Presentation Slides

Transcription

Presentation Slides
TheChromium™System
LinkedReadAndSingleCellRNA-Seq ApplicationsPoweredByGemCode
Technology
ChrisBlack,M.S.
SalesExecutive
10xGenomics
chris.black@10xgenomics.com
PresentationStructure
• GemCode™TechnologyOverview
• LinkedReadApplicationswithChromium
– ChromiumGenome&Exome(HybridCapture)
– DeNovoAssemblyUsingSupernova
• SingleCell3’RNA-seq withChromium
• ChromiumSoftwareSuiteOverview
Confidential— Donotdistribute
2
GemCode TechnologyOverview
Confidential— Donotdistribute
3
GemCode Technology– CreatingGEMs
• GEM– GelBeadinEMulsion
Confidential— Donotdistribute
4
GEMStructure
ReagentDelivery,SamplePartitioningandBarcodingSystem
HMWDNA
ormRNAfromSC
GelBead
Reagents
Emulsion
(akaPartition)
Confidential— Donotdistribute
5
GelBeadOligoDeliverySubstrate
Functional oligo
with barcode
Gel bead scaffold
P5
10x
Barcode
R1
High-diversity library
N-mer
• Millions of copies of identical oiligos
• Defined barcode sequence
• Built-in sequencing adapter and primer content
Confidential— Donotdistribute
6
GeneratingGEMswiththeChromiumController
Enzyme
Collect
Barcoded
GelBeads
HMW
gDNA
Oil
Isothermal
Incubation
1– 5gDNA
molecules
perGEM
Genome
GEMs
Solid phase
reagent delivery
Pool
RemoveOil
Fluid partitioning
Barcoded
Amplicons
Barcoded
Amplicons
Liquid phase
biochemistry
7
DiversityDrivesPerformance
STANDARD
BARCODING
ChromiumSingle
Cell3’RNA-seq
Chromium
Genome
Partitions
384
>1,000,000
>1,000,000
BarcodePool
384
750,000
4,000,000
100ng+
gDNA
2k – 12kcells
1ngHMW
gDNA
Input
• Massivepartitioningandbarcodediversityenables
– lowersequencingdepthrequirement(LR)
– lowerpurity/qualitysamples(LR)
– denovoAssembly(LR)
– Minimalduplicationrateevenwith10k+cells(SC)
Confidential— Donotdistribute
8
LinkedReadApplicationswithChromium:
ChromiumGenomeandExome(HybridCapture)
9
LinkedReads&TheChromiumGenome
•Resolvethegenomeintomulti-megabase phaseblocks
– Phasethefullspectrumofcalledvariants
•Detectstructuralvariants
– Translocations,insertions,deletions,duplications,complexrearrangements
•Recovervariantsinpreviouslyinaccessiblepartsofthegenome
– Confidentlymapreadsandcallvariantseveninrepetitiveregions
10
ChromiumGenome:PhaseCompoundHets
G
A
ATCT
Gly551Asp
ΔPhe508
chr7: 117,199,644
A
chr7: 117,227,860
TRANS 28.2 kb
• NA11274– Coriell samplefromfemaleaffectedwithcysticfibrosis
• Compoundheterozygousvariantsincysticfibrosistransmembranereceptor
• Genewithin18Mbphaseblock
11
ChromiumGenome:PhaseLargeDeletions
•
•
•
•
NA12878– Coriell samplefromfemale
128GbILMNsequencedata(2x150)
N50phaseblock– 8Mb
99%ofallSNPsPhased
12
ChromiumGenome:PhaseTandemDupes
Standard
Library,
BWA
Chromium
Genome,
Lariat
•
•
•
•
NA12878– Coriell samplefromfemale
128GbILMNsequencedata(2x150)
N50phaseblock– 8Mb
99%ofallSNPsPhased
13
ChromiumGenome:DetectSVs
• Automatedcallingofstructuralvariants<50bp&>30kb
• Datasupportsdetectionofvariantsofallsizes
Heterozygousdeletion
HCC1954T
Homozygousdeletion
NA12878
Tandemduplication
HCC1143T
Homozygousinversion
NA12878
14
ChromiumGenome:ResolveMappingIssues
•
BWAcoregeneratesmultiplecandidatealignmentsandfailstoassignreads
•
Lariataligneruseslinkedreaddatafrom10xbarcodestorescueunmapped
reads
R
Rep
Repeat
Locus 1
R
Rep
Repeat
Locus 2
Repetitive
Sequence
Repetitive
Sequence
Repetitive Sequence
Unique
Alignment
Unique
Alignment
Ambiguous Alignments
Unique Alignment
Ambiguous Alignments
15
ChromiumGenome:RescueRepeatRegions
Confidential— Donotdistribute
16
ChromiumGenome:EliminateTrioAnalysis
TriovalidationofLariatcalls
ComparisonofNA12878
alignedwithLariatandBWA
• NA12878– Coriell samplefromfemalechildofNA12891(P)andNA12892(M)
• Allthreesamplessequencedto128Gb(ILMN2x150)
17
LinkedReadsforExome(HybridCapture)
• Averageof5differentHapMap samplesincludingNA12878
• V6baitset;down-sampledto6Gbafterremovingbarcodesequence;NISTConfidentregions
100%
90%
90%
PercentBasesatGivenCoverage
PercentBasesatGivenCoverage
V6 Confident Performance
100%
80%
70%
60%
50%
40%
30%
20%
10%
0%
1XCoverage
5XCoverage
10XCoverage
20XCoverage
Rescuing Hard V6 Content
80%
70%
60%
50%
40%
30%
20%
10%
0%
1XCoverage
5XCoverage
10XCoverage
20XCoverage
AgilentXT(200ng)vs.ChromiumExome(1ng)
Highlyuniformandcomplexlibraryfromjust1ng
18
OptimizedSureSelect Baits- ComingSoon
• OptimizedSureSelect baitsdesignedspecificallyforthe
Chromiumsystemtoimprovegenephasingbyclosinggaps,and
recoveringhard-to-maplociinthegenome.
Confidential— Donotdistribute
19
IdentifyFusionGenesFromExomeData
•H2228:Non-small cell lung cancer cell line
20
LinkedReadApplicationswithChromium:
DeNovo AssemblyusingSupernova
Confidential— Donotdistribute
21
SupernovaDeNovoAssemblyWorkflow
SequenceGenome(Make1
Library)
1ngDNA
Input
DNAextraction
(1hr-3days)
Libraryprep
(2days)
SupernovaAssembler
Sequencers
(<3days)
Assayrequires1ng Single linkedread • 2x150reads
ofHMWDNA
librarypersample • HiSeq
>50kbacceptable
recommended
>100kbpreferred
• >50X coverage
recommended
SupernovaHuman
Assembly
(2days)
• Runonsingle,
standalone
CentOS/RedHat or
Ubuntusystem
• Atleast24cores
• 512GBRAM
• 2TBdisk
22
CharacteristicsOfSupportedGenomes
• Genomesize:1- 3.2GBtotal
• Ploidy
– Haploid:nottestedbutlikelytoworkwell
– Diploid:fullytestedandsupported
– Polyploid:nottested,likelytorequireadditionaldevelopment
• Inputshouldbefromasingleindividualorclonalpopulation
•RepeatcontentandG/Ccompositionsimilartohumangenome
recommended.
23
Confidential— Donotdistribute
TrueDiploidAssemblyFromLinkedReads
Sequencesfromhaplotype1
Sequencesfromhaplotype2
OldAssemblyModel:compressintoaconsensus
SupernovaAssemblyModel:representbothhaplotypes
Churchetal.,2011PLoS Biology
24
KeyAssemblyMetrics/Terms
Multi-Mbscaffold
Multi-kbContig
Multi-MbPhaseBlock
•
Contig:anungapped sequence
•
Scaffold:agappedsequencecontainingmultiplecontigs forwhichthe
orderandorientationisasserted
•
PhaseBlock:Truediploidsequences;sizeisdependentonheterozygosity
25
SupernovaDeNovoAssemblyofNA12878
SampleInput and
Sequencing
Supernova Compute
Requirements
• InputDNA:1.25ng
• 28cores(1server)
runningfor48hours
• Moleculesize:80.3kb
• 2x150bponIllumina • 1,344totalcore-hours
HiSeq XTen
• 1200Mreads(56x
coverage)
• 2TBofdatagenerated
SupernovaAssembly
#ofscaffolds>10Kb
1651
N50contig size
103.67kb
N50scaffoldsize
14.06Mb
Assemblysize
(scaffolds>=10kb)
2.74Gb
N50phaseblocksize
1.96Mb
Viewanddownloadat
http://software.10xgenomics.com/de-novo-assembly/overview/datasets
26
PhasedDeNovoAssembly
Confidential— Donotdistribute
27
SupernovaAssembliesFromDiverseGenomes
nonhuman
human
sample
Size
(Gb)
DNA N50
N50
size contig scaffold
(kb) (kb)
(Mb)
HETSNP N50
spacing phase
(kb)
block
(Mb)
NA12878
3.2
95.5
85.0
12.8
1.7
2.8
NA24385
3.2 111.3
90.0
10.4
1.5
3.9
HGP
3.2 138.8
104.9
19.4
1.5
4.6
Yoruban
3.2 126.9
100.5
16.1
1.1
11.4
Komododragon
1.8
85.4
95.3
10.2
10.3
0.4
Spottedowl
1.5
72.2
118.3
10.1
7.1
0.2
Hummingbird
1.0
86.2
87.6
12.5
0.4
10.1
Monk seal
2.6
92.3
93.8
14.8
17.7
0.6
Chilipepper
3.5
53.3
84.7
4.0
0.4
2.1
28
SingleCell3’RNA-seq withChromium
29
GelBeadOligoDeliverySubstrate
Functional oligo
with barcode
Gel bead scaffold
P7
10x
Barcode
High-diversity library
R2 Poly(dT)VN
• Millions of copies of identical oiligos
• Defined barcode sequence
• Built-in sequencing adapter and primer content
Confidential— Donotdistribute
30
PartitioningandBarcodingofSingleCells
Barcodes
Cell Suspension
and RT Reagents
Oil
Single Cell GEMs
Collect
Barcoded cDNA
Incubate
and Recover
• HighGEMfillratio(~90%ofdropletscontainbeads)
• PoissonloadingofcellsinGEMs
• Beadsdissolveforefficient,liquidphasebiochemistry
• Celllysisstartsimmediatelyfollowingencapsulation
Confidential— Donotdistribute
31
ChromiumSingleCellSampleLoading
Single-usemicrofluidicschip
• Upto8channelsprocessedinparallel
• 1,000to6,000recoveredcellsperchannel
Outletwell
Beadwell
Samplewell
Oilwell
• 10minuteruntimeperchip
• ~50%cellprocessingefficiency
Usercontrolledtrade-offbetweencellnumbersanddoubletrate
Number ofRecoveredCells
ExpectedDoublet Rate(%)*
1,200
~1.2
3,000
~2.9
6,000
~5.7
*ExpectedDoubletRateassuminganidealsinglecellsuspension
Confidential— Donotdistribute
32
ChromiumSingleCell:DataReproducibility
GenesDetectedperCell
4,000
3,000
2,000
1,000
0
25,000
50,000
75,000
100,000
ReadsperCell
• Graphshowsmean(black)andrange(darkgray)over16independentexperimentsusingHEK293Tcells
Confidential— Donotdistribute
33
ChromiumSingleCell:DoubletRates
60,000
MouseTranscriptCounts
• 1:1mixtureof~1,400
human(HEK293T)and
mouse(NIH3T3)cells
Human:Mouse
Humanonly
Mouseonly
• 99.4%ofcell-occupied
GEMsyieldedreads
mappingtoonlyone
species
• 1%inferreddoubletrate*
0
0
HumanTranscriptCounts
60,000
• *includesunobservedhuman:human andmouse:mouse doublets
• Numberofcellsdetected:~1400cells,Numberofrawreadspercell:~130k
Confidential— Donotdistribute
34
ChromiumSingleCell:CellCyclePOP
CombinedExpressionof
KnownPhaseMarkers
• ProliferatingHEK293Tcells
wereprofiledandscoredfor
expressionofmarkers
associatedwitheachmajor
cellcyclephase
G1/S
S
G2
G2/M
Expression
level
M/G1
0
100
200
300
• Cellsfromallphaseswere
identified
400
HEK293TCellsOrderedbyInferredCell
CyclePhase
• Phase-specificgenesderivedfromWhitfieldetal.,2002
• Numberofcellsdetected:~400cells,Numberofrawreadspercell:~40k
Confidential— Donotdistribute
35
DataClustering:NoCellSortingNeeded
UnbiasedAutomaticClusteringof
ThreeBreastCellLines
HER2 ExpressionMatches
ExpectedCellLineStatus
HCC1954
HCC1143
HCC1954
log(x+1)HER2 counts
t-SNEProjection
t-SNEProjection
HCC1143
HCC38
HCC38
• Numberofcellsdetected:~1000cells,Numberofrawreadspercell:~40k
Confidential— Donotdistribute
36
IdentifyingRareCellTypes
T-Cell (Jurkat)
20
20
10
10
PC2
PC2
B-Cell (Raji)
0
0
-10
-10
-20
-20
-20
-10
0
PC1
10
• Jurkat andRaji cellswere
combinedat9:1,99:1and
199:1ratiosandthenprofiled
20
• TheminorityRaji populations
wereidentifiedinallthree
mixtures
-20
-10
12%
10
20
1.5%
20
0.6%
20
20
618
1087
0
-10
0
83
-20
-10
0
PC1
17
-20
10
20
0
-10
-10
-20
1314
10
PC2
10
PC2
10
PC2
0
PC1
-20
-10
0
PC1
• Numberofcellsdetected:~1000cells,Numberofrawreadspercell:~60k
8
-20
10
20
-20
-10
0
PC1
10
20
Confidential— Donotdistribute
37
IncreasingCellCountIncreasesResolution
TSNE2
Bulk RNA-Seq
TSNE1
4,500 PBMCs
16,000 PBMCs
68,000 PBMCs
Confidential— Donotdistribute
38
MajorpopulationsofPBMCsaredetected
CD45 RA+
Naïve T
(26.4%)
CD4+ T
(28.4%)
TSNE2
Dendritic
(1.9%)
CD8+ T
(18.7%)
CD19+ B
(5.5%)
CD14+
Monocytes
(5.3%)
CD34+
Progenitors
(0.3%)
CD56+ NK
(13.5%)
TSNE1
Confidential— Donotdistribute
39
ChromiumSingleCell:68kPBMCsinOneRun
• CD45RA+NaïveTCells
• CD4+TCells
• CD8+TCells
• CD14+Monocytes
• CD19+BCells
• CD34+Myloid Progenitors
• CD56+NaturalKillerCells
• Numberofcellsdetected:~68,000cells,Numberofrawreadspercell:~21k
Confidential— Donotdistribute
40
ChromiumSoftwareSuiteOverview
Confidential— Donotdistribute
41
Overview– ChromiumSoftwareSuite
Turn-keyanalysisandvisualizationsoftwareisincludedwithall
Chromiumproducts:
ChromiumLinkedReadData
– LongRangerAnalysisPipelines
– LoupeGenomeBrowser
– SupernovaAssembler(Genomeonly)
ChromiumSingleCell3’RNA-Seq
– CellRangerAnalysisPipelines
– LoupeCellBrowser(comingsoon)
Confidential— Donotdistribute
42
ChromiumSoftwarePlatform
• Allsoftwareproductsbuiltonacommonplatform
– Consistentlookandfeelandeaseofuse
• Linux-basedanalysispipelines
– Self-contained,simpletoinstall
– Runonworkstations,andscaletoclustersandcloud
– NovelalgorithmsexploitLinked-Readdatatype
– Multipleversionsofpipelinescanrunside-by-side
• MacandWindows-baseddesktopvisualizationapps
Confidential— Donotdistribute
43
LinkedReadsInformaticsWorkflow
•Completesoftwarepackageforlongrangeanalysis
•SupportsWGSandWES
•OutputsarestandardformatsplusLoupevisualization
BAM
VCF
BCL
LongRanger™
AnalysisPipelines
BEDPE
LOUPE
StandardInformatics
samtools,vcftools,bedtools
Loupe™
GenomeBrowser
Confidential— Donotdistribute
44
LongRangerPipeline
• Complete,standalonepipelineoptimizedforhumanWGSandWES
• Exploitsbarcodesforalignment,variantcalling,phasing,SVs
• Buildsuponstandardcomponentsandfileformats
• RunsonLinux,easytodownloadandrun
Visualization
10xLoupe
Barcode
Processing
StandardPipelineStages
10xLariat™
GATK,
Freebayes
Alignment
VariantCalling
10xGenomicsStages
Phasing
SVCalling
BAM,VCF
BEDPE
FileFormats
Confidential— Donotdistribute
45
LongRangerSystemRequirements
LocalMode
– Runonsingle,standaloneLinuxsystem
– CentOS/RedHat5.2+orUbuntu8.04+
– 16+cores,128GBRAM,and2TB+disk
ClusterMode
– RunonSGEandLSF
– Eachnodemusthave8+coresand8GB+RAM/core
– Sharedfilesystembetweennodes(e.g.NFS)
Runtime
– 30X(100Gb)Genome:640core-hrs (36hrs on20cores)
– 10GbExome:148core-hrs
Confidential— Donotdistribute
46
Loupe:VisualizeLongRangeInformation
LoupeisaprecisiontoolforinspectingGEMs
•RunsonoutputofLongRangerpipeline
•Fullyhaplotype-enabledgenomebrowser
•Visualizationofbreakpointsandstructuralvariants
•DesktopapplicationforMacandWindows
Confidential— Donotdistribute
47
Loupe:SummaryView
Confidential— Donotdistribute
48
Loupe:HaplotypeView
Fluidlysearchforandbrowse
haplotype-resolvedvariants
atmultiplelociatonce
Confidential— Donotdistribute
49
Loupe:StructuralVariantView
Structuralvariantcallsandcandidates
producedbyLongRanger
Confidential— Donotdistribute
50
SupernovaInformatics&PipelineOverview
•CompletesoftwarepackageforDeNovoassembly
•DemultiplexingtoFASTAgenerationforcontigsandscaffolds
BCL
Barcode
Processing
k-mer Graph
Construction
Supernova™
AnalysisPipelines
Barcode-based
GraphResolution
Chromosome
Phasing
FASTA
FASTA
FileFormats
Confidential— Donotdistribute
51
SupernovaSystemRequirements
LocalMode
– Runonsingle,standaloneLinuxsystem
– CentOS/RedHatorUbuntu
– 24+cores,512GBRAM,2TB+disk
Runtime
– 180Gbgenome:500core-hours(36hr on28cores)
Confidential— Donotdistribute
52
SingleCellInformaticsWorkflow
•Completesoftwarepackageforsinglecellanalysis
•OutputsarestandardformatsplusLoupevisualization
BAM
HDF5
BCL
CellRanger™
AnalysisPipelines
MEX
LOUPE
StandardInformatics
samtools,Python,R
Loupe™
CellBrowser
CominginQ22016
Confidential— Donotdistribute
53
CellRangerPipeline
• Complete,standalonesinglecell,geneexpressionpipeline
• Buildsuponstandardcomponentsandfileformats
• RunsonLinux,easytodownloadandrun
Visualization
Loupe
Barcode
Processing
STAR
Transcript
Counting
Alignment
StandardPipelineStages
Gene-Cell
Matrix
Comingin
Q22016
Expression
Analysis
BAM,
Matrix
10xGenomicsStages
FileFormats
Confidential— Donotdistribute
54
CellRangerSystemRequirements
LocalMode
– Runonsingle,standaloneLinuxsystem
– CentOS/RedHat5.2+orUbuntu8.04+
– 8+cores,64GBRAM
ClusterMode
– RunonSGEandLSF
– Eachnodemusthave8+coresand8GB+RAM/core
– Sharedfilesystembetweennodes(e.g.NFS)
Runtime
– 50core-hoursper100Mclusters
– 5000cells,40kreads/cell:95core-hours
Confidential— Donotdistribute
55
SoftwareWebsite
Formoreinformation,downloads,
fileformats,andspecifications:
http://software.10xgenomics.com
Confidential— Donotdistribute
56
Thankyou!
Questions?
Confidential— Donotdistribute
57

Similar documents

a 10x Genomic Single Cell Overview

a 10x Genomic Single Cell Overview • Super-­‐Poisson   loading   of   barcoded  beads   into  droplets • Poisson   loading  of   cells   in   GEMs • Beads   dissolve   for  efficient,   liquid   phase...

More information

Single Cell Solutions

Single Cell Solutions evaluate the technical performance of the GemCode platform for single cell analysis. 1010 cells were captured, of which 483 were human and 535 were mouse, indicating a ~50% cell capture rate with a...

More information