Presentation Slides
Transcription
Presentation Slides
TheChromium™System LinkedReadAndSingleCellRNA-Seq ApplicationsPoweredByGemCode Technology ChrisBlack,M.S. SalesExecutive 10xGenomics chris.black@10xgenomics.com PresentationStructure • GemCode™TechnologyOverview • LinkedReadApplicationswithChromium – ChromiumGenome&Exome(HybridCapture) – DeNovoAssemblyUsingSupernova • SingleCell3’RNA-seq withChromium • ChromiumSoftwareSuiteOverview Confidential— Donotdistribute 2 GemCode TechnologyOverview Confidential— Donotdistribute 3 GemCode Technology– CreatingGEMs • GEM– GelBeadinEMulsion Confidential— Donotdistribute 4 GEMStructure ReagentDelivery,SamplePartitioningandBarcodingSystem HMWDNA ormRNAfromSC GelBead Reagents Emulsion (akaPartition) Confidential— Donotdistribute 5 GelBeadOligoDeliverySubstrate Functional oligo with barcode Gel bead scaffold P5 10x Barcode R1 High-diversity library N-mer • Millions of copies of identical oiligos • Defined barcode sequence • Built-in sequencing adapter and primer content Confidential— Donotdistribute 6 GeneratingGEMswiththeChromiumController Enzyme Collect Barcoded GelBeads HMW gDNA Oil Isothermal Incubation 1– 5gDNA molecules perGEM Genome GEMs Solid phase reagent delivery Pool RemoveOil Fluid partitioning Barcoded Amplicons Barcoded Amplicons Liquid phase biochemistry 7 DiversityDrivesPerformance STANDARD BARCODING ChromiumSingle Cell3’RNA-seq Chromium Genome Partitions 384 >1,000,000 >1,000,000 BarcodePool 384 750,000 4,000,000 100ng+ gDNA 2k – 12kcells 1ngHMW gDNA Input • Massivepartitioningandbarcodediversityenables – lowersequencingdepthrequirement(LR) – lowerpurity/qualitysamples(LR) – denovoAssembly(LR) – Minimalduplicationrateevenwith10k+cells(SC) Confidential— Donotdistribute 8 LinkedReadApplicationswithChromium: ChromiumGenomeandExome(HybridCapture) 9 LinkedReads&TheChromiumGenome •Resolvethegenomeintomulti-megabase phaseblocks – Phasethefullspectrumofcalledvariants •Detectstructuralvariants – Translocations,insertions,deletions,duplications,complexrearrangements •Recovervariantsinpreviouslyinaccessiblepartsofthegenome – Confidentlymapreadsandcallvariantseveninrepetitiveregions 10 ChromiumGenome:PhaseCompoundHets G A ATCT Gly551Asp ΔPhe508 chr7: 117,199,644 A chr7: 117,227,860 TRANS 28.2 kb • NA11274– Coriell samplefromfemaleaffectedwithcysticfibrosis • Compoundheterozygousvariantsincysticfibrosistransmembranereceptor • Genewithin18Mbphaseblock 11 ChromiumGenome:PhaseLargeDeletions • • • • NA12878– Coriell samplefromfemale 128GbILMNsequencedata(2x150) N50phaseblock– 8Mb 99%ofallSNPsPhased 12 ChromiumGenome:PhaseTandemDupes Standard Library, BWA Chromium Genome, Lariat • • • • NA12878– Coriell samplefromfemale 128GbILMNsequencedata(2x150) N50phaseblock– 8Mb 99%ofallSNPsPhased 13 ChromiumGenome:DetectSVs • Automatedcallingofstructuralvariants<50bp&>30kb • Datasupportsdetectionofvariantsofallsizes Heterozygousdeletion HCC1954T Homozygousdeletion NA12878 Tandemduplication HCC1143T Homozygousinversion NA12878 14 ChromiumGenome:ResolveMappingIssues • BWAcoregeneratesmultiplecandidatealignmentsandfailstoassignreads • Lariataligneruseslinkedreaddatafrom10xbarcodestorescueunmapped reads R Rep Repeat Locus 1 R Rep Repeat Locus 2 Repetitive Sequence Repetitive Sequence Repetitive Sequence Unique Alignment Unique Alignment Ambiguous Alignments Unique Alignment Ambiguous Alignments 15 ChromiumGenome:RescueRepeatRegions Confidential— Donotdistribute 16 ChromiumGenome:EliminateTrioAnalysis TriovalidationofLariatcalls ComparisonofNA12878 alignedwithLariatandBWA • NA12878– Coriell samplefromfemalechildofNA12891(P)andNA12892(M) • Allthreesamplessequencedto128Gb(ILMN2x150) 17 LinkedReadsforExome(HybridCapture) • Averageof5differentHapMap samplesincludingNA12878 • V6baitset;down-sampledto6Gbafterremovingbarcodesequence;NISTConfidentregions 100% 90% 90% PercentBasesatGivenCoverage PercentBasesatGivenCoverage V6 Confident Performance 100% 80% 70% 60% 50% 40% 30% 20% 10% 0% 1XCoverage 5XCoverage 10XCoverage 20XCoverage Rescuing Hard V6 Content 80% 70% 60% 50% 40% 30% 20% 10% 0% 1XCoverage 5XCoverage 10XCoverage 20XCoverage AgilentXT(200ng)vs.ChromiumExome(1ng) Highlyuniformandcomplexlibraryfromjust1ng 18 OptimizedSureSelect Baits- ComingSoon • OptimizedSureSelect baitsdesignedspecificallyforthe Chromiumsystemtoimprovegenephasingbyclosinggaps,and recoveringhard-to-maplociinthegenome. Confidential— Donotdistribute 19 IdentifyFusionGenesFromExomeData •H2228:Non-small cell lung cancer cell line 20 LinkedReadApplicationswithChromium: DeNovo AssemblyusingSupernova Confidential— Donotdistribute 21 SupernovaDeNovoAssemblyWorkflow SequenceGenome(Make1 Library) 1ngDNA Input DNAextraction (1hr-3days) Libraryprep (2days) SupernovaAssembler Sequencers (<3days) Assayrequires1ng Single linkedread • 2x150reads ofHMWDNA librarypersample • HiSeq >50kbacceptable recommended >100kbpreferred • >50X coverage recommended SupernovaHuman Assembly (2days) • Runonsingle, standalone CentOS/RedHat or Ubuntusystem • Atleast24cores • 512GBRAM • 2TBdisk 22 CharacteristicsOfSupportedGenomes • Genomesize:1- 3.2GBtotal • Ploidy – Haploid:nottestedbutlikelytoworkwell – Diploid:fullytestedandsupported – Polyploid:nottested,likelytorequireadditionaldevelopment • Inputshouldbefromasingleindividualorclonalpopulation •RepeatcontentandG/Ccompositionsimilartohumangenome recommended. 23 Confidential— Donotdistribute TrueDiploidAssemblyFromLinkedReads Sequencesfromhaplotype1 Sequencesfromhaplotype2 OldAssemblyModel:compressintoaconsensus SupernovaAssemblyModel:representbothhaplotypes Churchetal.,2011PLoS Biology 24 KeyAssemblyMetrics/Terms Multi-Mbscaffold Multi-kbContig Multi-MbPhaseBlock • Contig:anungapped sequence • Scaffold:agappedsequencecontainingmultiplecontigs forwhichthe orderandorientationisasserted • PhaseBlock:Truediploidsequences;sizeisdependentonheterozygosity 25 SupernovaDeNovoAssemblyofNA12878 SampleInput and Sequencing Supernova Compute Requirements • InputDNA:1.25ng • 28cores(1server) runningfor48hours • Moleculesize:80.3kb • 2x150bponIllumina • 1,344totalcore-hours HiSeq XTen • 1200Mreads(56x coverage) • 2TBofdatagenerated SupernovaAssembly #ofscaffolds>10Kb 1651 N50contig size 103.67kb N50scaffoldsize 14.06Mb Assemblysize (scaffolds>=10kb) 2.74Gb N50phaseblocksize 1.96Mb Viewanddownloadat http://software.10xgenomics.com/de-novo-assembly/overview/datasets 26 PhasedDeNovoAssembly Confidential— Donotdistribute 27 SupernovaAssembliesFromDiverseGenomes nonhuman human sample Size (Gb) DNA N50 N50 size contig scaffold (kb) (kb) (Mb) HETSNP N50 spacing phase (kb) block (Mb) NA12878 3.2 95.5 85.0 12.8 1.7 2.8 NA24385 3.2 111.3 90.0 10.4 1.5 3.9 HGP 3.2 138.8 104.9 19.4 1.5 4.6 Yoruban 3.2 126.9 100.5 16.1 1.1 11.4 Komododragon 1.8 85.4 95.3 10.2 10.3 0.4 Spottedowl 1.5 72.2 118.3 10.1 7.1 0.2 Hummingbird 1.0 86.2 87.6 12.5 0.4 10.1 Monk seal 2.6 92.3 93.8 14.8 17.7 0.6 Chilipepper 3.5 53.3 84.7 4.0 0.4 2.1 28 SingleCell3’RNA-seq withChromium 29 GelBeadOligoDeliverySubstrate Functional oligo with barcode Gel bead scaffold P7 10x Barcode High-diversity library R2 Poly(dT)VN • Millions of copies of identical oiligos • Defined barcode sequence • Built-in sequencing adapter and primer content Confidential— Donotdistribute 30 PartitioningandBarcodingofSingleCells Barcodes Cell Suspension and RT Reagents Oil Single Cell GEMs Collect Barcoded cDNA Incubate and Recover • HighGEMfillratio(~90%ofdropletscontainbeads) • PoissonloadingofcellsinGEMs • Beadsdissolveforefficient,liquidphasebiochemistry • Celllysisstartsimmediatelyfollowingencapsulation Confidential— Donotdistribute 31 ChromiumSingleCellSampleLoading Single-usemicrofluidicschip • Upto8channelsprocessedinparallel • 1,000to6,000recoveredcellsperchannel Outletwell Beadwell Samplewell Oilwell • 10minuteruntimeperchip • ~50%cellprocessingefficiency Usercontrolledtrade-offbetweencellnumbersanddoubletrate Number ofRecoveredCells ExpectedDoublet Rate(%)* 1,200 ~1.2 3,000 ~2.9 6,000 ~5.7 *ExpectedDoubletRateassuminganidealsinglecellsuspension Confidential— Donotdistribute 32 ChromiumSingleCell:DataReproducibility GenesDetectedperCell 4,000 3,000 2,000 1,000 0 25,000 50,000 75,000 100,000 ReadsperCell • Graphshowsmean(black)andrange(darkgray)over16independentexperimentsusingHEK293Tcells Confidential— Donotdistribute 33 ChromiumSingleCell:DoubletRates 60,000 MouseTranscriptCounts • 1:1mixtureof~1,400 human(HEK293T)and mouse(NIH3T3)cells Human:Mouse Humanonly Mouseonly • 99.4%ofcell-occupied GEMsyieldedreads mappingtoonlyone species • 1%inferreddoubletrate* 0 0 HumanTranscriptCounts 60,000 • *includesunobservedhuman:human andmouse:mouse doublets • Numberofcellsdetected:~1400cells,Numberofrawreadspercell:~130k Confidential— Donotdistribute 34 ChromiumSingleCell:CellCyclePOP CombinedExpressionof KnownPhaseMarkers • ProliferatingHEK293Tcells wereprofiledandscoredfor expressionofmarkers associatedwitheachmajor cellcyclephase G1/S S G2 G2/M Expression level M/G1 0 100 200 300 • Cellsfromallphaseswere identified 400 HEK293TCellsOrderedbyInferredCell CyclePhase • Phase-specificgenesderivedfromWhitfieldetal.,2002 • Numberofcellsdetected:~400cells,Numberofrawreadspercell:~40k Confidential— Donotdistribute 35 DataClustering:NoCellSortingNeeded UnbiasedAutomaticClusteringof ThreeBreastCellLines HER2 ExpressionMatches ExpectedCellLineStatus HCC1954 HCC1143 HCC1954 log(x+1)HER2 counts t-SNEProjection t-SNEProjection HCC1143 HCC38 HCC38 • Numberofcellsdetected:~1000cells,Numberofrawreadspercell:~40k Confidential— Donotdistribute 36 IdentifyingRareCellTypes T-Cell (Jurkat) 20 20 10 10 PC2 PC2 B-Cell (Raji) 0 0 -10 -10 -20 -20 -20 -10 0 PC1 10 • Jurkat andRaji cellswere combinedat9:1,99:1and 199:1ratiosandthenprofiled 20 • TheminorityRaji populations wereidentifiedinallthree mixtures -20 -10 12% 10 20 1.5% 20 0.6% 20 20 618 1087 0 -10 0 83 -20 -10 0 PC1 17 -20 10 20 0 -10 -10 -20 1314 10 PC2 10 PC2 10 PC2 0 PC1 -20 -10 0 PC1 • Numberofcellsdetected:~1000cells,Numberofrawreadspercell:~60k 8 -20 10 20 -20 -10 0 PC1 10 20 Confidential— Donotdistribute 37 IncreasingCellCountIncreasesResolution TSNE2 Bulk RNA-Seq TSNE1 4,500 PBMCs 16,000 PBMCs 68,000 PBMCs Confidential— Donotdistribute 38 MajorpopulationsofPBMCsaredetected CD45 RA+ Naïve T (26.4%) CD4+ T (28.4%) TSNE2 Dendritic (1.9%) CD8+ T (18.7%) CD19+ B (5.5%) CD14+ Monocytes (5.3%) CD34+ Progenitors (0.3%) CD56+ NK (13.5%) TSNE1 Confidential— Donotdistribute 39 ChromiumSingleCell:68kPBMCsinOneRun • CD45RA+NaïveTCells • CD4+TCells • CD8+TCells • CD14+Monocytes • CD19+BCells • CD34+Myloid Progenitors • CD56+NaturalKillerCells • Numberofcellsdetected:~68,000cells,Numberofrawreadspercell:~21k Confidential— Donotdistribute 40 ChromiumSoftwareSuiteOverview Confidential— Donotdistribute 41 Overview– ChromiumSoftwareSuite Turn-keyanalysisandvisualizationsoftwareisincludedwithall Chromiumproducts: ChromiumLinkedReadData – LongRangerAnalysisPipelines – LoupeGenomeBrowser – SupernovaAssembler(Genomeonly) ChromiumSingleCell3’RNA-Seq – CellRangerAnalysisPipelines – LoupeCellBrowser(comingsoon) Confidential— Donotdistribute 42 ChromiumSoftwarePlatform • Allsoftwareproductsbuiltonacommonplatform – Consistentlookandfeelandeaseofuse • Linux-basedanalysispipelines – Self-contained,simpletoinstall – Runonworkstations,andscaletoclustersandcloud – NovelalgorithmsexploitLinked-Readdatatype – Multipleversionsofpipelinescanrunside-by-side • MacandWindows-baseddesktopvisualizationapps Confidential— Donotdistribute 43 LinkedReadsInformaticsWorkflow •Completesoftwarepackageforlongrangeanalysis •SupportsWGSandWES •OutputsarestandardformatsplusLoupevisualization BAM VCF BCL LongRanger™ AnalysisPipelines BEDPE LOUPE StandardInformatics samtools,vcftools,bedtools Loupe™ GenomeBrowser Confidential— Donotdistribute 44 LongRangerPipeline • Complete,standalonepipelineoptimizedforhumanWGSandWES • Exploitsbarcodesforalignment,variantcalling,phasing,SVs • Buildsuponstandardcomponentsandfileformats • RunsonLinux,easytodownloadandrun Visualization 10xLoupe Barcode Processing StandardPipelineStages 10xLariat™ GATK, Freebayes Alignment VariantCalling 10xGenomicsStages Phasing SVCalling BAM,VCF BEDPE FileFormats Confidential— Donotdistribute 45 LongRangerSystemRequirements LocalMode – Runonsingle,standaloneLinuxsystem – CentOS/RedHat5.2+orUbuntu8.04+ – 16+cores,128GBRAM,and2TB+disk ClusterMode – RunonSGEandLSF – Eachnodemusthave8+coresand8GB+RAM/core – Sharedfilesystembetweennodes(e.g.NFS) Runtime – 30X(100Gb)Genome:640core-hrs (36hrs on20cores) – 10GbExome:148core-hrs Confidential— Donotdistribute 46 Loupe:VisualizeLongRangeInformation LoupeisaprecisiontoolforinspectingGEMs •RunsonoutputofLongRangerpipeline •Fullyhaplotype-enabledgenomebrowser •Visualizationofbreakpointsandstructuralvariants •DesktopapplicationforMacandWindows Confidential— Donotdistribute 47 Loupe:SummaryView Confidential— Donotdistribute 48 Loupe:HaplotypeView Fluidlysearchforandbrowse haplotype-resolvedvariants atmultiplelociatonce Confidential— Donotdistribute 49 Loupe:StructuralVariantView Structuralvariantcallsandcandidates producedbyLongRanger Confidential— Donotdistribute 50 SupernovaInformatics&PipelineOverview •CompletesoftwarepackageforDeNovoassembly •DemultiplexingtoFASTAgenerationforcontigsandscaffolds BCL Barcode Processing k-mer Graph Construction Supernova™ AnalysisPipelines Barcode-based GraphResolution Chromosome Phasing FASTA FASTA FileFormats Confidential— Donotdistribute 51 SupernovaSystemRequirements LocalMode – Runonsingle,standaloneLinuxsystem – CentOS/RedHatorUbuntu – 24+cores,512GBRAM,2TB+disk Runtime – 180Gbgenome:500core-hours(36hr on28cores) Confidential— Donotdistribute 52 SingleCellInformaticsWorkflow •Completesoftwarepackageforsinglecellanalysis •OutputsarestandardformatsplusLoupevisualization BAM HDF5 BCL CellRanger™ AnalysisPipelines MEX LOUPE StandardInformatics samtools,Python,R Loupe™ CellBrowser CominginQ22016 Confidential— Donotdistribute 53 CellRangerPipeline • Complete,standalonesinglecell,geneexpressionpipeline • Buildsuponstandardcomponentsandfileformats • RunsonLinux,easytodownloadandrun Visualization Loupe Barcode Processing STAR Transcript Counting Alignment StandardPipelineStages Gene-Cell Matrix Comingin Q22016 Expression Analysis BAM, Matrix 10xGenomicsStages FileFormats Confidential— Donotdistribute 54 CellRangerSystemRequirements LocalMode – Runonsingle,standaloneLinuxsystem – CentOS/RedHat5.2+orUbuntu8.04+ – 8+cores,64GBRAM ClusterMode – RunonSGEandLSF – Eachnodemusthave8+coresand8GB+RAM/core – Sharedfilesystembetweennodes(e.g.NFS) Runtime – 50core-hoursper100Mclusters – 5000cells,40kreads/cell:95core-hours Confidential— Donotdistribute 55 SoftwareWebsite Formoreinformation,downloads, fileformats,andspecifications: http://software.10xgenomics.com Confidential— Donotdistribute 56 Thankyou! Questions? Confidential— Donotdistribute 57
Similar documents
a 10x Genomic Single Cell Overview
• Super-‐Poisson loading of barcoded beads into droplets • Poisson loading of cells in GEMs • Beads dissolve for efficient, liquid phase...
More informationSingle Cell Solutions
evaluate the technical performance of the GemCode platform for single cell analysis. 1010 cells were captured, of which 483 were human and 535 were mouse, indicating a ~50% cell capture rate with a...
More information