Tutorial on Microarray Analysis using Bioconductor and R (Sample Study) Tools Used
Transcription
Tutorial on Microarray Analysis using Bioconductor and R (Sample Study) Tools Used
341: Introduction to Bioinformatics Tutorial on Microarray Analysis using Bioconductor and R (Sample Study) February 11, 2011 Tools Used • • Bioconductor, http://www.bioconductor.org/, provides open source tools for the analysis and comprehension of high-throughput genomic data R, http://www.r-project.org/, R is a free software environment for statistical computing and graphics Data Used • • • Resveratrol effect on lung carcinoma cell line - Analysis of lung carcinoma A549 cells treated with resveratrol. Resveratrol is a phytoestrogen found in red wine. Results provide insight into protective effect of resveratrol against lung cancer. Available from the Gene Expression http://www.ncbi.nlm.nih.gov/sites/GDSbrowser?acc=GDS2966 4 CEL files (Micro-array output) Data {GSM228717.CEL, GSM228718.CEL, GSM228719.CEL, GSM228720.CEL} available from above link 1 Sample Phenotype Mapping {GDS296-pheno.csv} Tasks 1. 2. 3. 4. Import and Normalise data Compare tissue treated with Resveratrol with similar tissue treat with an Ethanol control Filter most differentially expressed genes Cluster and View Analysis R Code ## Step 1- Load packages library(affy) library(limma) ## Step 2 - Import Sample Data setwd("C:/Users/asrowe/Documents/Tutorial/celfiles") phenoData <- read.AnnotatedDataFrame("GDS296-pheno.csv" ,header=TRUE, sep="\t") ## Step 3 - Import and Normalise Data using functions from the Bioconductor affy package eset <- justRMA(phenoData=phenoData) ## Step 4 - Differential expression filtering using Bioconductor limma package design <- model.matrix(~substance, pData(eset)) fit <- lmFit(eset, design) # fit each probeset to model efit <- eBayes(fit) # empirical Bayes adjustment tt <-topTable(efit, coef=2) # table of differentially expressed probesets fix(tt) #View # Step 5 - H Cluster and Dendrogram from the R stats package selected <- p.adjust(efit$p.value[, 2]) <0.05 #Select Adusted Points esetSel <- eset [selected, ] #Filter Selected Points heatmap(exprs(esetSel)) #Display Heatmap GDS296-pheno.csv Sample substance GSM228717 Control GSM228718 Wine GSM228719 Wine GSM228720 Wine Notes: For those interested in more information on Normalisation methods and why RMA (justRMA) is used, start here: Bolstad, B.M., Irizarry R. A., Astrand, M., and Speed, T.P. (2003), A Comparison of Normalization Methods for High Density Oligonucleotide Array Data Based on Bias and Variance. Bioinformatics 19(2):185-193