edgeR计算差异表达示例-行业观点-衍因科技官网

edgeR计算差异表达示例

admin 119 2025-02-12 14:20:08 编辑

实验数据：同一组织，分为两组，control vs treat，每组7例sample。数据列为基因名，后14列为对应的count。

bioconductor和edgeR包的安装

source(“http://bioconductor.org/biocLite.R“)biocLite(“edgeR”)library(“limma”)library(“edgeR”)

读取数据，方法随意

rawdata<-read.delim(“2.txt”,header=T)head(rawdata) #检查读入是否正确y<-DGEList(counts=rawdata[,2:15],genes=rawdata[,1])

过滤与标准化

left<-rowSums(cpm(y)>1)>=4 #过滤标准为至少one count per million (cpm)y<-y[left,]y<-DGEList(counts=y$counts,genes=y$genes)y<-calcNormFactors(y)#默认为TMM标准化

检查样本的outlier and relationship

y<-plotMDS(y)

设计design matrix

group<-factor(c(‘H’,’H’,’H’,’H’,’H’,’H’,’H’,’M’,’M’,’M’,’M’,’M’,’M’,’M’))design <- model.matrix(~group)y<-DGEList(counts=rawdata[,2:15],genes=rawdata[,1])

推测dispersion（离散度）

y<-estimateGLMCommonDisp(y,design,verbose=TRUE)y<-estimateGLMTrendedDisp(y, design)y<-estimateGLMTagwiseDisp(y, design)

差异表达基因，to perform quasi-likelihood F-tests:

fit <- glmQLFit(y,design)qlf <- glmQLFTest(fit,coef=2)topTags(qlf)#前10个差异表达基因

or 差异表达基因，to perform likelihood ratio tests:

fit<-glmFit(y, design)lrt<-glmLRT(fit)topTags(lrt)#前10个差异表达基因

火山图

summary(de<-decideTestsDGE(qlf))##qlf或可改为lrtdetags<-rownames(y)[as.logical(de)]plotSmear(qlf, de.tags=detags)abline,col=’blue’) #蓝线为2倍差异表达基因，差异表达的数据在qlf中

原文：http://blog.sina.com.cn/s/blog_5d188bc40102vwci.html

欢迎关注

NC重磅！CellChat：单细胞通讯分析工具！

634 2024-12-18

edgeR计算差异表达示例

bioconductor和edgeR包的安装

读取数据，方法随意

过滤与标准化

检查样本的outlier and relationship

设计design matrix

推测dispersion（离散度）

差异表达基因，to perform quasi-likelihood F-tests:

or 差异表达基因，to perform likelihood ratio tests:

火山图

推荐阅读

热门文章

最新文章

热门标签