HISAT, StringTie and Ballgown(二)

admin 33 2025-03-27 10:52:08 编辑

昨天HISAT, StringTie and Ballgown流程中还剩几步,今天继续讲。主要涉及Ballgown的用法

12、添加基因名字和geneID。

>results_transcripts =

data.frame(geneNames=ballgown::geneNames(bg_chrX_filt),

geneIDs=ballgown::geneIDs(bg_chrX_filt), results_transcripts)

13、按照P值从小到大排序。

>results_transcripts = arrange(results_transcripts,pval)

>results_genes = arrange(results_genes,pval)

14、保存到文件。:

>write.csv(results_transcripts, "chrX_transcript_results.csv",

row.names=FALSE)

>write.csv(results_genes, "chrX_gene_results.csv",

row.names=FALSE)

15鉴定 q value <0.05的转录本:

>subset(results_transcripts,results_transcripts$qval<0.05)

>subset(results_genes,results_genes$qval<0.05)

16 作图。

>tropical= c('darkorange', 'dodgerblue',

'hotpink', 'limegreen', 'yellow')

>palette(tropical)

17、对于基因按照FPKM 值作图 。

>fpkm = texpr(bg_chrX,meas="FPKM")

>fpkm = log2(fpkm+1)

>boxplot(fpkm,col=as.numeric(pheno_data$sex),las=2,ylab='log2(FPKM+1)')

18、对单个基因在不同样本中表达情况作图。. For example, here we show how to create a plot for the 12th transcript in the data set . The first two commands below show the name of the transcript (NM_012227)

and the name of the gene that contains it (GTP binding protein 6, GTPBP6):

>ballgown::transcriptNames(bg_chrX)[12]

## 12

## "NM_012227"

>ballgown::geneNames(bg_chrX)[12]

## 12

## "GTPBP6"

>plot(fpkm[12,] ~ pheno_data$sex, border=c(1,2),

main=paste(ballgown::geneNames(bg_chrX)[12],' : ',

ballgown::transcriptNames(bg_chrX)[12]),pch=19, xlab="Sex",

ylab='log2(FPKM+1)')

>points(fpkm[12,] ~ jitter(as.numeric(pheno_data$sex)),

col=as.numeric(pheno_data$sex))

19、输出一个样本中一个基因座位的所有转录本的基因结构与表达丰度图

>plotTranscripts(ballgown::geneIDs(bg_chrX)[1729], bg_chrX, main=c('Gene XIST in sample ERR188234'), sample=c('ERR188234'))

20、我们也可以使用plotMeans 属于一个基因的所有转录本的平均表达值。

>plotMeans('MSTRG.56', bg_chrX_filt,groupvar="sex",legend=FALSE)

就这些了,总之:This protocol does not require programming expertise, but it does assume familiarity with the Unix command-line interface and the ability to run basic R scripts. Users should be comfortable running programs from the command line and editing text files in the Unix environment

欢迎关注

————————————————

欢迎关注生信帮公众号

上一篇: 基因设计工具的十大推荐,让你的科研事半功倍
下一篇: 古代马基因组文章解读
相关文章