这个脚本调用如下的步骤:Generate rarefied OTU tables; compute alpha diversitymetrics for each rarefied OTU table; collate alpha diversityresults; and generate alpha rarefaction plots.
alpha_rarefaction.py
-i,输入biom文件
-m,mapping文件
-o, 输出文件夹
-p,参数文件,指定求解哪些东西
-n, --num_steps
Number ofsteps (or rarefied OTU table sizes) to make between min and maxcounts [default: 10]
-f, 强行覆盖同名的文件夹
-w,提示有哪些程序,但不适用他们(用于排错)
-a,平行运行
-t,进化树文件
--min_rare_depth
The lowerlimit of rarefaction depths [default: 10]
-e, --max_rare_depth
The upperlimit of rarefaction depths [default: median sequence/samplecount]
-O, --jobs_to_start
Number ofjobs to start. NOTE: you must also pass -a to run in parallel, thisdefines the number of jobs to be started if and only if -a ispassed [default: 2]
--retain_intermediate_files
Retainintermediate files: rarefied OTU tables (rarefaction) and alphadiversity results (alpha_div). By default these will be erased[default: False]
例子:
(1)首先把需要做的多样性指数写入txt文档中:
echo "alpha_diversity:metrics shannon,PD_whole_tree,chao1,observed_species,goods_coverage,simpson"> alpha_params.txt
(2)接着运行脚本(it may need several hours):
alpha_rarefaction.py -i otu_table/ot u_table.biom -m map.txt-o div_alpha/ -p alpha_params.txt -trep_phylo.tre
#输入文件otu_table.biom,rep_phylo.tre
#输出结果在div_alpha/
div_alpha/alpha_rarefaction_plots/rarefaction_plots.html用网页打开,可以选择你想要表示的图形文件
log文件中显示调用的命令
python/usr/lib/qiime/bin//multiple_rarefactions.py -iotu_table/otu_table.biom -m 10 -x 16544 -s 1653 -odiv_alpha//rarefaction/
随即抽取序列,默认的最小取10条序列,最大取16544条序列,下次抽取增加1653条序列,每一步的抽取重复10次
# Alphadiversity on rarefied OTU tables command
python /usr/lib/qiime/bin//alpha_diversity.py -idiv_alpha//rarefaction/ -o div_alpha//alpha_div/ --metricsshannon,PD_whole_tree,chao1,observed_species,goods_coverage,simpson-t rep_phylo.tre
sam@sam-Precision-WorkStation-T7500[mtt3] alpha_diversity.py-s
Knownmetrics are: ACE, berger_parker_d, brillouin_d, chao1,chao1_confidence, dominance, doubles, equitability, esty_ci,fisher_alpha, gini_index, goods_coverage, heip_e, kempton_taylor_q,margalef, mcintosh_d, mcintosh_e, menhinick, michaelis_menten_fit,observed_species, osd, simpson_reciprocal, robbins, shannon,simpson, simpson_e, singles, strong, PD_whole_tree
可以知道一共有哪些alpha_diversity矩阵
# Collatealpha command
python/usr/lib/qiime/bin//collate_alpha.py -idiv_alpha//alpha_div/ -o div_alpha//alpha_div_collated/
#上一步得到的结果中,一个文件夹中包含很多个Alpha多样性矩阵,将文件夹中所有文件中涉及到同一个矩阵的内容提出来,以该矩阵命令,形成新的文件夹。
#Rarefaction plot: All metrics command
python/usr/lib/qiime/bin//make_rarefaction_plots.py -idiv_alpha//alpha_div_collated/ -m map.txt -odiv_alpha//alpha_rarefaction_plots/
作图,div_alpha/alpha_rarefaction_plots/rarefaction_plots.html用网页打开,你什么都明白了
里面提到的几个矩阵:
shannon, 菌群多样性指数
香农-威纳指数的公式是:H=-∑(Pi)(㏑Pi)
Pi=样品中属于第i种的个体的比例,如样品总个体数为N,第i种个体数为ni,则Pi=ni/N
各种之间,个体分配越均匀,H值就越大。如果每一个体都属于不同的种,多样性指数就最大;如果每一个体都属于同一种,则其多样性指数就最小
Dominance 随即取两条序列,来自同一个样品的概率Σ(Si(Si-1))/N(N-1)
simpson 菌群多样性指数
辛普森多样性指数=随机取样的两个个体属于不同种的概率
=1-随机取样的两个个体属于同种的概率
越均匀,值越大
PD_whole_tree,
谱系alpha多样性(phylogenetic diversity,Faith1992):探讨进化历史的保存,应用于种群,群落,生物地理学,保护生物学。
谱系beta多样性(phylobetadiversity,Webb 2002):探讨群落或的确的谱系距离及其成因。
谱系信号与谱系结构(phylogeneticsignal and phylogenetic structure):探讨群落和地区物种共存机制。
谱系多样性(phylogenetic diversityPD),某个地点所有物种间最短进化分支长度之和占各节点分支长度综合的比例(Faith,1992)
群落谱系距离(phylogeneticdistance):群落I与群落II中种俩俩之间谱系分支长度之和的平均值(Webb,2002)
PD_whole_tree:sum of branchlengths between all representatives ????
chao1, 菌种丰富度指数。估计群落中的OTU数目
Schao1=Sobs+n1(n1-1)/2(n2+1),其中Schao1为估计的OUT数,Sobs为观测到的OTU数,n1为只有一天序列的OUT数目,n2为只有两天序列的OUT数目。
observed_species,
Otu的个数
goods_coverage 测序深度指数
测序深度:C=1-n1/N,n1为只有含一条序列的OTU数目,N为抽样中出现的总的序列数目。
ps:这个里面的PD_whole_tree指的什么有待琢磨,意义何在?
参考资料:
http://qiime.org/scripts/alpha_rarefaction.html
multiple_rarefactions.py注解http://qiime.org/scripts/multiple_rarefactions.html
alpha_diversity.py注解 http://qiime.org/scripts/alpha_diversity.html
collate_alpha.py 注解 http://qiime.org/scripts/collate_alpha.html
make_rarefaction_plots.py 注解http://qiime.org/scripts/make_rarefaction_plots.html