We will be offering mothur and R workshops throughout 2019. Learn more.

Bstick

From mothur
Jump to: navigation, search

The bstick calculator returns the Kolmogorov-Smirnov test statistic for the comparison of observed rank-abundance data to a fitted broken stick distribution and the critical values for α equal to 0.01 or 0.05. This calculator can be used in the summary.single, collect.single, and rarefaction.single commands.


<math>S_i=\frac{N}{S_{obs}} \sum_{n=i}^{S_{obs}}\frac{1}{n}</math>

where,

<math>S_i</math> = number of individuals in the ith OTU

<math>N</math> = the total number of individuals

<math>S_{obs}</math> = total number of observed OTUs


Open the file 98_lt_phylip_amazon.fn.sabund generated using the Amazonian dataset with the following commands:

mothur > read.dist(phylip=98_lt_phylip_amazon.dist, cutoff=0.10)
mothur > cluster()

The 98_lt_phylip_amazon.fn.sabund file is also outputted to the terminal window when the cluster() command is executed:

unique	2	94	2	
0.00	2	92	3	
0.01	2	88	5	
0.02	4	84	2	2	1	
0.03	4	75	6	1	2	
0.04	4	69	9	1	2	
0.05	4	55	13	3	2	
0.06	4	48	14	2	4	
0.07	4	44	16	2	4	
0.08	7	35	17	3	2	1	0	1	
0.09	7	35	14	3	3	0	0	2	
0.10	7	34	13	3	2	0	0	3	

The first column is the label for the OTU definition and the second column is an integer indicating the number of sequences in the dominant OTU. The numbers in the subsequent columns indicate the number of singletons, doubletons, etc. Here we will calculate the expected number of individuals in each OTU based on the broken stick distribution:

<math>S_i=\frac{98}{55} \sum_{n=i}^{55}\frac{1}{n}</math>


OTU Rank Indiv. Obs. Expected Cum. Obs. Cum. Exp. Difference
1 7 8.18 7 8.18 1.18
2 7 6.40 14 14.59 0.59
3 7 5.51 21 20.10 0.90
4 4 4.92 25 25.02 0.02
5 4 4.47 29 29.49 0.49
6 3 4.12 32 33.61 1.61
7 3 3.82 35 37.43 2.43
8 3 3.56 38 40.99 2.99
9 2 3.34 40 44.33 4.33
10 2 3.14 42 47.48 5.48
11 2 2.97 44 50.45 6.45
12 2 2.80 46 53.25 7.25
13 2 2.66 48 55.90 7.90
14 2 2.52 50 58.42 8.42
15 2 2.39 52 60.81 8.81
16 2 2.27 54 63.09 9.09
17 2 2.16 56 65.25 9.25
18 2 2.06 58 67.30 9.30
19 2 1.96 60 69.26 9.26
20 2 1.86 62 71.13 9.13
21 2 1.77 64 72.90 8.90
22 1 1.69 65 74.59 9.59
23 1 1.61 66 76.20 10.20
24 1 1.53 67 77.73 10.73
25 1 1.46 68 79.19 11.19
26 1 1.39 69 80.57 11.57
27 1 1.32 70 81.89 11.89
28 1 1.25 71 83.14 12.14
29 1 1.19 72 84.33 12.33
30 1 1.13 73 85.45 12.45
31 1 1.07 74 86.52 12.52
32 1 1.01 75 87.53 12.53
33 1 0.95 76 88.48 12.48
34 1 0.90 77 89.38 12.38
35 1 0.85 78 90.23 12.23
36 1 0.80 79 91.03 12.03
37 1 0.75 80 91.77 11.77
38 1 0.70 81 92.47 11.47
39 1 0.65 82 93.12 11.12
40 1 0.61 83 93.73 10.73
41 1 0.56 84 94.29 10.29
42 1 0.52 85 94.81 9.81
43 1 0.48 86 95.28 9.28
44 1 0.43 87 95.72 8.72
45 1 0.39 88 96.11 8.11
46 1 0.35 89 96.47 7.47
47 1 0.32 90 96.78 6.78
48 1 0.28 91 97.06 6.06
49 1 0.24 92 97.30 5.30
50 1 0.20 93 97.50 4.50
51 1 0.17 94 97.67 3.67
52 1 0.13 95 97.80 2.80
53 1 0.10 96 97.90 1.90
54 1 0.07 97 97.97 0.97
55 1 0.03 98 98.00 0.00

To determine whether the broken stick describes the distribution of individuals among OTUs as we observed, we will use the Kolmogorov-Smirnov test statistic (<math>D_{max}</math>). The statistic is the maximum difference between the cumulative observed and expected values (i.e. 12.53) divided by the total number of individuals observed (i.e. 98). So for this case the value was 0.1279. To test this statistic we can calculate the critical value for α=0.05 as 0.886√Sobs or 0.1195 and α=0.01 as 1.031√Sobs or 0.1390. Because our calculated value falls between the two critical values we are confident (0.01<P<0.05) that the observed and expected values are significantly different. Thus, we can reject the hypothesis that the observed data follows the broken stick distribution.


Running...

mothur > summary.single(calc=bstick)


...and opening 98_lt_phylip_amazon.fn.summary gives:


label	bstick		bstick_lci	bstick_hci
unique	0.351821	0.090427	0.105226
0.00	0.345455	0.090902	0.105778
0.01	0.332841	0.091874	0.106910
0.02	0.308102	0.093916	0.109286
0.03	0.278116	0.096671	0.112491
0.04	0.260635	0.098444	0.114556
0.05	0.216118	0.103698	0.120669
0.06	0.189863	0.107443	0.125027
0.07	0.179733	0.109059	0.126907
0.08	0.145906	0.115347	0.134225
0.09	0.136796	0.117354	0.136559
0.10	0.127853	0.119468	0.139020 <---


In this table the data in the column "bstick" are the calculated statistic value, those in column "bstick_lci" are the critical values for α=0.01, and those in column "bstick_hci" are the critical values for α=0.05. These are the same values that we found above for a cutoff of 0.10.