### We will be offering mothur and R workshops throughout 2019. Learn more.

# Bootstrap

Validate output by making calculations by hand

**Example Calculations**

***.bootstrap**

These files give the bootstrap estimate as described by Smith and Van Belle (5) but implemented for a single "quadrant".

<math>S_{Bootstrap} = S_{obs} + \sum_{i=1}^{S_{obs}} \left ( 1 - \frac {S_i}{N}\right )^N </math>

where,

N = The number of individuals sampled

<math>S_{i}</math> = The number of sequences in the ith OTU

<math>S_{obs}</math> = Observed number of species

For the Amazonian dataset at distance 0.03, <math>S_{Bootstrap}</math> =112.33 and there is not a simple expression for the 95% confidence interval.

**File Samples on the Amazonian Dataset**

- .sabund

This file contains data for constructing a rank-abundance plot of the OTU data for each distance level. The first column contains the distance and the second is the number of OTUs observed at that distance. The successive values in the row are the number of OTUs that were found once, twice, etc.

unique 2 94 2 0 2 92 3 0.01 2 88 5 0.02 4 84 2 2 10.03 4 75 6 1 20.04 4 69 9 1 2 0.05 4 55 13 3 2 0.06 4 48 14 2 4 0.07 4 44 16 2 4 0.08 7 36 15 4 2 1 0 1 0.09 7 36 12 4 3 0 0 2 0.1 7 35 12 2 3 0 0 3

- .bootstrap

The first line contains the labels of all the columns. First numsequences which shows the frequency of the observed calculations. The frequency was set to 10, so after each 10 selected the observed is calculated at each of the distances, with a calculation done after all are sampled. The following labels in the first line are the distances at which the calculations were made. Each additional line starts with the number of sequences sampled followed by the bootstrap calculation at the column's distance. For instance, at distance 0.01, after 98 samples 125.87 was the bootstrap estimate.

numsequences unique 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 1 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 10 13.49 13.49 13.49 13.49 13.49 13.49 11.90 11.90 11.90 10.31 13.49 20 27.17 27.17 27.17 27.17 27.17 22.38 25.57 24.13 23.98 20.94 22.54 30 40.85 40.85 40.85 40.85 39.25 34.46 32.86 36.21 32.86 33.02 26.99 40 54.53 54.53 51.33 52.93 51.33 46.69 44.94 43.49 37.10 40.45 33.24 50 68.21 66.61 65.01 65.01 58.62 57.17 54.12 52.52 47.93 48.08 42.12 60 80.29 80.29 74.04 78.69 66.25 64.45 61.40 63.00 55.66 56.96 49.76 70 92.37 92.37 84.87 86.12 75.13 71.88 65.98 71.88 65.10 61.84 57.27 80 106.05 104.45 95.50 98.40 85.91 79.36 75.36 82.36 69.93 63.62 60.83 90 119.73 116.53 109.18 107.43 96.59 86.94 81.24 82.84 71.16 68.40 65.71 98 130.67 125.87 120.12 112.33 107.53 95.03 87.59 84.39 74.39 72.01 69.55

**References**

5. Chao, A., and S. M. Lee. 1992. Estimating the number of classes via sample coverage. J. Am. Stat. Assoc. 87:210-217.