### We will be offering an R workshop December 18-20, 2019. Learn more.

# Difference between revisions of "Jack"

(New page: Link title '''Example Calculations''' '''''*.jack''''' These files give the interpolated Jackknife estimate as describe by Burnham and Overton (1). Since this is a complicated cal...) |
|||

Line 1: | Line 1: | ||

− | [[ | + | [[Validate output by making calculations by hand]] |

'''Example Calculations''' | '''Example Calculations''' |

## Revision as of 15:38, 14 January 2009

Validate output by making calculations by hand

**Example Calculations**

***.jack**

These files give the interpolated Jackknife estimate as describe by Burnham and Overton (1). Since this is a complicated calculation, it makes MOTHUR run much longer. Therefore, if you want the rarefied interpolated Jackknife estimate calculated, you must tell MOTHUR to do so by using the "-j" flag at the command line. If it is not calculated, MOTHUR will still produce the *.r_jack* files, but they will be filled with zeros.

<math>S_{jack,k} = S_{obs} + \sum_{i=1}^k \left ( -1 \right )^{i+1} {k \choose i} n_i </math>

<math>var \left( S_{jack,k} \right ) = \sum_{i=1}^{n_1} \left ( a_{ik} \right )^2 n_i - S_{jack,k} </math>

<math>a_{ik} = \langle \left ( -1 \right )^{i+1} {k \choose i} + 1, i = 1...k, 1, i > k</math>

where,

k = The order of the Jackknife estimate nt = The number of sequences in the largest OTU

To determine which order of the estimate to use it is necessary to calculate the test statistics, Tk:

<math>T_k = \frac{S_{jack,k+1} - S_{jack,k}}{\left ( var \left( S_{jack,k+1} - S_{jack,k} | S \right )\right )^2}</math>

<math>var \left( S_{jack,k+1} - S_{jack,k} | S \right ) = \frac {S_{obs}}{S_{obs}-1} \left [ \sum_{i=1}^{n_1} \left ( b_{i}^2 n_i \right ) - \frac{\left ( S_{jack,k+1} - S_{jack,k} \right )^2 }{S_{obs}}\right ]</math>

where,

<math>b_i = a_{i,k+1}-a_{i,k}</math>

For each Tk value, calculate its two-sided p-value. Find the first k-value where Pk>0.05 and calculate c and d:

<math>c = \frac {0.05 - P_{k-1}}{P_k - P_{k-1}}</math>

<math>d_i = ca_{i,k} + \left( i-c \right )a_{i,k-1}</math>

With c and d, calculate the interpolated Sjack and its standard error:

<math>S_{jack} = \sum_{i=1}^{n_1}d_i n_i</math>

<math>se \left ( S_{jack} \right ) = \left ( \sum_{i=1}^{n_1} \left ( d_{i}^2 n_i\right )-S_{jack}\right )^{0.5}</math>

For the Amazonian dataset, you can calculate the following:

k Sj,k var Tk Pk1 159 150 13.91 <0.0001 2 228 450 8.89 <0.00013 292 938 5.77 <0.00014 350 1700 3.36 0.00085 399 2940 1.54 0.1235 6 434 5250

The p-value crosses 0.05 between a order of 4 and 5 and you can calculate a c-value of 0.40 and
the interpolated Sjack of 369.64 with 95% confidence interval between 278.98 and 460.30. Note
that programs like EstiamteS and various microbial ecology papers present either the first and/or
second order Jackknife estimate. This method essentially uses a statistical procedure to
determine which order results in the minimum bias (error).