We will be offering an R workshop December 18-20, 2019. Learn more.


From mothur
Revision as of 14:24, 15 January 2009 by Westcott (Talk | contribs) (New page: Validate output by making calculations by hand '''Example Calculations''' '''*.sharedJclass''' Estimating the fraction of shared OTUs between two communities. Incidence-based mea...)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Validate output by making calculations by hand

Example Calculations


Estimating the fraction of shared OTUs between two communities. Incidence-based measures of community similarity such as the classic Jaccard (Jclas) similarity indices calculate the ratio of shared OTUs to the total number of OTUs in individual communities:

<math>J_{class} = \frac{S_{12}}{S_1 + S_2 - S_{12}}</math>


<math> S_1, S_2</math> = number of OTUs observed or estimated in A and B.

<math> S_{12} </math> = number of OTUs shared between A and B.

The observed number of OTUs in A and B was 89 and 81, respectively. Shared between the two libraries were 60 OTUs. Therefore the value of the equations for <math>J_{class} </math> was 0.545455 as seen below.

File Samples on the Eckburg 70.stool_compare Dataset

  • .shared

This file contains the frequency of sequences from each group found in each OTU. Each row consists of the distance being considered, group name, number of OTUS, and the abundance information separated by tabs. The abundance information is as follows. Each subsequent number represents a different OTU so that the number indicates the number of sequences in that group that clustered within that OTU. Note that OTU frequencies can only be compared within a distance definition. Below is a link to the file used in the calculations.


  • .sharedJclass

The first line contains the labels of all the columns. First sampled which shows the frequency of the <math>J_{class} </math> calculations. The frequency was set to 500, so after each 500 selected the <math>J_{class} </math> is calculated at each of the distances, with a calculation done after all are sampled. The following labels in the first line are the distances at which the calculations were made and the names of the groups compared. Each additional line starts with the number of sequences sampled followed by the <math>J_{class} </math> calculation at the column's distance. For instance, at distance 0.01, after 4392 samples <math>J_{class} </math> was 0.371094.