### We will be offering mothur and R workshops throughout 2019. Learn more.

# Sorclass

Validate output by making calculations by hand

**Example Calculations**

***.sharedSorClass**

Estimating the fraction of shared OTUs between two communities. Incidence-based measures of community similarity such as the classic Sørenson (Sclas) similarity indices calculate the ratio of shared OTUs to the total number of OTUs in individual communities:

<math>S_{class} = \frac{2S_{12}}{S_1 + S_2}</math>,

where,

<math> S_1, S_2</math> = number of OTUs observed or estimated in A and B.

<math> S_{12} </math> = number of OTUs shared between A and B.

The observed number of OTUs at distance 0.03 in A and B was 89 and 81, respectively. Shared between the two libraries were 60 OTUs. Therefore the value of the equations for <math>S_{class} </math> was 0.705882 as seen below.

**File Samples on the Eckburg 70.stool_compare Dataset**

- .shared

This file contains the frequency of sequences from each group found in each OTU. Each row consists of the distance being considered, group name, number of OTUS, and the abundance information separated by tabs. The abundance information is as follows. Each subsequent number represents a different OTU so that the number indicates the number of sequences in that group that clustered within that OTU. Note that OTU frequencies can only be compared within a distance definition. Below is a link to the file used in the calculations.

Media:/users/westcott/desktop/70.fn.shared

- .sharedSorclass

The first line contains the labels of all the columns. First sampled which shows the frequency of the <math>S_{class} </math> calculations. The frequency was set to 500, so after each 500 selected the <math>S_{class} </math> is calculated at each of the distances, with a calculation done after all are sampled. The following labels in the first line are the distances at which the calculations were made and the names of the groups compared. Each additional line starts with the number of sequences sampled followed by the <math>S_{class} </math> calculation at the column's distance. For instance, at distance 0.01, after 4392 samples <math>S_{class} </math> was 0.0463128.

sampled uniquetissuestool 0.00tissuestool 0.01tissuestool 0.02tissuestool 0.03tissuestool 1 0 0 0 0 0 500 0.0454545 0.201613 0.357616 0.545455 0.609756 1000 0.0643432 0.194226 0.411765 0.609929 0.576923 1500 0.0673527 0.2079 0.448133 0.699386 0.651163 2000 0.0609137 0.228873 0.492424 0.729282 0.666667 2500 0.0544057 0.241379 0.510345 0.731183 0.675497 3000 0.0479281 0.260116 0.51634 0.727273 0.666667 3500 0.0495222 0.262162 0.518519 0.724638 0.679012 4000 0.0462784 0.269182 0.542274 0.720379 0.678788 4392 0.0463128 0.279126 0.541311 0.71028 0.705882