Summary.shared
From mothur
The summary.shared command will produce a summary file that has the calculator value for each line in the OTU data and for all possible comparisons between the different groups in the group file. This can be useful if you aren't interested in generating collector's or rarefaction curves for your multi-sample data analysis. It would be worth your while, however, to look at the collector's curves for the calculators you are interested in to determine how sensitive the values are to sampling. If the values are not sensitive to sampling, then you can trust the values. Otherwise, you need to keep sampling. For this tutorial you should download and decompress Patient70Data.zip
Contents |
Default settings
To execute the summary.shared() command you first need to have either run the read.otu() command with the list and group options. For example:
mothur > read.otu(list=patient70.fn.list, group=patient70.tissue_stool.groups)
The summary data for multi-sample calculators are generated by default with the following command:
mothur > summary.shared()
This will result in output to the screen looking like:
unique 1 0.00 2 0.01 3 0.02 4 0.03 5 0.04 6 0.05 7 0.06 8 0.07 9 0.08 10 0.09 11 0.10 12
The left column indicates the label for each line in the data set and the right column indicates the row number in the data set. In sons, the summary data was provided in a file ending in "sons.ltt" and was only generated after the collector's curves were generated. Now, in mothur, all of this data is contained within a single "shared.summary" file. In this case data was written to the file patient70.fn.shared.summary, which looks like:
label comparison sharedsobs sharedchao sharedace JAbund SorAbund Jclass SorClass unique stool tissue 73.000000 161.449997 108.60603 0.150565 0.261723 0.026613 0.051847 0.00 stool tissue 124.000000 237.481247 254.53860 0.489131 0.656935 0.174402 0.297006 0.01 stool tissue 94.000000 162.892853 135.36864 0.736210 0.848066 0.367188 0.537143 0.02 stool tissue 76.000000 110.477272 86.50789 0.892669 0.943291 0.554745 0.713615 0.03 stool tissue 60.000000 75.916664 72.30236 0.926541 0.961870 0.545455 0.705882 ...
Again, the first column contains the label for the row in the data set you are analyzing. The second and third columns give the group names of the pairwise comparison that is represented by the row. Further columns are labeled to indicate the calculator that was used to generate the data. For instance, here the data in the column labeled SharedSobs contains the number of OTUs that were observed to be shared between groups for each line in the list file. This is actually just a snippet of the file; there are 11 calculators that are calculated by default.
Options
calc
If you don't want to see all of the default calculators, you can tell mothur which ones to use in the summary file:
mothur > summary.shared(calc=sharedsobs-sharedchao-jest)
This would generate the patient70.fn.shared.summary file:
label A B sharedsobs sharedchao Jest unique stool tissue 73.000000 161.449997 0.008066 0.00 stool tissue 124.000000 237.481247 0.219289 0.01 stool tissue 94.000000 162.892853 0.546228 0.02 stool tissue 76.000000 110.477272 0.665435 ...
label
There may only be a couple of lines in your OTU data that you are interested in summarizing. There are two options. You could: (i) manually delete the lines you aren't interested in from you rabund, sabund, or list file; (ii) or use the label option. To use the label option with either the summary.single() command you need to know the labels you are interested in. If you want the summary data for the lines labeled unique, 0.03, 0.05 and 0.10 you would enter:
mothur > summary.shared(label=unique-0.03-0.05-0.10, calc=sharedsobs-sharedchao)
Opening patient70.fn.shared.summary you would see the output as:
label A B sharedsobs sharedchao unique stool tissue 73.000000 161.449997 0.03 stool tissue 60.000000 75.916664 0.05 stool tissue 51.000000 63.312500 0.10 stool tissue 28.000000 33.416668
groups
If you had started this tutorial with the following commands:
mothur > read.otu(list=patient70.fn.list, group=patient70.sites.groups) mothur > get.group()
You would have seen that there were 7 groups here: 70A-70F and 70S. The sequences from 70S were collected from Patient 70's stool sample those from samples 70A-70F were from their mucosa. These 7 groups would yield 21 pairwise comparisons if you ran the summary.shared command; however, if you were only interested in the comparisons between each mucosa site and the stool sample you could use the group option:
mothur > summary.shared(calc=sharedsobs, groups=70A-70S) mothur > summary.shared(calc=sharedsobs, groups=70B-70S) mothur > summary.shared(calc=sharedsobs, groups=70C-70S) mothur > summary.shared(calc=sharedsobs, groups=70D-70S) mothur > summary.shared(calc=sharedsobs, groups=70E-70S) mothur > summary.shared(calc=sharedsobs, groups=70F-70S)
Alternatively, if you want all of the pairwise comparisons you can either not include the group option or set it equal to "all".
mothur > summary.shared(calc=sharedsobs, groups=all)
all
The sharedsobs and sharedchao calculators not only do the pairwise estimates, but also estimate the shared richness of all the groups in your file. This calculation is RAM intensive. If your RAM is limited and you have a large number of groups this may result in a crash, so by default the all parameter is set to false. To calculate the shared richness of all your groups, set the all parameter to true.
mothur > summary.shared(calc=sharedsobs-sharedchao, all=true)