We will be offering mothur and R workshops throughout 2019. Learn more.

Difference between revisions of "Summary.shared"

From mothur
Jump to: navigation, search
(all)
Line 90: Line 90:
  
 
===all===
 
===all===
The sharedsobs and sharedchao calculators not only do the pairwise estimates, but also estimate the shared richness of all the groups in your file.  This calculation is RAM intensive.  If your RAM is limited and you have a large number of groups this may result in a crash. To avoid this you can choose to just calculate the pairwise estimates by setting the all parameter to false.
+
The sharedsobs and sharedchao calculators not only do the pairwise estimates, but also estimate the shared richness of all the groups in your file.  This calculation is RAM intensive.  If your RAM is limited and you have a large number of groups this may result in a crash, so by default the all parameter is set to false. To calculate the shared richness of all your groups, set the all parameter to true.
  
  mothur > summary.shared(calc=sharedsobs-sharedchao, all=false)
+
  mothur > summary.shared(calc=sharedsobs-sharedchao, all=true)
  
 
[[Category:Commands]]
 
[[Category:Commands]]

Revision as of 18:50, 27 January 2010

The summary.shared command will produce a summary file that has the calculator value for each line in the OTU data and for all possible comparisons between the different groups in the group file. This can be useful if you aren't interested in generating collector's or rarefaction curves for your multi-sample data analysis. It would be worth your while, however, to look at the collector's curves for the calculators you are interested in to determine how sensitive the values are to sampling. If the values are not sensitive to sampling, then you can trust the values. Otherwise, you need to keep sampling. For this tutorial you should download and decompress Patient70Data.zip



Default settings

To execute the summary.shared() command you first need to have either run the read.otu() command with the list and group options. For example:

mothur > read.otu(list=patient70.fn.list, group=patient70.tissue_stool.groups)

The summary data for multi-sample calculators are generated by default with the following command:

mothur > summary.shared()

This will result in output to the screen looking like:

unique	1
0.00	2
0.01	3
0.02	4
0.03	5
0.04	6
0.05	7
0.06	8
0.07	9
0.08	10
0.09	11
0.10	12

The left column indicates the label for each line in the data set and the right column indicates the row number in the data set. In sons, the summary data was provided in a file ending in "sons.ltt" and was only generated after the collector's curves were generated. Now, in mothur, all of this data is contained within a single "shared.summary" file. In this case data was written to the file patient70.fn.shared.summary, which looks like:

label	comparison	sharedsobs	sharedchao	sharedace	JAbund		SorAbund	Jclass		SorClass
unique	stool	tissue	73.000000	161.449997	108.60603	0.150565	0.261723	0.026613	0.051847
0.00	stool	tissue	124.000000	237.481247	254.53860	0.489131	0.656935	0.174402	0.297006
0.01	stool	tissue	94.000000	162.892853	135.36864	0.736210	0.848066	0.367188	0.537143
0.02	stool	tissue	76.000000	110.477272	86.50789	0.892669	0.943291	0.554745	0.713615
0.03	stool	tissue	60.000000	75.916664	72.30236	0.926541	0.961870	0.545455	0.705882
...

Again, the first column contains the label for the row in the data set you are analyzing. The second and third columns give the group names of the pairwise comparison that is represented by the row. Further columns are labeled to indicate the calculator that was used to generate the data. For instance, here the data in the column labeled SharedSobs contains the number of OTUs that were observed to be shared between groups for each line in the list file. This is actually just a snippet of the file; there are 11 calculators that are calculated by default.

Options

calc

If you don't want to see all of the default calculators, you can tell mothur which ones to use in the summary file:

mothur > summary.shared(calc=sharedsobs-sharedchao-jest)

This would generate the patient70.fn.shared.summary file:

label	A	B		sharedsobs	sharedchao	Jest
unique	stool	tissue		73.000000	161.449997	0.008066
0.00	stool	tissue		124.000000	237.481247	0.219289
0.01	stool	tissue		94.000000	162.892853	0.546228
0.02	stool	tissue		76.000000	110.477272	0.665435
 ...


label

There may only be a couple of lines in your OTU data that you are interested in summarizing. There are two options. You could: (i) manually delete the lines you aren't interested in from you rabund, sabund, or list file; (ii) or use the label option. To use the label option with either the summary.single() command you need to know the labels you are interested in. If you want the summary data for the lines labeled unique, 0.03, 0.05 and 0.10 you would enter:

mothur > summary.shared(label=unique-0.03-0.05-0.10, calc=sharedsobs-sharedchao)

Opening patient70.fn.shared.summary you would see the output as:

label	A	B		sharedsobs	sharedchao
unique	stool	tissue		73.000000	161.449997
0.03	stool	tissue		60.000000	75.916664
0.05	stool	tissue		51.000000	63.312500
0.10	stool	tissue		28.000000	33.416668

groups

If you had started this tutorial with the following commands:

mothur > read.otu(list=patient70.fn.list, group=patient70.sites.groups)
mothur > get.group()

You would have seen that there were 7 groups here: 70A-70F and 70S. The sequences from 70S were collected from Patient 70's stool sample those from samples 70A-70F were from their mucosa. These 7 groups would yield 21 pairwise comparisons if you ran the summary.shared command; however, if you were only interested in the comparisons between each mucosa site and the stool sample you could use the group option:

mothur > summary.shared(calc=sharedsobs, groups=70A-70S)
mothur > summary.shared(calc=sharedsobs, groups=70B-70S)
mothur > summary.shared(calc=sharedsobs, groups=70C-70S)
mothur > summary.shared(calc=sharedsobs, groups=70D-70S)
mothur > summary.shared(calc=sharedsobs, groups=70E-70S)
mothur > summary.shared(calc=sharedsobs, groups=70F-70S)

Alternatively, if you want all of the pairwise comparisons you can either not include the group option or set it equal to "all".

mothur > summary.shared(calc=sharedsobs, groups=all)

all

The sharedsobs and sharedchao calculators not only do the pairwise estimates, but also estimate the shared richness of all the groups in your file. This calculation is RAM intensive. If your RAM is limited and you have a large number of groups this may result in a crash, so by default the all parameter is set to false. To calculate the shared richness of all your groups, set the all parameter to true.

mothur > summary.shared(calc=sharedsobs-sharedchao, all=true)