We will be offering mothur and R workshops throughout 2019. Learn more.

Difference between revisions of "Remove.rare"

From mothur
Jump to: navigation, search
m (formatting & layout changes)
(Revisions)
 
Line 174: Line 174:
 
* 1.28.0 Added count option.
 
* 1.28.0 Added count option.
 
* 1.30.1 - Bug Fix: added OTUlabels to shared file.
 
* 1.30.1 - Bug Fix: added OTUlabels to shared file.
 +
* 1.40.0 - Speed and memory improvements for shared files. [https://github.com/mothur/mothur/issues/357 #357] , [https://github.com/mothur/mothur/issues/347 #347]
  
 
[[Category:Commands]]
 
[[Category:Commands]]

Latest revision as of 14:30, 29 March 2018

The remove.rare command removes OTUs at a specified rarity (number of observations in the dataset) and outputs a new file. The files that we discuss in this tutorial can be obtained by downloading the AmazonData.zip file and decompressing it.


Default settings

The remove.rare command can be run on a list, rabund, sabund, or shared file. Options include nseqs, label, groups, and bygroup.

list file

Reading a list file:

mothur > remove.rare(list=98_lt_phylip_amazon.fn.list, nseqs=2, label=0.10)

or if you clustered with a count file, be sure to include it:

mothur > remove.rare(list=98_lt_phylip_amazon.fn.unique_list, count=amazon.count_table, nseqs=2, label=0.10)

Originally 98_lt_phylip_amazon.fn.list at distance 0.10 contained 55 OTUs, after running the above command 98_lt_phylip_amazon.fn.pick.list contains 8 OTUs.

rabund file

Reading a rabund file:

mothur > remove.rare(rabund=98_lt_phylip_amazon.fn.rabund, nseqs=2)

98_lt_phylip_amazon.fn.rabund looks like:

unique	96	2	2	1	1	1	1	1	1	1	1	1 ...	
0.00	95	2	2	2	1	1	1	1	1	1	1	1 ...
0.01	93	2	2	2	2	2	1	1	1	1	1	1 ...
0.02	89	4	3	3	2	2	1	1	1	1	1	1 ...
0.03	84	4	4	3	2	2	2	2	2	2	1	1 ...
0.04	81	4	4	3	2	2	2	2	2	2	2	2 ...
0.05	73	4	4	3	3	3	2	2	2	2	2	2 ...
0.06	68	4	4	4	4	3	3	2	2	2	2	2 ...	
0.07	66	4	4	4	4	3	3	2	2	2	2	2 ...
0.08	59	7	5	4	4	3	3	3	3	2	2	2 ...
0.09	57	7	7	4	4	4	3	3	3	3	2	2 ...
0.10	55	7	7	7	4	4	4	3	3	2	2	2 ...

and after running remove.rare, 98_lt_phylip_amazon.fn.pick.rabund looks like:

0.02	3	4	3	3	
0.03	3	4	4	3	
0.04	3	4	4	3	
0.05	5	4	4	3	3	3	
0.06	6	4	4	4	4	3	3	
0.07	6	4	4	4	4	3	3	
0.08	8	7	5	4	4	3	3	3	3	
0.09	9	7	7	4	4	4	3	3	3	3	
0.10	8	7	7	7	4	4	4	3	3	

sabund file

Reading a sabund file:

mothur > remove.rare(sabund=98_lt_phylip_amazon.fn.sabund, nseqs=2)

98_lt_phylip_amazon.fn.sabund looks like:

unique	2	94	2	
0.00	2	92	3	
0.01	2	88	5	
0.02	4	84	2	2	1	
0.03	4	75	6	1	2	
0.04	4	69	9	1	2	
0.05	4	55	13	3	2	
0.06	4	48	14	2	4	
0.07	4	44	16	2	4	
0.08	7	36	15	4	2	1	0	1	
0.09	7	36	12	4	3	0	0	2	
0.10	7	35	12	2	3	0	0	3

and after running remove.rare, 98_lt_phylip_amazon.fn.pick.sabund looks like:

0.02	4	0	0	2	1	
0.03	4	0	0	1	2	
0.04	4	0	0	1	2	
0.05	4	0	0	3	2	
0.06	4	0	0	2	4	
0.07	4	0	0	2	4	
0.08	7	0	0	4	2	1	0	1	
0.09	7	0	0	4	3	0	0	2	
0.10	7	0	0	2	3	0	0	3

shared file

Reading a shared file:

mothur > remove.rare(shared=98_lt_phylip_amazon.fn.shared, nseqs=2)

98_lt_phylip_amazon.fn.shared looks like:

unique	forest	96	1	1	1	1	1	1	1	1	1	1 ...
unique	pasture	96	0	0	0	0	0	0	0	0	0	0 ...
0.00	forest	95	1	1	1	1	1	1	1	1	1	1 ...	
0.00	pasture	95	0	0	0	0	0	0	0	0	0	0 ...	
0.01	forest	93	1	1	1	1	1	1	1	1	1	1 ...
0.01	pasture	93	0	0	0	0	0	0	0	0	0	0 ...
0.02	forest	89	1	1	1	1	1	1	1	1	1	1 ...
0.02	pasture	89	0	0	0	0	0	0	0	0	0	0 ...
0.03	forest	84	1	1	1	1	1	1	1	1	1	1 ...
0.03	pasture	84	0	0	0	0	0	0	0	0	0	0 ...
0.04	forest	81	1	1	1	1	1	1	1	2	1	1 ...	
0.04	pasture	81	0	0	0	0	0	0	0	0	0	0 ...	
0.05	forest	73	1	1	1	1	1	1	1	2	2	1 ...
0.05	pasture	73	0	0	0	0	0	0	1	0	0	0 ...
0.06	forest	68	1	1	1	1	1	1	1	2	2	1 ...
0.06	pasture	68	0	0	0	0	0	0	1	0	0	3 ...
0.07	forest	66	1	1	1	1	1	1	1	2	2	1 ...
0.07	pasture	66	0	0	0	1	0	0	1	0	0	3 ...
0.08	forest	59	1	1	1	1	1	1	1	3	2	2 ...
0.08	pasture	59	0	0	0	1	0	0	1	0	0	5 ...
0.09	forest	57	1	1	1	1	1	1	3	3	2	2 ...
0.09	pasture	57	0	0	0	1	0	0	1	0	0	5 ...
0.10	forest	55	1	1	1	1	1	1	3	3	2	2 ...
0.10	pasture	55	0	0	0	1	0	0	1	0	0	5 ...

and after running remove.rare, 98_lt_phylip_amazon.fn.pick.shared looks like:

0.02	forest	3	3	0	0	
0.02	pasture	3	0	3	4	
0.03	forest	3	3	0	0	
0.03	pasture	3	1	3	4	
0.04	forest	3	3	0	0	
0.04	pasture	3	1	3	4	
0.05	forest	5	1	3	1	0	0	
0.05	pasture	5	2	1	2	3	4	
0.06	forest	6	1	3	1	3	1	0	
0.06	pasture	6	3	0	3	1	2	4	
0.07	forest	6	1	3	1	3	1	0	
0.07	pasture	6	3	0	3	1	2	4	
0.08	forest	8	3	2	3	1	2	3	0	0	
0.08	pasture	8	0	5	0	3	1	1	3	5	
0.09	forest	9	3	3	2	3	1	2	3	0	0	
0.09	pasture	9	1	0	5	0	3	1	1	3	7	
0.10	forest	8	3	3	2	3	1	1	5	0	
0.10	pasture	8	1	0	5	0	3	3	2	7

Options

nseqs

The nseqs parameter allows you to set the cutoff for an OTU to be deemed rare. An OTU will be removed if its total abundance is less than or equal to nseqs. For example nseqs=1, will remove all singletons.

label

The label parameter is used to analyze specific labels in your input. default=all. You may separate label names with dashes. If you are reading a list file, you must select one label or the first label in your list file will be used.

groups

The groups parameter allows you to specify which of the groups you would like analyzed. Default=all. You may separate group names with dashes.

bygroup

The bygroup parameter is only valid with the shared file. default=f, which results in removal of any OTU that has fewer than the threshold nseqs across all groups. bygroups=T tests for threshold nseqs and drops OTUs one group at a time. E.g., OTUi is represented by 1 sequence in groupA and by 100 sequences in groupB. With nseqs=1 and bygroup=t, abundance of OTUi is set to 0 for groupA, but is retained at 100 for groupB.

mothur > remove.rare(shared=98_lt_phylip_amazon.fn.shared, nseqs=2, bygroup=t)

and after running remove.rare, 98_lt_phylip_amazon.fn.pick.shared looks like:

0.02	forest	3	3	0	0	
0.02	pasture	3	0	3	4	
0.03	forest	3	3	0	0	
0.03	pasture	3	0	3	4	
0.04	forest	3	3	0	0	
0.04	pasture	3	0	3	4	
0.05	forest	3	3	0	0	
0.05	pasture	3	0	3	4	
0.06	forest	5	0	3	0	3	0	
0.06	pasture	5	3	0	3	0	4	
0.07	forest	5	0	3	0	3	0	
0.07	pasture	5	3	0	3	0	4	
0.08	forest	7	3	0	3	0	3	0	0	
0.08	pasture	7	0	5	0	3	0	3	5	
0.09	forest	8	3	3	0	3	0	3	0	0	
0.09	pasture	8	0	0	5	0	3	0	3	7	
0.10	forest	8	3	3	0	3	0	0	5	0	
0.10	pasture	8	0	0	5	0	3	3	0	7

Revisions

  • 1.28.0 Added count option.
  • 1.30.1 - Bug Fix: added OTUlabels to shared file.
  • 1.40.0 - Speed and memory improvements for shared files. #357 , #347