We will be offering mothur and R workshops throughout 2019. Learn more.

Remove.rare

From mothur
Revision as of 18:09, 16 November 2015 by Westcott (Talk | contribs) (Options)

Jump to: navigation, search

The remove.rare command reads one of the following file types: list, rabund, sabund or shared file. It outputs a new file after removing the rare otus. The files that we discuss in this tutorial can be obtained by downloading the AmazonData.zip file and decompressing it.


Default settings

Reading a list_file:

mothur > remove.rare(list=98_lt_phylip_amazon.fn.list, nseqs=2, label=0.10)

or if you clustered with a count file, be sure to include it:

mothur > remove.rare(list=98_lt_phylip_amazon.fn.unique_list, count=amazon.count_table, nseqs=2, label=0.10)

Originally 98_lt_phylip_amazon.fn.list at distance 0.10 contained 55 otus, after running the above command 98_lt_phylip_amazon.fn.pick.list contains 8 otus.

or reading a rabund_file:

mothur > remove.rare(rabund=98_lt_phylip_amazon.fn.rabund, nseqs=2)

98_lt_phylip_amazon.fn.rabund looks like:

unique	96	2	2	1	1	1	1	1	1	1	1	1 ...	
0.00	95	2	2	2	1	1	1	1	1	1	1	1 ...
0.01	93	2	2	2	2	2	1	1	1	1	1	1 ...
0.02	89	4	3	3	2	2	1	1	1	1	1	1 ...
0.03	84	4	4	3	2	2	2	2	2	2	1	1 ...
0.04	81	4	4	3	2	2	2	2	2	2	2	2 ...
0.05	73	4	4	3	3	3	2	2	2	2	2	2 ...
0.06	68	4	4	4	4	3	3	2	2	2	2	2 ...	
0.07	66	4	4	4	4	3	3	2	2	2	2	2 ...
0.08	59	7	5	4	4	3	3	3	3	2	2	2 ...
0.09	57	7	7	4	4	4	3	3	3	3	2	2 ...
0.10	55	7	7	7	4	4	4	3	3	2	2	2 ...

and after running remove.rare, 98_lt_phylip_amazon.fn.pick.rabund looks like:

0.02	3	4	3	3	
0.03	3	4	4	3	
0.04	3	4	4	3	
0.05	5	4	4	3	3	3	
0.06	6	4	4	4	4	3	3	
0.07	6	4	4	4	4	3	3	
0.08	8	7	5	4	4	3	3	3	3	
0.09	9	7	7	4	4	4	3	3	3	3	
0.10	8	7	7	7	4	4	4	3	3	

or reading a sabund_file:

mothur > remove.rare(sabund=98_lt_phylip_amazon.fn.sabund, nseqs=2)

98_lt_phylip_amazon.fn.sabund looks like:

unique	2	94	2	
0.00	2	92	3	
0.01	2	88	5	
0.02	4	84	2	2	1	
0.03	4	75	6	1	2	
0.04	4	69	9	1	2	
0.05	4	55	13	3	2	
0.06	4	48	14	2	4	
0.07	4	44	16	2	4	
0.08	7	36	15	4	2	1	0	1	
0.09	7	36	12	4	3	0	0	2	
0.10	7	35	12	2	3	0	0	3

and after running remove.rare, 98_lt_phylip_amazon.fn.pick.sabund looks like:

0.02	4	0	0	2	1	
0.03	4	0	0	1	2	
0.04	4	0	0	1	2	
0.05	4	0	0	3	2	
0.06	4	0	0	2	4	
0.07	4	0	0	2	4	
0.08	7	0	0	4	2	1	0	1	
0.09	7	0	0	4	3	0	0	2	
0.10	7	0	0	2	3	0	0	3

or reading a shared_file:

mothur > remove.rare(shared=98_lt_phylip_amazon.fn.shared, nseqs=2)

98_lt_phylip_amazon.fn.shared looks like:

unique	forest	96	1	1	1	1	1	1	1	1	1	1 ...
unique	pasture	96	0	0	0	0	0	0	0	0	0	0 ...
0.00	forest	95	1	1	1	1	1	1	1	1	1	1 ...	
0.00	pasture	95	0	0	0	0	0	0	0	0	0	0 ...	
0.01	forest	93	1	1	1	1	1	1	1	1	1	1 ...
0.01	pasture	93	0	0	0	0	0	0	0	0	0	0 ...
0.02	forest	89	1	1	1	1	1	1	1	1	1	1 ...
0.02	pasture	89	0	0	0	0	0	0	0	0	0	0 ...
0.03	forest	84	1	1	1	1	1	1	1	1	1	1 ...
0.03	pasture	84	0	0	0	0	0	0	0	0	0	0 ...
0.04	forest	81	1	1	1	1	1	1	1	2	1	1 ...	
0.04	pasture	81	0	0	0	0	0	0	0	0	0	0 ...	
0.05	forest	73	1	1	1	1	1	1	1	2	2	1 ...
0.05	pasture	73	0	0	0	0	0	0	1	0	0	0 ...
0.06	forest	68	1	1	1	1	1	1	1	2	2	1 ...
0.06	pasture	68	0	0	0	0	0	0	1	0	0	3 ...
0.07	forest	66	1	1	1	1	1	1	1	2	2	1 ...
0.07	pasture	66	0	0	0	1	0	0	1	0	0	3 ...
0.08	forest	59	1	1	1	1	1	1	1	3	2	2 ...
0.08	pasture	59	0	0	0	1	0	0	1	0	0	5 ...
0.09	forest	57	1	1	1	1	1	1	3	3	2	2 ...
0.09	pasture	57	0	0	0	1	0	0	1	0	0	5 ...
0.10	forest	55	1	1	1	1	1	1	3	3	2	2 ...
0.10	pasture	55	0	0	0	1	0	0	1	0	0	5 ...

and after running remove.rare, 98_lt_phylip_amazon.fn.pick.shared looks like:

0.02	forest	3	3	0	0	
0.02	pasture	3	0	3	4	
0.03	forest	3	3	0	0	
0.03	pasture	3	1	3	4	
0.04	forest	3	3	0	0	
0.04	pasture	3	1	3	4	
0.05	forest	5	1	3	1	0	0	
0.05	pasture	5	2	1	2	3	4	
0.06	forest	6	1	3	1	3	1	0	
0.06	pasture	6	3	0	3	1	2	4	
0.07	forest	6	1	3	1	3	1	0	
0.07	pasture	6	3	0	3	1	2	4	
0.08	forest	8	3	2	3	1	2	3	0	0	
0.08	pasture	8	0	5	0	3	1	1	3	5	
0.09	forest	9	3	3	2	3	1	2	3	0	0	
0.09	pasture	9	1	0	5	0	3	1	1	3	7	
0.10	forest	8	3	3	2	3	1	1	5	0	
0.10	pasture	8	1	0	5	0	3	3	2	7

Options

nseqs

The nseqs parameter allows you to set the cutoff for an otu to be deemed rare. An OTU will be removed if it's total abundance is less than or equal to nseqs. For example nseqs=1, will remove all singletons.


groups

The groups parameter allows you to specify which of the groups you would like analyzed. Default=all. You may separate group names with dashes.

label

The label parameter is used to analyze specific labels in your input. default=all. You may separate label names with dashes. If you are reading a list file, you must select one label or the first label in your list file will be used.

bygroup

The bygroup parameter is only valid with the shared file. default=f, meaning remove any OTU that has nseqs or fewer sequences across all groups. bygroups=T means remove any OTU that has nseqs or fewer sequences in each group (if groupA has 1 sequence and group B has 100 sequences in OTUZ and nseqs=1, then set the groupA count for OTUZ to 0 and keep groupB's count at 100.

mothur > remove.rare(shared=98_lt_phylip_amazon.fn.shared, nseqs=2, bygroup=t)

and after running remove.rare, 98_lt_phylip_amazon.fn.pick.shared looks like:

0.02	forest	3	3	0	0	
0.02	pasture	3	0	3	4	
0.03	forest	3	3	0	0	
0.03	pasture	3	0	3	4	
0.04	forest	3	3	0	0	
0.04	pasture	3	0	3	4	
0.05	forest	3	3	0	0	
0.05	pasture	3	0	3	4	
0.06	forest	5	0	3	0	3	0	
0.06	pasture	5	3	0	3	0	4	
0.07	forest	5	0	3	0	3	0	
0.07	pasture	5	3	0	3	0	4	
0.08	forest	7	3	0	3	0	3	0	0	
0.08	pasture	7	0	5	0	3	0	3	5	
0.09	forest	8	3	3	0	3	0	3	0	0	
0.09	pasture	8	0	0	5	0	3	0	3	7	
0.10	forest	8	3	3	0	3	0	0	5	0	
0.10	pasture	8	0	0	5	0	3	3	0	7

Revisions

  • 1.28.0 Added count option.
  • 1.30.1 - Bug Fix: added OTUlabels to shared file.