We will be offering mothur and R workshops throughout 2019. Learn more.

Difference between revisions of "List.seqs"

From mothur
Jump to: navigation, search
m (New page: Given an input file, write out the names of the sequences found within it __NOTOC__ ==Options== fasta, name, group, alignreport; each takes a file name ==Required== One of the options m...)
 
(Revisions)
 
(10 intermediate revisions by 3 users not shown)
Line 1: Line 1:
Given an input file, write out the names of the sequences found within it
+
The [[list.seqs]] command will write out the names of the sequences found within a [[fastq file | fastq]], [[fasta file | fasta]], [[name file | name]], [[group file | group]], [[Count_File | count]], [[list file | list]], or [[align.report file | align.report]] file.  This could be useful for using the [[get.seqs]] and [[remove.seqs]] commands as well as to generate a [[group file]].    To complete this analysis, you need to download the folder compressed in the [[Media:Esophagus.zip | Esophagus.zip]] archive.
  
 
__NOTOC__
 
__NOTOC__
  
 
==Options==
 
==Options==
fasta, name, group, alignreport; each takes a file name
+
At least one of the following options must be used.  Each of these options will generate a file ending in accnos that contains a single column containing the list of the sequences contained in the input file.
  
==Required==
+
===fastq option===
One of the options must be provided
+
  
==Output==
+
The fastq option is used as presented in the following command:
A file with the extension “.accnos” (i.e. accession numbers) in place of the previous extension; the content should be a single column and on each row should be a separate sequence accession
+
  
==Algorithm==
+
mothur > list.seqs(fastq=test.fastq)
# as you scan through the input file, put every sequence name into a list<string> container
+
 
# sort the list<string> container once it is filled
+
===fasta option===
# print the list container out to the accnos file
+
The fasta option is used as presented in the following command:
 +
 
 +
mothur > list.seqs(fasta=esophagus.fasta)
 +
 
 +
The resulting esophagus.accnos file looks something like:
 +
 
 +
59_10_1
 +
59_10_10
 +
59_10_11
 +
59_10_13
 +
59_10_15
 +
59_10_16
 +
59_10_17
 +
...
 +
 
 +
===name option===
 +
The name option is used as presented in the following command (be sure to run [[unique.seqs]] on esophagus.fasta first):
 +
 
 +
mothur > list.seqs(name=esophagus.names)
 +
 
 +
===count option===
 +
The [[Count_File | count]] file is similar to the name file in that it is used to represent the number of duplicate sequences for a given representative sequence.  It can also contain group information.
 +
 
 +
mothur > list.seqs(count=esophagus.count_table)
 +
 
 +
===group option===
 +
The group option is used as presented in the following command:
 +
 
 +
mothur > list.seqs(group=esophagus.groups)
 +
 
 +
===alignreport option===
 +
The alignreport option is used as presented in the following command:
 +
 
 +
mothur > list.seqs(alignreport=esophagus.align.report)
 +
 
 +
===list option===
 +
The list option is used as presented in the following command:
 +
 
 +
mothur > list.seqs(list=esophagus.fn.list)
 +
 
 +
===taxonomy option===
 +
The taxonomy option is used as presented in the following command:
 +
 
 +
mothur > list.seqs(taxonomy=esophagus.silva.full.taxonomy)
 +
 
 +
 
 +
==Revisions==
 +
* 1.28.0 Added count option
 +
* 1.33.0 Added fastq option
 +
* 1.40.0 - Allow for () characters in taxonomy definitions. [https://github.com/mothur/mothur/issues/350 #350]
 +
 
 +
[[Category:Commands]]

Latest revision as of 14:56, 29 March 2018

The list.seqs command will write out the names of the sequences found within a fastq, fasta, name, group, count, list, or align.report file. This could be useful for using the get.seqs and remove.seqs commands as well as to generate a group file. To complete this analysis, you need to download the folder compressed in the Esophagus.zip archive.


Options

At least one of the following options must be used. Each of these options will generate a file ending in accnos that contains a single column containing the list of the sequences contained in the input file.

fastq option

The fastq option is used as presented in the following command:

mothur > list.seqs(fastq=test.fastq)

fasta option

The fasta option is used as presented in the following command:

mothur > list.seqs(fasta=esophagus.fasta)

The resulting esophagus.accnos file looks something like:

59_10_1
59_10_10
59_10_11
59_10_13
59_10_15
59_10_16
59_10_17
...

name option

The name option is used as presented in the following command (be sure to run unique.seqs on esophagus.fasta first):

mothur > list.seqs(name=esophagus.names)

count option

The count file is similar to the name file in that it is used to represent the number of duplicate sequences for a given representative sequence. It can also contain group information.

mothur > list.seqs(count=esophagus.count_table)

group option

The group option is used as presented in the following command:

mothur > list.seqs(group=esophagus.groups)

alignreport option

The alignreport option is used as presented in the following command:

mothur > list.seqs(alignreport=esophagus.align.report)

list option

The list option is used as presented in the following command:

mothur > list.seqs(list=esophagus.fn.list)

taxonomy option

The taxonomy option is used as presented in the following command:

mothur > list.seqs(taxonomy=esophagus.silva.full.taxonomy)


Revisions

  • 1.28.0 Added count option
  • 1.33.0 Added fastq option
  • 1.40.0 - Allow for () characters in taxonomy definitions. #350