We will be offering mothur and R workshops throughout 2019. Learn more.

Difference between revisions of "Get.seqs"

From mothur
Jump to: navigation, search
(accnos option)
Line 4: Line 4:
  
 
==Options==
 
==Options==
To run get.seqs, you must provide the accnos option and at least one other option.  The command will generate a *.pick.* file.
+
To run get.seqs, you must provide the accnos option and at least one other option.  The command will generate a *.pick.* file. To generate an accnos file, let's first run [[unique.seqs]], [[summary.seqs]], [[screen.seqs]], and [[list.seqs]]:
 
+
===accnos option===
+
To generate an accnos file, let's first run [[unique.seqs]], [[summary.seqs]], [[screen.seqs]], and [[list.seqs]]:
+
 
+
  
 
  mothur > unique.seqs(fasta=esophagus.fasta)
 
  mothur > unique.seqs(fasta=esophagus.fasta)
Line 32: Line 28:
 
A .accnos file is simply a list of file names that meets some given criterion (see [[list.seqs]] for further detail). If you have a subset of sequences that are of interest to you, and you want to retrieve them sequences from a larger .fasta file, another option for generating a .accnos file is to create your list (a single column of names) in a text editor or spreadsheet program that allows you to save your work as tab-delimited text. Notepad and Excel both allow you to do this, just be sure to use quotation marks around your file name in order to get your .accnos file type designation (e.g. "My_subset_sequences.accnos").
 
A .accnos file is simply a list of file names that meets some given criterion (see [[list.seqs]] for further detail). If you have a subset of sequences that are of interest to you, and you want to retrieve them sequences from a larger .fasta file, another option for generating a .accnos file is to create your list (a single column of names) in a text editor or spreadsheet program that allows you to save your work as tab-delimited text. Notepad and Excel both allow you to do this, just be sure to use quotation marks around your file name in order to get your .accnos file type designation (e.g. "My_subset_sequences.accnos").
  
===fasta option===
+
 
To use the fasta option, follow this example:
+
===accnos option===
 +
To use the accnos option, follow this example:
  
 
  mothur > get.seqs(accnos=esophagus.unique.good.accnos, fasta=esophagus.fasta)
 
  mothur > get.seqs(accnos=esophagus.unique.good.accnos, fasta=esophagus.fasta)

Revision as of 14:13, 24 November 2009

The get.seqs command takes a list of sequence names and either a fasta, name, group, or align.report file to generate a new file that contains only the sequences in the list. This command may be used in conjunction with the list.seqs command to help screen a sequence collection. To complete this analysis, you need to download the folder compressed in the Esophagus.zip archive.


Options

To run get.seqs, you must provide the accnos option and at least one other option. The command will generate a *.pick.* file. To generate an accnos file, let's first run unique.seqs, summary.seqs, screen.seqs, and list.seqs:

mothur > unique.seqs(fasta=esophagus.fasta)

mothur > summary.seqs(fasta=esophagus.unique.fasta)

		Start	End	NBases	Ambigs	Polymer
Minimum:	1	831	831	0	4
2.5%-tile:	1	841	841	0	4
25%-tile:	1	857	857	0	5
Median: 	1	866	866	0	5
75%-tile:	1	870	870	0	5
97.5%-tile:	1	900	900	5	7
Maximum:	1	1378	1378	20	8
# of Seqs:	656

mothur > screen.seqs(fasta=esophagus.unique.fasta, maxambig=0)

mothur > list.seqs(fasta=esophagus.unique.good.fasta)

This generates esophagus.unique.good.accnos, a file with 527 sequences.

A .accnos file is simply a list of file names that meets some given criterion (see list.seqs for further detail). If you have a subset of sequences that are of interest to you, and you want to retrieve them sequences from a larger .fasta file, another option for generating a .accnos file is to create your list (a single column of names) in a text editor or spreadsheet program that allows you to save your work as tab-delimited text. Notepad and Excel both allow you to do this, just be sure to use quotation marks around your file name in order to get your .accnos file type designation (e.g. "My_subset_sequences.accnos").


accnos option

To use the accnos option, follow this example:

mothur > get.seqs(accnos=esophagus.unique.good.accnos, fasta=esophagus.fasta)

This generates the file esophagus.pick.fasta, which contains the following lines:

>9_1_12
GCAAGTCGAGGGGAAAC...
>9_1_14
GCAAGTCGAGGGGAACG...
>9_1_15
GCAAGTCGAGGGGAAAC...
...

name option

To use the name option, follow this example:

mothur > get.seqs(accnos=esophagus.unique.good.accnos, name=esophagus.names)

This generates the file esophagus.pick.names, which contains the following lines:

65_5_22	65_5_22
65_5_12	65_5_12
59_7_23	59_7_23
59_7_7	59_7_7
65_5_28	65_5_28
65_9_13	65_9_13
9_6_11	9_6_11
...

group option

To use the group option, follow this example:

mothur > get.seqs(accnos=esophagus.unique.good.accnos, group=esophagus.groups)

This generates the file esophagus.pick.groups, which contains the following lines:

9_1_12	B
9_1_14	B
9_1_15	B
9_1_16	B
9_1_18	B
...

alignreport option

To use the alignreport option, follow this example:

mothur > get.seqs(accnos=esophagus.unique.good.accnos, alignreport=esophagus.align.report)

This generates the file esophagus.pick.align.report, which contains the following lines:

QueryName	QueryLength	TemplateName	TemplateLength	SearchMethod	SearchScore	AlignmentMethod	QueryStart	QueryEnd	TemplateStart	TemplateEnd	PairwiseAlignmentLength	GapsInQuery	GapsInTemplate	LongestInsert	SimBtwnQuery&Template	
9_1_12	866	108139	1525	kmer	62.17	needleman	1	866	50	917	868	2	0	0	91.36	
9_1_14	847	134265	1524	kmer	65.71	needleman	1	847	50	896	849	2	2	0	90.81	
9_1_15	866	108139	1525	kmer	61.47	needleman	1	866	50	917	869	3	1	1	91.02	
9_1_16	854	13820	1555	kmer	90.67	needleman	1	854	43	897	859	5	4	1	97.56	
...