We will be offering mothur and R workshops throughout 2019. Learn more.

Get.seqs

From mothur
Revision as of 13:40, 30 July 2009 by Pschloss (Talk | contribs)

Jump to: navigation, search

The get.seqs command takes a list of sequence names and either a fasta, name, group, or align.report file to generate a new file that contains only the sequences in the list. This command may be used in conjunction with the list.seqs command to help screen a sequence collection. To complete this analysis, you need to download the folder compressed in the Esophagus.zip archive.


Options

To run get.seqs, you must provide the accnos option and at least one other option. The command will generate a *.pick.* file.

accnos option

To generate an accnos file, you could use the list.seqs command. Here you should generate a text file containing the following lines:

59_10_1
59_10_10
59_10_11

Save the file as esophagus.accnos


fasta option

To use the fasta option, follow this example:

mothur > get.seqs(accnos=esophagus.accnos, fasta=esophagus.fasta)

This generates the file esophagus.pick.fasta, which contains the following lines:

>59_10_1
TGCAAGTCGAACGATGAAGCCTAGCTTG...
>59_10_10
TGCAAGTAGAACGCTGAAGAGAGGAGCT...
>59_10_11
TGCAAGTCGAACGAAACTTTCTTACACC...


name option

To use the name option, follow this example (assuming you have used unique.seqs on esophagus.fasta):

mothur > get.seqs(accnos=esophagus.accnos, name=esophagus.names)

This generates the file esophagus.pick.names, which contains the following lines:

59_10_10        59_10_10
59_10_11        59_10_11
59_10_1 59_10_1


group option

To use the group option, follow this example:

mothur > get.seqs(accnos=esophagus.accnos, group=esophagus.groups)

This generates the file esophagus.pick.groups, which contains the following lines:

59_10_1	C
59_10_10	C
59_10_11	C


alignreport option

To use the alignreport option, follow this example:

mothur > get.seqs(accnos=esophagus.accnos, alignreport=esophagus.align.report)

This generates the file esophagus.pick.align.report, which contains the following lines:

QueryName	QueryLength	TemplateName	TemplateLength	SearchMethod	SearchScore	AlignmentMethod	QueryStart	QueryEnd	TemplateStart	 TemplateEnd	PairwiseAlignmentLength	GapsInQuery	GapsInTemplate	LongestInsert	SimBtwnQuery&Template	
59_10_1	869	176825	1527	kmer	69.84	needleman	1	869	5914	870	1	6	1	93.79	
59_10_10	868	196718	1542	kmer	78.05	needleman	1	868	49	916	870	2	2	0	95.29	
59_10_11	870	97946	1560	kmer	92.12	needleman	1	870	51	920	870	0	0	0	99.08