We will be offering mothur and R workshops throughout 2019. Learn more.
Given a list of accession numbers (i.e. sequence names) and one or more file formats, generate a new file that contains only those sequences. The keep option indicates that you want the file to contain the sequences; if keep=false, generate files without those sequence names.
accnos, fasta, name, group, alignreport; each takes a file name and keep (default=true)
accnos and one of fasta/name/group/alignreport
- read accnos file into a set<string> container, close the file
- read through the file to be parsed and for each entry, if the sequence name is in the set<string> container:
- spit the data out to the new file and delete the entry from the set<string> container (the opposite is true when keep=false)
- otherwise do nothing