Revision as of 17:32, 23 June 2009

Given a list of accession numbers (i.e. sequence names) and one or more file formats, generate a new file that contains only those sequences. The keep option indicates that you want the file to contain the sequences; if keep=false, generate files without those sequence names.


accnos, fasta, name, group, alignreport; each takes a file name and keep (default=true)


accnos and one of fasta/name/group/alignreport




  1. read accnos file into a set<string> container, close the file
  2. read through the file to be parsed and for each entry, if the sequence name is in the set<string> container:
    • spit the data out to the new file and delete the entry from the set<string> container (the opposite is true when keep=false)
    • otherwise do nothing