We will be offering mothur and R workshops throughout 2019. Learn more.

Align.check

From mothur
Revision as of 19:53, 23 November 2009 by Westcott (Talk | contribs)

Jump to: navigation, search

The align.check command allows you to calculate the number of potentially misaligned bases in a 16S rRNA gene sequence alignment. If you are familiar with the editor window in ARB, this is the same as counting the number of ~, #, -, and = signs. The ~ indicates a strong Watson-Crick base pairing, # awful pairing, and - and = are weaker pairings. The lack of a symbol indicates that the base is in a loop and does not bond to another base in the secondary structure. Note that every sequence alignment will have non-ideal pairing. This tool should be used to identify outliers in your alignment. To run through the command below use the map for green genes Secondary Structure and Amazonian dataset.



Default settings

To run align.check you need to provide your sequences to be checked in either fasta, nexus, clustal, or phylip format, as well as a file containing the secondary structure. The output will be in .align.check file. Try the following command:

mothur > align.check(fasta=amazon.fasta, map=gg.ss.map)

The output file will look like:

name	pound	dash	plus	equal	loop	tilde	total
U68589	36	18	4	22	683	54	939
U68590	32	8	0	18	381	20	497
U68591	68	30	2	16	670	22	930
U68592	34	8	2	20	403	14	530
U68593	52	24	6	18	705	32	962
U68594	60	14	4	22	705	38	965
U68595	20	12	2	22	399	18	521
U68596	38	14	0	12	407	14	534
U68597	54	12	4	26	705	42	964
U68598	32	10	0	14	381	22	498
U68599	64	18	2	16	710	40	971
U68600	52	21	7	19	670	40	929
U68601	40	12	0	12	394	12	513
...