We will be offering mothur and R workshops throughout 2019. Learn more.

Create.database

From mothur
Revision as of 18:12, 17 February 2017 by Westcott (Talk | contribs) (Revisions)

Jump to: navigation, search

The create.database command reads a list file, *.cons.taxonomy, *.rep.fasta, *.rep.names or *.rep.count_table and optional group file, and creates a database file. To run the following tutorial please download: Example Files

Default Settings

The create.database command parameters are repfasta, list, shared, repname, constaxonomy, group and label. List or shared, repfasta, repnames or count, and constaxonomy are required. NOTE: Make SURE the repfasta, repnames or count and constaxonomy are for the same label as the listfile.

mothur > get.oturep(list=final.an.list, label=0.03, fasta=final.fasta, column=final.dist, name=final.names) 
mothur > classify.otu(list=final.an.list, name=final.names, taxonomy=final.taxonomy, label=0.03)
mothur > create.database(list=final.an.list, label=0.03, repfasta=final.an.0.03.rep.fasta, repname=final.an.0.03.rep.names, constaxonomy=final.an.0.03.cons.taxonomy)

or with a count file:

mothur > get.oturep(list=final.an.unique_list, label=0.03, fasta=final.fasta, column=final.dist, count=final.count_table) 
mothur > classify.otu(list=final.an.unique_list, count=final.count_table, taxonomy=final.taxonomy, label=0.03)
mothur > create.database(list=final.an.unique_list, label=0.03, repfasta=final.an.0.03.rep.fasta, count=final.an.0.03.rep.count_table, constaxonomy=final.an.0.03.cons.taxonomy)

or with and shared file:

mothur > get.oturep(list=final.an.list, label=0.03, fasta=final.fasta, column=final.dist, name=final.names) 
mothur > classify.otu(list=final.an.list, name=final.names, taxonomy=final.taxonomy, label=0.03)
mothur > create.database(shared=final.an.shared, label=0.03, repfasta=final.an.0.03.rep.fasta, repname=final.an.0.03.rep.names, constaxonomy=final.an.0.03.cons.taxonomy)


If you open the final.an.database file you will see:

OTUNumber	Abundance	repSeqName	repSeq	OTUConTaxonomy
1	6307	GQY1XT001C296C	A-GC--GA-G-A-A-G-T-A ... GT-GAA	Bacteria(100);"Bacteroidetes"(100);"Bacteroidia"(100);...
2	5124	GQY1XT001A3TJI	G-GC--GA-G-A-A-G-T-A ... GT-GAA	Bacteria(100);"Bacteroidetes"(100);"Bacteroidia"(100);...
3	3177	GQY1XT001CS2B8	G-GC--GA-G-A-A-G-T-A ... GT-GAA	Bacteria(100);"Bacteroidetes"(100);"Bacteroidia"(100);...
4	2947	GQY1XT001CD9IB	G-GC--GA-G-A-A-G-T-A ... GT-GAA	Bacteria(100);"Bacteroidetes"(100);"Bacteroidia"(100);...
...

repfasta

The repfasta file is fasta file outputted by get.oturep(fasta=yourFastaFile, list=yourListfile, column=yourDistFile, name=yourNameFile)

repname

The repname file is the name file outputted by get.oturep(fasta=yourFastaFile, list=yourListfile, column=yourDistFile, name=yourNameFile)

count

The count file is the count file outputted by get.oturep(fasta=yourFastaFile, list=yourListfile, column=yourDistFile, count=yourCountFile)

constaxonomy

The constaxonomy file is the taxonomy file outputted by classify.otu(list=yourListfile, name=final.names, taxonomy=yourTaxonomyFile)

Options

group

The group file is optional and will just give you the abundance breakdown by group.

mothur > create.database(list=final.an.list, label=0.03, repfasta=final.an.0.03.rep.fasta, repname=final.an.0.03.rep.names, constaxonomy=final.an.0.03.cons.taxonomy, group=final.groups)

If you open the final.an.database file you will see:

OTUNumber	F003D000 ...	F003D150	repSeqName	repSeq	OTUConTaxonomy
1	422	1012 ... 492  GQY1XT001C296C	A-GC--GA-G-A-A-G-T-A ... GT-GAA	Bacteria(100);"Bacteroidetes"(100);...
2	413	186 ... 707  GQY1XT001A3TJI	G-GC--GA-G-A-A-G-T-A ... GT-GAA	Bacteria(100);"Bacteroidetes"(100);...
3	279	238 ... 342  GQY1XT001CS2B8	G-GC--GA-G-A-A-G-T-A ... GT-GAA	Bacteria(100);"Bacteroidetes"(100);...
4	255	194 ... 410  GQY1XT001CD9IB	G-GC--GA-G-A-A-G-T-A ... GT-GAA	Bacteria(100);"Bacteroidetes"(100);...
...

label

The label parameter allows you to specify a label to be used from your list file.

Revisions

  • 1.25.0 - First Introduced
  • 1.26.0 - added shared parameter
  • 1.31.0 - added count parameter
  • 1.39.0 - Makes refasta and repnames parameters optional