We will be offering mothur and R workshops throughout 2019. Learn more.

Difference between revisions of "Create.database"

From mothur
Jump to: navigation, search
Line 29: Line 29:
 
  4 2947 GQY1XT001CD9IB G-GC--GA-G-A-A-G-T-A ... GT-GAA Bacteria(100);"Bacteroidetes"(100);"Bacteroidia"(100);...
 
  4 2947 GQY1XT001CD9IB G-GC--GA-G-A-A-G-T-A ... GT-GAA Bacteria(100);"Bacteroidetes"(100);"Bacteroidia"(100);...
 
  ...
 
  ...
 +
 +
===list===
 +
The list parameter allows you to provide your [[List_file | list]] file.  Mothur creates this file by running the [[cluster]], [[cluster.split]] or [[phylotype]] commands.
 +
 +
===shared===
 +
The shared parameter allows you to provide your [[Shared_file | shared]] file.  Mothur creates this file by running the [[make.shared]] command.
  
 
===count===
 
===count===

Revision as of 18:21, 17 February 2017

The create.database command reads a list file, *.cons.taxonomy, *.rep.fasta, *.rep.names or *.rep.count_table and optional group file, and creates a database file. To run the following tutorial please download: Example Files

Default Settings

The create.database command parameters are repfasta, list, shared, repname, constaxonomy, group and label. List or shared, and count & constaxonomy are required. NOTE: Make SURE the repfasta, repnames or count and constaxonomy are for the same label as the listfile.

mothur > get.oturep(list=final.an.list, label=0.03, fasta=final.fasta, column=final.dist, name=final.names) 
mothur > classify.otu(list=final.an.list, name=final.names, taxonomy=final.taxonomy, label=0.03)
mothur > create.database(list=final.an.list, label=0.03, repfasta=final.an.0.03.rep.fasta, repname=final.an.0.03.rep.names, constaxonomy=final.an.0.03.cons.taxonomy)

or with a count file:

mothur > get.oturep(list=final.an.unique_list, label=0.03, fasta=final.fasta, column=final.dist, count=final.count_table) 
mothur > classify.otu(list=final.an.unique_list, count=final.count_table, taxonomy=final.taxonomy, label=0.03)
mothur > create.database(list=final.an.unique_list, label=0.03, repfasta=final.an.0.03.rep.fasta, count=final.an.0.03.rep.count_table, constaxonomy=final.an.0.03.cons.taxonomy)

or with and shared file:

mothur > get.oturep(list=final.an.list, label=0.03, fasta=final.fasta, column=final.dist, name=final.names) 
mothur > classify.otu(list=final.an.list, name=final.names, taxonomy=final.taxonomy, label=0.03)
mothur > create.database(shared=final.an.shared, label=0.03, repfasta=final.an.0.03.rep.fasta, repname=final.an.0.03.rep.names, constaxonomy=final.an.0.03.cons.taxonomy)


If you open the final.an.database file you will see:

OTUNumber	Abundance	repSeqName	repSeq	OTUConTaxonomy
1	6307	GQY1XT001C296C	A-GC--GA-G-A-A-G-T-A ... GT-GAA	Bacteria(100);"Bacteroidetes"(100);"Bacteroidia"(100);...
2	5124	GQY1XT001A3TJI	G-GC--GA-G-A-A-G-T-A ... GT-GAA	Bacteria(100);"Bacteroidetes"(100);"Bacteroidia"(100);...
3	3177	GQY1XT001CS2B8	G-GC--GA-G-A-A-G-T-A ... GT-GAA	Bacteria(100);"Bacteroidetes"(100);"Bacteroidia"(100);...
4	2947	GQY1XT001CD9IB	G-GC--GA-G-A-A-G-T-A ... GT-GAA	Bacteria(100);"Bacteroidetes"(100);"Bacteroidia"(100);...
...

list

The list parameter allows you to provide your list file. Mothur creates this file by running the cluster, cluster.split or phylotype commands.

shared

The shared parameter allows you to provide your shared file. Mothur creates this file by running the make.shared command.

count

The count file is the count file outputted by get.oturep(fasta=yourFastaFile, list=yourListfile, column=yourDistFile, count=yourCountFile)

constaxonomy

The constaxonomy file is the taxonomy file outputted by classify.otu(list=yourListfile, name=final.names, taxonomy=yourTaxonomyFile)

Options

repfasta

The repfasta file is fasta file outputted by get.oturep(fasta=yourFastaFile, list=yourListfile, column=yourDistFile, name=yourNameFile) and is optional.

repname

The repname file is the name file outputted by get.oturep(fasta=yourFastaFile, list=yourListfile, column=yourDistFile, name=yourNameFile) and is optional.

group

The group file is optional and will just give you the abundance breakdown by group.

mothur > create.database(list=final.an.list, label=0.03, repfasta=final.an.0.03.rep.fasta, repname=final.an.0.03.rep.names, constaxonomy=final.an.0.03.cons.taxonomy, group=final.groups)

If you open the final.an.database file you will see:

OTUNumber	F003D000 ...	F003D150	repSeqName	repSeq	OTUConTaxonomy
1	422	1012 ... 492  GQY1XT001C296C	A-GC--GA-G-A-A-G-T-A ... GT-GAA	Bacteria(100);"Bacteroidetes"(100);...
2	413	186 ... 707  GQY1XT001A3TJI	G-GC--GA-G-A-A-G-T-A ... GT-GAA	Bacteria(100);"Bacteroidetes"(100);...
3	279	238 ... 342  GQY1XT001CS2B8	G-GC--GA-G-A-A-G-T-A ... GT-GAA	Bacteria(100);"Bacteroidetes"(100);...
4	255	194 ... 410  GQY1XT001CD9IB	G-GC--GA-G-A-A-G-T-A ... GT-GAA	Bacteria(100);"Bacteroidetes"(100);...
...

label

The label parameter allows you to specify a label to be used from your list file.

Revisions

  • 1.25.0 - First Introduced
  • 1.26.0 - added shared parameter
  • 1.31.0 - added count parameter
  • 1.39.0 - Makes refasta and repnames parameters optional