Detailed `taxit` documentation¶

This section gives the detailed documentation on taxit’s subcommands, organized alphabetically.

add_nodes¶

usage: taxit add_nodes [-h] [--schema SCHEMA] [--source-name SOURCE_NAME]
                       [url] FILE

Add nodes and names to a database

The input file specifies new nodes (type: node) and names (type: name)
in yaml format (see
http://fhcrc.github.io/taxtastic/commands.html#add-nodes).

positional arguments:
  url                   Database string URI or filename. If no database scheme
                        specified "sqlite:///" will be prepended.
                        [sqlite:///ncbi_taxonomy.db]
  FILE                  yaml file specifying new nodes

options:
  -h, --help            show this help message and exit
  --source-name SOURCE_NAME
                        Provides the default source name for new nodes. The
                        value is overridden by "source_name" in the input
                        file. If not provided, "source_name" is required in
                        each node or name definition. This source name is
                        created if it does not exist.

database options:
  --schema SCHEMA       Name of SQL schema in database to query (if database
                        flavor supports this).

Add nodes or names to the taxonomy in the specified database. new_nodes should be a yaml format file containing one or more records, each of which specifies a new node or name.

For a new node the following are required:

type: The value must be “node”
tax_id: The tax_id for this new node in the taxonomy, which must not conflict with an existing tax_id. NCBI’s tax_ids are all integers, so it works well to choose an alphabetic prefix for the tax_ids for all new nodes (e.g., name them AB1, AB2, AB3, etc.).
rank: The name of the rank at which this node falls in the taxonomy. Choose from among the ranks specified in table ranks.
parent_id: The tax_id which will be set as the parent of this node in the taxonomy.
names: One or more names to associate with the node. Minimally, must define a single taxonomic name. See description of a name record below.

Required unless a default source name is specified using the --source-name option:

source_name: A string describing the origin of these taxa so that it is easy to find them in the database.

Any combination of the following columns may be specified:

children: A list of tax_ids which should be detached from their current parents and attached to this node as its children.

A minimal example of a record specifying a node (assuming --source-name is provided on the command line):

---
type: node
tax_id: "newid"
parent_id: "1279"
rank: species_group
names:
  - tax_name: between genus and species

A record providing source_name, multiple taxonomic names, plus child nodes:

---
type: node
tax_id: "newid"
parent_id: "1279"
rank: species_group
names:
  - tax_name: between genus and species
    is_primary: true
  - tax_name: another name
source_name: someplace
children:
  - "1280" # Staphylococcus aureus
  - "1281" # Staphylococcus carnosus

A record specifying names to be added to existing nodes has the following required fields:

type: The value must be “name”
tax_id: The tax_id to add names to.
names: A list of taxonomic names. If a single name is provided, requires only tax_name; if more than one, the primary name must be indicated (see examples below).

A minimal example (again, assuming source_name is defined from the command line):

---
type: name
tax_id: bar
names:
  - tax_name: a new name for bar

If there are multiple names:

---
type: name
tax_id: bar
names:
  - tax_name: a new name for bar
    is_primary: true
  - tax_name: another name

Multiple records are delimited by --- and may contain any combination of names and nodes:

---
type: node
tax_id: "newid"
parent_id: "1279"
rank: species_group
names:
  - tax_name: between genus and species
---
type: name
tax_id: bar
names:
  - tax_name: a new name for bar

Note that the nodes and names are added to the database in the order specified; be sure to add parent nodes before children.

add_to_taxtable¶

usage: taxit add_to_taxtable [-h] [-o CSV] CSV CSV

Add nodes to an existing taxtable csv

positional arguments:
  CSV                A taxtable to augment
  CSV                A CSV file containing nodes to add to taxtable. Must
                     contain columns 'tax_id', 'tax_name', 'rank', and
                     'parent_id'. Each record must have a parent_id already in
                     the taxtable, or defined on an earlier row.

options:
  -h, --help         show this help message and exit
  -o CSV, --out CSV  Destination for output taxtable [default: stdout]

check¶

usage: taxit check [-h] REFPKG

Validate a reference package

Checks whether ``REFPKG`` is a valid input for ``pplacer``, that is,
does it have a FASTA file of the reference sequences; a Stockholm file
of their multiple alignment; a Newick formatted tree build from the
aligned sequences; and all the necessary auxiliary information.

positional arguments:
  REFPKG      Path to Refpkg to check

options:
  -h, --help  show this help message and exit

composition¶

usage: taxit composition [-h] [-t csv file] [-i csv file] [-r RANK] [-o OUT]
                         [refpkg]

Show taxonomic composition of a reference package

positional arguments:
  refpkg                the reference package to operate on

options:
  -h, --help            show this help message and exit
  -t csv file, --taxonomy csv file
                        Path to taxtable (ignored if refpkg is provided,
                        required otherwise)
  -i csv file, --seq_info csv file
                        Path to seq_info (ignored if refpkg is provided,
                        required otherwise)
  -r RANK, --rank RANK  show composition at RANK [species]
  -o OUT, --out OUT     rank at which to show composition. Use --rank=tax_id
                        to show original classifications [stdout]

create¶

usage: taxit create [-h] [-c] -P PATH -l LOCUS [-a NAME] [-d TEXT]
                    [-r VERSION] [-f FILE] [-i file] [-m FILE] [-M FILE]
                    [-p FILE] [-R FILE] [-s FILE] [-S FILE] [-t FILE]
                    [-T FILE] [--stats-type {PhyML,FastTree,RAxML}]
                    [--frequency-type {empirical,model}] [--no-reroot]
                    [--rppr RPPR]

Create a reference package

Create a new refpkg at the location specified by the argument to
``-P`` with locus name ``-l``.  All other fields are used to specify
initial metadata and files to add to the refpkg.  If there is already
a refpkg at ``refpkg``, this command will fail unless you specify
``-c`` or ``--clobber``.

options:
  -h, --help            show this help message and exit
  -c, --clobber         Delete an existing reference package.

Required arguments:
  -P PATH, --package-name PATH
                        Name of refpkg to create
  -l LOCUS, --locus LOCUS
                        The locus described by the reference package

Package Metadata:
  -a NAME, --author NAME
                        Person who created the reference package
  -d TEXT, --description TEXT
                        An arbitrary description field
  -r VERSION, --package-version VERSION
                        Release version for the reference package

Input files:
  -f FILE, --aln-fasta FILE
                        Multiple alignment in fasta format
  -i file, --seq-info file
                        CSV format file describing the aligned reference
                        sequences, minimally containing the fields "seqname"
                        and "tax_id"
  -m FILE, --mask FILE  Text file containing a mask
  -M FILE, --model-file FILE
                        File containing model information usually the
                        .bestModel file
  -p FILE, --profile FILE
                        Alignment profile
  -R FILE, --readme FILE
                        README file describing the reference package
  -s FILE, --tree-stats FILE
                        File containing tree statistics (for example
                        RAxML_info.whatever")
  -S FILE, --aln-sto FILE
                        Multiple alignment in Stockholm format
  -t FILE, --tree-file FILE
                        Phylogenetic tree in newick format
  -T FILE, --taxonomy FILE
                        CSV format file defining the taxonomy. Fields include
                        "tax_id","parent_id","rank","tax_name" followed by a
                        column defining tax_id at each rank starting with root

Tree information:
  --stats-type {PhyML,FastTree,RAxML}
                        stats file type [default: attempt to guess from file
                        contents]
  --frequency-type {empirical,model}
                        Residue frequency type from the model. Required for
                        var in collection: PhyML Amino Acid alignments.

Taxonomic Rerooting:
  --no-reroot           Do not reroot the reference package using `rppr
                        reroot`. [default: reroot if `rppr` is available and a
                        taxonomy file is specified]
  --rppr RPPR           Name of the rppr executable. [default: rppr]

Input files

Input files are identified in the refpkg using the following labels (see, for example taxit rp):

Option	File key	Description
`-f`, `--aln-fasta`	`aln_fasta`	Reference sequences in FASTA format
`-i`, `--seq-info`	`seq_info`	CSV describing aligned sequences
`-m`, `--mask`	`mask`	Text file containing sequence mask
`-p`, `--profile`	`profile`	Multiple alignment profile
`-R`, `--readme`	`readme`	A README file for the refpkg
`-s`, `--tree-stats`	`tree_stats`	Typically written by the tree builder
`-S`, `--aln-sto`	`aln_sto`	Stockholm file of reference sequences
`-t`, `--tree-file`	`tree`	Phylogenetic tree in Newick format
`-T`, `--taxonomy`	`taxonomy`	CSV file specifying taxonomy

Examples:

# Create a minimal refpkg
taxit create -P my_refpkg -l "Some locus name"

# Create a refpkg with lots of files in it
taxit create -P another_refpkg -l "Another locus" \
    --author "Boris the mad baboon" --package-version 0.3.1 \
    --aln-fasta seqs.fasta --aln-sto seqs.sto \
    --tree-file seqs.newick --seq-info seqs.csv \
    --profile cmalign.profile --tree-stats RAxML.info \
    --taxonomy taxtable.csv

findcompany¶

usage: taxit findcompany [-h] [-c] [-i INPUT] [-o OUT] taxdb [tax_ids ...]

Find company for lonely nodes

A command meant to follow ``lonelynodes``. Given a list of tax_ids
produced by ``taxit lonelynodes``, produces another list of species
tax_ids that can be added to the taxtable that would render those
tax_ids no longer lonely.

positional arguments:
  taxdb                 Taxonomy database to work from
  tax_ids               Tax IDs to look up

options:
  -h, --help            show this help message and exit
  -c, --cut             Produce only one output tax_id per input tax_id,
                        whether or not the output species would themselves be
                        lonely.
  -i INPUT, --input INPUT
                        Text file to read Tax IDs from, one per line
  -o OUT, --out OUT     Output file for new taxids

Examples:

taxit findcompany taxonomy.db -i taxids.txt -o newtaxids.txt
taxit findcompany taxonomy.db 31661 5213 564

info¶

usage: taxit info [-h] [-n] [-t] [-l] refpkg

Show information about reference packages.

positional arguments:
  refpkg           the reference package to operate on

options:
  -h, --help       show this help message and exit
  -n, --seq-names  print a list of sequence names
  -t, --tally      print a tally of sequences representing each taxon at rank
                   RANK
  -l, --lengths    print sequence lengths

lineage_table¶

usage: taxit lineage_table [-h] [--seqname-col NAME] [--tax-id-col NAME]
                           [-c FILE] [-t FILE]
                           FILE FILE

Create a table of lineages as taxonimic names for a collection of sequences

Minimal inputs are a taxtable and a file providing a mapping of
sequence names to tax_ids. Outputs are one or more of:

* a table of taxonomic lineges in csv format

* a "taxonomy" file formatted for MOTHUR
  (https://mothur.org/wiki/Taxonomy_File). Ranks are limited to the
  following, with corresponding abbreviations:

    ('species', 's'),
    ('genus', 'g'),
    ('family', 'f'),
    ('order', 'o'),
    ('class', 'c'),
    ('phylum', 'p'),
    ('domain', 'd'),

  Lineages are truncated to either the most specific defined rank or
  species, and missing tax_names at a given rank are replaced with the
  tax_name of the parent, eg

    "...;f__something;g__<None>;s__whatever"

  would become

    "...;f__something;g__something_unclassified;s__whatever"

options:
  -h, --help            show this help message and exit

input options:
  FILE                  output of "taxit taxtable" containing all tax_ids
                        represented in "seq_info"
  FILE                  csv file providing a mapping of sequence names to
                        tax_ids
  --seqname-col NAME    name of column in "seq_info" containing sequence names
  --tax-id-col NAME     name of column in "seq_info" containing tax_ids

Output options:
  -c FILE, --csv-table FILE
                        Output file containing lineages for each sequence name
                        in csv format
  -t FILE, --taxonomy-table FILE
                        "taxonomy" file formatted for MOTHUR

Examples:

taxit taxtable taxonomy.db -i seq_info.csv -o taxtable.csv
taxit lineage_table taxtable.csv seq_info.csv \
    --csv-table taxonomy.csv --taxonomy-table taxonomy.txt

taxonomy.txt looks like this:

s1    "pk__Bacteria";"ph__Firmicutes";"cl__Bacilli";"or__Bacillales";"fa__Staphylococcaceae";"ge__Staphylococcus";"sp__Staphylococcus aureus"
s2    "pk__Bacteria";"ph__Firmicutes";"cl__Bacilli";"or__Bacillales";"fa__Staphylococcaceae";"ge__Staphylococcus";"sp__Staphylococcus equorum"
s3    "pk__Bacteria";"ph__Firmicutes";"cl__Bacilli";"or__Bacillales";"fa__Staphylococcaceae";"ge__Staphylococcus";"sp__Staphylococcus equorum"
s4    "pk__Bacteria";"ph__Firmicutes";"cl__Bacilli";"or__Bacillales";"fa__Staphylococcaceae";"ge__Staphylococcus";"sp__unclassified"

lonelynodes¶

usage: taxit lonelynodes [-h] [-o OUT] [-r RANKS] taxtable_or_refpkg

Extracts tax ids of all lonely nodes in a taxtable

Find nodes in ``target`` (which can be a CSV file extracted by ``taxit
taxtable`` or a RefPkg containing such a file) which are lonely; that
is, whose parents have only one child. Print them, one per line, to
``stdout`` or to the file specified by the ``-o`` option.

positional arguments:
  taxtable_or_refpkg    A taxtable or a refpkg containing a taxtable

options:
  -h, --help            show this help message and exit
  -o OUT, --out OUT     Write output to given file [default: stdout]
  -r RANKS, --ranks RANKS
                        Comma separated list of ranks to consider [default:
                        all ranks]

Examples:

# Find lonely nodes in RefPkg mypkg-0.1.refpkg
taxit lonelynodes mypkg-0.1.refpkg

new_database¶

usage: taxit new_database [-h] [--schema SCHEMA] [--no-clobber] [-n]
                          [-a {error,warn}] [-z FILE.zip] [-u URL] [-p PATH]
                          [--out OUT]
                          [url]

Download NCBI taxonomy and create a database

Download the current version of the NCBI taxonomy and load it into
``database_file`` as an SQLite3 database.  If ``database_file``
already exists it be will overwritten unless you specify ``--no-clobber``.
The NCBI taxonomy will be downloaded into
the same directory as ``database_file`` will be created in unless you
specify ``-p`` or ``--download-dir``.

positional arguments:
  url                   Database string URI or filename. If no database scheme
                        specified "sqlite:///" will be prepended.
                        [sqlite:///ncbi_taxonomy.db]

options:
  -h, --help            show this help message and exit
  --no-clobber          If database exists keep current data and append new
                        data. [False]
  -n, --no-load         Create schema and exit
  -a {error,warn}, --unknown-action {error,warn}
                        action to perform for unknown ranks [error]
  --out OUT             table sql

database options:
  --schema SCHEMA       Name of SQL schema in database to query (if database
                        flavor supports this).

download options:
  -z FILE.zip, --taxdump-file FILE.zip
                        Location of zipped taxdump file [taxdmp.zip]
  -u URL, --taxdump-url URL
                        Url to taxdump file
                        [https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdmp.zip]
  -p PATH, --download-dir PATH
                        Name of the directory into which to download the zip
                        archive. [default is the same directory as the
                        database file]

Examples:

Download the NCBI taxonomy and create taxonomy.db if it does not exist:
taxit new_database taxonomy.db
Force the creation of taxonomy.db in the parent directory, putting the downloaded NCBI data in /tmp/ncbi:
taxit new_database ../taxonomy.db -p /tmp/ncbi

refpkg_intersection¶

usage: taxit refpkg_intersection [-h] -c REFPKG -r RANKS [--all-ranks]
                                 [-o OUT]
                                 infile

Find the intersection of a taxtable and a refpkg's taxonomy.

positional arguments:
  infile                taxtable to compare against

options:
  -h, --help            show this help message and exit
  -c REFPKG, --refpkg REFPKG
                        refpkg to insert into
  -r RANKS, --ranks RANKS
                        ranks to list in the output
  --all-ranks           don't filter by the lowest rank; list all
                        intersections
  -o OUT, --out OUT     output file in csv format (default is stdout)

reroot¶

usage: taxit reroot [-h] [--rppr RPPR] [-p] refpkg

Taxonomically reroots a reference package

Calls ``rppr reroot`` to generate a rerooted tree from the tree in
``refpkg`` and writes it back to the refpkg.  The refpkg ``refpkg``
must contain the necessary inputs for ``pplacer`` for this to work.

positional arguments:
  refpkg         the reference package to operate on

options:
  -h, --help     show this help message and exit
  --rppr RPPR    specify the rppr binary to call to perform the rerooting
  -p, --pretend  don't save the rerooted tree; just attempt the rerooting.

Examples:

Reroot the tree in my_refpkg:

taxit reroot my_refpkg

Try running reroot without modifying the refpkg, using a particular version of rppr:

taxit reroot --rppr ~/local/bin/rppr -p my_refpkg

rollback¶

usage: taxit rollback [-h] [-n int] refpkg

Undo an operation performed on a refpkg

Rollback ``N`` operations on ``refpkg`` (default to 1 operation if
``-n`` is omitted).  This is equivalent to calling the ``rollback()``
method of ``taxtastic.refpkg.Refpkg``.  If there are not at least
``N`` operations that can be rolled back, an error is returned and no
changes are made to the refpkg.

positional arguments:
  refpkg      the reference package to operate on

options:
  -h, --help  show this help message and exit
  -n int      Number of operations to roll back

Examples:

Update the author on my_refpkg, then revert the change:

taxit update --metadata 'author=Boris the mad baboon'
taxit rollback my_refpkg

Roll back the last 3 operations on my_refpkg:

taxit rollback -n 3 my_refpkg

rollforward¶

usage: taxit rollforward [-h] [-n int] refpkg

Restore a change to a refpkg immediately after being reverted

Restore the last ``N`` rolled back operations on ``refpkg``, or the
last operation if ``-n`` is omitted.  If there are not at least ``N``
operations that can be rolled forward on this refpkg, then an error is
returned and no changes are made to the refpkg.

Note that operations can only be rolled forward immediately after
being rolled back.  If any operation besides a rollback occurs, all
roll forward information is removed.

positional arguments:
  refpkg      the reference package to operate on

options:
  -h, --help  show this help message and exit
  -n int      Number of operations to roll back

Examples:

Roll back the last operation on my_refpkg, then restore it:

taxit rollback my_refpkg
taxit rollforward my_refpkg

Roll forward the last 3 rollbacks on my_refpkg:

taxit rollforward -n 3 my_refpkg

rp (resolve path)¶

usage: taxit rp [-h] refpkg KEY

Resolve path; get the path to a file in the reference package

See online documentation for ``taxit create`` for a list of
permissible values for ``KEY``

For example, write the absolute path to the file containing the
phylogenetic tree in ``my.refpkg`` to stdout::

  taxit rp my.refpkg tree

Examine the contents of the seq_info file::

  less $(taxit rp my.refpkg seq_info)

positional arguments:
  refpkg      the reference package to operate on
  KEY         show the path for file identified by KEY

options:
  -h, --help  show this help message and exit

strip¶

usage: taxit strip [-h] refpkg

Remove rollback and rollforward information from a refpkg

Delete everything in the refpkg not relevant to the current state,
including all files no longer referred to, as well as all rollback and
rollforward information. The log is preserved, with a new entry
entered indicating that ``refpkg`` was stripped.

positional arguments:
  refpkg      the reference package to operate on

options:
  -h, --help  show this help message and exit

Examples:

Perform an update:

taxit update my_refpkg hilda=file1

After this, file1 is still in the refpkg, but not referred to except by the rollback information:

taxit update my_refpkg hilda=file2

Now strip deletes file1, and the rollback and rollforward information:

taxit strip my_refpkg

taxids¶

usage: taxit taxids [-h] [--schema SCHEMA] [-f FILE | -n NAMES] [-o FILE]
                    [url]

Convert a list of taxonomic names into a recursive list of species
leve tax_ids.

``The names to convert can be specified in a text file with one name
per line (the ``-f`` or ``--name-file`` options) or on the command
line as a comma delimited list (the ``-n`` of ``--name`` options).

positional arguments:
  url                   Database string URI or filename. If no database scheme
                        specified "sqlite:///" will be prepended.
                        [sqlite:///ncbi_taxonomy.db]

options:
  -h, --help            show this help message and exit

database options:
  --schema SCHEMA       Name of SQL schema in database to query (if database
                        flavor supports this).

Input options:
  -f FILE, --name-file FILE
                        file containing a list of taxonomic names, one per
                        line
  -n NAMES, --name NAMES
                        list of taxonomic names provided as a comma-delimited
                        list on the command line

Output options:
  -o FILE, --out FILE   output file

Examples:

Look up two species and print their tax_ids to stdout, one per line:

taxit taxids ncbi_database.db -n "Lactobacillus crispatus,Lactobacillus helveticus"

Read the species from some_names.txt and write their tax_ids to some_taxids.txt:

taxit taxids ncbi_database.db -f some_names.txt -o some_taxids.txt

taxtable¶

usage: taxit taxtable [-h] [--schema SCHEMA] [-t TAX_IDS [TAX_IDS ...]]
                      [-f FILE] [-i SEQ_INFO] [-a {error,warn}] [-o FILE]
                      [url]

Create a tabular representation of taxonomic lineages

Write a CSV file containing the minimal subset of the taxonomy in
``database_file`` representing all of the lineages specified by the
provided tax_ids. Duplicate tax_ids are ignored.

By default the CSV is written to ``stdout``, unless a file is
specified with ``-o/--outfile``.

positional arguments:
  url                   Database string URI or filename. If no database scheme
                        specified "sqlite:///" will be prepended.
                        [sqlite:///ncbi_taxonomy.db]

options:
  -h, --help            show this help message and exit
  -a {error,warn}, --unknown-action {error,warn}
                        action to perform for tax_ids not present in database
                        [error]

database options:
  --schema SCHEMA       Name of SQL schema in database to query (if database
                        flavor supports this).

input options:
  -t TAX_IDS [TAX_IDS ...], --tax-ids TAX_IDS [TAX_IDS ...]
                        one or more space-delimited tax_ids (eg "-t 47770
                        33945")
  -f FILE, --tax-id-file FILE
                        File containing a whitespace-delimited list of tax_ids
                        (ie, separated by tabs, spaces, or newlines.
  -i SEQ_INFO, --seq-info SEQ_INFO
                        Read tax_ids from sequence info file, minimally
                        containing a column named "tax_id"

Output options:
  -o FILE, --outfile FILE
                        Output file containing lineages for the specified taxa
                        in csv format; writes to stdout if unspecified

Examples:

Extract tax_ids 47770 and 33945 and all nodes connecting them to the root.:

taxit taxtable taxonomy.db -t 47770,33945

The same as above, but write the output to subtax.csv instead of stdout:

taxit taxtable taxonomy.db -t 47770,33945 -o subtax.csv

Extract the same tax_ids, plus the taxa specifies in taxnames.txt:

taxit taxtable taxonomy.db -t 47770,33945 -n taxnames.txt -o taxonomy_from_both.csv

update¶

usage: taxit update [-h] [--metadata]
                    [--stats-type {PhyML,FastTree,RAxML,IQTREE}]
                    [--frequency-type {empirical,model}]
                    refpkg [key=value ...]

Add or modify files or metadata in a refpkg

Update ``refpkg`` to set ``key`` to ``some value``.  If ``--metadata``
is specified, the update is done to the metadata.  Otherwise ``some
value`` is treated as the path to a file, and that file is updated in
``refpkg``.  An arbitrary of "key=value" pairs can be specified on the
command line.  If the same key is specified twice, the later
occurrence dominates.

All updates specified to an instance of this command are run as a
single operation, and will all be undone by a single rollback.

For example::

  taxit update my-refpkg meep=../otherdir/boris hilda=abcd

If a file already exists under a given key, it is overwritten.

The --metadata option causes a change to the metadata instead of
files.  For example, to set the author field to "Genghis Khan" and the
version to 0.4.3::

  taxit update --metadata "author=Genghis Khan" version=0.4.3

Other examples:

Set the author in my_refpkg::

  taxit update my_refpkg --metadata "author=Boris the mad baboon"

Set the author and version at once::

  taxit update my_refpkg --metadata "author=Bill" "package_version=1.7.2"

Insert a file into the refpkg::

  taxit update my_refpkg "aln_fasta=/path/to/a/file.fasta"

positional arguments:
  refpkg                the reference package to operate on
  key=value             keys to update, in key=some_file format

options:
  -h, --help            show this help message and exit
  --metadata            Update metadata instead of files

Tree inference log file parsing (for updating `tree_stats`):
  --stats-type {PhyML,FastTree,RAxML,IQTREE}
                        stats file type [default: attempt to guess from file
                        contents]
  --frequency-type {empirical,model}
                        Residue frequency type from the model. Required for
                        PhyML Amino Acid alignments.

update_taxids¶

usage: taxit update_taxids [-h] [--schema SCHEMA] [--delimiter]
                           [--taxid-column] [--unknowns]
                           [-a {drop,ignore,error}] [-o]
                           infile [url]

Update obsolete tax_ids

Replaces tax_ids as specified in table 'merged' in the taxonomy
database. Use in preparation for ``taxit taxtable``. Takes sequence
info file as passed to ``taxit create --seq-info``

positional arguments:
  infile                Input file with taxids. Use "-" for stdin.
  url                   Database string URI or filename. If no database scheme
                        specified "sqlite:///" will be prepended.
                        [sqlite:///ncbi_taxonomy.db]

options:
  -h, --help            show this help message and exit
  --delimiter           Infile columns delimiter [,]
  --taxid-column        name of column or index if headerless containing
                        tax_ids to be replaced [tax_id]
  --unknowns            optional output file containing rows with unknown
                        tax_ids having no replacements in merged table
  -a {drop,ignore,error}, --unknown-action {drop,ignore,error}
                        action to perform for tax_ids with no replacement in
                        merged table [error]
  -o , --outfile        Modified version of input file [stdout]

database options:
  --schema SCHEMA       Name of SQL schema in database to query (if database
                        flavor supports this).

Detailed `taxit` documentation¶

add_nodes¶

add_to_taxtable¶

check¶

composition¶

create¶

findcompany¶

info¶

lineage_table¶

lonelynodes¶

new_database¶

refpkg_intersection¶

reroot¶

rollback¶

rollforward¶

rp (resolve path)¶

strip¶

taxids¶

taxtable¶

update¶

update_taxids¶

Table of Contents

Previous topic

This Page

Detailed taxit documentation¶

add_nodes¶

add_to_taxtable¶

check¶

composition¶

create¶

findcompany¶

info¶

lineage_table¶

lonelynodes¶

new_database¶

refpkg_intersection¶

reroot¶

rollback¶

rollforward¶

rp (resolve path)¶

strip¶

taxids¶

taxtable¶

update¶

update_taxids¶

Detailed `taxit` documentation¶