dendropy.datamodel.taxonmodel: Taxonomic Namespace Reference and Management¶
The TaxonNamespace Class¶
- class dendropy.datamodel.taxonmodel.TaxonNamespace(*args, **kwargs)[source]¶
A collection of
Taxonobjects representing a self-contained and complete domain of distinct operational taxonomic unit definitions. Provides the common semantic context in which operational taxonomic units referenced by various phylogenetic data objects (e.g., trees or alignments) can be related.- Parameters:
*args (positional arguments, optional) – Accepts a single iterable as an optional positional argument. If a
TaxonNamespaceobject is passed as the positional argument, then clones or deep-copies of its memberTaxonobjects will be added to this one. If any other iterable is passed as the positional argument, then each string in the iterable will result in a newTaxonobject being constructed and added to the namespace with the string as its label (name), while each Taxon object in the iterable will be added to the namespace directly.**kwargs (keyword arguments) –
- labelstring
The label or name for this namespace.
- is_mutableboolean, optional (default =
True) If
True(default), thenTaxonobjects can be added to this namespace. IfFalse, then addingTaxonobjects will result in an error.- is_case_sensitiveboolean, optional (default =
False) Whether or not taxon names are considered case sensitive or insensitive.
Notes
An empty
TaxonNamespacecan be created (with optional) label andTaxonobjects added later:>>> tns = dendropy.TaxonNamespace(label="taxa") >>> t1 = Taxon("a") >>> tns.add_taxon(t1) >>> t2 = Taxon("b") >>> tns.add_taxon(t2) >>> tns.add_taxon("c") >>> tns <TaxonNamespace 0x106509090 'taxa': [<Taxon 0x10661f050 'a'>, <Taxon 0x10651c590 'b'>, <Taxon 0x106642a90 'c'>]>
Alternatively, an iterable can be passed in as an initializer, and all
Taxonobjects will be added directly while, for each string, a newTaxonobject will be created and added. So, the below are all equivalent to the above:>>> tns = dendropy.TaxonNamespace(["a", "b", "c"], label="taxa")
>>> taxa = [Taxon(n) for n in ["a", "b", "c"]] >>> tns = dendropy.taxonnamespace(taxa, label="taxa")
>>> t1 = Taxon("a") >>> t2 = Taxon("b") >>> taxa = [t1, t2, "c"] >>> tns = dendropy.TaxonNamespace(taxa, label="taxa")
If a
TaxonNamespaceobject is passed as the initializer argument, a shallow copy of the object is constructed:>>> tns1 = dendropy.TaxonNamespace(["a", "b", "c"], label="taxa1") >>> tns1 <TaxonNamespace 0x1097275d0 'taxa1': [<Taxon 0x109727610 'a'>, <Taxon 0x109727e10 'b'>, <Taxon 0x109727e90 'c'>]> >>> tns2 = dendropy.TaxonNamespace(tns1, label="2") >>> tns2 <TaxonNamespace 0x109727d50 'taxa1': [<Taxon 0x109727610 'a'>, <Taxon 0x109727e10 'b'>, <Taxon 0x109727e90 'c'>]>
Thus, while “
tns1” and “tns2” are independent collections, and addition/deletion ofTaxoninstances to one will not effect the other, the label of aTaxoninstance that is an element in one will of course effect the same instance if it is in the other:>>> print(tns1[0].label) >>> a >>> print(tns2[0].label) >>> a >>> tns1[0].label = "Z" >>> print(tns1[0].label) >>> Z >>> print(tns2[0].label) >>> Z
In contrast to actual data (i.e., the
Taxonobjects), alll metadata associated with “tns2” (i.e., theAnnotationSetobject, in theTaxonNamespace.annotationsattribute), will be a full, independent deep-copy.If what is needed is a true deep-copy of the data of a particular
TaxonNamespaceobject, including copies of the memberTaxoninstances, then this can be achieved usingcopy.deepcopy.>>> import copy >>> tns1 = dendropy.TaxonNamespace(["a", "b", "c"], label="taxa1") >>> tns2 = copy.deepcopy(tns1)
- __len__()[source]¶
Returns number of
Taxonobjects in thisTaxonNamespace.
- accession_index(taxon)[source]¶
Returns the accession index of
taxon. Note that this may not be the same as the list index of the taxon if taxa have been deleted from the namespace.
- add_taxa(taxa)[source]¶
Adds multiple
Taxonobjects to self.Each
Taxonobject intaxathat is not already in the collection ofTaxonobjects in this namespace is added to it. If any of theTaxonobjects are already in the collection, then nothing happens. If the namespace is immutable, then TypeError is raised when trying to addTaxonobjects.
- add_taxon(taxon)[source]¶
Adds a new
Taxonobject toself.If
taxonis not already in the collection ofTaxonobjects in this namespace, and this namespace is mutable, it is added to the collection. If it is already in the collection, then nothing happens. If it is not already in the collection, but the namespace is not mutable, then TypeError is raised.
- all_taxa_bitmask()[source]¶
Returns mask of all taxa.
- Returns:
h (integer) – Bitmask spanning all
Taxonobjects in self.
- bitmask_as_newick_string(bitmask, preserve_spaces=False, quote_underscores=True)[source]¶
Represents a split as a newick string.
- Parameters:
bitmask (integer) – Split hash bitmask value.
preserve_spaces (boolean, optional) – If
False(default), then spaces in taxon labels will be replaced by underscores. IfTrue, then taxon labels with spaces will be wrapped in quotes.quote_underscores (boolean, optional) – If
True(default), then taxon labels with underscores will be wrapped in quotes. IfFalse, then the labels will not be wrapped in quotes.
- Returns:
s (string) – NEWICK representation of split specified by
bitmask.
- bitmask_taxa_list(bitmask, index=0)[source]¶
Returns list of
Taxonobjects represented by splitbitmask.
- description(depth=1, indent=0, itemize='', output=None, **kwargs)[source]¶
Returns description of object, up to level
depth.
- discard_taxon_label(label, is_case_sensitive=None, first_match_only=False)[source]¶
Removes all
Taxonobjects with label matchinglabelfrom the collection in this namespace.- Parameters:
label (string or string-like) – The value of the
Taxonobject label to remove.is_case_sensitive (
Noneor bool) – By default, label lookup will use theis_case_sensitiveattribute ofselfto decide whether or not to respect case when trying to match labels to operational taxonomic unit names represented byTaxoninstances. This can be over-ridden by specifyingis_case_sensitivetoTrue(forcing case-sensitivity) orFalse(forcing case-insensitivity).first_match_only (bool) – If
False, then the entire namespace will be searched and allTaxonobjects with the matching labels will be remove. IfTruethen only the firstTaxonobject with a matching label will be removed (i.e., the entire namespace is not searched). Setting this argument toTruewill be more efficient and should be preferred if there are no redundant or duplicate labels.
See also
TaxonNamespace.remove_taxon_labelSimilar, but raises an error if no matching
Taxonobjects are found.
- findall(label, is_case_sensitive=None)[source]¶
Return list of
Taxonobject(s) with label matchinglabel.- Parameters:
label (string or string-like) – The value which the
labelattribute of theTaxonobject(s) to be returned must match.is_case_sensitive (
Noneor bool) – By default, label lookup will use theis_case_sensitiveattribute ofselfto decide whether or not to respect case when trying to match labels to operational taxonomic unit names represented byTaxoninstances. This can be over-ridden by specifyingis_case_sensitivetoTrue(forcing case-sensitivity) orFalse(forcing case-insensitivity).
- Returns:
taxa (
list[Taxon]) – A list containing zero or moreTaxonobjects with labels matchinglabel.
- get_taxa(labels, is_case_sensitive=None, first_match_only=False)[source]¶
Retrieves list of
Taxonobjects with given labels.- Parameters:
labels (
collections.Iterable[string]) – AnyTaxonobject in this namespace collection that has a label attribute that matches any value inlabelswill be included in the list returned.is_case_sensitive (
Noneor bool) – By default, label lookup will use theis_case_sensitiveattribute ofselfto decide whether or not to respect case when trying to match labels to operational taxonomic unit names represented byTaxoninstances. This can be over-ridden by specifyingis_case_sensitivetoTrue(forcing case-sensitivity) orFalse(forcing case-insensitivity).first_match_only (bool) – If
False, then for each label inlabels, the entire namespace will be searched and allTaxonobjects with the matches will be added to the lest. IfTruethen, for each label inlabels, only the firstTaxonobject with a matching label will be added to the list (i.e., the entire namespace is not searched). Setting this argument toTruewill be more efficient and should be preferred if there are no redundant or duplicate labels.
- Returns:
taxa (
list[Taxon]) – A list containing zero or moreTaxonobjects with labels matchinglabel.
- get_taxon(label, is_case_sensitive=None)[source]¶
Retrieves a
Taxonobject with the given label.If multiple
Taxonobjects exist with labels that matchlabel, then only the first one is returned. If noTaxonobject is found in this namespace with the specified critieria,Noneis returned.- Parameters:
label (string or string-like) – The value which the
labelattribute of theTaxonobject to be returned must match.is_case_sensitive (
Noneor bool) – By default, label lookup will use theis_case_sensitiveattribute ofselfto decide whether or not to respect case when trying to match labels to operational taxonomic unit names represented byTaxoninstances. This can be over-ridden by specifyingis_case_sensitivetoTrue(forcing case-sensitivity) orFalse(forcing case-insensitivity).
- Returns:
taxon (|Taxon| object or |None|) – The first
Taxonobject in this namespace collection with a label matchinglabel, orNoneif no suchTaxonobject exists.
- has_taxa_labels(labels, is_case_sensitive=None)[source]¶
Checks for presence of
Taxonobjects with the given labels.- Parameters:
labels (
collections.Iterable[string]) – The values of theTaxonobject labels to match.is_case_sensitive (
Noneor bool) – By default, label lookup will use theis_case_sensitiveattribute ofselfto decide whether or not to respect case when trying to match labels to operational taxonomic unit names represented byTaxoninstances. This can be over-ridden by specifyingis_case_sensitivetoTrue(forcing case-sensitivity) orFalse(forcing case-insensitivity).
- Returns:
b (boolean) – Returns
Trueif, for every element in the iterablelabels, there is at least oneTaxonobject that has a label attribute that matches this.Falseotherwise.
- has_taxon_label(label, is_case_sensitive=None)[source]¶
Checks for presence of a
Taxonobject with the given label.- Parameters:
label (string or string-like) – The value of the
Taxonobject label to match.is_case_sensitive (
Noneor bool) – By default, label lookup will use theis_case_sensitiveattribute ofselfto decide whether or not to respect case when trying to match labels to operational taxonomic unit names represented byTaxoninstances. This can be over-ridden by specifyingis_case_sensitivetoTrue(forcing case-sensitivity) orFalse(forcing case-insensitivity).
- Returns:
b (boolean) –
Trueif there is at least oneTaxonobject in this namespace with a label matching the value oflabel. Otherwise,False.
- label_taxon_map(is_case_sensitive=None)[source]¶
Returns dictionary with taxon labels as keys and corresponding
Taxonobjects as values.If the
TaxonNamespaceis currently case-insensitive, then the dictionary returned will have case-insensitive keys, other the dictionary will be case-sensitive. You can override this by explicitly specifyingis_case_sensitivetoFalseorTrue.No attempt is made to handle collisions.
- labels()[source]¶
Returns list of labels of all
Taxonobjects inself.- Returns:
labels (
list[string]) – List ofTaxon.labelvalues ofTaxonobjects inself.
- new_taxa(labels)[source]¶
Creates and add a new
Taxonwith corresponding label for each label inlabels. Returns list ofTaxonobjects created.- Parameters:
labels (
collections.Iterable[string]) – The values of thelabelattributes of the newTaxonobjects to be created, added to this namespace collection, and returned.- Returns:
taxa (
collections.Iterable[Taxon]) – A list ofTaxonobjects created and added.- Raises:
TypeError – If this namespace is immutable (i.e.
TaxonNamespace.is_mutableisFalse).
- new_taxon(label)[source]¶
Creates, adds, and returns a new
Taxonobject with corresponding label.- Parameters:
label (string or string-like) – The name or label of the new operational taxonomic unit concept.
- Returns:
taxon (|Taxon|) – The new
Taxonobject,
- remove_taxon(taxon)[source]¶
Removes specified
Taxonobject from the collection in this namespace.- Parameters:
- Raises:
ValueError – If
taxonis not in the collection of this namespace.
- remove_taxon_label(label, is_case_sensitive=None, first_match_only=False)[source]¶
Removes all
Taxonobjects with label matchinglabelfrom the collection in this namespace.- Parameters:
label (string or string-like) – The value of the
Taxonobject label to remove.is_case_sensitive (
Noneor bool) – By default, label lookup will use theis_case_sensitiveattribute ofselfto decide whether or not to respect case when trying to match labels to operational taxonomic unit names represented byTaxoninstances. This can be over-ridden by specifyingis_case_sensitivetoTrue(forcing case-sensitivity) orFalse(forcing case-insensitivity).first_match_only (bool) – If
False, then the entire namespace will be searched and allTaxonobjects with the matching labels will be remove. IfTruethen only the firstTaxonobject with a matching label will be removed (i.e., the entire namespace is not searched). Setting this argument toTruewill be more efficient and should be preferred if there are no redundant or duplicate labels.
- Raises:
LookupError – If no
Taxonobjects are found with matching label(s).
See also
TaxonNamespace.discard_taxon_labelsSimilar, but does not raise an error if no matching
Taxonobjects are found.
- require_taxon(label, is_case_sensitive=None)[source]¶
Retrieves a
Taxonobject with the given label, creating it if necessary.Retrieves a Taxon object with the label,
label. If multipleTaxonobjects exist with labels that matchlabel, then only the first one is returned. If no suchTaxonobject exists in the current namespace and theTaxonNamespaceis NOT mutable, an exception is raised. If no suchTaxonobject exists in the current namespace andTaxonNamespaceis mutable, then a newTaxonis created, added, and returned.- Parameters:
label (string or string-like) – The value which the
labelattribute of theTaxonobject to be returned must match.is_case_sensitive (
Noneor bool) – By default, label lookup will use theis_case_sensitiveattribute ofselfto decide whether or not to respect case when trying to match labels to operational taxonomic unit names represented byTaxoninstances. This can be over-ridden by specifyingis_case_sensitivetoTrue(forcing case-sensitivity) orFalse(forcing case-insensitivity).
- Returns:
taxon (|Taxon| object or |None|) – A
Taxonobject in this namespace collection with a label matchinglabel.- Raises:
TypeError – If no
Taxonobject is currently in the collection with a label matching the inputlabeland theis_mutableattribute of self isFalse.
- sort(key=None, reverse=False)[source]¶
Sorts
Taxonobjects in collection. Ifkeyis not given, defaults to sorting by label (i.e.,key = lambda x: x.label).- Parameters:
key (key function object, optional) – Function that takes a
Taxonobject as an argument and returns the value that determines its sort order. Defaults to sorting by label.reverse (boolean, optional) – If
True, sort will be in reverse order.
- split_as_newick_string(split, preserve_spaces=False, quote_underscores=True)[source]¶
Represents a split as a newick string.
- Parameters:
bitmask (integer) – Split hash bitmask value.
preserve_spaces (boolean, optional) – If
False(default), then spaces in taxon labels will be replaced by underscores. IfTrue, then taxon labels with spaces will be wrapped in quotes.quote_underscores (boolean, optional) – If
True(default), then taxon labels with underscores will be wrapped in quotes. IfFalse, then the labels will not be wrapped in quotes.
- Returns:
s (string) – NEWICK representation of split specified by
bitmask.
- taxa_bipartition(**kwargs)[source]¶
Returns a bipartition that represents all taxa specified by keyword-specified list of taxon objects (
taxa=) or labels (labels=).- Parameters:
**kwargs (keyword arguments) –
Requires one of:
- Returns:
b (
list[integer]) – List of split hash bitmask values for specifiedTaxonobjects.
- taxa_bitmask(**kwargs)[source]¶
Retrieves the list of split hash bitmask values representing all taxa specified by keyword-specified list of taxon objects (
taxa=) or labels (labels=).- Parameters:
**kwargs (keyword arguments) –
Requires one of:
- Returns:
b (
list[integer]) – List of split hash bitmask values for specifiedTaxonobjects.
- taxon_bitmask(taxon)[source]¶
Returns bitmask value of split hash for split subtending node with
taxon.
- taxon_namespace_scoped_copy(memo=None)[source]¶
Cloning level: 1. Taxon-namespace-scoped copy: All member objects are full independent instances, except for
TaxonNamespaceandTaxonobjects: these are preserved as references.
The Taxon Class¶
- class dendropy.datamodel.taxonmodel.Taxon(label=None)[source]¶
A taxon associated with a sequence or a node on a tree.
- Parameters:
label (string or
Taxonobject) – Label or name of this operational taxonomic unit concept. If a string, then thelabelattribute ofselfis set to this value. If aTaxonobject, then thelabelattribute ofselfis set to the same value as thelabelattribute the otherTaxonobject and all annotations/metadata are copied.
- description(depth=1, indent=0, itemize='', output=None, **kwargs)[source]¶
Returns description of object, up to level
depth.
- taxon_namespace_scoped_copy(memo=None)[source]¶
Cloning level: 1. Taxon-namespace-scoped copy: All member objects are full independent instances, except for
TaxonNamespaceandTaxonobjects: these are preserved as references.
The TaxonNamespaceAssociated Class¶
- class dendropy.datamodel.taxonmodel.TaxonNamespaceAssociated(taxon_namespace=None)[source]¶
Provides infrastructure for the maintenance of references to taxa.
- migrate_taxon_namespace(taxon_namespace, unify_taxa_by_label=True, taxon_mapping_memo=None)[source]¶
Move this object and all members to a new operational taxonomic unit concept namespace scope.
Current
self.taxon_namespacevalue will be replaced with value given intaxon_namespaceif this is notNone, or a newTaxonNamespaceobject. Following this,reconstruct_taxon_namespace()will be called: each distinctTaxonobject associated withselfor members ofselfthat is not alread intaxon_namespacewill be replaced with a newTaxonobject that will be created with the same label and added toself.taxon_namespace. Calling this method results in the object (and all its member objects) being associated with a new, independent taxon namespace.Label mapping case sensitivity follows the
self.taxon_namespace.is_case_sensitivesetting. IfFalseandunify_taxa_by_labelis alsoTrue, then the establishment of correspondence betweenTaxonobjects in the old and new namespaces with be based on case-insensitive matching of labels. E.g., if there are fourTaxonobjects with labels ‘Foo’, ‘Foo’, ‘FOO’, and ‘FoO’ in the old namespace, then all objects that reference these will reference a single newTaxonobject in the new namespace (with a label some existing casing variant of ‘foo’). IfTrue: ifunify_taxa_by_labelisTrue,Taxonobjects with labels identical except in case will be considered distinct.- Parameters:
taxon_namespace (
TaxonNamespace) – TheTaxonNamespaceinto the scope of which this object will be moved.unify_taxa_by_label (boolean, optional) – If
True, then references to distinctTaxonobjects with identical labels in the current namespace will be replaced with a reference to a singleTaxonobject in the new namespace. IfFalse: references to distinctTaxonobjects will remain distinct, even if the labels are the same.taxon_mapping_memo (dictionary) – Similar to
memoof deepcopy, this is a dictionary that mapsTaxonobjects in the old namespace to correspondingTaxonobjects in the new namespace. Mostly for interal use when migrating complex data to a new namespace. Note that any mappings here take precedence over all other options: if aTaxonobject in the old namespace is found in this dictionary, the counterpart in the new namespace will be whatever value is mapped, regardless of, e.g. label values.
Examples
Use this method to move an object from one taxon namespace to another.
For example, to get a copy of an object associated with another taxon namespace and associate it with a different namespace:
# Get handle to the new TaxonNamespace other_taxon_namespace = some_other_data.taxon_namespace # Get a taxon-namespace scoped copy of a tree # in another namespace t2 = Tree(t1) # Replace taxon namespace of copy t2.migrate_taxon_namespace(other_taxon_namespace)
You can also use this method to get a copy of a structure and then move it to a new namespace:
t2 = Tree(t1) t2.migrate_taxon_namespace(TaxonNamespace())
# Note: the same effect can be achived by: t3 = copy.deepcopy(t1)
See also
- poll_taxa(taxa=None)[source]¶
Returns a set populated with all of
Taxoninstances associated withself.- Parameters:
taxa (set()) – Set to populate. If not specified, a new one will be created.
- Returns:
taxa (set[|Taxon|]) – Set of taxa associated with
self.
- purge_taxon_namespace()[source]¶
Remove all
Taxoninstances inself.taxon_namespacethat are not associated withselfor any item inself.
- reconstruct_taxon_namespace(unify_taxa_by_label=True, taxon_mapping_memo=None)[source]¶
Repopulates the current taxon namespace with new taxon objects, preserving labels. Each distinct
Taxonobject associated withselfor members ofselfthat is not already inself.taxon_namespacewill be replaced with a newTaxonobject that will be created with the same label and added toself.taxon_namespace.Label mapping case sensitivity follows the
self.taxon_namespace.is_case_sensitivesetting. IfFalseandunify_taxa_by_labelis alsoTrue, then the establishment of correspondence betweenTaxonobjects in the old and new namespaces with be based on case-insensitive matching of labels. E.g., if there are fourTaxonobjects with labels ‘Foo’, ‘Foo’, ‘FOO’, and ‘FoO’ in the old namespace, then all objects that reference these will reference a single newTaxonobject in the new namespace (with a label some existing casing variant of ‘foo’). IfTrue: ifunify_taxa_by_labelisTrue,Taxonobjects with labels identical except in case will be considered distinct.Note
Existing
Taxonobjects inself.taxon_namespaceare not removed. This method should thus only be called only whenself.taxon_namespacehas been changed. In fact, typical usage would not involve calling this method directly, but rather through- Parameters:
unify_taxa_by_label (boolean, optional) – If
True, then references to distinctTaxonobjects with identical labels in the current namespace will be replaced with a reference to a singleTaxonobject in the new namespace. IfFalse: references to distinctTaxonobjects will remain distinct, even if the labels are the same.taxon_mapping_memo (dictionary) – Similar to
memoof deepcopy, this is a dictionary that mapsTaxonobjects in the old namespace to correspondingTaxonobjects in the new namespace. Mostly for interal use when migrating complex data to a new namespace.
- reindex_subcomponent_taxa()[source]¶
DEPRECATED: Use
reconstruct_taxon_namespaceinstead. Derived classes should override this to ensure that their various components, attributes and members all refer to the sameTaxonNamespaceobject asself.taxon_namespace, and thatself.taxon_namespacehas all theTaxonobjects in the various members.
- reindex_taxa(taxon_namespace=None, clear=False)[source]¶
DEPRECATED: Use
migrate_taxon_namespace()instead. Rebuildstaxon_namespacefrom scratch, or assignsTaxonobjects from givenTaxonNamespaceobjecttaxon_namespacebased on label values.


