DendroPy 4 Changes and Migration Primer¶
Introduction¶
Updated for full (and exclusive) Python 3.x compatibility.
Faster, better, stronger! Core serialization/deserialization infrastructure rewritten from the ground up, with many optimizations for speed and reliability.
Python Version Compatibility¶
Compatibility: Python 3 is fully supported. The only version of Python 2 supported is Python 2.7.
Python 2: Python 2.7
Python 3: Python 3.1, 3.2, 3.3, 3.4
Library-Wide Changes¶
Public Module Reorganization¶
A number of modules have been renamed, moved, or split into multiple modules. Calls to the old module should continue to work, albeit with warnings exhorting that you update to the latest configuration.
dendropy.treecalchas been split into three submodules depending on whether the statistic or value being calculated is on a single tree, a single tree and a dataset, or two trees:
dendropy.calculate.treemeasureFor calculation of statistics, metrics, and values on a single tree.
dendropy.calculate.treecompareFor calculation of statistics, metrics, and values of two trees (e.g., Robinson-Fould’s distances).
dendropy.calculate.treescoreFor calculation of statistics, metrics, and values of a tree with reference to a dataset under some criterion.The functionality provided
dendropy.treesplithas been largely subsumed by the newBipartitionclass.The functionality provided by
dendropy.treesumhas been largely subsumed by the newTreeArrayclass, a high-performance class for efficiently managing and carrying out operations on large collections of large trees.
dendropy.reconcilehas been moved todendropy.model.reconcile.
dendropy.coalescenthas been moved todendropy.model.coalescent.
dendropy.popgenstathas been moved todendropy.calculate.popgenstat.
dendropy.treesimhas been moved todendropy.simulate.treesim.
dendropy.popgensimhas been moved todendropy.simulate.popgensim.
Behind-the-Scenes Module Reorganization¶
In constrast to the above, the following changes should be opaque to most normal usage and client code. Most of the names (classes/methods/variables) in these modules were imported into the ‘
dendropy’ namespace, and this is how all public code should be accessing them, or they were never exposed (or meant to be exposed) for public usage in the first place. A list of module changes:DendroPy 3
DendroPy 4
dendropy.dataobject.basedendropy.dataobject.taxondendropy.dataobject.treedendropy.datamodel.treemodeldendropy.datamodel.treecollectionmodeldendropy.dataobject.chardendropy.datamodel.charstatemodel,dendropy.datamodel.charmatrixmodel
Unique Object Identifier (”oid”) Attributes Removed¶
The entire
oidsystem (“object identifier”), i.e., the unique id assigned to every data object, has been removed. This was an implementation artifact from NEXML parsing that greatly slowed down a number of operations without any benefit or utility for most normal operations.
TaxonSet is now TaxonNamespace¶
The
dendropy.TaxonSetclass has been renamedTaxonNamespace, (and the correspondingtaxon_setattribute of phylogenetic data objects that reference a taxonomic context has been renamedtaxon_namespace).The
TaxonNamespaceclass replaces theTaxonSetclass as the manager for theTaxonobjects.The API is largely similar with the following differences:
Calls to the
__getitem__and__delitem__methods (e.g.TaxonNamespace[x]) now only accept integer values as arguments (representing indexes into the list ofTaxonobjects in the internal array).TaxonSet.has_taxonandTaxonSet.has_taxahave beenreplaced by
TaxonNamespace.has_taxon_labelandTaxonNamespace.has_taxa_labelsrespectively.
- Various new methods for accessing and managing the collection of
Taxonobjects (e.g.,findall,remove_taxon,remove_taxon_label,discard_taxon_label,__delitem__, etc.)
Numerous look-up methods took ‘
case_insensitive’ as an argument that determined whether the look-up was case sensitive or not (when retrieving, for example, aTaxonobject corresponding to a particular label), which, if not specified, default toFalse, i.e. a non-caseless or a case-sensitive matching criteria. In all cases, this has been changed to to ‘case_sensitive’ with a default ofTrue. That is, searches by default are still case-sensitive, but now you will have to specify ‘case_sensitive=False’ instead of ‘case_insensitive=True’ to perform a case-insensitive search. This change was for consistency with the rest of the library.
In most cases, a simple global search-and-replace of “TaxonSet” with “TaxonNamespace” and “
taxon_set” with “taxon_namespace” should be sufficient to bring existing code into line with DendroPy 4.For legacy support, a class called
TaxonSetexists. This derives with no modifications fromTaxonNamespace. Instantiating objects of this class will result in warnings being emitted. As long as usage ofTaxonSetdoes conforms to the above API change notes, old or legacy code should continue to work unchanged (albeit, with some warning noise). This support is temporary and will be removed in upcoming releases: code should update to usingTaxonNamespaceas soon as expedient.For legacy support, “
taxon_set” continues to be accepted and processed as an attribute name and keyword argument synonymous with “taxon_namespace”. Usage of this will result in warnings being emitted, but code should continue to function as expected. This support is temporary and will be removed in upcoming releases: code should update to using “taxon_namespace” as soon as expedient.
The Node Class¶
Constructor now only accepts keyword arguments (and
oidis not one of them!).add_childno longer acceptsposas an argument to indicate position in which a child should be inserted. Useinsert_childwhich takes a position specified byindexand a node specified bynodefor this functionality instead.
The Edge Class¶
Constructor now only accepts keyword arguments (and
oidis not one of them!).Because
tail_nodeis no longer an independent attribute but a dynamic property, bound toNode._parent_nodeattribute of thehead_node(see below), theEdgeconstructor does not accepttail_nodeas an argument.The
tail_nodeof anEdgeobject is now a dynamic property, referencing theNode._parent_nodeattribute of theEdge._head_nodeof theEdgeobject. So, now updatingEdge._tail_nodeof anEdgeobject will set theNode._parent_nodeof itsEdge._head_nodeto the new value, and vice versa. This avoids the need for independent book-keeping logic to ensure thatNode._parent_nodeandEdge._tail_nodeare always synchronized to reference the sameNodeobject and all the potential errors this might cause.
The Tree Class¶
Constructor no longer supports they
streamkeyword argument to construct the newTreeobject from a data source. Use the factory class method:get_from_streaminstead.nodes: sorting option removed; usenodes())instead.node_set: removed; usenodes())instead.edge_set: removed; useedges())instead.For consistency with
preorder_node_iter,postorder_node_iter, a number of iteration methods have been renamed.DendroPy 3
DendroPy 4
Tree.level_order_node_iter()Tree.level_order_edge_iter()Node.level_order_iter()Tree.age_order_node_iter()Node.age_order_iter()Tree.leaf_iter()The old names are still supported for now (with warnings being emitted), but new code should start using the newer names. In additon, support for in-order or infix tree traversal has been added:
inorder_node_iter,inorder_edge_iter.Instead of
tree_source_iterandmulti_tree_source_iter, useyield_from_files
NEWICK-format Reading¶
The
suppress_external_taxon_labelsandsuppress_external_node_labelskeyword arguments have been replaced bysuppress_leaf_taxon_labelsandsuppress_leaf_node_labels, respectively. This is for consistency with the rest of the library (including writing in NEWICK-format), which uses the term “leaf” rather than “external”.The various boolean rooting directive switches (
as_rooted,default_as_rooted, etc.) have been replaced by a single argument:rooting. This can take on one of the following (string) values:- rooting=”default-unrooted”
Interpret trees following rooting token (”
[&R]” for rooted, “[&U]” for unrooted) if present; otherwise, intrepret trees as unrooted.
- rooting”default-rooted”
Interpret trees following rooting token (”
[&R]” for rooted, “[&U]” for unrooted) if present; otherwise, intrepret trees as rooted.
- rooting=”force-unrooted”
Unconditionally interpret all trees as unrooted.
- rooting=”force-rooted”
Unconditionally interpret all trees as rooted.
The value of the “
rooting” argument defaults to “default-unrooted”, i.e., all trees are assumed to be unrooted unless a rooting token is present that explicitly specifies the rooting state.
NEWICK-format Writing¶
Previously, if
annotations_as_nhxwasTrue, metadata annotations would be written out even ifsuppress_annotationswasTrue. Now,suppress_annotationsmust beTruefor annotations to be written out, even ifannotations_as_nhxisTrue.
The DataSet Class¶
Constructor no longer supports they
streamkeyword argument to construct the newDataSetobject from a data source. Use the factory class method:DataSet.get_from_streaminstead.Constructor only accepts one unnamed (positional) argument: either a
DataSetinstance to be cloned, or an iterable ofTaxonNamespace,TreeList, orCharacterMatrix-derived instances to be composed (added) into the newDataSetinstance.TaxonNamespaceno longer managed.


