DendroPy 4 Changes and Migration Primer¶
Introduction¶
Updated for full (and exclusive) Python 3.x compatibility.
Faster, better, stronger! Core serialization/deserialization infrastructure rewritten from the ground up, with many optimizations for speed and reliability.
Python Version Compatibility¶
Compatibility: Python 3 is fully supported. The only version of Python 2 supported is Python 2.7.
Python 2: Python 2.7
Python 3: Python 3.1, 3.2, 3.3, 3.4
Library-Wide Changes¶
Public Module Reorganization¶
A number of modules have been renamed, moved, or split into multiple modules. Calls to the old module should continue to work, albeit with warnings exhorting that you update to the latest configuration.
dendropy.treecalc
has been split into three submodules depending on whether the statistic or value being calculated is on a single tree, a single tree and a dataset, or two trees:
dendropy.calculate.treemeasure
For calculation of statistics, metrics, and values on a single tree.
dendropy.calculate.treecompare
For calculation of statistics, metrics, and values of two trees (e.g., Robinson-Fould’s distances).
dendropy.calculate.treescore
For calculation of statistics, metrics, and values of a tree with reference to a dataset under some criterion.The functionality provided
dendropy.treesplit
has been largely subsumed by the newBipartition
class.The functionality provided by
dendropy.treesum
has been largely subsumed by the newTreeArray
class, a high-performance class for efficiently managing and carrying out operations on large collections of large trees.
dendropy.reconcile
has been moved todendropy.model.reconcile
.
dendropy.coalescent
has been moved todendropy.model.coalescent
.
dendropy.popgenstat
has been moved todendropy.calculate.popgenstat
.
dendropy.treesim
has been moved todendropy.simulate.treesim
.
dendropy.popgensim
has been moved todendropy.simulate.popgensim
.
Behind-the-Scenes Module Reorganization¶
In constrast to the above, the following changes should be opaque to most normal usage and client code. Most of the names (classes/methods/variables) in these modules were imported into the ‘
dendropy
’ namespace, and this is how all public code should be accessing them, or they were never exposed (or meant to be exposed) for public usage in the first place. A list of module changes:DendroPy 3
DendroPy 4
dendropy.dataobject.base
dendropy.dataobject.taxon
dendropy.dataobject.tree
dendropy.datamodel.treemodel
dendropy.datamodel.treecollectionmodel
dendropy.dataobject.char
dendropy.datamodel.charstatemodel
,dendropy.datamodel.charmatrixmodel
Unique Object Identifier (”oid
”) Attributes Removed¶
The entire
oid
system (“object identifier”), i.e., the unique id assigned to every data object, has been removed. This was an implementation artifact from NEXML parsing that greatly slowed down a number of operations without any benefit or utility for most normal operations.
TaxonSet
is now TaxonNamespace
¶
The
dendropy.TaxonSet
class has been renamedTaxonNamespace
, (and the correspondingtaxon_set
attribute of phylogenetic data objects that reference a taxonomic context has been renamedtaxon_namespace
).The
TaxonNamespace
class replaces theTaxonSet
class as the manager for theTaxon
objects.The API is largely similar with the following differences:
Calls to the
__getitem__
and__delitem__
methods (e.g.TaxonNamespace[x]
) now only accept integer values as arguments (representing indexes into the list ofTaxon
objects in the internal array).TaxonSet.has_taxon
andTaxonSet.has_taxa
have beenreplaced by
TaxonNamespace.has_taxon_label
andTaxonNamespace.has_taxa_labels
respectively.
- Various new methods for accessing and managing the collection of
Taxon
objects (e.g.,findall
,remove_taxon
,remove_taxon_label
,discard_taxon_label
,__delitem__
, etc.)
Numerous look-up methods took ‘
case_insensitive
’ as an argument that determined whether the look-up was case sensitive or not (when retrieving, for example, aTaxon
object corresponding to a particular label), which, if not specified, default toFalse
, i.e. a non-caseless or a case-sensitive matching criteria. In all cases, this has been changed to to ‘case_sensitive
’ with a default ofTrue
. That is, searches by default are still case-sensitive, but now you will have to specify ‘case_sensitive=False
’ instead of ‘case_insensitive=True
’ to perform a case-insensitive search. This change was for consistency with the rest of the library.
In most cases, a simple global search-and-replace of “TaxonSet” with “TaxonNamespace” and “
taxon_set
” with “taxon_namespace
” should be sufficient to bring existing code into line with DendroPy 4.For legacy support, a class called
TaxonSet
exists. This derives with no modifications fromTaxonNamespace
. Instantiating objects of this class will result in warnings being emitted. As long as usage ofTaxonSet
does conforms to the above API change notes, old or legacy code should continue to work unchanged (albeit, with some warning noise). This support is temporary and will be removed in upcoming releases: code should update to usingTaxonNamespace
as soon as expedient.For legacy support, “
taxon_set
” continues to be accepted and processed as an attribute name and keyword argument synonymous with “taxon_namespace
”. Usage of this will result in warnings being emitted, but code should continue to function as expected. This support is temporary and will be removed in upcoming releases: code should update to using “taxon_namespace
” as soon as expedient.
The Node
Class¶
Constructor now only accepts keyword arguments (and
oid
is not one of them!).add_child
no longer acceptspos
as an argument to indicate position in which a child should be inserted. Useinsert_child
which takes a position specified byindex
and a node specified bynode
for this functionality instead.
The Edge
Class¶
Constructor now only accepts keyword arguments (and
oid
is not one of them!).Because
tail_node
is no longer an independent attribute but a dynamic property, bound toNode._parent_node
attribute of thehead_node
(see below), theEdge
constructor does not accepttail_node
as an argument.The
tail_node
of anEdge
object is now a dynamic property, referencing theNode._parent_node
attribute of theEdge._head_node
of theEdge
object. So, now updatingEdge._tail_node
of anEdge
object will set theNode._parent_node
of itsEdge._head_node
to the new value, and vice versa. This avoids the need for independent book-keeping logic to ensure thatNode._parent_node
andEdge._tail_node
are always synchronized to reference the sameNode
object and all the potential errors this might cause.
The Tree
Class¶
Constructor no longer supports they
stream
keyword argument to construct the newTree
object from a data source. Use the factory class method:get_from_stream
instead.nodes
: sorting option removed; usenodes())
instead.node_set
: removed; usenodes())
instead.edge_set
: removed; useedges())
instead.For consistency with
preorder_node_iter
,postorder_node_iter
, a number of iteration methods have been renamed.DendroPy 3
DendroPy 4
Tree.level_order_node_iter()
Tree.level_order_edge_iter()
Node.level_order_iter()
Tree.age_order_node_iter()
Node.age_order_iter()
Tree.leaf_iter()
The old names are still supported for now (with warnings being emitted), but new code should start using the newer names. In additon, support for in-order or infix tree traversal has been added:
inorder_node_iter
,inorder_edge_iter
.Instead of
tree_source_iter
andmulti_tree_source_iter
, useyield_from_files
NEWICK-format Reading¶
The
suppress_external_taxon_labels
andsuppress_external_node_labels
keyword arguments have been replaced bysuppress_leaf_taxon_labels
andsuppress_leaf_node_labels
, respectively. This is for consistency with the rest of the library (including writing in NEWICK-format), which uses the term “leaf” rather than “external”.The various boolean rooting directive switches (
as_rooted
,default_as_rooted
, etc.) have been replaced by a single argument:rooting
. This can take on one of the following (string) values:- rooting=”default-unrooted”
Interpret trees following rooting token (”
[&R]
” for rooted, “[&U]
” for unrooted) if present; otherwise, intrepret trees as unrooted.
- rooting”default-rooted”
Interpret trees following rooting token (”
[&R]
” for rooted, “[&U]
” for unrooted) if present; otherwise, intrepret trees as rooted.
- rooting=”force-unrooted”
Unconditionally interpret all trees as unrooted.
- rooting=”force-rooted”
Unconditionally interpret all trees as rooted.
The value of the “
rooting
” argument defaults to “default-unrooted”, i.e., all trees are assumed to be unrooted unless a rooting token is present that explicitly specifies the rooting state.
NEWICK-format Writing¶
Previously, if
annotations_as_nhx
wasTrue
, metadata annotations would be written out even ifsuppress_annotations
wasTrue
. Now,suppress_annotations
must beTrue
for annotations to be written out, even ifannotations_as_nhx
isTrue
.
The DataSet
Class¶
Constructor no longer supports they
stream
keyword argument to construct the newDataSet
object from a data source. Use the factory class method:DataSet.get_from_stream
instead.Constructor only accepts one unnamed (positional) argument: either a
DataSet
instance to be cloned, or an iterable ofTaxonNamespace
,TreeList
, orCharacterMatrix
-derived instances to be composed (added) into the newDataSet
instance.TaxonNamespace
no longer managed.