Primary Phylogenetic Data Objects¶
Phylogenetic data in DendroPy is represented by one or more objects of the following classes:
Taxon
A representation of an operational taxonomic unit, with an attribute,
label
, corresponding to the taxon label.TaxonNamespace
A collection of
Taxon
objects representing a distinct definition of taxa (for example, as specified explicitly in a NEXUS “TAXA” block, or implicitly in the set of all taxon labels used across a Newick tree file).Tree
A collection of
Node
andEdge
objects representing a phylogenetic tree. EachTree
object maintains a reference to aTaxonNamespace
object in its attribute,taxon_namespace
, which specifies the set of taxa that are referenced by the tree and its nodes. EachNode
object has ataxon
attribute (which points to a particularTaxon
object if there is an operational taxonomic unit associated with this node, or isNone
if not), aparent_node
attribute (which will beNone
if theNode
has no parent, e.g., a root node), aEdge
attribute, as well as a list of references to child nodes, a copy of which can be obtained by callingchild_nodes
. In addition, advanced operations with tree data often make use of aBipartition
object associated with eachEdge
on aTree
(see “Bipartitions” for more information).TreeList
A
list
ofTree
objects. ATreeList
object has an attribute,taxon_namespace
, which specifies the set of taxa that are referenced by all memberTree
elements. This is enforced when aTree
object is added to aTreeList
, with theTaxonNamespace
of theTree
object and allTaxon
references of theNode
objects in theTree
mapped to theTaxonNamespace
of theTreeList
.CharacterMatrix
Representation of character data, with specializations for different data types:
DnaCharacterMatrix
,RnaCharacterMatrix
,ProteinCharacterMatrix
,StandardCharacterMatrix
,ContinuousCharacterMatrix
, etc. ACharacterMatrix
can treated very much like adict
object, withTaxon
objects as keys and character data as values associated with those keys.DataSet
A meta-collection of phylogenetic data, consisting of lists of multiple
TaxonNamespace
objects (DataSet.taxon_namespaces
),TreeList
objects (DataSet.tree_lists
), andCharacterMatrix
objects (DataSet.char_matrices
).TreeArray
A high-performance container designed to efficiently store and manage (potentially) large collections of structures of (potentially) large trees for processing.