dendropy.model.birthdeath
: The Birth-Death and Related Processes¶
Models, modeling and model-fitting of birth-death processes.
Nee, S. 2001. Inferring speciation rates from phylogenies. Evolution 55:661-668.
Yule, G. U. 1924. A mathematical theory of evolution based on the conclusions of Dr. J. C. Willis. Phil. Trans. R. Soc. Lond. B 213:21-87.
Hoehna, S. (2015). The time-dependent reconstructed evolutionary process with a key-role for mass-extinction events. Journal of theoretical biology, 380, 321-331.
- dendropy.model.birthdeath.birth_death_likelihood(**kwargs)[source]¶
Calculates the log-likelihood of a tree (or a set of internal nodes) under a birth death model.
Requires either a
Tree
object or an interable of internal node ages to be passed in via keyword argumentstree
orinternal_node_ages
, respectively. The former is more convenient when doing one-off calculations, while the latter is more efficient if the list of internal node ages needs to be used in other places and you already have it calculated and want to avoid re-calculating it here.- Parameters:
**kwargs (keyword arguments, mandatory) –
Exactly one of the following must be specified:
- treea
Tree
object. A
Tree
object. The tree needs to be ultrametric for the internal node ages (time from each internal node to the tips) to make sense. The precision by which the ultrametricity is checked can be specified using theultrametricity_precision
keyword argument (see below). Iftree
is given, theninternal_node_ages
cannot be given, and vice versa. Iftree
is not given, theninternal_node_ages
must be given.- internal_node_agesiterable (of numerical values)
Iterable of node ages of the internal nodes of a tree, i.e., the list of sum of the edge lengths between each internal node and the tips of the tree. If
internal_node_ages
is given, thentree
cannot be given, and vice versa. Ifinternal_node_ages
is not given, thentree
must be given.
The following keyword parameters are mandatory:
- birth_ratefloat
The birth rate.
- death_ratefloat
The death rate.
The following keyword parameters are optional:
- sampling_probability
The probability for a species to be included in the sample. Defaults to 1.0 (all species sampled).
- sampling_strategy
The strategy how samples were obtained. Options are: uniform|diversified|age.
- is_mrca_included
Does the process start with the most recent common ancestor?
- condition_onstring
Do we condition the process on: “time”, “survival”, or “taxa”?
The following are optional, and are only used if internal node ages need to be calculated (i.e., ‘tree’ is passed in).
- ultrametricity_precisionfloat
When calculating the node ages, an error will be raised if the tree in o ultrametric. This error may be due to floating-point or numerical imprecision. You can set the precision of the ultrametricity validation by setting the
ultrametricity_precision
parameter. E.g., useultrametricity_precision=0.01
for a more relaxed precision, down to 2 decimal places. Useultrametricity_precision=False
to disable checking of ultrametricity precision.- ignore_likelihood_calculation_failure: bool (default: False)
In some cases (typically, abnormal trees, e.g., 1-tip), the likelihood estimation will fail. In this case a ValueError will be raised. If
ignore_likelihood_calculation_failure
isTrue
, then the function call will still succeed, with the likelihood set to -inf
.
The following are optional, and are only used if internal node ages are specified (i.e., ‘internal_node_ages’ are passed in):
- is_node_ages_presortedbool
By default, the vector of node ages are sorted. If this argument is specified as
True
, then this sorting will be skipped, in which case it is the client code’s responsibility to make sure that the node ages are given in REVERSE order (i.e., oldest nodes – nodes closer to the root – given first).
- treea
Notes
Lifted directly from the (fantastic!) TESS package for R:
H{“o}hna S. 2013. Fast simulation of reconstructed phylogenies under global time-dependent birth–death processes. Bioinformatics, 29(11) 1367-1374.
- Returns:
lnl (float)
The log-likehood of the tree under the birth-death model.
- dendropy.model.birthdeath.birth_death_tree(birth_rate, death_rate, birth_rate_sd=0.0, death_rate_sd=0.0, **kwargs)[source]¶
Returns a birth-death tree with birth rate specified by
birth_rate
, and death rate specified bydeath_rate
, with edge lengths in continuous (real) units.Tree growth is controlled by one or more of the following arguments, of which at least one must be specified:
If
num_extant_tips
is given as a keyword argument, tree is grown until the number of EXTANT tips equals this number.If
num_extinct_tips
is given as a keyword argument, tree is grown until the number of EXTINCT tips equals this number.If
num_total_tips
is given as a keyword argument, tree is grown until the number of EXTANT plus EXTINCT tips equals this number.If ‘max_time’ is given as a keyword argument, tree is grown for a maximum of
max_time
.If
gsa_ntax
is given then the tree will be simulated up to this number of EXTANT tips (or 0 tips), then a tree will be randomly selected from the intervals which corresond to times at which the tree had exactlynum_extant_tips
leaves. This allows for simulations according to the “General Sampling Approach” of Hartmann et al. (2010). If this option is specified, thennum_extant_tips
MUST be specified andnum_extinct_tips
andnum_total_tips
CANNOT be specified.
If more than one of the above is given, then tree growth will terminate when any one of the termination conditions are met.
- Parameters:
birth_rate (float) – The birth rate.
death_rate (float) – The death rate.
birth_rate_sd (float) – The standard deviation of the normally-distributed mutation added to the birth rate as it is inherited by daughter nodes; if 0, birth rate does not evolve on the tree.
death_rate_sd (float) – The standard deviation of the normally-distributed mutation added to the death rate as it is inherited by daughter nodes; if 0, death rate does not evolve on the tree.
- Keyword Arguments:
num_extant_tips (int) – If specified, branching process is terminated when number of EXTANT tips equals this number.
num_extinct_tips (int) – If specified, branching process is terminated when number of EXTINCT tips equals this number.
num_total_tips (int) – If specified, branching process is terminated when number of EXTINCT plus EXTANT tips equals this number.
max_time (float) – If specified, branching process is terminated when time reaches or exceeds this value.
gsa_ntax (int) – The General Sampling Approach threshold for number of taxa. See above for details.
tree (Tree instance) – If given, then this tree will be used; otherwise a new one will be created.
taxon_namespace (TaxonNamespace instance) – If given, then this will be assigned to the new tree, and, in addition, taxa assigned to tips will be sourced from or otherwise created with reference to this.
is_assign_extant_taxa (bool [default: True]) – If False, then taxa will not be assigned to extant tips. If True (default), then taxa will be assigned to extant tips. Taxa will be assigned from the specified
taxon_namespace
ortree.taxon_namespace
. If the number of taxa required exceeds the number of taxa existing in the taxon namespace, newTaxon
objects will be created as needed and added to the taxon namespace.is_assign_extinct_taxa (bool [default: True]) – If False, then taxa will not be assigned to extant tips. If True (default), then taxa will be assigned to extant tips. Taxa will be assigned from the specified
taxon_namespace
ortree.taxon_namespace
. If the number of taxa required exceeds the number of taxa existing in the taxon namespace, newTaxon
objects will be created as needed and added to the taxon namespace. Note that this option only makes sense if extinct tips are retained (specified via ‘is_retain_extinct_tips’ option), and will otherwise be ignored.is_add_extinct_attr (bool [default: True]) – If True (default), add an boolean attribute indicating whether or not a node is an extinct tip or not. False will skip this. Name of attribute is set by ‘extinct_attr_name’ argument, defaulting to ‘is_extinct’. Note that this option only makes sense if extinct tips are retained (specified via ‘is_retain_extinct_tips’ option), and will otherwise be ignored.
extinct_attr_name (str [default: 'is_extinct']) – Name of attribute to add to nodes indicating whether or not tip is extinct. Note that this option only makes sense if extinct tips are retained (specified via ‘is_retain_extinct_tips’ option), and will otherwise be ignored.
is_retain_extinct_tips (bool [default: False]) – If True, extinct tips will be retained on tree. Defaults to False: extinct lineages removed from tree.
repeat_until_success (bool [default: True]) – Under some conditions, it is possible for all lineages on a tree to go extinct. In this case, if this argument is given as
True
(the default), then a new branching process is initiated. IfFalse
(default), then a TreeSimTotalExtinctionException is raised.rng (random.Random() or equivalent instance) – A Random() object or equivalent can be passed using the
rng
keyword; otherwise GLOBAL_RNG is used.
References
Hartmann, Wong, and Stadler “Sampling Trees from Evolutionary Models” Systematic Biology. 2010. 59(4). 465-476
- dendropy.model.birthdeath.discrete_birth_death_tree(birth_rate, death_rate, birth_rate_sd=0.0, death_rate_sd=0.0, **kwargs)[source]¶
Returns a birth-death tree with birth rate specified by
birth_rate
, and death rate specified bydeath_rate
, with edge lengths in discrete (integer) units.birth_rate_sd
is the standard deviation of the normally-distributed mutation added to the birth rate as it is inherited by daughter nodes; if 0, birth rate does not evolve on the tree.death_rate_sd
is the standard deviation of the normally-distributed mutation added to the death rate as it is inherited by daughter nodes; if 0, death rate does not evolve on the tree.Tree growth is controlled by one or more of the following arguments, of which at least one must be specified:
If
ntax
is given as a keyword argument, tree is grown until the number of tips == ntax.If
taxon_namespace
is given as a keyword argument, tree is grown until the number of tips == len(taxon_namespace), and the taxa are assigned randomly to the tips.If ‘max_time’ is given as a keyword argument, tree is grown for
max_time
number of generations.
If more than one of the above is given, then tree growth will terminate when any of the termination conditions (i.e., number of tips ==
ntax
, or number of tips == len(taxon_namespace) or number of generations =max_time
) are met.Also accepts a Tree object (with valid branch lengths) as an argument passed using the keyword
tree
: if given, then this tree will be used; otherwise a new one will be created.If
assign_taxa
is False, then taxa will not be assigned to the tips; otherwise (default), taxa will be assigned. Iftaxon_namespace
is given (tree.taxon_namespace
, iftree
is given), and the final number of tips on the tree after the termination condition is reached is less then the number of taxa intaxon_namespace
(as will be the case, for example, whenntax
< len(taxon_namespace
)), then a random subset of taxa intaxon_namespace
will be assigned to the tips of tree. If the number of tips is more than the number of taxa in thetaxon_namespace
, new Taxon objects will be created and added to thetaxon_namespace
if the keyword argumentcreate_required_taxa
is not given as False.Under some conditions, it is possible for all lineages on a tree to go extinct. In this case, if the keyword argument
repeat_until_success
isTrue
, then a new branching process is initiated. IfFalse
(default), then a TreeSimTotalExtinctionException is raised.A Random() object or equivalent can be passed using the
rng
keyword; otherwise GLOBAL_RNG is used.
- dendropy.model.birthdeath.fit_pure_birth_model(**kwargs)[source]¶
Calculates the maximum-likelihood estimate of the birth rate of a set of internal node ages under a Yule (pure-birth) model.
Requires either a
Tree
object or an interable of internal node ages to be passed in via keyword argumentstree
orinternal_node_ages
, respectively. The former is more convenient when doing one-off calculations, while the latter is more efficient if the list of internal node ages needs to be used in other places and you already have it calculated and want to avoid re-calculating it here.- Parameters:
**kwargs (keyword arguments, mandatory) –
Exactly one of the following must be specified:
- treea
Tree
object. A
Tree
object. The tree needs to be ultrametric for the internal node ages (time from each internal node to the tips) to make sense. The precision by which the ultrametricity is checked can be specified using theultrametricity_precision
keyword argument (see below). Iftree
is given, theninternal_node_ages
cannot be given, and vice versa. Iftree
is not given, theninternal_node_ages
must be given.- internal_node_agesiterable (of numerical values)
Iterable of node ages of the internal nodes of a tree, i.e., the list of sum of the edge lengths between each internal node and the tips of the tree. If
internal_node_ages
is given, thentree
cannot be given, and vice versa. Ifinternal_node_ages
is not given, thentree
must be given.
The following are optional, and are only used if internal node ages are specified (i.e., ‘internal_node_ages’ are passed in):
- is_node_ages_presortedbool
By default, the vector of node ages are sorted. If this argument is specified as
True
, then this sorting will be skipped, in which case it is the client code’s responsibility to make sure that the node ages are given in REVERSE order (i.e., oldest nodes – nodes closer to the root – given first).
- treea
- Returns:
m (dictionary)
A dictionary with keys being parameter names and values being
estimates –
- “birth_rate”
The birth rate.
- ”log_likelihood”
The log-likelihood of the model and given birth rate.
Examples
Birth rates can be estimated by passing in trees directly:
for idx, tree in enumerate(trees): m = birthdeath.fit_pure_birth_model(tree=tree) print("Tree {}: birth rate = {} (logL = {})".format( idx+1, m["birth_rate"], m["log_likelihood"]))
Or by pre-calculating and passing in a list of node ages:
for idx, tree in enumerate(trees): m = birthdeath.fit_pure_birth_model( internal_node_ages=tree.internal_node_ages()) print("Tree {}: birth rate = {} (logL = {})".format( idx+1, m["birth_rate"], m["log_likelihood"]))
Notes
Adapted from the laser package for R:
Dan Rabosky and Klaus Schliep (2013). laser: Likelihood Analysis of Speciation/Extinction Rates from Phylogenies. R package version 2.4-1. http://CRAN.R-project.org/package=laser
- dendropy.model.birthdeath.fit_pure_birth_model_to_tree(tree, ultrametricity_precision=1e-05)[source]¶
Calculates the maximum-likelihood estimate of the birth rate a tree under a Yule (pure-birth) model.
- Parameters:
tree (
Tree
object) – A tree to be fitted.- Returns:
m (dictionary)
A dictionary with keys being parameter names and values being
estimates –
“birth_rate” The birth rate.
”log_likelihood” The log-likelihood of the model and given birth rate.
Examples
import dendropy from dendropy.model import birthdeath trees = dendropy.TreeList.get_from_path( "pythonidae.nex", "nexus") for idx, tree in enumerate(trees): m = birthdeath.fit_pure_birth_model_to_tree(tree) print("Tree {}: birth rate = {} (logL = {})".format( idx+1, m["birth_rate"], m["log_likelihood"]))