dendropy.model.discrete
: Discrete Character Evolution¶
Models and modeling of discrete character evolution.
- class dendropy.model.discrete.DiscreteCharacterEvolutionModel(state_alphabet, stationary_freqs=None, rng=None)[source]¶
Base class for discrete character substitution models.
__init__ initializes the state_alphabet to define the character type on which this model acts. The objects random number generator will be
rng
or ‘GLOBAL_RNG’
- class dendropy.model.discrete.DiscreteCharacterEvolver(seq_model=None, mutation_rate=None, seq_attr='sequences', seq_model_attr='seq_model', edge_length_attr='length', edge_rate_attr='mutation_rate', seq_label_attr='taxon')[source]¶
Evolves sequences on a tree.
__init__ sets up meta-data dealing with object nomenclature and semantics.
- evolve_states(tree, seq_len, root_states=None, simulate_root_states=True, in_place=True, rng=None)[source]¶
Appends a new sequence of length
seq_len
to a list at each node intree
. The attribute name of this list in each node is given byseq_attr
. Ifseq_model
is None,tree.seq_model
orseq_model
at each node must be specified. Ifin_place
is False, the tree is copied first, otherwise original tree is modified. Ifroot_states
is given, this will be used as the sequence for the root. If not, and ifsimulate_root_states
is True, then the sequence for the root will be drawn from the stationary distribution of the character model.
- extend_char_matrix_with_characters_on_tree(char_matrix, tree, include=None, exclude=None)[source]¶
Creates a character matrix with new sequences (or extends sequences of an existing character matrix if provided via
char_matrix
), where the the sequence for each taxon corresponds to the concatenation of all sequences in the list of sequences associated with tip that references the given taxon. Specific sequences to be included/excluded can be fine-tuned using theinclude
andexclude
args, whereinclude=None
means to include all by default, andexclude=None
means to exclude all by default.
- class dendropy.model.discrete.Hky85(kappa=1.0, base_freqs=None, state_alphabet=None, rng=None)[source]¶
Hasegawa et al. 1985 model. Implementation following Swofford et al., 1996.
__init__: if no arguments given, defaults to JC69.
- corrected_substitution_rate(rate)[source]¶
Returns the factor that we have to multiply to the branch length to make branch lengths proportional to # of substitutions per site.
- pij(state_i, state_j, tlen, rate=1.0)[source]¶
Returns probability, p_ij, of going from state i to state j over time tlen at given rate. (tlen * rate = nu, expected number of substitutions)
- pmatrix(tlen, rate=1.0)[source]¶
Returns a matrix of nucleotide substitution probabilities. Based on analytical solution by Swofford et al., 1996. (tlen * rate = nu, expected number of substitutions)
- class dendropy.model.discrete.Jc69(state_alphabet=None, rng=None)[source]¶
Jukes-Cantor 1969 model. Specializes HKY85 such that kappa = 1.0, and base frequencies = [0.25, 0.25, 0.25, 0.25].
__init__: uses Hky85.__init__
- class dendropy.model.discrete.NucleotideCharacterEvolutionModel(base_freqs=None, state_alphabet=None, rng=None)[source]¶
General nucleotide substitution model.
__init__ calls SeqModel.__init__ and sets the base_freqs field
- is_purine(state_index)[source]¶
Returns True if state_index represents a purine (A or G) row or column index: 0, 2
- is_purine_transition(state1_idx, state2_idx)[source]¶
Returns True if the change from state1 to state2, as represented by the row or column indices, is a purine transitional change.
- is_pyrimidine(state_index)[source]¶
Returns True if state_index represents a pyrimidine (C or T) row or column index: 1, 3
- is_pyrimidine_transition(state1_idx, state2_idx)[source]¶
Returns True if the change from state1 to state2, as represented by the row or column indices, is a pyrimidine transitional change.
- is_transition(state1_idx, state2_idx)[source]¶
Returns True if the change from state1 to state2, as represented by the row or column indices, is a transitional change.
- dendropy.model.discrete.hky85_chars(seq_len, tree_model, mutation_rate=1.0, kappa=1.0, base_freqs=[0.25, 0.25, 0.25, 0.25], root_states=None, char_matrix=None, retain_sequences_on_tree=False, rng=None)[source]¶
Convenience class to wrap generation of characters (as a CharacterBlock object) based on the HKY model.
- Parameters:
seq_len (int) – Length of sequence (number of characters).
tree_model (
Tree
) – Tree on which to simulate.mutation_rate (float) – Mutation modifier rate (should be 1.0 if branch lengths on tree reflect true expected number of changes).
root_states`` (list) – Vector of root states (length must equal
seq_len
).char_matrix (
DnaCharacterMatrix
) – If given, new sequences for taxa ontree_model
leaf_nodes will be appended to existing sequences of corresponding taxa in char_matrix; if not, a newDnaCharacterMatrix
object will be created.retain_sequences_on_tree (bool) – If
False
, sequence annotations will be cleared from tree after simulation. Set toTrue
if you want to, e.g., evolve and accumulate different sequences on tree, or retain information for other purposes.rng (random number generator) – If not given, ‘GLOBAL_RNG’ will be used.
- Returns:
d (|DnaCharacterMatrix|) – The simulated alignment.
Since characters will be appended to existing sequences, you can simulate a
sequences under a mixed model by calling this method multiple times with
different character model parameter values and/or different mutation
rates, passing in the same
char_matrix
object each time.
- dendropy.model.discrete.simulate_discrete_char_dataset(seq_len, tree_model, seq_model, mutation_rate=1.0, root_states=None, dataset=None, rng=None)[source]¶
Wrapper to conveniently generate a DataSet simulated under the given tree and character model.
- Parameters:
seq_len (int) – Length of sequence (number of characters).
tree_model (
Tree
) – Tree on which to simulate.seq_model (dendropy.model.discrete.NucleotideCharacterEvolutionModel) – The character substitution model under which to to evolve the characters.
mutation_rate (float) – Mutation modifier rate (should be 1.0 if branch lengths on tree reflect true expected number of changes).
root_states`` (list) – Vector of root states (length must equal
seq_len
).dataset (
DataSet
) – If given, the new dendropy.CharacterMatrix object will be added to this (along with a new taxon_namespace if required). Otherwise, a new dendropy.DataSet object will be created.rng (random number generator) – If not given, ‘GLOBAL_RNG’ will be used.
- Returns:
d (|DataSet|)
- dendropy.model.discrete.simulate_discrete_chars(seq_len, tree_model, seq_model, mutation_rate=1.0, root_states=None, char_matrix=None, retain_sequences_on_tree=False, rng=None)[source]¶
Wrapper to conveniently generate a characters simulated under the given tree and character model.
Since characters will be appended to existing sequences, you can simulate a sequences under a mixed model by calling this method multiple times with different character models and/or different mutation rates, passing in the same
char_matrix
object each time.- Parameters:
seq_len (int) – Length of sequence (number of characters).
tree_model (
Tree
) – Tree on which to simulate.seq_model (dendropy.model.discrete.NucleotideCharacterEvolutionModel) – The character substitution model under which to to evolve the characters.
mutation_rate (float) – Mutation modifier rate (should be 1.0 if branch lengths on tree reflect true expected number of changes).
root_states`` (list) – Vector of root states (length must equal
seq_len
).char_matrix (
DnaCharacterMatrix
) – If given, new sequences for taxa ontree_model
leaf_nodes will be appended to existing sequences of corresponding taxa in char_matrix; if not, a newDnaCharacterMatrix
object will be created.retain_sequences_on_tree (bool) – If
False
, sequence annotations will be cleared from tree after simulation. Set toTrue
if you want to, e.g., evolve and accumulate different sequences on tree, or retain information for other purposes.rng (random number generator) – If not given, ‘GLOBAL_RNG’ will be used.
- Returns:
d (a dendropy.datamodel.CharacterMatrix object.)