dendropy.calculate.popgenstat: Population Genetics Statistics

Population genetic statistics.

dendropy.calculate.popgenstat.average_number_of_pairwise_differences(char_matrix, ignore_uncertain=True)[source]

Returns $k$, calculated for a character block.

dendropy.calculate.popgenstat.derived_state_matrix(char_matrix, ancestral_sequence=None, derived_state_alphabet=None, ignore_uncertain=True)[source]

Given a list of CharDataSequence objects, and a reference ancestral sequence, this returns a list of strings corresponding to the list of CharDataSequence objects, where a ‘0’ indicates the ancestral state and ‘1’ a derived state.

e.g.

Given:

GGCTAATCTGA GCTTTTTCTGA GCTCTCTCTTC

with ancestral sequence:

GGTTAATCTGA

this returns:

0010000000 0000110000 0001110011

dendropy.calculate.popgenstat.nucleotide_diversity(char_matrix, ignore_uncertain=True)[source]

Returns $pi$, calculated for a character block.

dendropy.calculate.popgenstat.num_segregating_sites(char_matrix, ignore_uncertain=True)[source]

Returns the raw number of segregating sites (polymorphic sites).

dendropy.calculate.popgenstat.tajimas_d(char_matrix, ignore_uncertain=True)[source]

Returns Tajima’s D.

dendropy.calculate.popgenstat.unfolded_site_frequency_spectrum(char_matrix, ancestral_sequence=None, ignore_uncertain=False, pad=True)[source]

Returns the site frequency spectrum of list of CharDataSequence objects given by char_sequences, with reference to the ancestral sequence given by ancestral_seq. If ancestral_seq is None, then the first sequence in char_sequences is taken to be the ancestral sequence.

dendropy.calculate.popgenstat.wattersons_theta(char_matrix, ignore_uncertain=True)[source]

Returns Watterson’s Theta (per sequence)