dendropy.calculate.statistics
: General Statistics¶
Functions to calculate some general statistics.
- class dendropy.calculate.statistics.FishersExactTest(table)[source]¶
Given a 2x2 table:
a
b
c
d
represented by a list of lists:
[[a,b],[c,d]]
this calculates the sum of the probability of this table and all others more extreme under the null hypothesis that there is no association between the categories represented by the vertical and horizontal axes.
- static probability_of_table(table)[source]¶
Given a 2x2 table:
a
b
c
d
represented by a list of lists:
[[a,b],[c,d]]
this returns the probability of this table under the null hypothesis of no association between rows and columns, which was shown by Fisher to be a hypergeometric distribution:
p = ( choose(a+b, a) * choose(c+d, c) ) / choose(a+b+c+d, a+c)
- dendropy.calculate.statistics.empirical_cdf(values, v)[source]¶
Returns the proportion of values in
values
<=v
.
- dendropy.calculate.statistics.empirical_hpd(values, conf=0.05)[source]¶
Assuming a unimodal distribution, returns the 0.95 highest posterior density (HPD) interval for a set of samples from a posterior distribution. Adapted from
emp.hpd
in the “TeachingDemos” R package (Copyright Greg Snow; licensed under the Artistic License).
- dendropy.calculate.statistics.mean_and_population_variance(values)[source]¶
Returns the mean and population variance while only passing over the elements in values once.
- dendropy.calculate.statistics.mean_and_sample_variance(values)[source]¶
Returns the mean and sample variance while only passing over the elements in values once.
- dendropy.calculate.statistics.median(pool)[source]¶
Returns median of sample. From: http://wiki.python.org/moin/SimplePrograms
- dendropy.calculate.statistics.mode(values, bin_size=0.1)[source]¶
Returns the mode of a set of values.
- dendropy.calculate.statistics.rank(value_to_be_ranked, value_providing_rank)[source]¶
Returns the rank of
value_to_be_ranked
in set of values,values
. Works even ifvalues
is a non-orderable collection (e.g., a set). A binary search would be an optimized way of doing this if we can constrainvalues
to be an ordered collection.
- dendropy.calculate.statistics.summarize(values)[source]¶
Summarizes a sample of values:
range
: tuple pair representing minimum and maximum valuesmean
: mean of samplemedian
: median of samplevar
: (sample) variancesd
: (sample) standard deviationhpd95
: tuple pair representing 5% and 95% HPDquant_5_95
: tuple pair representing 5% and 95% quantile
- dendropy.calculate.statistics.variance_covariance(data, population_variance=False)[source]¶
Returns the Variance-Covariance matrix for
data
. From: http://www.python-forum.org/pythonforum/viewtopic.php?f=3&t=17441