API Reference¶

Core Methods¶

Simple Expectation-Maximization fitting of mixtures of probability densities.

mixem.em(data, distributions, initial_weights=None, max_iterations=100, tol=1e-15, tol_iters=10, progress_callback=<function simple_progress>)[source]¶

Fit a mixture of probability distributions using the Expectation-Maximization (EM) algorithm.

Parameters:

data (numpy.ndarray) – The data to fit the distributions for. Can be an array-like or a numpy.ndarray
distributions (list of mixem.distribution.Distribution) – The list of distributions to fit to the data.
initial_weights (numpy.ndarray) – Inital weights for the distributions. Must be the same size as distributions. If None, will use uniform initial weights for all distributions.
max_iterations (int) – The maximum number of iterations to compute for.
tol (float) – The minimum relative increase in log-likelihood after tol_iters iterations
tol_iters (int) – The number of iterations to go back in comparing log-likelihood change
progress_callback (function or None) – A function to call to report progress after every iteration.

Return type:

tuple (weights, distributitons, log_likelihood)

mixem.probability(data, weights, distributions)[source]¶: Compute the probability for data of the mixture density model given by weights and a list of distributions

Specifying Distributions¶

class mixem.distribution.Distribution[source]¶

Base class for a mixEM probability distribution.

To define your own new distribution, all methods of this class will have to be implemented.

estimate_parameters(data, weights)[source]¶

Estimate the probabilities’ parameters using weighted maximum-likelihood estimation and update parameters in-place.

Parameters:	data (numpy.ndarray) – The data \(x\) to estimate parameters for. A \((N \times D)\) `numpy.ndarray` where N is the number of examples and D is the dimensionality of the data weights – The weights \(\gamma\) for individual data points. A N-element `numpy.ndarray` where N is the number of examples.

Choose those parameters \(\phi\) that maximize the weighted log-likelihood function:

\[ll_\gamma(x|\phi) = \sum_{n=1}^N \gamma_{n} \log [P(x|\phi)]\]

Generally, this will involve differentiating the log-likelihood function for all parameters. You can set the derivative of the gradient to 0 and try to solve for the parameter to find a closed-form solution for the maximum-likelihood estimate or use a numerical optimizer to find the maximum-likelihood parameter.

Once parameter estimates are found, update the attributes in place.

log_density(data)[source]¶

Compute the log-probability density \(\log P(x|\phi)\)

Parameters:	data (numpy.ndarray) – The data \(x\) to compute a probability density for. A \((N \times D)\) `numpy.ndarray` where N is the number of examples and D is the dimensionality of the data
Returns:	The log-probability for observing the data, given the probability distribution’s parameters
Return type:	float

class mixem.distribution.ExponentialDistribution(lmbda)[source]¶: Exponential distribution with parameter (lambda).

class mixem.distribution.GeometricDistribution(p)[source]¶: Geometric distribution with parameter (p).

class mixem.distribution.NormalDistribution(mu, sigma)[source]¶: Univariate normal distribution with parameters (mu, sigma).

class mixem.distribution.MultivariateNormalDistribution(mu, sigma)[source]¶: Multivariate normal distribution with parameters (mu, Sigma).

class mixem.distribution.LogNormalDistribution(mu, sigma)[source]¶: Univariate log-normal distribution with parameters (mu, sigma).

API Reference¶

Core Methods¶

Specifying Distributions¶

Table Of Contents

Related Topics

This Page