[Serializable]
[DataContract]
[Quality(QualityBand.Mature)]
public class Dirichlet : IDistribution<Vector>, IDistribution, ICloneable, HasPoint<Vector>, CanGetLogProb<Vector>, SettableTo<Dirichlet>, SettableToProduct<Dirichlet>, SettableToProduct<Dirichlet, Dirichlet>, Diffable, SettableToUniform, SettableToRatio<Dirichlet>, SettableToRatio<Dirichlet, Dirichlet>, SettableToPower<Dirichlet>, SettableToWeightedSum<Dirichlet>, CanGetLogAverageOf<Dirichlet>, CanGetLogAverageOfPower<Dirichlet>, CanGetAverageLog<Dirichlet>, CanGetLogNormalizer, Sampleable<Vector>, CanGetMean<Vector>, CanGetVariance<Vector>, CanGetMeanAndVariance<Vector, Vector>, CanSetMeanAndVariance<Vector, Vector>, CanGetMode<Vector>

Remarks

The Dirichlet is a distribution on probability vectors. The formula for the distribution is p(x) = (Gamma(a)/prod_i Gamma(b_i)) prod_i x_i^{b_i-1} subject to the constraints x_i >= 0 and sum_i x_i = 1. The parameter a is the "total pseudo-count" and is shorthand for sum_i b_i. The vector b contains the pseudo-counts for each case i. The vector b can be sparse or dense; in many cases it is useful to give it a Sparsity specification of ApproximateWithTolerance(Double).

The distribution is represented by the pair (TotalCount, PseudoCount). If TotalCount is infinity, the distribution is a point mass. The Point property gives the mean. Otherwise TotalCount is always equal to PseudoCount.Sum(). If distribution is uniform when all PseudoCounts = 1. If any PseudoCount <= 0, the distribution is improper. In this case, the density is redefined to not include the Gamma terms, i.e. there is no normalizer.

Constructors

Dirichlet(Dirichlet)

Copy constructor.

Declaration

public Dirichlet(Dirichlet that)

Parameters

Type	Name	Description
Dirichlet	that

Dirichlet(Vector)

Creates a Dirichlet distribution with the specified pseudo-counts. The pseudo-count vector can have any Sparsity specification. A specification of ApproximateWithTolerance(Double) is recommended for sparse problems, and the message functions used in inference will maintain that sparsity specification.

Declaration

[Construction(new string[]{"PseudoCount"})]
public Dirichlet(Vector pseudoCount)

Parameters

Type	Name	Description
Vector	pseudoCount	The vector of pseudo-counts

Dirichlet(Double[])

Creates a Dirichlet distribution with the psecified pseudo-counts

Declaration

public Dirichlet(params double[] pseudoCount)

Parameters

Type	Name	Description
Double[]	pseudoCount	An array of pseudo-counts

Dirichlet(Int32)

Creates a uniform Dirichlet distribution with unit pseudo-counts.

Declaration

protected Dirichlet(int dimension)

Parameters

Type	Name	Description
Int32	dimension	Dimension

Dirichlet(Int32, Sparsity)

Creates a uniform Dirichlet distribution with unit pseudo-counts and a given dimension and Sparsity.

Declaration

protected Dirichlet(int dimension, Sparsity sparsity)

Parameters

Type	Name	Description
Int32	dimension	Dimension
Sparsity	sparsity	The Sparsity specification. A specification of ApproximateWithTolerance(Double) is recommended for sparse problems.

Dirichlet(Int32, Double)

Creates a uniform Dirichlet distribution with the specified initial pseudo-count for each index.

Declaration

protected Dirichlet(int dimension, double initialCount)

Parameters

Type	Name	Description
Int32	dimension	Dimension
Double	initialCount	Initial value for each pseudocount

Dirichlet(Int32, Double, Sparsity)

Creates a uniform Dirichlet distribution with the specified dimension, initial pseudo-count and Sparsity.

Declaration

protected Dirichlet(int dimension, double initialCount, Sparsity sparsity)

Parameters

Type	Name	Description
Int32	dimension	Dimension
Double	initialCount	Initial value for each pseudocount
Sparsity	sparsity	The Sparsity specification. A specification of ApproximateWithTolerance(Double) is recommended for sparse problems.

Fields

AllowImproperSum

If true, SetToSum(Double, Dirichlet, Double, Dirichlet) will use moment matching as described by Minka and Lafferty (2002).

Declaration

public static bool AllowImproperSum

Field Value

Type	Description
Boolean

PseudoCount

Vector of pseudo-counts

Declaration

[DataMember]
public Vector PseudoCount

Field Value

Type	Description
Vector

TotalCount

Gets the total count. If infinite, the distribution is a point mass. Otherwise, this is the sum of pseudo-counts

Declaration

[DataMember]
public double TotalCount

Field Value

Type	Description
Double

Properties

Dimension

Gets the dimension of this Dirichlet

Declaration

public int Dimension { get; }

Property Value

Type	Description
Int32

IsPointMass

Whether this Dirichlet is a point mass

Declaration

[IgnoreDataMember]
public bool IsPointMass { get; }

Property Value

Type	Description
Boolean

Point

Sets/gets this distribution as a point mass

Declaration

[IgnoreDataMember]
public Vector Point { get; set; }

Property Value

Type	Description
Vector

Sparsity

Gets the Sparsity specification of this Distribution.

Declaration

public Sparsity Sparsity { get; }

Property Value

Type	Description
Sparsity

Methods

Clone()

Clones this Dirichlet.

Declaration

public object Clone()

Returns

Type	Description
Object	An object which is a clone of the current instance. This must be cast if you want to assign the result to a Dirichlet type

DirichletLn(Vector)

Computes the log Dirichlet function: sum_i GammaLn(pseudoCount[i]) - GammaLn(sum_i pseudoCount[i])

Declaration

public static double DirichletLn(Vector pseudoCount)

Parameters

Type	Name	Description
Vector	pseudoCount	Vector of pseudo-counts.

Returns

Type	Description
Double	`sum_i GammaLn(pseudoCount[i]) - GammaLn(sum_i pseudoCount[i])`

Remarks

If any pseudoCount <= 0, the result is defined to be 0.

Equals(Object)

Override of the Equals method

Declaration

public override bool Equals(object thatd)

Parameters

Type	Name	Description
Object	thatd	The instance to compare to

Returns

Type	Description
Boolean	True if the two distributions are the same in value, false otherwise

Overrides

Object.Equals(Object)

EstimateNewton(Vector, Vector)

Modifies PseudoCount to produce the given expected logarithms.

Declaration

public static void EstimateNewton(Vector PseudoCount, Vector meanLog)

Parameters

Type	Name	Description
Vector	PseudoCount	On input, the initial guess. On output, the converged solution.
Vector	meanLog	May be -infinity.

FromMeanLog(Vector)

Create a Dirichlet distribution with the given expected logarithms.

Declaration

public static Dirichlet FromMeanLog(Vector meanLog)

Parameters

Type	Name	Description
Vector	meanLog	Desired expectation E[log(pk)] for each k.

Returns

Type	Description
Dirichlet	A new Dirichlet where GetMeanLog == meanLog

Remarks

This function is equivalent to maximum-likelihood estimation of a Dirichlet distribution from data given by sufficient statistics. This function is significantly slower than the other constructors since it involves nonlinear optimization. Uses the Newton algorithm described in "Estimating a Dirichlet distribution" by T. Minka, 2000.

GetAverageLog(Dirichlet)

The expected logarithm of that distribution under this distribution.

Declaration

public double GetAverageLog(Dirichlet that)

Parameters

Type	Name	Description
Dirichlet	that	The distribution to take the logarithm of.

Returns

Type	Description
Double	`sum_x this.Evaluate(x)*Math.Log(that.Evaluate(x))`

Remarks

This is also known as the cross entropy.

GetHashCode()

Override of GetHashCode method

Declaration

public override int GetHashCode()

Returns

Type	Description
Int32	The hash code for this instance

Overrides

Object.GetHashCode()

GetLogAverageOf(Dirichlet)

The log of the integral of the product of this Dirichlet and that Dirichlet

Declaration

public double GetLogAverageOf(Dirichlet that)

Parameters

Type	Name	Description
Dirichlet	that	That Dirichlet

Returns

Type	Description
Double	The log inner product

GetLogAverageOfPower(Dirichlet, Double)

Get the integral of this distribution times another distribution raised to a power.

Declaration

public double GetLogAverageOfPower(Dirichlet that, double power)

Parameters

Type	Name	Description
Dirichlet	that
Double	power

Returns

Type	Description
Double

GetLogNormalizer()

Gets the log normalizer for the distribution

Declaration

public double GetLogNormalizer()

Returns

Type	Description
Double

GetLogProb(Vector)

Evaluates the log of the Dirichlet density function at the given Vector value

Declaration

public double GetLogProb(Vector value)

Parameters

Type	Name	Description
Vector	value	Where to do the evaluation. Must be vector of positive real numbers

Returns

Type	Description
Double	log(Dir(value;a,b))

GetMean()

Gets the expected value E(x)

Declaration

public Vector GetMean()

Returns

Type	Description
Vector	E(x)

GetMean(Vector)

Gets the expected value E(x). Provide a vector to put the result

Declaration

public Vector GetMean(Vector result)

Parameters

Type	Name	Description
Vector	result	Where to put E(x)

Returns

Type	Description
Vector	E(x)

GetMeanAndVariance(Vector, Vector)

Gets the mean E(p) = m/s and variance var(p) = m*(1-m)/(1+s)

Declaration

public void GetMeanAndVariance(Vector mean, Vector variance)

Parameters

Type	Name	Description
Vector	mean	Where to put the mean
Vector	variance	Where to put the variance

GetMeanCube()

Computes E[p(x)^3] for each x.

Declaration

public Vector GetMeanCube()

Returns

Type	Description
Vector

GetMeanLog()

Gets the expected log value E(log(x))

Declaration

public Vector GetMeanLog()

Returns

Type	Description
Vector	E(log(x))

GetMeanLog(Vector)

Gets the expected log value E(log(x)). Provide a vector to put the result

Declaration

public Vector GetMeanLog(Vector result)

Parameters

Type	Name	Description
Vector	result	Where to put E(log(x))

Returns

Type	Description
Vector	E(log(x))

GetMeanLogAt(Int32)

E[log prob[sample]]

Declaration

public double GetMeanLogAt(int sample)

Parameters

Type	Name	Description
Int32	sample	a dimension of prob of interest

Returns

Type	Description
Double	E[log prob[sample]]

GetMeanSquare()

Computes E[p(x)^2] for each x.

Declaration

public Vector GetMeanSquare()

Returns

Type	Description
Vector

GetMode()

The most probable vector.

Declaration

public Vector GetMode()

Returns

Type	Description
Vector

GetMode(Vector)

The most probable vector.

Declaration

public Vector GetMode(Vector result)

Parameters

Type	Name	Description
Vector	result

Returns

Type	Description
Vector

GetVariance()

Gets the variance var(p) = m*(1-m)/(1+s)

Declaration

public Vector GetVariance()

Returns

Type	Description
Vector	The variance

IsProper()

Whether the the distribution is proprer or not. It is proper if all pseudo-counts are > 0.

Declaration

public bool IsProper()

Returns

Type	Description
Boolean	true if proper, false otherwise

IsUniform()

Whether this instance is uniform (i.e. has unit pseudo-counts)

Declaration

public bool IsUniform()

Returns

Type	Description
Boolean	true if uniform, false otherwise

MaxDiff(Object)

The maximum difference between the parameters of this Dirichlet and that Dirichlet

Declaration

public double MaxDiff(object that)

Parameters

Type	Name	Description
Object	that	That Dirichlet

Returns

Type	Description
Double	The maximum difference

Remarks

a.MaxDiff(b) == b.MaxDiff(a)

PointMass(Vector)

Creates a point-mass Dirichlet at the specified location

Declaration

[Construction(new string[]{"Point"}, UseWhen = "IsPointMass")]
public static Dirichlet PointMass(Vector mean)

Parameters

Type	Name	Description
Vector	mean	Where to locate the point-mass. All elements of the Vector must be positive

Returns

Type	Description
Dirichlet	The created point mass Dirichlet

PointMass(Double[])

Creates a point-mass Dirichlet at the specified location

Declaration

public static Dirichlet PointMass(params double[] mean)

Parameters

Type	Name	Description
Double[]	mean	Where to locate the point-mass. All elements of the array must be positive

Returns

Type	Description
Dirichlet	The created point mass Dirichlet

Sample()

Samples from this Dirichlet distribution

Declaration

public Vector Sample()

Returns

Type	Description
Vector	The sample Vector

Sample(Vector)

Samples from this Dirichlet distribution. Provide a Vector to place the result

Declaration

public Vector Sample(Vector result)

Parameters

Type	Name	Description
Vector	result	Where to place the resulting sample

Returns

Type	Description
Vector	result

Sample(Vector, Vector)

Sample from a Dirichlet with specified pseudo-counts

Declaration

public static Vector Sample(Vector pseudoCount, Vector result)

Parameters

Type	Name	Description
Vector	pseudoCount	The pseudo-count vector
Vector	result	Where to put the result

Returns

Type	Description
Vector	result

SampleFromPseudoCounts(Vector)

Sample from a Dirichlet with specified pseudo-counts

Declaration

public static Vector SampleFromPseudoCounts(Vector pseudoCount)

Parameters

Type	Name	Description
Vector	pseudoCount	The pseudo-count vector

Returns

Type	Description
Vector	A new Vector

SetDerivatives(Vector, Vector, Vector, Boolean)

Sets the mean and precision to best match the given derivatives at a point.

Declaration

public void SetDerivatives(Vector x, Vector dLogP, Vector ddLogP, bool forceProper)

Parameters

Type	Name	Description
Vector	x	A probability vector
Vector	dLogP	Desired derivative of log-density at x
Vector	ddLogP	Desired second derivative of log-density at x
Boolean	forceProper	If true and both derivatives cannot be matched by a distribution with counts at least 1, match only the first.

SetMeanAndMeanSquare(Vector, Vector)

Sets the mean, and sets the precision to best match the given mean-squares.

Declaration

public void SetMeanAndMeanSquare(Vector mean, Vector meanSquare)

Parameters

Type	Name	Description
Vector	mean	Desired mean in each dimension. Must be in [0,1] and sum to 1.
Vector	meanSquare	Desired meanSquare in each dimension. Must be in [0,1].

Remarks

The resulting distribution will have the given mean but will only approximately match the meanSquare, since the Dirichlet does not have enough parameters. The moment matching formula comes from: "Expectation-Propagation for the Generative Aspect Model", Thomas Minka and John Lafferty, Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence, pp. 352-359, 2002 http://research.microsoft.com/~minka/papers/aspect/

SetMeanAndVariance(Vector, Vector)

Sets the mean, and sets the precision to best match the given variances.

Declaration

public void SetMeanAndVariance(Vector mean, Vector variance)

Parameters

Type	Name	Description
Vector	mean	Desired mean in each dimension. Must be in [0,1] and sum to 1.
Vector	variance	Desired variance in each dimension. Must be non-negative.

Remarks

The resulting distribution will have the given mean but will only approximately match the variance, since the Dirichlet does not have enough parameters. The moment matching formula comes from: "Expectation-Propagation for the Generative Aspect Model", Thomas Minka and John Lafferty, Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence, pp. 352-359, 2002 http://research.microsoft.com/~minka/papers/aspect/

SetMeanLog(Vector)

Set the Dirichlet parameters to produce the given expected logarithms.

Declaration

public void SetMeanLog(Vector meanLog)

Parameters

Type	Name	Description
Vector	meanLog	Desired expectation E[log(pk)] for each k.

Remarks

This function is equivalent to maximum-likelihood estimation of a Dirichlet distribution from data given by sufficient statistics. This function is significantly slower than the other setters since it involves nonlinear optimization. Uses the Newton algorithm described in "Estimating a Dirichlet distribution" by T. Minka, 2000.

SetTo(Dirichlet)

Sets this Dirichlet instance to have the parameter values of another Dirichlet instance

Declaration

public void SetTo(Dirichlet value)

Parameters

Type	Name	Description
Dirichlet	value

SetToPower(Dirichlet, Double)

Sets the parameters to represent the raising a Dirichlet to some power.

Declaration

public void SetToPower(Dirichlet dist, double exponent)

Parameters

Type	Name	Description
Dirichlet	dist	The Dirichlet
Double	exponent	The exponent

SetToProduct(Dirichlet, Dirichlet)

Sets the parameters to represent the product of two Dirichlets.

Declaration

public void SetToProduct(Dirichlet a, Dirichlet b)

Parameters

Type	Name	Description
Dirichlet	a	The first Dirichlet. May refer to `this`.
Dirichlet	b	The second Dirichlet. May refer to `this`.

Remarks

The result may not be proper, i.e. its parameters may be negative. For example, if you multiply Dirichlet(0.1,0.1) by itself you get Dirichlet(-0.8, -0.8). No error is thrown in this case.

SetToRatio(Dirichlet, Dirichlet, Boolean)

Sets the parameters to represent the ratio of two Dirichlets.

Declaration

public void SetToRatio(Dirichlet numerator, Dirichlet denominator, bool forceProper = false)

Parameters

Type	Name	Description
Dirichlet	numerator	The numerator Dirichlet. Can be the same object as this.
Dirichlet	denominator	The denominator Dirichlet. Can be the same object as this.
Boolean	forceProper	If true, the PseudoCounts of the result are made >= 1, under the constraint that denominator*result has the same mean as numerator.

SetToSum(Double, Dirichlet, Double, Dirichlet)

Set the parameters to match the moments of a mixture distribution.

Declaration

public void SetToSum(double weight1, Dirichlet dist1, double weight2, Dirichlet dist2)

Parameters

Type	Name	Description
Double	weight1	The first weight
Dirichlet	dist1	The first distribution. Can be the same object as `this`
Double	weight2	The second weight
Dirichlet	dist2	The second distribution. Can be the same object as `this`

SetToUniform()

Sets the distribution to be uniform

Declaration

public void SetToUniform()

Symmetric(Int32, Double)

Creates a Dirichlet distribution with all pseudo-counts equal to initialCount.

Declaration

public static Dirichlet Symmetric(int dimension, double pseudoCount)

Parameters

Type	Name	Description
Int32	dimension	Dimension
Double	pseudoCount	The value for each pseudo-count

Returns

Type	Description
Dirichlet	A new Dirichlet distribution

Symmetric(Int32, Double, Sparsity)

Creates a Dirichlet distribution of a given sparsity with all pseudo-counts equal to initialCount.

Declaration

public static Dirichlet Symmetric(int dimension, double pseudoCount, Sparsity sparsity)

Parameters

Type	Name	Description
Int32	dimension	Dimension
Double	pseudoCount	The value for each pseudo-count
Sparsity	sparsity	Sparsity specification

Returns

Type	Description
Dirichlet	A new Dirichlet distribution

ToString()

ToString override

Declaration

public override string ToString()

Returns

Type	Description
String	String representation of the instance

Overrides

Object.ToString()

ToString(String)

Declaration

public string ToString(string format)

Parameters

Type	Name	Description
String	format

Returns

Type	Description
String

ToString(String, String)

Declaration

public string ToString(string format, string delimiter)

Parameters

Type	Name	Description
String	format
String	delimiter

Returns

Type	Description
String

Uniform(Int32)

Instantiates a uniform Dirichlet distribution

Declaration

public static Dirichlet Uniform(int dimension)

Parameters

Type	Name	Description
Int32	dimension	Dimension

Returns

Type	Description
Dirichlet	A new uniform Dirichlet distribution

Uniform(Int32, Sparsity)

Instantiates a uniform Dirichlet distribution of a given sparsity

Declaration

[Construction(new string[]{"Dimension", "Sparsity"}, UseWhen = "IsUniform")]
public static Dirichlet Uniform(int dimension, Sparsity sparsity)

Parameters

Type	Name	Description
Int32	dimension	Dimension
Sparsity	sparsity	Sparsity

Returns

Type	Description
Dirichlet	A new uniform Dirichlet distribution

WeightedSum<T>(T, Int32, Double, T, Double, T, Sparsity)

Static weighted sum method for distribution types for which both mean and variance can be got/set as Vectors

Declaration

public static T WeightedSum<T>(T result, int dimension, double weight1, T dist1, double weight2, T dist2, Sparsity sparsity)
    where T : CanGetMeanAndVariance<Vector, Vector>, CanSetMeanAndVariance<Vector, Vector>, SettableToUniform, SettableTo<T>

Parameters

Type	Name	Description
T	result	The resulting distribution
Int32	dimension	The vector dimension
Double	weight1	First weight
T	dist1	First distribution instance
Double	weight2	Second weight
T	dist2	Second distribution instance
Sparsity	sparsity	Vector sparsity specification

Returns

Type	Description
T	Resulting distribution

Type Parameters

Name	Description
T	The distribution type

Operators

Division(Dirichlet, Dirichlet)

Creates a Dirichlet distribution which is the ratio of two Dirichlet distributions

Declaration

public static Dirichlet operator /(Dirichlet numerator, Dirichlet denominator)

Parameters

Type	Name	Description
Dirichlet	numerator	The numerator distribution
Dirichlet	denominator	The denominator distribution

Returns

Type	Description
Dirichlet	The resulting Dirichlet distribution

ExclusiveOr(Dirichlet, Double)

Raises a distribution to a power.

Declaration

public static Dirichlet operator ^(Dirichlet dist, double exponent)

Parameters

Type	Name	Description
Dirichlet	dist	The distribution.
Double	exponent	The power to raise to.

Returns

Type	Description
Dirichlet	`dist` raised to power `exponent`.

Multiply(Dirichlet, Dirichlet)

Creates a Dirichlet distribution which is the product of two Dirichlet distributions

Declaration

public static Dirichlet operator *(Dirichlet a, Dirichlet b)

Parameters

Type	Name	Description
Dirichlet	a	The first distribution
Dirichlet	b	The second distribution

Returns

Type	Description
Dirichlet	The resulting Dirichlet distribution

Implements

IDistribution<T>

IDistribution

System.ICloneable

HasPoint<T>

CanGetLogProb<T>

SettableTo<T>

SettableToProduct<T>

SettableToProduct<T, U>

Diffable

SettableToUniform

SettableToRatio<T>

SettableToRatio<T, U>

SettableToPower<T>

SettableToWeightedSum<T>

CanGetLogAverageOf<T>

CanGetLogAverageOfPower<T>

CanGetAverageLog<T>

CanGetLogNormalizer

Sampleable<T>

CanGetMean<MeanType>

CanGetVariance<VarType>

CanGetMeanAndVariance<MeanType, VarType>

CanSetMeanAndVariance<MeanType, VarType>

CanGetMode<ModeType>