Class Dirichlet
A Dirichlet distribution on probability vectors.
Implements
Inherited Members
Namespace: Microsoft.ML.Probabilistic.Distributions
Assembly: Microsoft.ML.Probabilistic.dll
Syntax
[Serializable]
[DataContract]
[Quality(QualityBand.Mature)]
public class Dirichlet : IDistribution<Vector>, IDistribution, ICloneable, HasPoint<Vector>, CanGetLogProb<Vector>, SettableTo<Dirichlet>, SettableToProduct<Dirichlet>, SettableToProduct<Dirichlet, Dirichlet>, Diffable, SettableToUniform, SettableToRatio<Dirichlet>, SettableToRatio<Dirichlet, Dirichlet>, SettableToPower<Dirichlet>, SettableToWeightedSum<Dirichlet>, CanGetLogAverageOf<Dirichlet>, CanGetLogAverageOfPower<Dirichlet>, CanGetAverageLog<Dirichlet>, CanGetLogNormalizer, Sampleable<Vector>, CanGetMean<Vector>, CanGetVariance<Vector>, CanGetMeanAndVariance<Vector, Vector>, CanSetMeanAndVariance<Vector, Vector>, CanGetMode<Vector>
Remarks
The Dirichlet is a distribution on probability vectors. The formula for the distribution is p(x) = (Gamma(a)/prod_i Gamma(b_i)) prod_i x_i^{b_i-1} subject to the constraints x_i >= 0 and sum_i x_i = 1. The parameter a is the "total pseudo-count" and is shorthand for sum_i b_i. The vector b contains the pseudo-counts for each case i. The vector b can be sparse or dense; in many cases it is useful to give it a Sparsity specification of ApproximateWithTolerance(Double).
The distribution is represented by the pair (TotalCount, PseudoCount). If TotalCount is infinity, the distribution is a point mass. The Point property gives the mean. Otherwise TotalCount is always equal to PseudoCount.Sum(). If distribution is uniform when all PseudoCounts = 1. If any PseudoCount <= 0, the distribution is improper. In this case, the density is redefined to not include the Gamma terms, i.e. there is no normalizer.
Constructors
Dirichlet(Dirichlet)
Copy constructor.
Declaration
public Dirichlet(Dirichlet that)
Parameters
Type | Name | Description |
---|---|---|
Dirichlet | that |
Dirichlet(Vector)
Creates a Dirichlet distribution with the specified pseudo-counts. The pseudo-count vector can have any Sparsity specification. A specification of ApproximateWithTolerance(Double) is recommended for sparse problems, and the message functions used in inference will maintain that sparsity specification.
Declaration
[Construction(new string[]{"PseudoCount"})]
public Dirichlet(Vector pseudoCount)
Parameters
Type | Name | Description |
---|---|---|
Vector | pseudoCount | The vector of pseudo-counts |
Dirichlet(Double[])
Creates a Dirichlet distribution with the psecified pseudo-counts
Declaration
public Dirichlet(params double[] pseudoCount)
Parameters
Type | Name | Description |
---|---|---|
Double[] | pseudoCount | An array of pseudo-counts |
Dirichlet(Int32)
Creates a uniform Dirichlet distribution with unit pseudo-counts.
Declaration
protected Dirichlet(int dimension)
Parameters
Type | Name | Description |
---|---|---|
Int32 | dimension | Dimension |
Dirichlet(Int32, Sparsity)
Creates a uniform Dirichlet distribution with unit pseudo-counts and a given dimension and Sparsity.
Declaration
protected Dirichlet(int dimension, Sparsity sparsity)
Parameters
Type | Name | Description |
---|---|---|
Int32 | dimension | Dimension |
Sparsity | sparsity | The Sparsity specification. A specification of ApproximateWithTolerance(Double) is recommended for sparse problems. |
Dirichlet(Int32, Double)
Creates a uniform Dirichlet distribution with the specified initial pseudo-count for each index.
Declaration
protected Dirichlet(int dimension, double initialCount)
Parameters
Type | Name | Description |
---|---|---|
Int32 | dimension | Dimension |
Double | initialCount | Initial value for each pseudocount |
Dirichlet(Int32, Double, Sparsity)
Creates a uniform Dirichlet distribution with the specified dimension, initial pseudo-count and Sparsity.
Declaration
protected Dirichlet(int dimension, double initialCount, Sparsity sparsity)
Parameters
Type | Name | Description |
---|---|---|
Int32 | dimension | Dimension |
Double | initialCount | Initial value for each pseudocount |
Sparsity | sparsity | The Sparsity specification. A specification of ApproximateWithTolerance(Double) is recommended for sparse problems. |
Fields
AllowImproperSum
If true, SetToSum(Double, Dirichlet, Double, Dirichlet) will use moment matching as described by Minka and Lafferty (2002).
Declaration
public static bool AllowImproperSum
Field Value
Type | Description |
---|---|
Boolean |
PseudoCount
Vector of pseudo-counts
Declaration
[DataMember]
public Vector PseudoCount
Field Value
Type | Description |
---|---|
Vector |
TotalCount
Gets the total count. If infinite, the distribution is a point mass. Otherwise, this is the sum of pseudo-counts
Declaration
[DataMember]
public double TotalCount
Field Value
Type | Description |
---|---|
Double |
Properties
Dimension
Gets the dimension of this Dirichlet
Declaration
public int Dimension { get; }
Property Value
Type | Description |
---|---|
Int32 |
IsPointMass
Whether this Dirichlet is a point mass
Declaration
[IgnoreDataMember]
public bool IsPointMass { get; }
Property Value
Type | Description |
---|---|
Boolean |
Point
Sets/gets this distribution as a point mass
Declaration
[IgnoreDataMember]
public Vector Point { get; set; }
Property Value
Type | Description |
---|---|
Vector |
Sparsity
Gets the Sparsity specification of this Distribution.
Declaration
public Sparsity Sparsity { get; }
Property Value
Type | Description |
---|---|
Sparsity |
Methods
Clone()
Clones this Dirichlet.
Declaration
public object Clone()
Returns
Type | Description |
---|---|
Object | An object which is a clone of the current instance. This must be cast if you want to assign the result to a Dirichlet type |
DirichletLn(Vector)
Computes the log Dirichlet function: sum_i GammaLn(pseudoCount[i]) - GammaLn(sum_i pseudoCount[i])
Declaration
public static double DirichletLn(Vector pseudoCount)
Parameters
Type | Name | Description |
---|---|---|
Vector | pseudoCount | Vector of pseudo-counts. |
Returns
Type | Description |
---|---|
Double |
|
Remarks
If any pseudoCount <= 0, the result is defined to be 0.
Equals(Object)
Override of the Equals method
Declaration
public override bool Equals(object thatd)
Parameters
Type | Name | Description |
---|---|---|
Object | thatd | The instance to compare to |
Returns
Type | Description |
---|---|
Boolean | True if the two distributions are the same in value, false otherwise |
Overrides
EstimateNewton(Vector, Vector)
Modifies PseudoCount to produce the given expected logarithms.
Declaration
public static void EstimateNewton(Vector PseudoCount, Vector meanLog)
Parameters
Type | Name | Description |
---|---|---|
Vector | PseudoCount | On input, the initial guess. On output, the converged solution. |
Vector | meanLog | May be -infinity. |
FromMeanLog(Vector)
Create a Dirichlet distribution with the given expected logarithms.
Declaration
public static Dirichlet FromMeanLog(Vector meanLog)
Parameters
Type | Name | Description |
---|---|---|
Vector | meanLog | Desired expectation E[log(pk)] for each k. |
Returns
Type | Description |
---|---|
Dirichlet | A new Dirichlet where GetMeanLog == meanLog |
Remarks
This function is equivalent to maximum-likelihood estimation of a Dirichlet distribution from data given by sufficient statistics. This function is significantly slower than the other constructors since it involves nonlinear optimization. Uses the Newton algorithm described in "Estimating a Dirichlet distribution" by T. Minka, 2000.
GetAverageLog(Dirichlet)
The expected logarithm of that distribution under this distribution.
Declaration
public double GetAverageLog(Dirichlet that)
Parameters
Type | Name | Description |
---|---|---|
Dirichlet | that | The distribution to take the logarithm of. |
Returns
Type | Description |
---|---|
Double |
|
Remarks
This is also known as the cross entropy.
GetHashCode()
Override of GetHashCode method
Declaration
public override int GetHashCode()
Returns
Type | Description |
---|---|
Int32 | The hash code for this instance |
Overrides
GetLogAverageOf(Dirichlet)
The log of the integral of the product of this Dirichlet and that Dirichlet
Declaration
public double GetLogAverageOf(Dirichlet that)
Parameters
Type | Name | Description |
---|---|---|
Dirichlet | that | That Dirichlet |
Returns
Type | Description |
---|---|
Double | The log inner product |
GetLogAverageOfPower(Dirichlet, Double)
Get the integral of this distribution times another distribution raised to a power.
Declaration
public double GetLogAverageOfPower(Dirichlet that, double power)
Parameters
Type | Name | Description |
---|---|---|
Dirichlet | that | |
Double | power |
Returns
Type | Description |
---|---|
Double |
GetLogNormalizer()
Gets the log normalizer for the distribution
Declaration
public double GetLogNormalizer()
Returns
Type | Description |
---|---|
Double |
GetLogProb(Vector)
Evaluates the log of the Dirichlet density function at the given Vector value
Declaration
public double GetLogProb(Vector value)
Parameters
Type | Name | Description |
---|---|---|
Vector | value | Where to do the evaluation. Must be vector of positive real numbers |
Returns
Type | Description |
---|---|
Double | log(Dir(value;a,b)) |
GetMean()
Gets the expected value E(x)
Declaration
public Vector GetMean()
Returns
Type | Description |
---|---|
Vector | E(x) |
GetMean(Vector)
Gets the expected value E(x). Provide a vector to put the result
Declaration
public Vector GetMean(Vector result)
Parameters
Type | Name | Description |
---|---|---|
Vector | result | Where to put E(x) |
Returns
Type | Description |
---|---|
Vector | E(x) |
GetMeanAndVariance(Vector, Vector)
Gets the mean E(p) = m/s and variance var(p) = m*(1-m)/(1+s)
Declaration
public void GetMeanAndVariance(Vector mean, Vector variance)
Parameters
Type | Name | Description |
---|---|---|
Vector | mean | Where to put the mean |
Vector | variance | Where to put the variance |
GetMeanCube()
Computes E[p(x)^3] for each x.
Declaration
public Vector GetMeanCube()
Returns
Type | Description |
---|---|
Vector |
GetMeanLog()
Gets the expected log value E(log(x))
Declaration
public Vector GetMeanLog()
Returns
Type | Description |
---|---|
Vector | E(log(x)) |
GetMeanLog(Vector)
Gets the expected log value E(log(x)). Provide a vector to put the result
Declaration
public Vector GetMeanLog(Vector result)
Parameters
Type | Name | Description |
---|---|---|
Vector | result | Where to put E(log(x)) |
Returns
Type | Description |
---|---|
Vector | E(log(x)) |
GetMeanLogAt(Int32)
E[log prob[sample]]
Declaration
public double GetMeanLogAt(int sample)
Parameters
Type | Name | Description |
---|---|---|
Int32 | sample | a dimension of prob of interest |
Returns
Type | Description |
---|---|
Double | E[log prob[sample]] |
GetMeanSquare()
Computes E[p(x)^2] for each x.
Declaration
public Vector GetMeanSquare()
Returns
Type | Description |
---|---|
Vector |
GetMode()
The most probable vector.
Declaration
public Vector GetMode()
Returns
Type | Description |
---|---|
Vector |
GetMode(Vector)
The most probable vector.
Declaration
public Vector GetMode(Vector result)
Parameters
Type | Name | Description |
---|---|---|
Vector | result |
Returns
Type | Description |
---|---|
Vector |
GetVariance()
Gets the variance var(p) = m*(1-m)/(1+s)
Declaration
public Vector GetVariance()
Returns
Type | Description |
---|---|
Vector | The variance |
IsProper()
Whether the the distribution is proprer or not. It is proper if all pseudo-counts are > 0.
Declaration
public bool IsProper()
Returns
Type | Description |
---|---|
Boolean | true if proper, false otherwise |
IsUniform()
Whether this instance is uniform (i.e. has unit pseudo-counts)
Declaration
public bool IsUniform()
Returns
Type | Description |
---|---|
Boolean | true if uniform, false otherwise |
MaxDiff(Object)
The maximum difference between the parameters of this Dirichlet and that Dirichlet
Declaration
public double MaxDiff(object that)
Parameters
Type | Name | Description |
---|---|---|
Object | that | That Dirichlet |
Returns
Type | Description |
---|---|
Double | The maximum difference |
Remarks
a.MaxDiff(b) == b.MaxDiff(a)
PointMass(Vector)
Creates a point-mass Dirichlet at the specified location
Declaration
[Construction(new string[]{"Point"}, UseWhen = "IsPointMass")]
public static Dirichlet PointMass(Vector mean)
Parameters
Type | Name | Description |
---|---|---|
Vector | mean | Where to locate the point-mass. All elements of the Vector must be positive |
Returns
Type | Description |
---|---|
Dirichlet | The created point mass Dirichlet |
PointMass(Double[])
Creates a point-mass Dirichlet at the specified location
Declaration
public static Dirichlet PointMass(params double[] mean)
Parameters
Type | Name | Description |
---|---|---|
Double[] | mean | Where to locate the point-mass. All elements of the array must be positive |
Returns
Type | Description |
---|---|
Dirichlet | The created point mass Dirichlet |
Sample()
Samples from this Dirichlet distribution
Declaration
public Vector Sample()
Returns
Type | Description |
---|---|
Vector | The sample Vector |
Sample(Vector)
Samples from this Dirichlet distribution. Provide a Vector to place the result
Declaration
public Vector Sample(Vector result)
Parameters
Type | Name | Description |
---|---|---|
Vector | result | Where to place the resulting sample |
Returns
Type | Description |
---|---|
Vector | result |
Sample(Vector, Vector)
Sample from a Dirichlet with specified pseudo-counts
Declaration
public static Vector Sample(Vector pseudoCount, Vector result)
Parameters
Type | Name | Description |
---|---|---|
Vector | pseudoCount | The pseudo-count vector |
Vector | result | Where to put the result |
Returns
Type | Description |
---|---|
Vector | result |
SampleFromPseudoCounts(Vector)
Sample from a Dirichlet with specified pseudo-counts
Declaration
public static Vector SampleFromPseudoCounts(Vector pseudoCount)
Parameters
Type | Name | Description |
---|---|---|
Vector | pseudoCount | The pseudo-count vector |
Returns
Type | Description |
---|---|
Vector | A new Vector |
SetDerivatives(Vector, Vector, Vector, Boolean)
Sets the mean and precision to best match the given derivatives at a point.
Declaration
public void SetDerivatives(Vector x, Vector dLogP, Vector ddLogP, bool forceProper)
Parameters
Type | Name | Description |
---|---|---|
Vector | x | A probability vector |
Vector | dLogP | Desired derivative of log-density at x |
Vector | ddLogP | Desired second derivative of log-density at x |
Boolean | forceProper | If true and both derivatives cannot be matched by a distribution with counts at least 1, match only the first. |
SetMeanAndMeanSquare(Vector, Vector)
Sets the mean, and sets the precision to best match the given mean-squares.
Declaration
public void SetMeanAndMeanSquare(Vector mean, Vector meanSquare)
Parameters
Type | Name | Description |
---|---|---|
Vector | mean | Desired mean in each dimension. Must be in [0,1] and sum to 1. |
Vector | meanSquare | Desired meanSquare in each dimension. Must be in [0,1]. |
Remarks
The resulting distribution will have the given mean but will only approximately match the meanSquare, since the Dirichlet does not have enough parameters. The moment matching formula comes from: "Expectation-Propagation for the Generative Aspect Model", Thomas Minka and John Lafferty, Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence, pp. 352-359, 2002 http://research.microsoft.com/~minka/papers/aspect/
SetMeanAndVariance(Vector, Vector)
Sets the mean, and sets the precision to best match the given variances.
Declaration
public void SetMeanAndVariance(Vector mean, Vector variance)
Parameters
Type | Name | Description |
---|---|---|
Vector | mean | Desired mean in each dimension. Must be in [0,1] and sum to 1. |
Vector | variance | Desired variance in each dimension. Must be non-negative. |
Remarks
The resulting distribution will have the given mean but will only approximately match the variance, since the Dirichlet does not have enough parameters. The moment matching formula comes from: "Expectation-Propagation for the Generative Aspect Model", Thomas Minka and John Lafferty, Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence, pp. 352-359, 2002 http://research.microsoft.com/~minka/papers/aspect/
SetMeanLog(Vector)
Set the Dirichlet parameters to produce the given expected logarithms.
Declaration
public void SetMeanLog(Vector meanLog)
Parameters
Type | Name | Description |
---|---|---|
Vector | meanLog | Desired expectation E[log(pk)] for each k. |
Remarks
This function is equivalent to maximum-likelihood estimation of a Dirichlet distribution from data given by sufficient statistics. This function is significantly slower than the other setters since it involves nonlinear optimization. Uses the Newton algorithm described in "Estimating a Dirichlet distribution" by T. Minka, 2000.
SetTo(Dirichlet)
Sets this Dirichlet instance to have the parameter values of another Dirichlet instance
Declaration
public void SetTo(Dirichlet value)
Parameters
Type | Name | Description |
---|---|---|
Dirichlet | value |
SetToPower(Dirichlet, Double)
Sets the parameters to represent the raising a Dirichlet to some power.
Declaration
public void SetToPower(Dirichlet dist, double exponent)
Parameters
Type | Name | Description |
---|---|---|
Dirichlet | dist | The Dirichlet |
Double | exponent | The exponent |
SetToProduct(Dirichlet, Dirichlet)
Sets the parameters to represent the product of two Dirichlets.
Declaration
public void SetToProduct(Dirichlet a, Dirichlet b)
Parameters
Type | Name | Description |
---|---|---|
Dirichlet | a | The first Dirichlet. May refer to |
Dirichlet | b | The second Dirichlet. May refer to |
Remarks
The result may not be proper, i.e. its parameters may be negative. For example, if you multiply Dirichlet(0.1,0.1) by itself you get Dirichlet(-0.8, -0.8). No error is thrown in this case.
SetToRatio(Dirichlet, Dirichlet, Boolean)
Sets the parameters to represent the ratio of two Dirichlets.
Declaration
public void SetToRatio(Dirichlet numerator, Dirichlet denominator, bool forceProper = false)
Parameters
Type | Name | Description |
---|---|---|
Dirichlet | numerator | The numerator Dirichlet. Can be the same object as this. |
Dirichlet | denominator | The denominator Dirichlet. Can be the same object as this. |
Boolean | forceProper | If true, the PseudoCounts of the result are made >= 1, under the constraint that denominator*result has the same mean as numerator. |
SetToSum(Double, Dirichlet, Double, Dirichlet)
Set the parameters to match the moments of a mixture distribution.
Declaration
public void SetToSum(double weight1, Dirichlet dist1, double weight2, Dirichlet dist2)
Parameters
Type | Name | Description |
---|---|---|
Double | weight1 | The first weight |
Dirichlet | dist1 | The first distribution. Can be the same object as |
Double | weight2 | The second weight |
Dirichlet | dist2 | The second distribution. Can be the same object as |
SetToUniform()
Sets the distribution to be uniform
Declaration
public void SetToUniform()
Symmetric(Int32, Double)
Creates a Dirichlet distribution with all pseudo-counts equal to initialCount.
Declaration
public static Dirichlet Symmetric(int dimension, double pseudoCount)
Parameters
Type | Name | Description |
---|---|---|
Int32 | dimension | Dimension |
Double | pseudoCount | The value for each pseudo-count |
Returns
Type | Description |
---|---|
Dirichlet | A new Dirichlet distribution |
Symmetric(Int32, Double, Sparsity)
Creates a Dirichlet distribution of a given sparsity with all pseudo-counts equal to initialCount.
Declaration
public static Dirichlet Symmetric(int dimension, double pseudoCount, Sparsity sparsity)
Parameters
Type | Name | Description |
---|---|---|
Int32 | dimension | Dimension |
Double | pseudoCount | The value for each pseudo-count |
Sparsity | sparsity | Sparsity specification |
Returns
Type | Description |
---|---|
Dirichlet | A new Dirichlet distribution |
ToString()
ToString override
Declaration
public override string ToString()
Returns
Type | Description |
---|---|
String | String representation of the instance |
Overrides
ToString(String)
Declaration
public string ToString(string format)
Parameters
Type | Name | Description |
---|---|---|
String | format |
Returns
Type | Description |
---|---|
String |
ToString(String, String)
Declaration
public string ToString(string format, string delimiter)
Parameters
Type | Name | Description |
---|---|---|
String | format | |
String | delimiter |
Returns
Type | Description |
---|---|
String |
Uniform(Int32)
Instantiates a uniform Dirichlet distribution
Declaration
public static Dirichlet Uniform(int dimension)
Parameters
Type | Name | Description |
---|---|---|
Int32 | dimension | Dimension |
Returns
Type | Description |
---|---|
Dirichlet | A new uniform Dirichlet distribution |
Uniform(Int32, Sparsity)
Instantiates a uniform Dirichlet distribution of a given sparsity
Declaration
[Construction(new string[]{"Dimension", "Sparsity"}, UseWhen = "IsUniform")]
public static Dirichlet Uniform(int dimension, Sparsity sparsity)
Parameters
Type | Name | Description |
---|---|---|
Int32 | dimension | Dimension |
Sparsity | sparsity | Sparsity |
Returns
Type | Description |
---|---|
Dirichlet | A new uniform Dirichlet distribution |
WeightedSum<T>(T, Int32, Double, T, Double, T, Sparsity)
Static weighted sum method for distribution types for which both mean and variance can be got/set as Vectors
Declaration
public static T WeightedSum<T>(T result, int dimension, double weight1, T dist1, double weight2, T dist2, Sparsity sparsity)
where T : CanGetMeanAndVariance<Vector, Vector>, CanSetMeanAndVariance<Vector, Vector>, SettableToUniform, SettableTo<T>
Parameters
Type | Name | Description |
---|---|---|
T | result | The resulting distribution |
Int32 | dimension | The vector dimension |
Double | weight1 | First weight |
T | dist1 | First distribution instance |
Double | weight2 | Second weight |
T | dist2 | Second distribution instance |
Sparsity | sparsity | Vector sparsity specification |
Returns
Type | Description |
---|---|
T | Resulting distribution |
Type Parameters
Name | Description |
---|---|
T | The distribution type |
Operators
Division(Dirichlet, Dirichlet)
Creates a Dirichlet distribution which is the ratio of two Dirichlet distributions
Declaration
public static Dirichlet operator /(Dirichlet numerator, Dirichlet denominator)
Parameters
Type | Name | Description |
---|---|---|
Dirichlet | numerator | The numerator distribution |
Dirichlet | denominator | The denominator distribution |
Returns
Type | Description |
---|---|
Dirichlet | The resulting Dirichlet distribution |
ExclusiveOr(Dirichlet, Double)
Raises a distribution to a power.
Declaration
public static Dirichlet operator ^(Dirichlet dist, double exponent)
Parameters
Type | Name | Description |
---|---|---|
Dirichlet | dist | The distribution. |
Double | exponent | The power to raise to. |
Returns
Type | Description |
---|---|
Dirichlet |
|
Multiply(Dirichlet, Dirichlet)
Creates a Dirichlet distribution which is the product of two Dirichlet distributions
Declaration
public static Dirichlet operator *(Dirichlet a, Dirichlet b)
Parameters
Type | Name | Description |
---|---|---|
Dirichlet | a | The first distribution |
Dirichlet | b | The second distribution |
Returns
Type | Description |
---|---|
Dirichlet | The resulting Dirichlet distribution |