Class Dirichlet
A Dirichlet distribution on probability vectors.
Implements
Inherited Members
Namespace: Microsoft.ML.Probabilistic.Distributions
Assembly: Microsoft.ML.Probabilistic.dll
Syntax
[Serializable]
[DataContract]
[Quality(QualityBand.Mature)]
public class Dirichlet : IDistribution<Vector>, IDistribution, ICloneable, HasPoint<Vector>, CanGetLogProb<Vector>, SettableTo<Dirichlet>, SettableToProduct<Dirichlet>, SettableToProduct<Dirichlet, Dirichlet>, Diffable, SettableToUniform, SettableToRatio<Dirichlet>, SettableToRatio<Dirichlet, Dirichlet>, SettableToPower<Dirichlet>, SettableToWeightedSum<Dirichlet>, CanGetLogAverageOf<Dirichlet>, CanGetLogAverageOfPower<Dirichlet>, CanGetAverageLog<Dirichlet>, CanGetLogNormalizer, Sampleable<Vector>, CanGetMean<Vector>, CanGetVariance<Vector>, CanGetMeanAndVariance<Vector, Vector>, CanSetMeanAndVariance<Vector, Vector>, CanGetMode<Vector>
Remarks
The Dirichlet is a distribution on probability vectors. The formula for the distribution is p(x) = (Gamma(a)/prod_i Gamma(b_i)) prod_i x_i^{b_i-1} subject to the constraints x_i >= 0 and sum_i x_i = 1. The parameter a is the "total pseudo-count" and is shorthand for sum_i b_i. The vector b contains the pseudo-counts for each case i. The vector b can be sparse or dense; in many cases it is useful to give it a Sparsity specification of ApproximateWithTolerance(Double).
The distribution is represented by the pair (TotalCount, PseudoCount). If TotalCount is infinity, the distribution is a point mass. The Point property gives the mean. Otherwise TotalCount is always equal to PseudoCount.Sum(). If distribution is uniform when all PseudoCounts = 1. If any PseudoCount <= 0, the distribution is improper. In this case, the density is redefined to not include the Gamma terms, i.e. there is no normalizer.
Constructors
Dirichlet(Dirichlet)
Copy constructor.
Declaration
public Dirichlet(Dirichlet that)
Parameters
| Type | Name | Description |
|---|---|---|
| Dirichlet | that |
Dirichlet(Vector)
Creates a Dirichlet distribution with the specified pseudo-counts. The pseudo-count vector can have any Sparsity specification. A specification of ApproximateWithTolerance(Double) is recommended for sparse problems, and the message functions used in inference will maintain that sparsity specification.
Declaration
[Construction(new string[]{"PseudoCount"})]
public Dirichlet(Vector pseudoCount)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | pseudoCount | The vector of pseudo-counts |
Dirichlet(Double[])
Creates a Dirichlet distribution with the psecified pseudo-counts
Declaration
public Dirichlet(params double[] pseudoCount)
Parameters
| Type | Name | Description |
|---|---|---|
| Double[] | pseudoCount | An array of pseudo-counts |
Dirichlet(Int32)
Creates a uniform Dirichlet distribution with unit pseudo-counts.
Declaration
protected Dirichlet(int dimension)
Parameters
| Type | Name | Description |
|---|---|---|
| Int32 | dimension | Dimension |
Dirichlet(Int32, Sparsity)
Creates a uniform Dirichlet distribution with unit pseudo-counts and a given dimension and Sparsity.
Declaration
protected Dirichlet(int dimension, Sparsity sparsity)
Parameters
| Type | Name | Description |
|---|---|---|
| Int32 | dimension | Dimension |
| Sparsity | sparsity | The Sparsity specification. A specification of ApproximateWithTolerance(Double) is recommended for sparse problems. |
Dirichlet(Int32, Double)
Creates a uniform Dirichlet distribution with the specified initial pseudo-count for each index.
Declaration
protected Dirichlet(int dimension, double initialCount)
Parameters
| Type | Name | Description |
|---|---|---|
| Int32 | dimension | Dimension |
| Double | initialCount | Initial value for each pseudocount |
Dirichlet(Int32, Double, Sparsity)
Creates a uniform Dirichlet distribution with the specified dimension, initial pseudo-count and Sparsity.
Declaration
protected Dirichlet(int dimension, double initialCount, Sparsity sparsity)
Parameters
| Type | Name | Description |
|---|---|---|
| Int32 | dimension | Dimension |
| Double | initialCount | Initial value for each pseudocount |
| Sparsity | sparsity | The Sparsity specification. A specification of ApproximateWithTolerance(Double) is recommended for sparse problems. |
Fields
AllowImproperSum
If true, SetToSum(Double, Dirichlet, Double, Dirichlet) will use moment matching as described by Minka and Lafferty (2002).
Declaration
public static bool AllowImproperSum
Field Value
| Type | Description |
|---|---|
| Boolean |
PseudoCount
Vector of pseudo-counts
Declaration
[DataMember]
public Vector PseudoCount
Field Value
| Type | Description |
|---|---|
| Vector |
TotalCount
Gets the total count. If infinite, the distribution is a point mass. Otherwise, this is the sum of pseudo-counts
Declaration
[DataMember]
public double TotalCount
Field Value
| Type | Description |
|---|---|
| Double |
Properties
Dimension
Gets the dimension of this Dirichlet
Declaration
public int Dimension { get; }
Property Value
| Type | Description |
|---|---|
| Int32 |
IsPointMass
Whether this Dirichlet is a point mass
Declaration
[IgnoreDataMember]
public bool IsPointMass { get; }
Property Value
| Type | Description |
|---|---|
| Boolean |
Point
Sets/gets this distribution as a point mass
Declaration
[IgnoreDataMember]
public Vector Point { get; set; }
Property Value
| Type | Description |
|---|---|
| Vector |
Sparsity
Gets the Sparsity specification of this Distribution.
Declaration
public Sparsity Sparsity { get; }
Property Value
| Type | Description |
|---|---|
| Sparsity |
Methods
Clone()
Clones this Dirichlet.
Declaration
public object Clone()
Returns
| Type | Description |
|---|---|
| Object | An object which is a clone of the current instance. This must be cast if you want to assign the result to a Dirichlet type |
DirichletLn(Vector)
Computes the log Dirichlet function: sum_i GammaLn(pseudoCount[i]) - GammaLn(sum_i pseudoCount[i])
Declaration
public static double DirichletLn(Vector pseudoCount)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | pseudoCount | Vector of pseudo-counts. |
Returns
| Type | Description |
|---|---|
| Double |
|
Remarks
If any pseudoCount <= 0, the result is defined to be 0.
Equals(Object)
Override of the Equals method
Declaration
public override bool Equals(object thatd)
Parameters
| Type | Name | Description |
|---|---|---|
| Object | thatd | The instance to compare to |
Returns
| Type | Description |
|---|---|
| Boolean | True if the two distributions are the same in value, false otherwise |
Overrides
EstimateNewton(Vector, Vector)
Modifies PseudoCount to produce the given expected logarithms.
Declaration
public static void EstimateNewton(Vector PseudoCount, Vector meanLog)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | PseudoCount | On input, the initial guess. On output, the converged solution. |
| Vector | meanLog | May be -infinity. |
FromMeanLog(Vector)
Create a Dirichlet distribution with the given expected logarithms.
Declaration
public static Dirichlet FromMeanLog(Vector meanLog)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | meanLog | Desired expectation E[log(pk)] for each k. |
Returns
| Type | Description |
|---|---|
| Dirichlet | A new Dirichlet where GetMeanLog == meanLog |
Remarks
This function is equivalent to maximum-likelihood estimation of a Dirichlet distribution from data given by sufficient statistics. This function is significantly slower than the other constructors since it involves nonlinear optimization. Uses the Newton algorithm described in "Estimating a Dirichlet distribution" by T. Minka, 2000.
GetAverageLog(Dirichlet)
The expected logarithm of that distribution under this distribution.
Declaration
public double GetAverageLog(Dirichlet that)
Parameters
| Type | Name | Description |
|---|---|---|
| Dirichlet | that | The distribution to take the logarithm of. |
Returns
| Type | Description |
|---|---|
| Double |
|
Remarks
This is also known as the cross entropy.
GetHashCode()
Override of GetHashCode method
Declaration
public override int GetHashCode()
Returns
| Type | Description |
|---|---|
| Int32 | The hash code for this instance |
Overrides
GetLogAverageOf(Dirichlet)
The log of the integral of the product of this Dirichlet and that Dirichlet
Declaration
public double GetLogAverageOf(Dirichlet that)
Parameters
| Type | Name | Description |
|---|---|---|
| Dirichlet | that | That Dirichlet |
Returns
| Type | Description |
|---|---|
| Double | The log inner product |
GetLogAverageOfPower(Dirichlet, Double)
Get the integral of this distribution times another distribution raised to a power.
Declaration
public double GetLogAverageOfPower(Dirichlet that, double power)
Parameters
| Type | Name | Description |
|---|---|---|
| Dirichlet | that | |
| Double | power |
Returns
| Type | Description |
|---|---|
| Double |
GetLogNormalizer()
Gets the log normalizer for the distribution
Declaration
public double GetLogNormalizer()
Returns
| Type | Description |
|---|---|
| Double |
GetLogProb(Vector)
Evaluates the log of the Dirichlet density function at the given Vector value
Declaration
public double GetLogProb(Vector value)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | value | Where to do the evaluation. Must be vector of positive real numbers |
Returns
| Type | Description |
|---|---|
| Double | log(Dir(value;a,b)) |
GetMean()
Gets the expected value E(x)
Declaration
public Vector GetMean()
Returns
| Type | Description |
|---|---|
| Vector | E(x) |
GetMean(Vector)
Gets the expected value E(x). Provide a vector to put the result
Declaration
public Vector GetMean(Vector result)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | result | Where to put E(x) |
Returns
| Type | Description |
|---|---|
| Vector | E(x) |
GetMeanAndVariance(Vector, Vector)
Gets the mean E(p) = m/s and variance var(p) = m*(1-m)/(1+s)
Declaration
public void GetMeanAndVariance(Vector mean, Vector variance)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | mean | Where to put the mean |
| Vector | variance | Where to put the variance |
GetMeanCube()
Computes E[p(x)^3] for each x.
Declaration
public Vector GetMeanCube()
Returns
| Type | Description |
|---|---|
| Vector |
GetMeanLog()
Gets the expected log value E(log(x))
Declaration
public Vector GetMeanLog()
Returns
| Type | Description |
|---|---|
| Vector | E(log(x)) |
GetMeanLog(Vector)
Gets the expected log value E(log(x)). Provide a vector to put the result
Declaration
public Vector GetMeanLog(Vector result)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | result | Where to put E(log(x)) |
Returns
| Type | Description |
|---|---|
| Vector | E(log(x)) |
GetMeanLogAt(Int32)
E[log prob[sample]]
Declaration
public double GetMeanLogAt(int sample)
Parameters
| Type | Name | Description |
|---|---|---|
| Int32 | sample | a dimension of prob of interest |
Returns
| Type | Description |
|---|---|
| Double | E[log prob[sample]] |
GetMeanSquare()
Computes E[p(x)^2] for each x.
Declaration
public Vector GetMeanSquare()
Returns
| Type | Description |
|---|---|
| Vector |
GetMode()
The most probable vector.
Declaration
public Vector GetMode()
Returns
| Type | Description |
|---|---|
| Vector |
GetMode(Vector)
The most probable vector.
Declaration
public Vector GetMode(Vector result)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | result |
Returns
| Type | Description |
|---|---|
| Vector |
GetVariance()
Gets the variance var(p) = m*(1-m)/(1+s)
Declaration
public Vector GetVariance()
Returns
| Type | Description |
|---|---|
| Vector | The variance |
IsProper()
Whether the the distribution is proprer or not. It is proper if all pseudo-counts are > 0.
Declaration
public bool IsProper()
Returns
| Type | Description |
|---|---|
| Boolean | true if proper, false otherwise |
IsUniform()
Whether this instance is uniform (i.e. has unit pseudo-counts)
Declaration
public bool IsUniform()
Returns
| Type | Description |
|---|---|
| Boolean | true if uniform, false otherwise |
MaxDiff(Object)
The maximum difference between the parameters of this Dirichlet and that Dirichlet
Declaration
public double MaxDiff(object that)
Parameters
| Type | Name | Description |
|---|---|---|
| Object | that | That Dirichlet |
Returns
| Type | Description |
|---|---|
| Double | The maximum difference |
Remarks
a.MaxDiff(b) == b.MaxDiff(a)
PointMass(Vector)
Creates a point-mass Dirichlet at the specified location
Declaration
[Construction(new string[]{"Point"}, UseWhen = "IsPointMass")]
public static Dirichlet PointMass(Vector mean)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | mean | Where to locate the point-mass. All elements of the Vector must be positive |
Returns
| Type | Description |
|---|---|
| Dirichlet | The created point mass Dirichlet |
PointMass(Double[])
Creates a point-mass Dirichlet at the specified location
Declaration
public static Dirichlet PointMass(params double[] mean)
Parameters
| Type | Name | Description |
|---|---|---|
| Double[] | mean | Where to locate the point-mass. All elements of the array must be positive |
Returns
| Type | Description |
|---|---|
| Dirichlet | The created point mass Dirichlet |
Sample()
Samples from this Dirichlet distribution
Declaration
public Vector Sample()
Returns
| Type | Description |
|---|---|
| Vector | The sample Vector |
Sample(Vector)
Samples from this Dirichlet distribution. Provide a Vector to place the result
Declaration
public Vector Sample(Vector result)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | result | Where to place the resulting sample |
Returns
| Type | Description |
|---|---|
| Vector | result |
Sample(Vector, Vector)
Sample from a Dirichlet with specified pseudo-counts
Declaration
public static Vector Sample(Vector pseudoCount, Vector result)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | pseudoCount | The pseudo-count vector |
| Vector | result | Where to put the result |
Returns
| Type | Description |
|---|---|
| Vector | result |
SampleFromPseudoCounts(Vector)
Sample from a Dirichlet with specified pseudo-counts
Declaration
public static Vector SampleFromPseudoCounts(Vector pseudoCount)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | pseudoCount | The pseudo-count vector |
Returns
| Type | Description |
|---|---|
| Vector | A new Vector |
SetDerivatives(Vector, Vector, Vector, Boolean)
Sets the mean and precision to best match the given derivatives at a point.
Declaration
public void SetDerivatives(Vector x, Vector dLogP, Vector ddLogP, bool forceProper)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | x | A probability vector |
| Vector | dLogP | Desired derivative of log-density at x |
| Vector | ddLogP | Desired second derivative of log-density at x |
| Boolean | forceProper | If true and both derivatives cannot be matched by a distribution with counts at least 1, match only the first. |
SetMeanAndMeanSquare(Vector, Vector)
Sets the mean, and sets the precision to best match the given mean-squares.
Declaration
public void SetMeanAndMeanSquare(Vector mean, Vector meanSquare)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | mean | Desired mean in each dimension. Must be in [0,1] and sum to 1. |
| Vector | meanSquare | Desired meanSquare in each dimension. Must be in [0,1]. |
Remarks
The resulting distribution will have the given mean but will only approximately match the meanSquare, since the Dirichlet does not have enough parameters. The moment matching formula comes from: "Expectation-Propagation for the Generative Aspect Model", Thomas Minka and John Lafferty, Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence, pp. 352-359, 2002 http://research.microsoft.com/~minka/papers/aspect/
SetMeanAndVariance(Vector, Vector)
Sets the mean, and sets the precision to best match the given variances.
Declaration
public void SetMeanAndVariance(Vector mean, Vector variance)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | mean | Desired mean in each dimension. Must be in [0,1] and sum to 1. |
| Vector | variance | Desired variance in each dimension. Must be non-negative. |
Remarks
The resulting distribution will have the given mean but will only approximately match the variance, since the Dirichlet does not have enough parameters. The moment matching formula comes from: "Expectation-Propagation for the Generative Aspect Model", Thomas Minka and John Lafferty, Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence, pp. 352-359, 2002 http://research.microsoft.com/~minka/papers/aspect/
SetMeanLog(Vector)
Set the Dirichlet parameters to produce the given expected logarithms.
Declaration
public void SetMeanLog(Vector meanLog)
Parameters
| Type | Name | Description |
|---|---|---|
| Vector | meanLog | Desired expectation E[log(pk)] for each k. |
Remarks
This function is equivalent to maximum-likelihood estimation of a Dirichlet distribution from data given by sufficient statistics. This function is significantly slower than the other setters since it involves nonlinear optimization. Uses the Newton algorithm described in "Estimating a Dirichlet distribution" by T. Minka, 2000.
SetTo(Dirichlet)
Sets this Dirichlet instance to have the parameter values of another Dirichlet instance
Declaration
public void SetTo(Dirichlet value)
Parameters
| Type | Name | Description |
|---|---|---|
| Dirichlet | value |
SetToPower(Dirichlet, Double)
Sets the parameters to represent the raising a Dirichlet to some power.
Declaration
public void SetToPower(Dirichlet dist, double exponent)
Parameters
| Type | Name | Description |
|---|---|---|
| Dirichlet | dist | The Dirichlet |
| Double | exponent | The exponent |
SetToProduct(Dirichlet, Dirichlet)
Sets the parameters to represent the product of two Dirichlets.
Declaration
public void SetToProduct(Dirichlet a, Dirichlet b)
Parameters
| Type | Name | Description |
|---|---|---|
| Dirichlet | a | The first Dirichlet. May refer to |
| Dirichlet | b | The second Dirichlet. May refer to |
Remarks
The result may not be proper, i.e. its parameters may be negative. For example, if you multiply Dirichlet(0.1,0.1) by itself you get Dirichlet(-0.8, -0.8). No error is thrown in this case.
SetToRatio(Dirichlet, Dirichlet, Boolean)
Sets the parameters to represent the ratio of two Dirichlets.
Declaration
public void SetToRatio(Dirichlet numerator, Dirichlet denominator, bool forceProper = false)
Parameters
| Type | Name | Description |
|---|---|---|
| Dirichlet | numerator | The numerator Dirichlet. Can be the same object as this. |
| Dirichlet | denominator | The denominator Dirichlet. Can be the same object as this. |
| Boolean | forceProper | If true, the PseudoCounts of the result are made >= 1, under the constraint that denominator*result has the same mean as numerator. |
SetToSum(Double, Dirichlet, Double, Dirichlet)
Set the parameters to match the moments of a mixture distribution.
Declaration
public void SetToSum(double weight1, Dirichlet dist1, double weight2, Dirichlet dist2)
Parameters
| Type | Name | Description |
|---|---|---|
| Double | weight1 | The first weight |
| Dirichlet | dist1 | The first distribution. Can be the same object as |
| Double | weight2 | The second weight |
| Dirichlet | dist2 | The second distribution. Can be the same object as |
SetToUniform()
Sets the distribution to be uniform
Declaration
public void SetToUniform()
Symmetric(Int32, Double)
Creates a Dirichlet distribution with all pseudo-counts equal to initialCount.
Declaration
public static Dirichlet Symmetric(int dimension, double pseudoCount)
Parameters
| Type | Name | Description |
|---|---|---|
| Int32 | dimension | Dimension |
| Double | pseudoCount | The value for each pseudo-count |
Returns
| Type | Description |
|---|---|
| Dirichlet | A new Dirichlet distribution |
Symmetric(Int32, Double, Sparsity)
Creates a Dirichlet distribution of a given sparsity with all pseudo-counts equal to initialCount.
Declaration
public static Dirichlet Symmetric(int dimension, double pseudoCount, Sparsity sparsity)
Parameters
| Type | Name | Description |
|---|---|---|
| Int32 | dimension | Dimension |
| Double | pseudoCount | The value for each pseudo-count |
| Sparsity | sparsity | Sparsity specification |
Returns
| Type | Description |
|---|---|
| Dirichlet | A new Dirichlet distribution |
ToString()
ToString override
Declaration
public override string ToString()
Returns
| Type | Description |
|---|---|
| String | String representation of the instance |
Overrides
ToString(String)
Declaration
public string ToString(string format)
Parameters
| Type | Name | Description |
|---|---|---|
| String | format |
Returns
| Type | Description |
|---|---|
| String |
ToString(String, String)
Declaration
public string ToString(string format, string delimiter)
Parameters
| Type | Name | Description |
|---|---|---|
| String | format | |
| String | delimiter |
Returns
| Type | Description |
|---|---|
| String |
Uniform(Int32)
Instantiates a uniform Dirichlet distribution
Declaration
public static Dirichlet Uniform(int dimension)
Parameters
| Type | Name | Description |
|---|---|---|
| Int32 | dimension | Dimension |
Returns
| Type | Description |
|---|---|
| Dirichlet | A new uniform Dirichlet distribution |
Uniform(Int32, Sparsity)
Instantiates a uniform Dirichlet distribution of a given sparsity
Declaration
[Construction(new string[]{"Dimension", "Sparsity"}, UseWhen = "IsUniform")]
public static Dirichlet Uniform(int dimension, Sparsity sparsity)
Parameters
| Type | Name | Description |
|---|---|---|
| Int32 | dimension | Dimension |
| Sparsity | sparsity | Sparsity |
Returns
| Type | Description |
|---|---|
| Dirichlet | A new uniform Dirichlet distribution |
WeightedSum<T>(T, Int32, Double, T, Double, T, Sparsity)
Static weighted sum method for distribution types for which both mean and variance can be got/set as Vectors
Declaration
public static T WeightedSum<T>(T result, int dimension, double weight1, T dist1, double weight2, T dist2, Sparsity sparsity)
where T : CanGetMeanAndVariance<Vector, Vector>, CanSetMeanAndVariance<Vector, Vector>, SettableToUniform, SettableTo<T>
Parameters
| Type | Name | Description |
|---|---|---|
| T | result | The resulting distribution |
| Int32 | dimension | The vector dimension |
| Double | weight1 | First weight |
| T | dist1 | First distribution instance |
| Double | weight2 | Second weight |
| T | dist2 | Second distribution instance |
| Sparsity | sparsity | Vector sparsity specification |
Returns
| Type | Description |
|---|---|
| T | Resulting distribution |
Type Parameters
| Name | Description |
|---|---|
| T | The distribution type |
Operators
Division(Dirichlet, Dirichlet)
Creates a Dirichlet distribution which is the ratio of two Dirichlet distributions
Declaration
public static Dirichlet operator /(Dirichlet numerator, Dirichlet denominator)
Parameters
| Type | Name | Description |
|---|---|---|
| Dirichlet | numerator | The numerator distribution |
| Dirichlet | denominator | The denominator distribution |
Returns
| Type | Description |
|---|---|
| Dirichlet | The resulting Dirichlet distribution |
ExclusiveOr(Dirichlet, Double)
Raises a distribution to a power.
Declaration
public static Dirichlet operator ^(Dirichlet dist, double exponent)
Parameters
| Type | Name | Description |
|---|---|---|
| Dirichlet | dist | The distribution. |
| Double | exponent | The power to raise to. |
Returns
| Type | Description |
|---|---|
| Dirichlet |
|
Multiply(Dirichlet, Dirichlet)
Creates a Dirichlet distribution which is the product of two Dirichlet distributions
Declaration
public static Dirichlet operator *(Dirichlet a, Dirichlet b)
Parameters
| Type | Name | Description |
|---|---|---|
| Dirichlet | a | The first distribution |
| Dirichlet | b | The second distribution |
Returns
| Type | Description |
|---|---|
| Dirichlet | The resulting Dirichlet distribution |