Skip to main content

Learners : Matchbox recommender

Learner API

There are two ways in which you can use the Infer.NET Matchbox recommender. The first one, which is also the simplest, is through the command line. This is explained in detail in the Command-line runners section. Here we will cover the other approach, which makes use of the developer API. This API is part of the Microsoft.ML.Probabilistic.Learners package. In this section we will cover an overview example of the recommender API, while more of the detail will be filled in the subsections.

Before a recommender is created, a data mapping needs to be instantiated. This is simply an interface implementation which tells the recommender how to read data. This approach was preferred to passing fixed parameters to the system in order to avoid unnecessary data conversions. The mapping defines methods which the recommender will call during training or bulk prediction. Here is a sample implementation of a mapping which provides training instances from a comma-separated file; a single line of this file contains the rating that a user has given to an item - for example, “Person Name, Movie Name, 5”.

[Serializable]
class  CsvMapping :  IStarRatingRecommenderMapping
 <string, Tuple<string, string, int>, string, string, int, NoFeatureSource, Vector>
{
 public  IEnumerable<Tuple<string, string, int>> GetInstances(string instanceSource)
 {
   foreach (string line in  File.ReadLines(instanceSource))
   {
     string[] split = line.Split(new [] { ',' });
     yield  return  Tuple.Create(split[0], split[1], Convert.ToInt32(split[2]));
   }
 }

 public  string  GetUser(string instanceSource, Tuple<string, string, int> instance)
 { return instance.Item1; }

 public  string  GetItem(string instanceSource, Tuple<string, string, int> instance)
 { return instance.Item2; }

 public  int  GetRating(string instanceSource, Tuple<string, string, int> instance)
 { return instance.Item3; }

 public  IStarRatingInfo<int> GetRatingInfo(string instanceSource)
 { return  new  StarRatingInfo(0, 5); }

 public  Vector  GetUserFeatures(NoFeatureSource featureSource, string user)
 { throw  new  NotImplementedException(); }

 public  Vector  GetItemFeatures(NoFeatureSource featureSource, string item)
 { throw  new  NotImplementedException(); }
}

This sample code can be found in CsvMapping.cs. The GetInstances method will be invoked by the recommender to read the user-item-rating triples used for training. Then for each instance the recommender will obtain the corresponding object using the GetUser, GetItem, and GetRating methods. GetRatingInfo tells the system what the minimum and maximum rating values are. This is considered to be data dependent, and therefore was not designed as a setting. Finally, user and item features are not used in this simple example.

Once we have the data mapping, we can create a recommender, set relevant settings, and train the system:

var dataMapping = new  CsvMapping();
var recommender = MatchboxRecommender.Create(dataMapping);
recommender.Settings.Training.TraitCount = 5;
recommender.Settings.Training.IterationCount = 20;
recommender.Train("Ratings.csv");

The recommender is instantiated using the MatchboxRecommender.Create factory method, which takes in the data mapping. The next line sets the number of traits. These were discussed in the Introduction and typically vary between 1 and 20. We then set the number of iterations. This should be in the range between 1 and 100, typically greater than 10. Both of these parameters depend on the data and should be tuned. Finally, the recommender is trained on the Ratings.csv file using the Train method. The system knows how to parse the input, because this is specified in the mapping.

Once training is completed, the recommender can optionally be serialized using the Save method and then deserialized using the MatchboxRecommender.Load static method:

recommender.Save("TrainedModel.bin");
// ...
var recommender = MatchboxRecommender.Load<string, string, string, NoFeatureSource>( "TrainedModel.bin");

And finally, recommendations are made using the Recommend method. It takes as input the user to make recommendations to and the number of items to recommend:

var recommendations = recommender.Recommend("Person 1", 10);

An example of a more complex data mapping, that includes user and item features, can be found in Mappings.StarRatingRecommender. This is the mapping used by the command-line runner. This mapping takes a RecommenderDataset as its instance source. A RecommenderDataset can be loaded from a file in the format used by the command-line runner.
Here is some example code using this mapping (also see CsvMapping.cs):

RecommenderDataset trainingDataset = RecommenderDataset.Load("RatingsDataset.csv");
var recommender = MatchboxRecommender.Create(Mappings.StarRatingRecommender);
recommender.Settings.Training.TraitCount = 5;
recommender.Settings.Training.IterationCount = 20;
recommender.Train(trainingDataset);
var recommendations = recommender.Recommend(new User("Person 1", Vector.FromArray(2.3)), 10);

Subsections: Data mappings | Setting up | Training | Prediction | Evaluation | Serialization