In this article, we show how to use LensKit to evaluate a recommender written in Python.  We wrote this article to help people who want to use LensKit’s built-in evaluation capabilities and comparison algorithms, but don’t want to implement their own algorithms in Java.  Evaluating an external recommender — whether in R, Python, or MatLab, involves three primary steps:

  • Writing the recommender. We will need a simple recommender written in language other than Java (Python in this case) that can take test data to build up a simple model and generate recommendations for a given list of test users.
  • Setting up a shim class. We will need to write a small class that teaches LensKit how to use our external algorithm.
  • Setting up LensKit evaluation. Finally we show how we setup an experiment using the shim class in a LensKit eval script to evaluate the external recommender.

Note, that the data we will use to test this recommender is a MovieLens rating dataset. The data consists of movie ratings with each row being <userId,itemId,rating>. You can read more about the dataset here.

Step 1: Install LensKit

You will need a copy of LensKit. This tutorial requires features that are included in LensKit 2.1; you can download the 2.1-M4 milestone release from the LensKit downloads page. Download the binary distribution and unpack it somewhere on your hard drive.

Step 2: Create a groovy file (eval.groovy)

Before we move into details of external algorithms, let’s start with a groovy script to show a basic recommender evaluation experiment setup. Shown below is a very simple groovy file that a) sets up a five-fold evaluation using the MovieLens 100k ratings dataset; b) specifies the recommendation algorithm(s), in this case a basic item-user mean algorithm PersMean; and c) specifies metrics to evaluate the recommendations — in this case  topNnDCGand RMSEPredictMetric

Note: For more details on the LensKit evaluator and the configurations you can put in eval.groovy, see the evaluator manual page.

To execute the script, save this file and navigate to the directory containing the file using terminal (or command prompt). From the same directory execute the command lenskit eval and wait for LensKit related logs with “Build Successful” logged at the end of execution signifying the successful evaluation. The results of the experiment are written to eval-results.csv. This step will help you understand the basics and at the same time allowing us to easily demonstrate the changes required for an external algorithm.

Step 3: Create a simple Recommender  (

Let’s create a simple Python code that can generate item recommendations for a list of users. The code below shows a simple recommender that calculates each item’s mean rating normalized/offset by global item mean. You can note that in this case each user will have same list of items and with same predicted ratings; similar to item-mean recommender in LensKit.

Important point to observe — the code requires following two arguments:

  1. training file ( consisting of <userId,itemId,rating>): The training set from which a simple model of recommendation is built up.
  2. users file (<userId>: List of user Ids for which recommendations are to be generated

Step 4: Update Groovy file : Create and configure the shim class

Now that we have an evaluation script and an external recommender, what we need is an agent to bind them together. The core of a typical LensKit recommender is the ItemScorer, computing individual item scores (typically rating predictions) that are then used for prediction and recommendation. We will use our Python script to pre-compute item scores (rating predictions) that will then be consumed by LensKit for the rest of the process.

To enable this, LensKit provides a PrecomputedItemScorer, an item scorer that just has an in-memory copy of fixed item scores. The ExternalProcessItemScorerBuilder utility class constructs a precomputed item scorer by running an external program — in this case, the Python script — to compute the scores, reading them from the program’s standard output and storing them in the precomputed item scorer.The evaluator needs to know how to run this program, and therefore, we need a simple class that implements Provider<ItemScorer> by using the builder to build a precomputed item scorer using our Python code. The class will set up the command line arugments needed by the program and instruct the item scorer builder to collect its output. For convenience, we will put this class in eval.groovy; here is the full script with our class and a new algorithm block that hooks it in to the evaluator:

Lets have a look at some important code sections of the shim class:

  1. builder.setExecutable("python)

    specifies executable for the language that your code is written in, here it is Python. It can be Ruby, R, Matlab or any other language.

  2. builder.addRatingFileArgument(eventDAO)

    specifies the training file generated by LensKit (crossfolds)

  3. builder.addUserFileArgument(userDAO)

    specifies the Users file generated by LensKit to evaluate the recommendations

  4. builder.addArgument("___") --NOT SHOWN ABOVE

    You can pass any more arguments that you may require for your code


Notice the algorithm("ExternalAlgorithm") being added in groovy file. Earlier in Step 2, we included only one algorithm to evaluate i.e. PersMean. In this step we include the external algorithm and bind the ItemScorer to the new shim class described above.

Step 5: Finally, run LensKit

To execute the groovy file, same as we followed in Step 2, navigate to the directory containing the groovy file (eval.groovy) from terminal (or command prompt) and run lenskit eval again.

Written by

PhD Student

Comments are closed.