RecList is an open source library providing behavioral, "black-box" testing for recommender systems.

Overview

RecList

Documentation Status Contributors License Downloads

RecList

Overview

RecList is an open source library providing behavioral, "black-box" testing for recommender systems. Inspired by the pioneering work of Ribeiro et al. 2020 in NLP, we introduce a general plug-and-play procedure to scale up behavioral testing, with an easy-to-extend interface for custom use cases.

RecList ships with some popular datasets and ready-made behavioral tests: check the paper for more details on the relevant literature and the philosophical motivations behind the project.

If you are not familiar with the library, we suggest first taking our small tour to get acquainted with the main abstractions through ready-made models and public datasets.

Quick Links

  • Our paper, with in-depth analysis, detailed use cases and scholarly references.
  • A colab notebook (WIP), showing how to train a cart recommender model from scratch and use the library to test it.
  • Our blog post (forthcoming), with examples and practical tips.

Project updates

Nov. 2021: the library is currently in alpha. We welcome contributions and feedback, but please be advised that the package may change substantially in the upcoming months.

As the project is in active development, come back often for updates.

Summary

This doc is structured as follows:

Quick Start

If you want to see RecList in action, clone the repository, create and activate a virtual env, and install the required packages from root. If you prefer to experiment in an interactive, no-installation-required fashion, try out our colab notebook.

Sample scripts are divided by use-cases: similar items, complementary items or session-based recommendations. When executing one, a suitable public dataset will be downloaded, and a baseline ML model trained: finally, the script will run a pre-made suite of behavioral tests to show typical results.

git clone https://github.com/jacopotagliabue/reclist
cd reclist
python3 -m venv venv
source venv/bin/activate
pip install -e .
python examples/coveo_complementary_rec.py

Running your model on one of the supported dataset, leveraging the pre-made tests, is as easy as implementing a simple interface, RecModel.

Once you've run successfully the sample script, take the guided tour below to learn more about the abstractions and the out-of-the-box capabilities of RecList.

A Guided Tour

An instance of RecList represents a suite of tests for recommender systems: given a dataset (more appropriately, an instance of RecDataset) and a model (an instance of RecModel), it will run the specified tests on the target dataset, using the supplied model.

For example, the following code instantiates a pre-made suite of tests that contains sensible defaults for a cart recommendation use case:

rec_list = CoveoCartRecList(
    model=model,
    dataset=coveo_dataset
)
# invoke rec_list to run tests
rec_list(verbose=True)

Our library pre-packages standard recSys KPIs and important behavioral tests, divided by use cases, but it is built with extensibility in mind: you can re-use tests in new suites, or you can write new domain-specific suites and tests.

Any suite must inherit the RecList interface, and then declare with Pytonic decorators its tests: in this case, the test re-uses a standard function:

class MyRecList(RecList):

    @rec_test(test_type='stats')
    def basic_stats(self):
        """
        Basic statistics on training, test and prediction data
        """
        from reclist.metrics.standard_metrics import statistics
        return statistics(self._x_train,
            self._y_train,
            self._x_test,
            self._y_test,
            self._y_preds)

Any model can be tested, as long as its predictions are wrapped in a RecModel. This allows for pure "black-box" testings, a SaaS provider can be tested just by wrapping the proper API call in the method:

class MyCartModel(RecModel):

    def __init__(self, **kwargs):
        super().__init__(**kwargs)

    def predict(self, prediction_input: list, *args, **kwargs):
        """
        Implement the abstract method, accepting a list of lists, each list being
        the content of a cart: the predictions returned by the model are the top K
        items suggested to complete the cart.
        """

        return

While many standard KPIs are available in the package, the philosophy behind RecList is that metrics like Hit Rate provide only a partial picture of the expected behavior of recommenders in the wild: two models with very similar accuracy can have very different behavior on, say, the long-tail, or model A can be better than model B overall, but at the expense of providing disastrous performance on a set of inputs that are particularly important in production.

RecList recognizes that outside of academic benchmarks, some mistakes are worse than others, and not all inputs are created equal: when possible, it tries to operationalize through scalable code behavioral insights for debugging and error analysis; it also provides extensible abstractions when domain knowledge and custom logic are needed.

Once you run a suite of tests, results are dumped automatically and versioned in a local folder, structured as follows (name of the suite, name of the model, run timestamp):

.reclist/
  myList/
    myModel/
      1637357392/
      1637357404/

We provide a simple (and very WIP) UI to easily compare runs and models. After you run two times one of the example scripts, you can do:

cd app
python app.py

to start a local web app that lets you explore test results:

https://github.com/jacopotagliabue/reclist/blob/main/images/explorer.png

If you select more than model, the app will automatically build comparison tables:

https://github.com/jacopotagliabue/reclist/blob/main/images/comparison.png

If you start using RecList as part of your standard testings - either for research or production purposes - you can use the JSON report for machine-to-machine communication with downstream system (e.g. you may want to automatically fail the model pipeline if certain behavioral tests are not passed).

Capabilities

RecList provides a dataset and model agnostic framework to scale up behavioral tests. As long as the proper abstractions are implemented, all the out-of-the-box components can be re-used. For example:

  • you can use a public dataset provided by RecList to train your new cart recommender model, and then use the RecTests we provide for that use case;
  • you can use some baseline model on your custom dataset, to establish a baseline for your project;
  • you can use a custom model, on a private dataset and define from scratch a new suite of tests, mixing existing methods and domain-specific tests.

We list below what we currently support out-of-the-box, with particular focus on datasets and tests, as the models we provide are convenient baselines, but they are not meant to be SOTA research models.

Datasets

RecList features convenient wrappers around popular datasets, to help test models over known benchmarks in a standardized way.

Behavioral Tests

Coming soon!

Roadmap

To do:

  • the app is just a stub: improve the report "contract" and extend the app capabilities, possibly including it in the library itself;
  • continue adding default RecTests by use cases, and test them on public datasets;
  • improving our test suites and refactor some abstractions;
  • adding Colab tutorials, extensive documentation and a blog-like write-up to explain the basic usage.

We maintain a small Trello board on the project which we plan on sharing with the community: more details coming soon!

Contributing

We will update this repo with some guidelines for contributions as soon as the codebase becomes more stable. Check back often for updates!

Acknowledgments

The main contributors are:

If you have questions or feedback, please reach out to: jacopo dot tagliabue at tooso dot ai.

License and Citation

All the code is released under an open MIT license. If you found RecList useful, or you are using it to benchmark/debug your model, please cite our pre-print (forhtcoming):

@inproceedings{Chia2021BeyondNB,
  title={Beyond NDCG: behavioral testing of recommender systems with RecList},
  author={Patrick John Chia and Jacopo Tagliabue and Federico Bianchi and Chloe He and Brian Ko},
  year={2021}
}

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Owner
Jacopo Tagliabue
I failed the Turing Test once, but that was many friends ago.
Jacopo Tagliabue
The official implementation of "DGCN: Diversified Recommendation with Graph Convolutional Networks" (WWW '21)

DGCN This is the official implementation of our WWW'21 paper: Yu Zheng, Chen Gao, Liang Chen, Depeng Jin, Yong Li, DGCN: Diversified Recommendation wi

FIB LAB, Tsinghua University 37 Dec 18, 2022
基于个性化推荐的音乐播放系统

MusicPlayer 基于个性化推荐的音乐播放系统 Hi, 这是我在大四的时候做的毕设,现如今将该项目开源。 本项目是基于Python的tkinter和pygame所著。 该项目总体来说,代码比较烂(因为当时水平很菜)。 运行的话安装几个基本库就能跑,只不过里面的数据还没有上传至Github。 先

Cedric Niu 6 Nov 19, 2022
Use Jupyter Notebooks to demonstrate how to build a Recommender with Apache Spark & Elasticsearch

Recommendation engines are one of the most well known, widely used and highest value use cases for applying machine learning. Despite this, while there are many resources available for the basics of

International Business Machines 793 Dec 18, 2022
It is a movie recommender web application which is developed using the Python.

Movie Recommendation 🍿 System Watch Tutorial for this project Source IMDB Movie 5000 Dataset Inspired from this original repository. Features Simple

Kushal Bhavsar 10 Dec 26, 2022
Code for ICML2019 Paper "Compositional Invariance Constraints for Graph Embeddings"

Dependencies NOTE: This code has been updated, if you were using this repo earlier and experienced issues that was due to an outaded codebase. Please

Avishek (Joey) Bose 43 Nov 25, 2022
RetaGNN: Relational Temporal Attentive Graph Neural Networks for Holistic Sequential Recommendation

RetaGNN: Relational Temporal Attentive Graph Neural Networks for Holistic Sequential Recommendation Pytorch based implemention of Relational Temporal

28 Dec 28, 2022
Recommender systems are the systems that are designed to recommend things to the user based on many different factors

Recommender systems are the systems that are designed to recommend things to the user based on many different factors. The recommender system deals with a large volume of information present by filte

Happy N. Monday 3 Feb 15, 2022
A framework for large scale recommendation algorithms.

A framework for large scale recommendation algorithms.

Alibaba Group - PAI 880 Jan 03, 2023
reXmeX is recommender system evaluation metric library.

A general purpose recommender metrics library for fair evaluation.

AstraZeneca 258 Dec 22, 2022
Cross Domain Recommendation via Bi-directional Transfer Graph Collaborative Filtering Networks

Bi-TGCF Tensorflow Implementation of BiTGCF: Cross Domain Recommendation via Bi-directional Transfer Graph Collaborative Filtering Networks. in CIKM20

17 Nov 30, 2022
A library of Recommender Systems

A library of Recommender Systems This repository provides a summary of our research on Recommender Systems. It includes our code base on different rec

MilaGraph 980 Jan 05, 2023
An open source movie recommendation WebApp build by movie buffs and mathematicians that uses cosine similarity on the backend.

Movie Pundit Find your next flick by asking the (almost) all-knowing Movie Pundit Jump to Project Source » View Demo · Report Bug · Request Feature Ta

Kapil Pramod Deshmukh 8 May 28, 2022
Learning Fair Representations for Recommendation: A Graph-based Perspective, WWW2021

FairGo WWW2021 Learning Fair Representations for Recommendation: A Graph-based Perspective As a key application of artificial intelligence, recommende

lei 39 Oct 26, 2022
fastFM: A Library for Factorization Machines

Citing fastFM The library fastFM is an academic project. The time and resources spent developing fastFM are therefore justified by the number of citat

1k Dec 24, 2022
NVIDIA Merlin is an open source library designed to accelerate recommender systems on NVIDIA’s GPUs.

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in

420 Jan 04, 2023
Persine is an automated tool to study and reverse-engineer algorithmic recommendation systems.

Persine, the Persona Engine Persine is an automated tool to study and reverse-engineer algorithmic recommendation systems. It has a simple interface a

Jonathan Soma 87 Nov 29, 2022
Code for MB-GMN, SIGIR 2021

MB-GMN Code for MB-GMN, SIGIR 2021 For Beibei data, run python .\labcode.py For Tmall data, run python .\labcode.py --data tmall --rank 2 For IJCAI

32 Dec 04, 2022
Beyond Clicks: Modeling Multi-Relational Item Graph for Session-Based Target Behavior Prediction

MGNN-SPred This is our Tensorflow implementation for the paper: WenWang,Wei Zhang, Shukai Liu, Qi Liu, Bo Zhang, Leyu Lin, and Hongyuan Zha. 2020. Bey

Wen Wang 18 Jan 02, 2023
A TensorFlow recommendation algorithm and framework in Python.

TensorRec A TensorFlow recommendation algorithm and framework in Python. NOTE: TensorRec is not under active development TensorRec will not be receivi

James Kirk 1.2k Jan 04, 2023
Movies/TV Recommender

recommender Movies/TV Recommender. Recommends Movies, TV Shows, Actors, Directors, Writers. Setup Create file API_KEY and paste your TMDB API key in i

Aviem Zur 3 Apr 22, 2022