Welcome to the Choice-Learn documentation!
Large-scale choice modeling through the lens of machine learning
If you are not coming from GitHub, check it out | |
---|---|
You can also read our academic paper ! |
Choice-Learn is a Python package designed to help you estimate discrete choice models and use them (e.g., assortment optimization plug-in). The package provides ready-to-use datasets and models from the litterature. It also provides a lower level use if you wish to customize any choice model or create your own from scratch. Choice-Learn efficiently handles data with the objective to limit RAM usage. It is made particularly easy to estimate choice models with your own, large datasets.
Choice-Learn uses NumPy and pandas as data backend engines and TensorFlow for models.
In this documentation you will find examples to be quickly getting started as well as some more in-depth example.
What's in there ?
Here is a quick overview of the different functionalities offered by Choice-Learn. Further details are given in the rest of the documentation.
Data
- Custom data handling for choice datasets with possible memory usage optimizations
- Some Open-Source ready-to use datasets are included within the datasets:
- SwissMetro
- ModeCanada
- The Train dataset
- The Heating & Electricitydatasets from Kenneth Train
- Stated car preferences
- The TaFeng dataset from Kaggle
- The ICDM-2013 Expedia dataset from Kaggle
- London Passenger Mode Choice
Models
- Custom modeling
- Ready to be used models:
- Linear Models:
- Non-Linear Models:
Tools
Examples
Diverse examples are provided in the How-To section, give it a look !
Introduction - Discrete Choice Modeling
Discrete choice models aim at explaining or predicting choices over a set of alternatives. Well known use-cases include analyzing people's choice of mean of transport or products purchases in stores.
If you are new to choice modeling, you can check this resource. Otherwise, you can also take a look at the introductive example.
Installation
User installation
To install the required packages in a virtual environment, run the following command:
The easiest is to pip-install the package:
Otherwise you can use the git repository to get the latest version:
Dependencies
Choice-Learn requires the following: - Python (>=3.9) - NumPy (>=1.24) - pandas (>=1.5)
For modeling you need: - TensorFlow (>=2.13)
Warning: If you are a MAC user with a M1 or M2 chip, importing TensorFlow might lead to Python crashing. In such case, use anaconda to install TensorFlow with
conda install -c apple tensorflow
.
Finally, an optional requirement used for statsitcal reporting and LBFG-S optimization is: - TensorFlow Probability (>=0.20.1)
Finally for pricing or assortment optimization, you need either Gurobi or OR-Tools: - gurobipy (>=11.0.0) - ortools (>=9.6.2534)
Contributing
You are welcome to contribute to the project ! You can help in various ways: - raise issues - resolve issues already opened - develop new features - provide additional examples of use - fix typos, improve code quality - develop new tests
We recommend to first open an issue to discuss your ideas.
Citation
If you consider this package and any of its feature useful for your research, please cite us:
@article{Auriau2024,
doi = {10.21105/joss.06899},
url = {https://doi.org/10.21105/joss.06899},
year = {2024},
publisher = {The Open Journal},
volume = {9},
number = {101},
pages = {6899},
author = {Vincent Auriau and Ali Aouad and Antoine Désir and Emmanuel Malherbe},
title = {Choice-Learn: Large-scale choice modeling for operational contexts through the lens of machine learning},
journal = {Journal of Open Source Software} }
License
The use of this software is under the MIT license, with no limitation of usage, including for commercial applications.
Affiliations
Choice-Learn has been developed through a collaboration between researchers at the Artefact Research Center and the laboratory MICS from CentraleSupélec, Université Paris Saclay.