Introductive Example

^{_{Introduction Notebook}}

# Install necessary requirements

# If you run this notebook on Google Colab, or in standalone mode, you need to install the required packages.
# Uncomment the following lines:

# !pip install choice-learn

# If you run the notebook within the GitHub repository, you need to run the following lines, that can skipped otherwise:
import os
import sys

sys.path.append("../../")

Choice-Learn is a Python package designed to help building discrete choice models. In particular you will find:

Optimized Data handling with the ChoiceDataset object and ready-to-use datasets
Modelling tools with:
- Efficient well-known choice models
- Customizable class ChoiceModel to build your own model
- Estimation options such as choosing the method (LBFGS, Gradient Descent, etc...)
Divers Tools revolving around choice models such as an Assortment Optimizer

Discrete Choice Modelling

Discrete choice models aim at explaining or predicting a choice from a set of alternatives. Well known use-cases include analyzing people choice of mean of transport or products purchases in stores.

If you are new to choice modelling, you can check this resource.

Tutorial

In this notebook we will describe step-by-step the estimation of a choice model.

Data Handling
Modelling

Data

Items, features and choices

The data structure for choice modelling is somehow different than usual prediction use-cases. We consider a set of variable size of different alternatives. Each alternative is described by features and one is chosen among the set. Some contexts features (describing a customer, or time) can also affect the choice. Let's take an example where we want to predict a customer's next purchase.

Three different items, i₁, i₂ and i₃ are sold and we have gathered a small dataset:

1st Purchase:	2nd Purchase:	3rd Purchase:
Shelf: \| Item \| Price \| Promotion \| \| -------------- \| ------- \| --------- \| \| i₁ \| $100 \| no \| \| i₂ \| $140 \| no \| \| i₃ \| $200 \| no \| Customer Purchase: i₁	Shelf: \| Item \| Price \| Promotion \| \| -------------- \| ------- \| --------- \| \| i₁ \| $100 \| no \| \| i₂ \| $120 \| yes \| \| i₃ \| $200 \| no \| Customer Purchase: i₂	Shelf: \| Item \| Price \| Promotion \| \| -------------- \| ------------ \| ------------ \| \| i₁ \| $100 \| no \| \| i₂ \| Out-Of-Stock \| Out-Of-Stock \| \| i₃ \| $180 \| yes \| Customer Purchase: i₃

Indexing the items in the same order, we create the ChoiceDataset as follows:

choices = [0, 1, 2] # Indexes of the items chosen

items_features_by_choice =  [
    [
        [100., 0.], # choice 1, Item 1 [price, promotion]
        [140., 0.], # choice 1, Item 2 [price, promotion]
        [200., 0.], # choice 1, Item 3 [price, promotion]
    ],
    [
        [100., 0.], # choice 2, Item 1 [price, promotion]
        [120., 1.], # choice 2, Item 2 [price, promotion]
        [200., 0.], # choice 2, Item 3 [price, promotion]
    ],
    [
        [100., 0.], # choice 3, Item 1 [price, promotion]
        [120., 1.], # choice 3, Item 2 [price, promotion]
        [180., 1.], # choice 3, Item 3 [price, promotion]
    ],
]

Item i₂ was out of stock during the last choice. Thus it could not have been chosen. In order to keep this information we create a matric indicating which items were available during each of the choices:

available_items_by_choice = [
    [1, 1, 1], # All items available for choice 1
    [1, 1, 1], # All items available for choice 2
    [1, 0, 1], # Item 2 not available for choice 3
]

And now let's create the ChoiceDataset! We can also specify the features names if we want to.

from choice_learn.data import ChoiceDataset

dataset = ChoiceDataset(
    choices=choices,
    items_features_by_choice=items_features_by_choice,
    items_features_by_choice_names=["price", "promotion"],
    available_items_by_choice=available_items_by_choice,
)

Modelling

Estimation and choice probabilities

A first and simple model to predict a customer choice is the Multinomial Logit.

We consider that customers attribute a utility to each product and that he chooses the product with hightest utility.

We formulate the utility as a linear function of our features:

$U(i) = \alpha_i + \beta \cdot price(i) + \gamma \cdot promotion(i)$

Considering that this estimation is noisy, we use the softmax function over the available products to get the purchase probability. For example using our first data sample we obtain:

$\mathbb{P}(i_1) = \frac{e^{U(i_1)}}{e^{U(i_1)} + e^{U(i_2)} + e^{U(i_3)}}$

For the third sample only two items are still available, making the probability: $\mathbb{P}(i_1) = \frac{e^{U(i_1)}}{e^{U(i_1)} + e^{U(i_3)}}$

The parameters $\alpha_i$, $\beta$ and $\gamma$ are estimated by maximizing the Negative Log-Likelihood. Here is how it goes with Choice-Learn:

from choice_learn.models import SimpleMNL

model = SimpleMNL(intercept="item")
history = model.fit(dataset)

To access the weights estimation:

print("Features coefficients are:")
print(model.trainable_weights[0])
print("Items intercepts:")
print([0], "and", model.trainable_weights[1])

Features coefficients are:
<tf.Variable 'Weights_items_features:0' shape=(2,) dtype=float32, numpy=array([-0.37710273, 40.983475  ], dtype=float32)>
Items intercepts:
[0] and <tf.Variable 'Intercept:0' shape=(2,) dtype=float32, numpy=array([-11.027451,  12.578588], dtype=float32)>

In order to compute the average Negative Log-Likelihood of the model, we can use the following code:

model.evaluate(dataset)

<tf.Tensor: shape=(), dtype=float32, numpy=1.001363e-05>

You can now acces estimated choice probabilities using a ChoiceDataset:

probabilities = model.predict_probas(dataset)
print("Probabilities are:")
print(probabilities)

Probabilities are:
tf.Tensor(
[[9.9998999e-01 4.5697122e-12 1.2174261e-11]
 [1.8438762e-10 9.9998999e-01 2.2448054e-21]
 [6.9211727e-11 0.0000000e+00 9.9998999e-01]], shape=(3, 3), dtype=float32)

Useful Jupyter Notebooks

If you want to go further, here are a few useful Jupyter Notebooks:

Data: - A more complete example here - A detailed use of FeaturesByIDs here if you want to minimize your RAM footprint

Modelling: - A more complete example using the Conditional-MNL here - An example to easily build custom models here

Tools: - An example of assortment optimization using a choice model and Gurobi here

Here are complementary Notebooks that might interest you: - A comparison with the R package mlogit here - A reconstruction of the experiments of the RUMnet paper here - An example of estimation of a Latent Class MNL here - An example of estimation of the Nested Logit model here - A reconstruction using Choice-Learn of scikit-learn's Logistic Regression tutorial here

Documentation

The full documentation also hosts a lot of useful details and information.

Additional Questions, Requests, etc...

If you have ideas, questions, features request or any other input, do not hesitate to reach out by opening an issue on GitHub.