AleaCarta model Usage
Introduction to modelling with AleaCarta
We use a synthetic dataset to demonstrate how to use the AleaCarta model [1]
# Install necessary packages that are not already installed in the environment
# !pip install matplotlib
import os
import sys
os.environ["CUDA_VISIBLE_DEVICES"] = ""
sys.path.append("./../../")
print(os.getcwd())
import matplotlib.pyplot as plt
import numpy as np
from synthetic_dataset import get_dataset_ace
from choice_learn.basket_models import AleaCarta
data = get_dataset_ace()
print(data)
print(f"\nThe TripDataset 'data' contains {data.n_items} distinct items that appear in {data.n_samples} transactions carried out at {data.n_stores} point(s) of sale with {data.n_assortments} different assortments.")
latent_sizes = {"preferences": 6, "price": 3, "season": 3}
n_negative_samples = 2
optimizer = "adam"
lr = 1e-2
epochs = 200
# epochs = 1000
batch_size = 32
model = AleaCarta(
# item_intercept=True,
item_intercept=False,
price_effects=False,
seasonal_effects=False,
latent_sizes=latent_sizes,
n_negative_samples=n_negative_samples,
optimizer=optimizer,
lr=lr,
epochs=epochs,
batch_size=batch_size,
)
model.instantiate(n_items=7, n_stores=2)
plt.plot(history["train_loss"])
plt.xlabel("Epoch")
plt.ylabel("Training Loss")
plt.legend()
plt.title("Training")
plt.show()
model.compute_batch_utility(item_batch=np.array(list(range(2))),
basket_batch=np.array([[1, 2], [3, 4]]),
store_batch=np.array([0, 0]),
week_batch=np.array([0, 0]),
price_batch=np.array([0, 0]),
available_item_batch=np.ones((2, data.n_items))
)
A more complex dataset
We will use the Badminton Dataset[1] to showcase how the AleaCarta model can capture complementarity & cannibalization effects:
We load the model and define 4 nests of items with interaction effects that are either complementarity or neutral. We consider that the different items within a nest cannibalize each others sales.
data_gen = SyntheticDataGenerator(
proba_complementary_items=0.7,
proba_neutral_items=0.3,
noise_proba=0.15,
items_nest = {0:[0, 1, 2],
1: [3, 4, 5],
2: [6],
3: [7]},
nests_interactions = [["", "compl", "neutral", "neutral"],
["compl", "", "neutral", "neutral"],
["neutral", "neutral", "", "neutral"],
["neutral", "neutral", "neutral", ""]])
# Load the dataset
trip_dataset = data_gen.generate_trip_dataset(n_baskets=1000, assortments_matrix=np.ones((1, 8)))
It is possible to instantiate and train the model as follows:
model = AleaCarta(
item_intercept=True,
price_effects=False,
seasonal_effects=False,
latent_sizes=latent_sizes,
n_negative_samples=n_negative_samples,
optimizer=optimizer,
lr=lr,
epochs=epochs,
batch_size=batch_size,
)
model.instantiate(n_items=8, n_stores=1)
hist = model.fit(trip_dataset)
Here is a way to observe the marginal probabilities $\mathbb{P}(i | \mathcal{B} = {j})$:
proba_matrix = []
for item in range(8):
probabilities = model.compute_item_likelihood(
basket=np.array([item]),
available_items=np.ones((8, )),
store=np.array(0),
week=np.array(0),
prices=np.zeros((8, )),
)
proba_matrix.append(probabilities)
proba_matrix = np.vstack(proba_matrix)
plt.imshow(proba_matrix, vmin=0., vmax=1., cmap="coolwarm")
plt.colorbar()
plt.xlabel("Item $j$")
plt.ylabel("Item $i$")
For model evaluation with a TripDataset, one can simply do:
References
[1] Better Capturing Interactions between Products in Retail: Revisited Negative Sampling for Basket Choice Modeling, Désir, J.; Auriau, V.; Možina, M.; Malherbe, E. (2025), ECML-PKDD