artefactual.data#

Data loading and processing utilities for artefactual.

class artefactual.data.Completion(**data)[source]#

Bases: BaseModel

Represents a single generated completion as a sequence of token logprobs.

model_config: ClassVar[ConfigDict] = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class artefactual.data.Result(**data)[source]#

Bases: BaseModel

Represents the full data for a single query.

Attributes:: query_id: The unique identifier for the query. query: The query text. expected_answers: List of expected correct answers. generated_answers: List of generated answers with metadata. token_logprobs: Nested sequence of token log probabilities.

model_config: ClassVar[ConfigDict] = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class artefactual.data.TokenLogprob(**data)[source]#

Bases: BaseModel

Represents a single token’s log probability.

Attributes:: token: The token string. logprob: The log probability of the token. rank: The rank of the token in the probability distribution.

model_config: ClassVar[ConfigDict] = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

Modules

Pydantic models for representing the data in the generated JSON files.