artefactual.preprocessing#
- artefactual.preprocessing.is_openai_responses_api(outputs)[source]#
Detects if the output follows the signature of the new OpenAI Responses API.
- Return type:
- Args:
outputs (Any): The output object or dictionary to inspect.
- Returns:
bool: True if the output matches the OpenAI Responses API signature, False otherwise.
- artefactual.preprocessing.parse_model_outputs(outputs)[source]#
Parse different output formats to extract logprobs.
- Args:
- outputs: Model outputs. Can be:
List of vLLM RequestOutput objects.
OpenAI ChatCompletion object (or dict).
OpenAI Responses object (or dict).
- Returns:
List of dictionaries mapping token indices to lists of log probs.
- Raises:
TypeError: If the output format is not supported.
- artefactual.preprocessing.process_openai_chat_completion(response, iterations)[source]#
Processes log probabilities from OpenAI Chat Completion (classic ‘choices’ format).
- Args:
response (Any): The response object or dictionary from OpenAI API (ChatCompletion). iterations (int): The number of iterations (choices) to process.
- Returns:
list[dict[int, list[float]]]: A list of dictionaries, where each dictionary maps token indices to lists of log probabilities for a sequence.
- artefactual.preprocessing.process_openai_responses_api(response)[source]#
Parses the response from the ‘client.responses.create’ API to extract log probabilities.
Structure expected: response.output -> [item] -> item.content -> [part] -> part.logprobs
- Args:
response (Any): The response object from the OpenAI Responses API.
- Returns:
list[dict[int, list[float]]]: A list of dictionaries, where each dictionary maps token indices to lists of log probabilities for a sequence.
- artefactual.preprocessing.process_vllm_logprobs(outputs, iterations)[source]#
Processes log probabilities from vllm.chat outputs for a given number of iterations.
- Args:
outputs (list[RequestOutput]): A list containing model output objects, each with log probability data. iterations (int): The number of iterations to process, corresponding to the number of output sequences.
- Returns:
list[dict[int, list[float]]]: A list of dictionaries mapping token indices to lists of log probs for each token in the sequence.
Modules
Module for parsing model outputs from various sources to extract log probabilities. |
|