evaluate_file

source

Lettuce’s evaluation framework can run end-to-end, and this provides an example. The JSON output from running this script can be found in the repo.

Process

Load the example dataset

The file found in evaluation/datasets/example.csv is loaded using a SingleInputCSVforLLM

Define a prompt

In this example, every LLM pipeline uses the same prompt, but you can test alternative prompts

Define pipelines

A list of different test pipelines using LLMs is created

Define tests

A list of PipelinTests is created from the list of pipelines

Main function evaluates the pipelines

The main function creates an EvaluationFramework. Its run_evaluations() method then runs the data through the pipelines, calculates the metrics and saves the result as a JSON file