Skip to main content
We also demonstrate how to prompt models to evaluate the content they have generated.
from edsl import Model, ModelList, ScenarioList, QuestionFreeText, QuestionLinearScale, Survey
m = ModelList([
    Model("claude-3-7-sonnet-20250219", service_name = "anthropic"),
    Model("gemini-1.5-flash", service_name = "google"),
    Model("gpt-4o", service_name = "openai")
])
s = ScenarioList.from_source("list", "topic", ["winter", "language models"])
q1 = QuestionFreeText(
    question_name = "haiku",
    question_text = "Please draft a haiku about {{ scenario.topic }}."
)

q2 = QuestionLinearScale(
    question_name = "originality",
    question_text = "On a scale from 1 to 5, please rate the originality of this haiku: {{ haiku.answer }}.",
    question_options = [1,2,3,4,5],
    option_labels = {1:"Totally unoriginal", 5:"Highly original"}
)

survey = Survey(questions = [q1, q2])

survey
Survey # questions: 2; question_name list: [‘haiku’, ‘originality’];
option_labelsquestion_textquestion_namequestion_optionsquestion_type
0nanPlease draft a haiku about .haikunanfree_text
1{1: 'Totally unoriginal', 5: 'Highly original'}On a scale from 1 to 5, please rate the originality of this haiku: .originality[1, 2, 3, 4, 5]linear_scale
results = survey.by(s).by(m).run()
results.select("model", "topic", "haiku", "originality")
model.modelscenario.topicanswer.haikuanswer.originality
0claude-3-7-sonnet-20250219winterSnowflakes drift downward Blanket of white hides the earth Silence embraces2
1gemini-1.5-flashwinterWhite breath in the air, Frozen ground crunches below, Silence blankets all.2
2gpt-4owinterSnow blankets the earth, Silent whispers fill the air, Cold breath of winter.2
3claude-3-7-sonnet-20250219language modelsWords dance in code, Patterns weave through silicon— Echoes of our thoughts.4
4gemini-1.5-flashlanguage modelsData flows like streams, Words bloom, a digital flower, Meaning takes its form.2
5gpt-4olanguage modelsWords dance in silence, Patterns weave through vast data— Machines learn to speak.4

Next we prompt each model to rate every haiku

We modify the second question to use a scenario for each haiku instead of piping the answer from the first question (i.e., {{ haiku.answer }} is changed to {{ scenario.haiku }}):
new_q = QuestionLinearScale(
    question_name = "originality",
    question_text = "On a scale from 1 to 5, please rate the originality of this haiku: {{ scenario.haiku }}.",
    question_options = [1,2,3,4,5],
    option_labels = {1:"Totally unoriginal", 5:"Highly original"}
)
haikus = results.select("model", "topic", "haiku").to_scenario_list().rename({"model":"drafting_model"})
haikus
ScenarioList scenarios: 6; keys: [‘haiku’, ‘drafting_model’, ‘topic’];
drafting_modeltopichaiku
0claude-3-7-sonnet-20250219winterSnowflakes drift downward Blanket of white hides the earth Silence embraces
1gemini-1.5-flashwinterWhite breath in the air, Frozen ground crunches below, Silence blankets all.
2gpt-4owinterSnow blankets the earth, Silent whispers fill the air, Cold breath of winter.
3claude-3-7-sonnet-20250219language modelsWords dance in code, Patterns weave through silicon— Echoes of our thoughts.
4gemini-1.5-flashlanguage modelsData flows like streams, Words bloom, a digital flower, Meaning takes its form.
5gpt-4olanguage modelsWords dance in silence, Patterns weave through vast data— Machines learn to speak.
new_results = new_q.by(haikus).by(m).run()
(
    new_results
    .sort_by("topic", "drafting_model", "model")
    .select("model", "drafting_model", "topic", "haiku", "originality")
)
model.modelscenario.drafting_modelscenario.topicscenario.haikuanswer.originality
0claude-3-7-sonnet-20250219claude-3-7-sonnet-20250219language modelsWords dance in code, Patterns weave through silicon— Echoes of our thoughts.4
1gemini-1.5-flashclaude-3-7-sonnet-20250219language modelsWords dance in code, Patterns weave through silicon— Echoes of our thoughts.3
2gpt-4oclaude-3-7-sonnet-20250219language modelsWords dance in code, Patterns weave through silicon— Echoes of our thoughts.4
3claude-3-7-sonnet-20250219gemini-1.5-flashlanguage modelsData flows like streams, Words bloom, a digital flower, Meaning takes its form.3
4gemini-1.5-flashgemini-1.5-flashlanguage modelsData flows like streams, Words bloom, a digital flower, Meaning takes its form.2
5gpt-4ogemini-1.5-flashlanguage modelsData flows like streams, Words bloom, a digital flower, Meaning takes its form.3
6claude-3-7-sonnet-20250219gpt-4olanguage modelsWords dance in silence, Patterns weave through vast data— Machines learn to speak.4
7gemini-1.5-flashgpt-4olanguage modelsWords dance in silence, Patterns weave through vast data— Machines learn to speak.3
8gpt-4ogpt-4olanguage modelsWords dance in silence, Patterns weave through vast data— Machines learn to speak.4
9claude-3-7-sonnet-20250219claude-3-7-sonnet-20250219winterSnowflakes drift downward Blanket of white hides the earth Silence embraces2
10gemini-1.5-flashclaude-3-7-sonnet-20250219winterSnowflakes drift downward Blanket of white hides the earth Silence embraces2
11gpt-4oclaude-3-7-sonnet-20250219winterSnowflakes drift downward Blanket of white hides the earth Silence embraces2
12claude-3-7-sonnet-20250219gemini-1.5-flashwinterWhite breath in the air, Frozen ground crunches below, Silence blankets all.3
13gemini-1.5-flashgemini-1.5-flashwinterWhite breath in the air, Frozen ground crunches below, Silence blankets all.2
14gpt-4ogemini-1.5-flashwinterWhite breath in the air, Frozen ground crunches below, Silence blankets all.2
15claude-3-7-sonnet-20250219gpt-4owinterSnow blankets the earth, Silent whispers fill the air, Cold breath of winter.2
16gemini-1.5-flashgpt-4owinterSnow blankets the earth, Silent whispers fill the air, Cold breath of winter.2
17gpt-4ogpt-4owinterSnow blankets the earth, Silent whispers fill the air, Cold breath of winter.2

Posting this notebook to Coop

# from edsl import Notebook

# nb = Notebook(path = "models_scoring_models.ipynb")

# nb.push(
#     description = "Models scoring models",
#     alias = "models-scoring-models-notebook",
#     visibility = "public"
# )
Updating an object at Coop:
from edsl import Notebook

nb = Notebook(path = "models_scoring_models.ipynb") # resave

nb.patch("https://www.expectedparrot.com/content/RobinHorton/models-scoring-models-notebook", value = nb)

I