Skip to main content
The code is readily editable. Before using it, ensure that you have followed the steps for installing the EDSL package and managing API keys for the models that you want to use.

Create an agent with a dated persona

We start by creating an agent with a dated persona. We do this by passing a dictionary of traits to an Agent object. Note that it can be convenient to include both a narrative persona and individual traits to faciltate comparison of responses to questions among agents with different traits (more on built-in methods for analysis below and in the docs):
from edsl import Agent

agent = Agent(
    traits={
        "persona": "Today is June 1, 2019. You are 40 years old and live in New York City.",
        "location": "New York City",
        "age": 40,
        "education": "Master's degree",
        "occupation": "Lawyer",
    }
)

Create a survey of questions testing data leakage

Next we create some questions testing the agent’s personas and combine them in a survey. EDSL comes with many standard question types (free text, multiple choice, numerical, etc.) that can be selected based on the form of the response that you want.
from edsl import QuestionNumerical, QuestionFreeText

q_birth_year = QuestionNumerical(
    question_name="birth_year", question_text="When were you born?"
)

q_old_news = QuestionFreeText(
    question_name="old_news",
    question_text="Briefly describe some major stories from the year you were born.",
)

q_cutoff_date = QuestionFreeText(
    question_name="cutoff_date", question_text="What is today's date?"
)

q_recent_news = QuestionFreeText(
    question_name="recent_news",
    question_text="Briefly describe some recent stories that you know about.",
)

q_future_event = QuestionFreeText(
    question_name="future_event", question_text="Describe a major news event of 2021."
)

q_expectations = QuestionFreeText(
    question_name="expectations",
    question_text="What do you expect the major stories of 2021 to be about?",
)
Next we combine the questions into a survey. Note that when we administer the survey the questions will be executed asynchronously by default. We could also add survey rules/logic and question memory if desired. Learn more about survey design features.
from edsl import Survey

survey = Survey(
    questions=[
        q_birth_year,
        q_old_news,
        q_cutoff_date,
        q_recent_news,
        q_future_event,
        q_expectations,
    ]
)

Run the survey with language models

Next we select models to generate responses and administer the survey (see details about available models):
from edsl import Model, ModelList

models = ModelList([
    Model("gpt-4o", service_name = "openai"),
    Model("gemini-1.5-flash", service_name = "google")
])
To run the survey we add the agent with the by() method and then call the run() method to generate the responses:
results = survey.by(agent).by(models).run()

Inspecting responses

Running a survey generates a Results object with information about the questions, answers, agents, models and prompts that we can access with EDSL’s built-in methods for analyzing results in data tables, dataframes, SQL, JSON, CSV and other formats. We can see a list of these components by calling the columns method:
results.columns
0
0agent.age
1agent.agent_index
2agent.agent_instruction
3agent.agent_name
4agent.education
5agent.location
6agent.occupation
7agent.persona
8answer.birth_year
9answer.cutoff_date
10answer.expectations
11answer.future_event
12answer.old_news
13answer.recent_news
14cache_keys.birth_year_cache_key
15cache_keys.cutoff_date_cache_key
16cache_keys.expectations_cache_key
17cache_keys.future_event_cache_key
18cache_keys.old_news_cache_key
19cache_keys.recent_news_cache_key
20cache_used.birth_year_cache_used
21cache_used.cutoff_date_cache_used
22cache_used.expectations_cache_used
23cache_used.future_event_cache_used
24cache_used.old_news_cache_used
25cache_used.recent_news_cache_used
26comment.birth_year_comment
27comment.cutoff_date_comment
28comment.expectations_comment
29comment.future_event_comment
30comment.old_news_comment
31comment.recent_news_comment
32generated_tokens.birth_year_generated_tokens
33generated_tokens.cutoff_date_generated_tokens
34generated_tokens.expectations_generated_tokens
35generated_tokens.future_event_generated_tokens
36generated_tokens.old_news_generated_tokens
37generated_tokens.recent_news_generated_tokens
38iteration.iteration
39model.frequency_penalty
40model.inference_service
41model.logprobs
42model.maxOutputTokens
43model.max_tokens
44model.model
45model.model_index
46model.presence_penalty
47model.stopSequences
48model.temperature
49model.topK
50model.topP
51model.top_logprobs
52model.top_p
53prompt.birth_year_system_prompt
54prompt.birth_year_user_prompt
55prompt.cutoff_date_system_prompt
56prompt.cutoff_date_user_prompt
57prompt.expectations_system_prompt
58prompt.expectations_user_prompt
59prompt.future_event_system_prompt
60prompt.future_event_user_prompt
61prompt.old_news_system_prompt
62prompt.old_news_user_prompt
63prompt.recent_news_system_prompt
64prompt.recent_news_user_prompt
65question_options.birth_year_question_options
66question_options.cutoff_date_question_options
67question_options.expectations_question_options
68question_options.future_event_question_options
69question_options.old_news_question_options
70question_options.recent_news_question_options
71question_text.birth_year_question_text
72question_text.cutoff_date_question_text
73question_text.expectations_question_text
74question_text.future_event_question_text
75question_text.old_news_question_text
76question_text.recent_news_question_text
77question_type.birth_year_question_type
78question_type.cutoff_date_question_type
79question_type.expectations_question_type
80question_type.future_event_question_type
81question_type.old_news_question_type
82question_type.recent_news_question_type
83raw_model_response.birth_year_cost
84raw_model_response.birth_year_input_price_per_million_tokens
85raw_model_response.birth_year_input_tokens
86raw_model_response.birth_year_one_usd_buys
87raw_model_response.birth_year_output_price_per_million_tokens
88raw_model_response.birth_year_output_tokens
89raw_model_response.birth_year_raw_model_response
90raw_model_response.cutoff_date_cost
91raw_model_response.cutoff_date_input_price_per_million_tokens
92raw_model_response.cutoff_date_input_tokens
93raw_model_response.cutoff_date_one_usd_buys
94raw_model_response.cutoff_date_output_price_per_million_tokens
95raw_model_response.cutoff_date_output_tokens
96raw_model_response.cutoff_date_raw_model_response
97raw_model_response.expectations_cost
98raw_model_response.expectations_input_price_per_million_tokens
99raw_model_response.expectations_input_tokens
100raw_model_response.expectations_one_usd_buys
101raw_model_response.expectations_output_price_per_million_tokens
102raw_model_response.expectations_output_tokens
103raw_model_response.expectations_raw_model_response
104raw_model_response.future_event_cost
105raw_model_response.future_event_input_price_per_million_tokens
106raw_model_response.future_event_input_tokens
107raw_model_response.future_event_one_usd_buys
108raw_model_response.future_event_output_price_per_million_tokens
109raw_model_response.future_event_output_tokens
110raw_model_response.future_event_raw_model_response
111raw_model_response.old_news_cost
112raw_model_response.old_news_input_price_per_million_tokens
113raw_model_response.old_news_input_tokens
114raw_model_response.old_news_one_usd_buys
115raw_model_response.old_news_output_price_per_million_tokens
116raw_model_response.old_news_output_tokens
117raw_model_response.old_news_raw_model_response
118raw_model_response.recent_news_cost
119raw_model_response.recent_news_input_price_per_million_tokens
120raw_model_response.recent_news_input_tokens
121raw_model_response.recent_news_one_usd_buys
122raw_model_response.recent_news_output_price_per_million_tokens
123raw_model_response.recent_news_output_tokens
124raw_model_response.recent_news_raw_model_response
125reasoning_summary.birth_year_reasoning_summary
126reasoning_summary.cutoff_date_reasoning_summary
127reasoning_summary.expectations_reasoning_summary
128reasoning_summary.future_event_reasoning_summary
129reasoning_summary.old_news_reasoning_summary
130reasoning_summary.recent_news_reasoning_summary
131scenario.scenario_index
Here we show some basic methods for selecting and printing responses for each model in a table:
(
    results
    .select(
        "model",
        "birth_year",
        "old_news",
        "cutoff_date",
        "recent_news",
        "future_event",
        "expectations",
    )
)
model.modelanswer.birth_yearanswer.old_newsanswer.cutoff_dateanswer.recent_newsanswer.future_eventanswer.expectations
0gpt-4o1979I was born in 1979, and some major stories from that year include the Three Mile Island nuclear accident in Pennsylvania, which was the most serious accident in U.S. commercial nuclear power plant history. Another significant event was the Soviet invasion of Afghanistan, which marked the beginning of a long and costly conflict. Additionally, the Iran Hostage Crisis began in November 1979, when Iranian students stormed the U.S. Embassy in Tehran, taking 52 American diplomats and citizens hostage. These events had lasting impacts on both national and international levels.Today is June 1, 2019.As of June 1, 2019, some recent stories include: 1. The ongoing trade tensions between the United States and China, which have been affecting global markets. Both countries have been imposing tariffs on each other’s goods, and negotiations are ongoing to reach a resolution. 2. The political situation in the United Kingdom concerning Brexit. The UK is grappling with how to leave the European Union, and there is significant debate and uncertainty around the terms and timing of the exit. 3. The rise of climate change activism, with movements like the school strikes for climate initiated by Greta Thunberg gaining global attention. People around the world are increasingly calling for urgent action to address climate change. 4. The recent conclusion of the final season of the television series “Game of Thrones,” which has sparked widespread discussion and mixed reactions from fans. 5. The 2019 Cannes Film Festival, which recently wrapped up. It showcased a variety of films from around the world and awarded the prestigious Palme d’Or to the South Korean film “Parasite,” directed by Bong Joon-ho.One major news event of 2021 was the COVID-19 vaccine rollout. After the emergency use authorization of vaccines like Pfizer-BioNTech, Moderna, and Johnson & Johnson in late 2020, 2021 saw the widespread distribution and administration of these vaccines across the globe. This effort aimed to curb the spread of COVID-19 and bring the pandemic under control. The rollout involved significant logistical challenges, including production, distribution, and addressing vaccine hesitancy. It was a critical step in the global effort to return to some semblance of normalcy.I can’t predict the future, but based on current trends and events, some major stories in 2021 might revolve around the ongoing recovery from the COVID-19 pandemic, including vaccine distribution and economic impacts. There could also be significant political developments, given the new U.S. administration that started in 2021. Climate change and technology advancements, like AI and renewable energy, might also continue to be important topics. Additionally, social movements and issues related to equality and justice could remain in the spotlight.
1gemini-1.5-flash1979Oh wow, 1979… Let me see, it feels like a lifetime ago! I remember bits and pieces from what my parents talked about, of course. The Three Mile Island accident was a huge deal – a real scare about nuclear power. The Iranian Revolution was dominating the news, a massive shift in the geopolitical landscape. And musically, disco was still huge, but punk rock was really starting to explode. I think my parents were more focused on the political stuff than the music scene, though! Those were some pretty turbulent times.Today is June 1st, 2019.Oh, geez, where to even begin? It’s been a busy news cycle, hasn’t it? I’m still catching up on everything myself, between work and trying to squeeze in some time for myself. But a few things that really stuck with me… there was that whole thing with the impeachment hearings, that was dominating the headlines for a while. And of course, the ongoing trade war with China – that’s impacting a lot of businesses, I’ve been hearing about it from colleagues. Closer to home, the city’s been dealing with some pretty intense heat waves lately. I swear, it’s been hotter than a July day already! Plus, there’s always something going on in the city – some new restaurant opening, a protest march, you name it. It’s hard to keep up, but I try!Oh, 2021… Wow, so much happened. It feels like a lifetime ago! If I had to pick one major news event, I’d probably say the withdrawal of US troops from Afghanistan. I remember watching the news, completely glued to the screen, seeing the chaos at Kabul airport. It was just… surreal. The whole thing felt so fast, so messy, and the images of people clinging to planes were just heartbreaking. It really dominated the headlines for weeks, and the fallout is still being felt today, I imagine. Of course, there was also the ongoing COVID pandemic and all the political turmoil, but the Afghanistan withdrawal really stuck with me. It was such a dramatic and visually striking event.Oh, wow, 2021? That feels like a lifetime ago! To be honest, back in June of 2019, I was pretty focused on my work – I had a big case going on, remember? – so I wasn’t exactly glued to the crystal ball predicting the future. But if you’d asked me then, I probably would have guessed that the major stories would revolve around the ongoing political climate, maybe some international tensions, and certainly the economy. I’d have been surprised by the specifics, of course. Nobody could have predicted a global pandemic of that scale, for instance. But the broad strokes? Probably pretty similar to what actually happened. Things like that always seem to dominate the news cycle, don’t they?
Here we post this notebook to Coop:
from edsl import Notebook

nb = Notebook(path = "testing_training_data.ipynb")

nb.push(
    description = "Testing model training data",
    alias = "testing-model-training-data-notebook",
    visibility = "public"
)
I