Intro to EDSLο
This notebook provides example code for base components of EDSL, an open-source library for simulating surveys, experiments and research tasks with AI agents and large language models.
Before running the code below, please see instructions on getting started.
Our documentation page also provides many more tips, tutorials and demo notebooks for using EDSL.
Simple exampleο
We start by selecting a question type and constructing a question in the relevant template:
[1]:
from edsl import QuestionMultipleChoice
q = QuestionMultipleChoice(
question_name = "marvel_movies",
question_text = "Do you enjoy Marvel movies?",
question_options = ["Yes", "No", "I do not know"]
)
We administer a question by calling the run()
method. This generates a dataset of Results
including the modelβs response to the question:
[2]:
results = q.run()
results.select("marvel_movies")
Service | Model | Input Tokens | Input Cost | Output Tokens | Output Cost | Total Cost | Total Credits |
---|---|---|---|---|---|---|---|
openai | gpt-4o | 64 | $0.0002 | 22 | $0.0003 | $0.0005 | 0.00 |
Totals | 64 | $0.0002 | 22 | $0.0003 | $0.0005 | 0.00 |
You can obtain the total credit cost by multiplying the total USD cost by 100. A lower credit cost indicates that you saved money by retrieving responses from the universal remote cache.
[2]:
answer.marvel_movies | |
---|---|
0 | I do not know |
Designing AI agentsο
We can create personas for agents to answer the question:
[3]:
from edsl import AgentList, Agent
personas = ["comic book collector", "movie critic"]
a = AgentList(
Agent(traits = {"persona": p}) for p in personas
)
Selecting language modelsο
We can select language models to generate the responses (in the example above we did not specify a model, so GPT 4 preview was used by default):
[4]:
from edsl import ModelList, Model
m = ModelList([
Model("gpt-4o", service_name = "openai"),
Model("gemini-1.5-flash", service_name = "google")
])
Generating resultsο
We add agents and models to a question when running it:
[5]:
results = q.by(a).by(m).run()
results.select("model", "persona", "marvel_movies")
Service | Model | Input Tokens | Input Cost | Output Tokens | Output Cost | Total Cost | Total Credits |
---|---|---|---|---|---|---|---|
openai | gpt-4o | 189 | $0.0005 | 64 | $0.0008 | $0.0013 | 0.00 |
gemini-1.5-flash | 187 | $0.0001 | 118 | $0.0001 | $0.0002 | 0.00 | |
Totals | 376 | $0.0006 | 182 | $0.0009 | $0.0015 | 0.00 |
You can obtain the total credit cost by multiplying the total USD cost by 100. A lower credit cost indicates that you saved money by retrieving responses from the universal remote cache.
[5]:
model.model | agent.persona | answer.marvel_movies | |
---|---|---|---|
0 | gpt-4o | comic book collector | Yes |
1 | gemini-1.5-flash | comic book collector | Yes |
2 | gpt-4o | movie critic | Yes |
3 | gemini-1.5-flash | movie critic | Yes |
Parameterizing questionsο
We can use Scenario
objects to add data or content to questions:
[6]:
q1 = QuestionMultipleChoice(
question_name = "politically_motivated",
question_text = """
Read the following movie review and determine whether it is politically motivated.
Movie: {{ scenario.title }}
Review: {{ scenario.review }}
""",
question_options = ["Yes", "No", "I do not know"]
)
EDSL comes with methods for generating scenarios from many data sources, including PDFs, CSVs, docs, images, tables, lists, dicts:
[7]:
from edsl import Scenario
s = Scenario({
"year": 2014,
"title": "Captain America: The Winter Soldier",
"review": """
Part superhero flick, part 70s political thriller.
It's a bold mix that pays off, delivering a scathing
critique of surveillance states wrapped in spandex
and shield-throwing action.
"""
})
[8]:
results = q1.by(s).by(a).by(m).run()
Service | Model | Input Tokens | Input Cost | Output Tokens | Output Cost | Total Cost | Total Credits |
---|---|---|---|---|---|---|---|
openai | gpt-4o | 327 | $0.0009 | 49 | $0.0005 | $0.0014 | 0.00 |
gemini-1.5-flash | 345 | $0.0001 | 179 | $0.0001 | $0.0002 | 0.00 | |
Totals | 672 | $0.0010 | 228 | $0.0006 | $0.0016 | 0.00 |
You can obtain the total credit cost by multiplying the total USD cost by 100. A lower credit cost indicates that you saved money by retrieving responses from the universal remote cache.
[9]:
(
results.filter("{{ agent.persona }} == 'movie critic'")
.sort_by("model")
.select("model", "year", "title", "politically_motivated")
)
[9]:
model.model | scenario.year | scenario.title | answer.politically_motivated | |
---|---|---|---|---|
0 | gemini-1.5-flash | 2014 | Captain America: The Winter Soldier | No |
1 | gpt-4o | 2014 | Captain America: The Winter Soldier | Yes |
Commentsο
Questions automatically include a βcommentβ field. This can be useful for understanding the context of a response, or debugging a non-response.
[10]:
(
results.filter("{{ agent.persona }} == 'movie critic'")
.sort_by("model")
.select("model", "politically_motivated", "politically_motivated_comment")
)
[10]:
model.model | answer.politically_motivated | comment.politically_motivated_comment | |
---|---|---|---|
0 | gemini-1.5-flash | No | The review focuses on the film's genre blending and its thematic exploration of surveillance states. While the themes touched upon have political undertones, the review itself doesn't explicitly endorse or condemn any specific political ideology or party. It's a critique of a concept, not a political stance. |
1 | gpt-4o | Yes | The review mentions a "scathing critique of surveillance states," indicating that the movie's themes are politically motivated. |
Combining questions in a surveyο
We can combine questions in a ``Survey` <https://docs.expectedparrot.com/en/latest/surveys.html>`__ to administer them together. Here we create some variations on the above question to compare responses:
[11]:
from edsl import QuestionYesNo
q2 = QuestionYesNo(
question_name = "yn",
question_text = """
Read the following movie review and determine whether it is politically motivated.
Movie: {{ scenario.title }}
Review: {{ scenario.review }}
"""
)
[12]:
from edsl import QuestionLinearScale
q3 = QuestionLinearScale(
question_name = "ls",
question_text = """
Read the following movie review and indicate whether it is politically motivated.
Movie: {{ scenario.title }}
Review: {{ scenario.review }}
""",
question_options = [0,1,2,3,4,5],
option_labels = {0:"Not at all", 5:"Very much"}
)
[13]:
from edsl import QuestionList
q4 = QuestionList(
question_name = "favorites",
question_text = "List your favorite Marvel movies.",
max_list_items = 3
)
Survey rules & logicο
We can add skip/stop and other rules, and βmemoryβ of other questions in a survey:
[14]:
from edsl import Survey
survey = Survey(questions = [q2, q3, q4])
survey = survey.add_stop_rule(q3, "{{ ls.answer }} < 3")
[15]:
results = survey.by(s).by(a).by(m).run()
Service | Model | Input Tokens | Input Cost | Output Tokens | Output Cost | Total Cost | Total Credits |
---|---|---|---|---|---|---|---|
openai | gpt-4o | 889 | $0.0023 | 278 | $0.0029 | $0.0052 | 0.00 |
gemini-1.5-flash | 702 | $0.0001 | 238 | $0.0001 | $0.0002 | 0.00 | |
Totals | 1,591 | $0.0024 | 516 | $0.0030 | $0.0054 | 0.00 |
You can obtain the total credit cost by multiplying the total USD cost by 100. A lower credit cost indicates that you saved money by retrieving responses from the universal remote cache.
[16]:
(
results.filter("{{ agent.persona }} == 'comic book collector'")
.select("model", "persona", "yn", "ls", "favorites")
.print(pretty_labels = {
"answer.yn": "Yes/No version",
"answer.ls": "Linear scale version",
"answer.favorites": "Favorites"
})
)
[16]:
model.model | agent.persona | Yes/No version | Linear scale version | Favorites | |
---|---|---|---|---|---|
0 | gpt-4o | comic book collector | Yes | 3 | ['Iron Man', 'The Avengers', 'Guardians of the Galaxy'] |
1 | gemini-1.5-flash | comic book collector | Yes | 1 | nan |
Working with results as datasetsο
EDSL provides built-in methods for analyzing results, e.g., as SQL tables, dataframes:
[17]:
results.sql("""
select model, persona, yn, ls, favorites
from self
order by model, persona
""")
[17]:
model | persona | yn | ls | favorites | |
---|---|---|---|---|---|
0 | gemini-1.5-flash | comic book collector | Yes | 1 | nan |
1 | gemini-1.5-flash | movie critic | Yes | 1 | nan |
2 | gpt-4o | comic book collector | Yes | 3 | ['Iron Man', 'The Avengers', 'Guardians of the Galaxy'] |
3 | gpt-4o | movie critic | Yes | 3 | ['Avengers: Endgame', 'Black Panther', 'Guardians of the Galaxy'] |
[18]:
results.to_pandas()
[18]:
answer.ls | answer.yn | answer.favorites | scenario.year | scenario.scenario_index | scenario.title | scenario.review | agent.agent_index | agent.persona | agent.agent_instruction | ... | generated_tokens.ls_generated_tokens | cache_used.yn_cache_used | cache_used.favorites_cache_used | cache_used.ls_cache_used | cache_keys.yn_cache_key | cache_keys.ls_cache_key | cache_keys.favorites_cache_key | reasoning_summary.ls_reasoning_summary | reasoning_summary.favorites_reasoning_summary | reasoning_summary.yn_reasoning_summary | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3 | Yes | ['Iron Man', 'The Avengers', 'Guardians of the... | 2014 | 0 | Captain America: The Winter Soldier | \n Part superhero flick, part 70s political... | 0 | comic book collector | You are answering questions as if you were a h... | ... | 3\n\nThe review highlights the film's critique... | True | True | True | 28be7af4ca7266aeaaf6dfb1939f2830 | f34701a54317094ac2145039c54d2e82 | 4a65448cf57b2e2529af3a3fe11ad8b0 | NaN | NaN | NaN |
1 | 1 | Yes | NaN | 2014 | 0 | Captain America: The Winter Soldier | \n Part superhero flick, part 70s political... | 0 | comic book collector | You are answering questions as if you were a h... | ... | 1\n\nThe review explicitly mentions a "scathin... | True | NaN | True | 455b0c2040a35ed5ee2b5653a939e8ad | f97dfa012a93b0722bd0e173651a1523 | NaN | NaN | NaN | NaN |
2 | 3 | Yes | ['Avengers: Endgame', 'Black Panther', 'Guardi... | 2014 | 0 | Captain America: The Winter Soldier | \n Part superhero flick, part 70s political... | 1 | movie critic | You are answering questions as if you were a h... | ... | 3 \nThe review highlights a critique of surve... | True | True | True | 502a250f2082fec0421839e48967b5d6 | b24392669b17dbf2ddcaf21067eafae9 | 595a2e86d433057894b2609f4e058586 | NaN | NaN | NaN |
3 | 1 | Yes | NaN | 2014 | 0 | Captain America: The Winter Soldier | \n Part superhero flick, part 70s political... | 1 | movie critic | You are answering questions as if you were a h... | ... | 1\n\nThe review directly mentions a "scathing ... | True | NaN | True | 6a0c7ae0d01760ce2648dae44cbcfa82 | b980819394531b86e6bc4ac8e5e1add3 | NaN | NaN | NaN | NaN |
4 rows Γ 77 columns
[19]:
results.to_csv("marvel_movies_survey.csv")
File written to marvel_movies_survey.csv
Posting to the Coopο
[ ]:
from edsl import Notebook
nb = Notebook(path = "edsl_intro.ipynb")
nb.push(
description = "Example survey: Using EDSL to analyze content",
alias = "example-edsl-notebook",
visibility = "public"
)