Intro to EDSL

This notebook provides example code for base components of EDSL, an open-source library for simulating surveys, experiments and other research with AI agents and large language models. Details on the code below are provided in accompanying slides: How to use EDSL.

Technical setup

Before running the code below, please ensure that you have installed the EDSL library and either activated remote inference from your Coop account or stored API keys for the language models that you want to use with EDSL.

Documentation

Please also see our documentation page for tips, tutorials and more demo notebooks on using EDSL.

Simple example

We start by selecting a question type and constructing a question in the relevant template:

[1]:
from edsl import QuestionMultipleChoice

q = QuestionMultipleChoice(
    question_name = "marvel_movies",
    question_text = "Do you enjoy Marvel movies?",
    question_options = ["Yes", "No", "I do not know"]
)

We administer a question by calling the run() method. This generates a dataset of Results including the model’s response to the question:

[2]:
results = q.run()

results.select("marvel_movies")
Remote Job Log (2024-12-14 08:37:16)
Remote inference activated. Sending job to server...
Your survey is running at the Expected Parrot server...
Job sent to server. (Job uuid=d321721d-4e3a-42af-9bc2-9f492e5dcaaa).
Job status: queued - last update: 2024-12-14 08:37:03 AM
Job status: running - last update: 2024-12-14 08:37:06 AM
Job status: running - last update: 2024-12-14 08:37:09 AM
Job status: running - last update: 2024-12-14 08:37:12 AM
[2]:
answer.marvel_movies
I do not know

Designing AI agents

We can create personas for agents to answer the question:

[3]:
from edsl import AgentList, Agent

personas = ["comic book collector", "movie critic"]

a = AgentList(
    Agent(traits = {"persona": p}) for p in personas
)

Selecting language models

We can select language models to generate the responses (in the example above we did not specify a model, so GPT 4 preview was used by default):

[4]:
from edsl import ModelList, Model

models = ["gpt-4o", "claude-3-5-sonnet-20240620"]

m = ModelList(
    Model(m) for m in ["gpt-4o", "claude-3-5-sonnet-20240620"]
)

Generating results

We add agents and models to a question when running it:

[5]:
results = q.by(a).by(m).run()

results.select("model", "persona", "marvel_movies")
Remote Job Log (2024-12-14 08:39:59)
Remote inference activated. Sending job to server...
Your survey is running at the Expected Parrot server...
Job sent to server. (Job uuid=96b8fad4-e884-4370-8cf5-3ddc5ffa5f37).
Job status: queued - last update: 2024-12-14 08:39:43 AM
Job status: running - last update: 2024-12-14 08:39:46 AM
Job status: running - last update: 2024-12-14 08:39:50 AM
Job status: running - last update: 2024-12-14 08:39:53 AM
Job status: running - last update: 2024-12-14 08:39:56 AM
[5]:
model.model agent.persona answer.marvel_movies
gpt-4o comic book collectorYes
claude-3-5-sonnet-20240620comic book collectorYes
gpt-4o movie critic Yes
claude-3-5-sonnet-20240620movie critic Yes

Parameterizing questions

We can use Scenario objects to add data or content to questions:

[6]:
q1 = QuestionMultipleChoice(
    question_name = "politically_motivated",
    question_text = """
    Read the following movie review and determine whether it is politically motivated.
    Movie: {{ title }}
    Review: {{ review }}
    """,
    question_options = ["Yes", "No", "I do not know"]
)

EDSL comes with methods for generating scenarios from many data sources, including PDFs, CSVs, docs, images, tables, lists, dicts:

[7]:
from edsl import Scenario

example_review = {
    "year": 2014,
    "title": "Captain America: The Winter Soldier",
    "review": """
    Part superhero flick, part 70s political thriller.
    It's a bold mix that pays off, delivering a scathing
    critique of surveillance states wrapped in spandex
    and shield-throwing action.
    """
}

s = Scenario.from_dict(example_review)
[8]:
results = q1.by(s).by(a).by(m).run()

(
    results.filter("persona == 'movie critic'")
    .sort_by("model")
    .select("model", "year", "title", "politically_motivated")
)
Remote Job Log (2024-12-14 08:42:05)
Remote inference activated. Sending job to server...
Your survey is running at the Expected Parrot server...
Job sent to server. (Job uuid=8b1fcef8-684c-4696-a533-d2b7811fb281).
Job status: queued - last update: 2024-12-14 08:41:45 AM
Job status: queued - last update: 2024-12-14 08:41:49 AM
Job status: running - last update: 2024-12-14 08:41:52 AM
Job status: running - last update: 2024-12-14 08:41:55 AM
Job status: running - last update: 2024-12-14 08:41:58 AM
Job status: running - last update: 2024-12-14 08:42:01 AM
[8]:
model.model scenario.yearscenario.title answer.politically_motivated
claude-3-5-sonnet-20240620 2014Captain America: The Winter SoldierNo
gpt-4o 2014Captain America: The Winter SoldierYes

Comments

Questions automatically include a “comment” field. This can be useful for understanding the context of a response, or debugging a non-response.

[9]:
(
    results.filter("persona == 'movie critic'")
    .sort_by("model")
    .select("model", "politically_motivated", "politically_motivated_comment")
)
[9]:
model.model answer.politically_motivated comment.politically_motivated_comment
claude-3-5-sonnet-20240620No While the review mentions political themes, it doesn't appear to be politically motivated itself. It's a straightforward assessment of the film's genre blend and thematic content without pushing any particular political agenda.
gpt-4o Yes The review explicitly mentions a "scathing critique of surveillance states," indicating a political perspective on the film's themes.

Combining questions in a survey

We can combine questions in a ``Survey` <https://docs.expectedparrot.com/en/latest/surveys.html>`__ to administer them together. Here we create some variations on the above question to compare responses:

[10]:
from edsl import QuestionYesNo

q2 = QuestionYesNo(
    question_name = "yn",
    question_text = """
    Read the following movie review and determine whether it is politically motivated.
    Movie: {{ title }}
    Review: {{ review }}
    """
)
[11]:
from edsl import QuestionLinearScale

q3 = QuestionLinearScale(
    question_name = "ls",
    question_text = """
    Read the following movie review and indicate whether it is politically motivated.
    Movie: {{ title }}
    Review: {{ review }}
    """,
    question_options = [0,1,2,3,4,5],
    option_labels = {0:"Not at all", 5:"Very much"}
)
[12]:
from edsl import QuestionList

q4 = QuestionList(
    question_name = "favorites",
    question_text = "List your favorite Marvel movies.",
    max_list_items = 3
)

Survey rules & logic

We can add skip/stop and other rules, and “memory” of other questions in a survey:

[13]:
from edsl import Survey

survey = Survey(questions = [q2, q3, q4])

survey = survey.add_stop_rule(q3, "ls < 3")
[14]:
results = survey.by(s).by(a).by(m).run()
Remote Job Log (2024-12-14 08:46:18)
Remote inference activated. Sending job to server...
Your survey is running at the Expected Parrot server...
Job sent to server. (Job uuid=b239f5bb-27dc-4834-9741-babfc0b5f8c9).
Job status: queued - last update: 2024-12-14 08:45:59 AM
Job status: running - last update: 2024-12-14 08:46:02 AM
Job status: running - last update: 2024-12-14 08:46:05 AM
Job status: running - last update: 2024-12-14 08:46:09 AM
Job status: running - last update: 2024-12-14 08:46:12 AM
Job status: running - last update: 2024-12-14 08:46:15 AM
[15]:
(
    results.filter("persona == 'comic book collector'")
    .select("model", "persona", "yn", "ls", "favorites")
    .print(pretty_labels = {
        "answer.yn": "Yes/No version",
        "answer.ls": "Linear scale version",
        "answer.favorites": "Favorites"
    })
)
[15]:
model.model agent.persona Yes/No version Linear scale versionFavorites
gpt-4o comic book collectorYes 3['Guardians of the Galaxy', 'Spider-Man: Into the Spider-Verse', 'Avengers: Endgame']
claude-3-5-sonnet-20240620comic book collectorNo 2

Working with results as datasets

EDSL provides built-in methods for analyzing results, e.g., as SQL tables, dataframes:

[16]:
results.sql("select model, persona, yn, ls, favorites from self")
[16]:
model persona yn lsfavorites
gpt-4o comic book collectorYes 3['Guardians of the Galaxy', 'Spider-Man: Into the Spider-Verse', 'Avengers: Endgame']
claude-3-5-sonnet-20240620comic book collectorNo 2
gpt-4o movie critic Yes 3['Avengers: Endgame', 'Black Panther', 'Guardians of the Galaxy']
claude-3-5-sonnet-20240620movie critic No 0
[17]:
results.to_pandas()
[17]:
answer.yn answer.ls answer.favorites scenario.year scenario.title scenario.review agent.agent_instruction agent.agent_name agent.persona model.frequency_penalty ... question_options.ls_question_options question_type.favorites_question_type question_type.yn_question_type question_type.ls_question_type comment.yn_comment comment.ls_comment comment.favorites_comment generated_tokens.ls_generated_tokens generated_tokens.yn_generated_tokens generated_tokens.favorites_generated_tokens
0 Yes 3 ['Guardians of the Galaxy', 'Spider-Man: Into ... 2014 Captain America: The Winter Soldier \n Part superhero flick, part 70s political... You are answering questions as if you were a h... Agent_9 comic book collector 0 ... [0, 1, 2, 3, 4, 5] list yes_no linear_scale While the review highlights the film's action ... The review suggests that the movie delivers a ... These movies capture the essence of what makes... 3 \nThe review suggests that the movie delive... Yes\n\nWhile the review highlights the film's ... ["Guardians of the Galaxy", "Spider-Man: Into ...
1 No 2 NaN 2014 Captain America: The Winter Soldier \n Part superhero flick, part 70s political... You are answering questions as if you were a h... Agent_10 comic book collector 0 ... [0, 1, 2, 3, 4, 5] list yes_no linear_scale Comment: As a comic book collector, I don't se... Task was cancelled. Task was cancelled. NaN No\n\nComment: As a comic book collector, I do... NaN
2 Yes 3 ['Avengers: Endgame', 'Black Panther', 'Guardi... 2014 Captain America: The Winter Soldier \n Part superhero flick, part 70s political... You are answering questions as if you were a h... Agent_11 movie critic 0 ... [0, 1, 2, 3, 4, 5] list yes_no linear_scale The review mentions a "scathing critique of su... The review acknowledges the film's critique of... These films stand out due to their compelling ... 3 \nThe review acknowledges the film's critiq... Yes\n\nThe review mentions a "scathing critiqu... ["Avengers: Endgame", "Black Panther", "Guardi...
3 No 0 NaN 2014 Captain America: The Winter Soldier \n Part superhero flick, part 70s political... You are answering questions as if you were a h... Agent_12 movie critic 0 ... [0, 1, 2, 3, 4, 5] list yes_no linear_scale Comment: This review does not appear to be pol... Task was cancelled. Task was cancelled. NaN No\n\nComment: This review does not appear to ... NaN

4 rows × 48 columns

[18]:
results.to_csv("marvel_movies_survey.csv")
[18]:

FileStore

key value
path marvel_movies_survey.csv

binary False
suffix csv
mime_type text/csv

Posting to the Coop

[19]:
from edsl import Notebook
[20]:
n = Notebook(path = "edsl_intro.ipynb")
[21]:
info = n.push(description = "Example survey: Using EDSL to analyze content", visibility = "public")
info
[21]:
{'description': 'Example survey: Using EDSL to analyze content',
 'object_type': 'notebook',
 'url': 'https://www.expectedparrot.com/content/2ed11d62-8e1a-44c3-976c-7de87cb28e6b',
 'uuid': '2ed11d62-8e1a-44c3-976c-7de87cb28e6b',
 'version': '0.1.39.dev1',
 'visibility': 'public'}

To update an object at the Coop:

[22]:
n = Notebook(path = "edsl_intro.ipynb") # resave
[23]:
n.patch(uuid = info["uuid"], value = n)
[23]:
{'status': 'success'}