Cheatsheet: Scenarios

This notebook provides quick examples of methods for using Scenario and ScenarioList objects to add data or other content to your EDSL survey questions.

EDSL is an open-source Python library for simulating surveys, experiments and other research with AI agents and large language models. It is built by Expected Parrot and available under the MIT License. Please see our documentation page for information and tutorials on getting started, and more details on methods for working with scenarios that are shown in this notebook.

Purpose

Scenarios allow you to efficiently administer multiple versions of questions at once. This can be useful in conducting experiments, data labeling and other tasks where you want to answer the same questions about many different things, such as every piece of data in a dataset, or a collection of texts, websites, images or other content.

Contents

In the examples below we demonstrate:

How to inspect, use, create, combine, replicate, modify, sample, slice, chunk, select and drop scenarios.
How to automatically generate scenarios from texts, PDFs, CSVs, webpages, images, dictionaries and lists.
How to use scenarios to add metadata to surveys.
How to generate code for recreating scenarios.
How to use scenarios to design AI agents to answer surveys.

Let us know if you do not see a method that you want to use! Please post a message at our Discord or send an email to info@expectedparrot.com.

How to use this notebook

The examples below can be rerun or modified to use your own questions and data or content. They also include code for posting the objects at the Coop: a new platform for creating, storing and sharing LLM-based research using EDSL.

Learn more about using the Coop to conduct and share research.

Technical setup

Before running the code below, ensure that you have: 1. Installed the EDSL library. 2. Created a Coop account to activate remote inference OR stored your own API keys for language models that you want to use with EDSL.

Importing the tools

We start by importing the relevant tools for working with scenarios:

[1]:

from edsl import Scenario, ScenarioList

Inspecting an example

A Scenario contains a dictionary of keys and values representing data or content to be added to (inserted in) the question_text field of a Question object. We can call the example() method to inspect an example scenario:

[2]:

example_scenario = Scenario.example()
example_scenario

[2]:

{
    "persona": "A reseacher studying whether LLMs can be used to generate surveys."
}

We can also see an example ScenarioList, which is a dictionary containing a list of scenarios:

[3]:

example_scenariolist = ScenarioList.example()
example_scenariolist

[3]:

{
    "scenarios": [
        {
            "persona": "A reseacher studying whether LLMs can be used to generate surveys."
        },
        {
            "persona": "A reseacher studying whether LLMs can be used to generate surveys."
        }
    ]
}

Using a Scenario

To use a scenario: 1. Create a Question (or a Survey of multiple questions) that includes a {{ placeholder }} in the question_text for each scenario key. 2. Call the by() method on the question (or survey) and pass it the scenario (or list of scenarios). A new version of a question is automatically created with each scenario value. 3. Call the run() method on the question (or survey) to send it to a language model. Responses are returned in a formatted dataset of Results that includes details on all components.

Here we construct a survey of questions using the example scenario from above, send it to a language model and inspect the results. If remote inference is activated, the results are also automatically posted to the Coop. Otherwise, we can post any objects to the Coop by calling the push() method on them.

[4]:

# Import question types
from edsl import QuestionFreeText, QuestionList, Survey

# Create questions in the relevant templates with placeholders for content to be inserted
q1 = QuestionFreeText(
    question_name = "background",
    question_text = "Draft a sample bio for this researcher: {{ persona }}",
)
q2 = QuestionList(
    question_name = "interests",
    question_text = "Identify some potential interests of this researcher: {{ persona }}",
)

# Combine questions into a survey to administer them together
survey = Survey(questions=[q1, q2])

# Run the survey with the scenarios to generate a dataset of results
results = survey.by(example_scenario).run()

We have generated the results locally, and can post them and the survey at the Coop by calling the push() method, optionally passing a description and visibility status (default is `unlisted):

[5]:

survey.push(description = "Simple survey using the example scenario for a persona", visibility = "public")

[5]:

{'description': 'Simple survey using the example scenario for a persona',
 'object_type': 'survey',
 'url': 'https://www.expectedparrot.com/content/644a4aa8-79b9-4336-b281-4a15e8643ab8',
 'uuid': '644a4aa8-79b9-4336-b281-4a15e8643ab8',
 'version': '0.1.31.dev4',
 'visibility': 'public'}

[6]:

results.push(description = "Results of simple survey using the example scenario for a persona", visibility = "public")

[6]:

{'description': 'Results of simple survey using the example scenario for a persona',
 'object_type': 'results',
 'url': 'https://www.expectedparrot.com/content/edc790c1-d95b-4a51-8732-b36ef0977147',
 'uuid': 'edc790c1-d95b-4a51-8732-b36ef0977147',
 'version': '0.1.31.dev4',
 'visibility': 'public'}

We can analyze the results at the Coop or using built-in methods for working with results at your workspace:

[7]:

# Print a table of selected components of the results
results.select("persona", "background", "interests").print(format="rich")

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ scenario                            ┃ answer                              ┃ answer                              ┃
┃ .persona                            ┃ .background                         ┃ .interests                          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ A reseacher studying whether LLMs   │ Dr. Alex Rivera is a pioneering     │ ['natural language processing',     │
│ can be used to generate surveys.    │ researcher in the field of          │ 'survey design', 'machine           │
│                                     │ artificial intelligence, focusing   │ learning', 'human-computer          │
│                                     │ on the capabilities and             │ interaction', 'data collection      │
│                                     │ applications of Large Language      │ methods', 'artificial intelligence  │
│                                     │ Models (LLMs). With a Ph.D. in      │ ethics', 'response rate             │
│                                     │ Computer Science and a passion for  │ optimization', 'text generation']   │
│                                     │ advancing AI technology, Dr. Rivera │                                     │
│                                     │ has dedicated their career to       │                                     │
│                                     │ exploring how LLMs can be utilized  │                                     │
│                                     │ to automate and optimize the        │                                     │
│                                     │ creation of surveys. Their work     │                                     │
│                                     │ investigates the potential for      │                                     │
│                                     │ these models to understand nuanced  │                                     │
│                                     │ human inquiries and generate        │                                     │
│                                     │ questions that capture the depth    │                                     │
│                                     │ and complexity of various research  │                                     │
│                                     │ topics. Through their innovative    │                                     │
│                                     │ studies, Dr. Rivera aims to         │                                     │
│                                     │ revolutionize the way data is       │                                     │
│                                     │ collected, enhancing the efficiency │                                     │
│                                     │ and accuracy of survey-based        │                                     │
│                                     │ research.                           │                                     │
└─────────────────────────────────────┴─────────────────────────────────────┴─────────────────────────────────────┘

Creating a Scenario

We create a scenario by passing a dictionary to a Scenario object:

[8]:

weather_scenario = Scenario({"weather": "sunny"})
weather_scenario

[8]:

{
    "weather": "sunny"
}

Creating a ScenarioList

It can be useful to create a set of scenarios all at once. This can be done by constructing a list of Scenario objects or a ScenarioList. Compare a list of Scenario objects:

[9]:

weather_scenarios = [
    Scenario({"weather": w}) for w in ["sunny", "cloudy", "rainy", "snowy"]
]
weather_scenarios

[9]:

[Scenario({'weather': 'sunny'}),
 Scenario({'weather': 'cloudy'}),
 Scenario({'weather': 'rainy'}),
 Scenario({'weather': 'snowy'})]

Alternatively, we can create a ScenarioList which has a key scenarios and a list of scenarios as the values:

[10]:

example_scenariolist = ScenarioList.example()
example_scenariolist

[10]:

{
    "scenarios": [
        {
            "persona": "A reseacher studying whether LLMs can be used to generate surveys."
        },
        {
            "persona": "A reseacher studying whether LLMs can be used to generate surveys."
        }
    ]
}

[11]:

weather_scenariolist = ScenarioList(
    [Scenario({"weather": w}) for w in ["sunny", "cloudy", "rainy", "snowy"]]
)
weather_scenariolist

[11]:

{
    "scenarios": [
        {
            "weather": "sunny"
        },
        {
            "weather": "cloudy"
        },
        {
            "weather": "rainy"
        },
        {
            "weather": "snowy"
        }
    ]
}

[12]:

weather_scenariolist.push(description="Weather scenarios", visibility="public")

[12]:

{'description': 'Weather scenarios',
 'object_type': 'scenario_list',
 'url': 'https://www.expectedparrot.com/content/5e65491e-6bdf-48f6-b2e0-55e1d2dd02d2',
 'uuid': '5e65491e-6bdf-48f6-b2e0-55e1d2dd02d2',
 'version': '0.1.31.dev4',
 'visibility': 'public'}

Combining scenarios

We can add scenarios together to create a single new scenario with an extended dictionary:

[13]:

scenario1 = Scenario({"food": "apple"})
scenario2 = Scenario({"drink": "juice"})

snack_scenario = scenario1 + scenario2
snack_scenario

[13]:

{
    "food": "apple",
    "drink": "juice"
}

Replicating scenarios

We can replicate a scenario to create a ScenarioList:

[14]:

personas_scenariolist = Scenario.example().replicate(n=3)
personas_scenariolist

[14]:

{
    "scenarios": [
        {
            "persona": "A reseacher studying whether LLMs can be used to generate surveys."
        },
        {
            "persona": "A reseacher studying whether LLMs can be used to generate surveys."
        },
        {
            "persona": "A reseacher studying whether LLMs can be used to generate surveys."
        }
    ]
}

Renaming scenarios

We can call the rename() method to rename the fields (keys) of a Scenario:

[15]:

role_scenario = Scenario.example().rename({"persona": "role"})
role_scenario

[15]:

{
    "role": "A reseacher studying whether LLMs can be used to generate surveys."
}

The method can also be called on a ScenarioList:

[16]:

scenariolist = ScenarioList(
    [
        Scenario({"name": "Apostolos"}),
        Scenario({"name": "John"}),
        Scenario({"name": "Robin"}),
    ]
)

renamed_scenariolist = scenariolist.rename({"name": "first_name"})
renamed_scenariolist

[16]:

{
    "scenarios": [
        {
            "first_name": "Apostolos"
        },
        {
            "first_name": "John"
        },
        {
            "first_name": "Robin"
        }
    ]
}

Sampling

We can call the sample() method to take a sample from a ScenarioList:

[17]:

weather_scenariolist = ScenarioList(
    [Scenario({"weather": w}) for w in ["sunny", "cloudy", "rainy", "snowy"]]
)

sample = weather_scenariolist.sample(n=2)
sample

[17]:

{
    "scenarios": [
        {
            "weather": "cloudy"
        },
        {
            "weather": "sunny"
        }
    ]
}

Selecting and dropping scenarios

We can call the select() and drop() methods on a ScenarioList to include and exclude specified fields from the scenarios:

[18]:

snacks_scenariolist = ScenarioList(
    [
        Scenario({"food": "apple", "drink": "water"}),
        Scenario({"food": "banana", "drink": "milk"}),
    ]
)

food_scenariolist = snacks_scenariolist.select("food")
food_scenariolist

[18]:

{
    "scenarios": [
        {
            "food": "apple"
        },
        {
            "food": "banana"
        }
    ]
}

[19]:

drink_scenariolist = snacks_scenariolist.drop("food")
drink_scenariolist

[19]:

{
    "scenarios": [
        {
            "drink": "water"
        },
        {
            "drink": "milk"
        }
    ]
}

Adding metadata to scenarios

Note that we can create fields in scenarios without including them in the question_text. This will cause the fields to be present in the Results dataset, which can be useful for adding metadata to your questions and results without needing to recombine data sources. See more examples here.

Example usage:

[20]:

songs = [
    ["1999", "Prince", "pop"],
    ["1979", "The Smashing Pumpkins", "alt"],
    ["1901", "Phoenix", "indie"],
]
metadata_scenarios = ScenarioList(
    [Scenario({"title": t, "musician": m, "genre": g}) for [t, m, g] in songs]
)
metadata_scenarios

[20]:

{
    "scenarios": [
        {
            "title": "1999",
            "musician": "Prince",
            "genre": "pop"
        },
        {
            "title": "1979",
            "musician": "The Smashing Pumpkins",
            "genre": "alt"
        },
        {
            "title": "1901",
            "musician": "Phoenix",
            "genre": "indie"
        }
    ]
}

[21]:

q = QuestionFreeText(
    question_name="song",
    question_text="What is this song about: {{ title }}",  # optionally omitting other fields in the scenarios
)

results = q.by(metadata_scenarios).run()
results.select("scenario.*", "song").print(format="rich")  # all scenario fields will be present

┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ scenario              ┃ scenario ┃ scenario ┃ answer                                                            ┃
┃ .musician             ┃ .genre   ┃ .title   ┃ .song                                                             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Prince                │ pop      │ 1999     │ The song '1999' by Prince is about the celebration of life and    │
│                       │          │          │ the enjoyment of the present moment. It reflects the desire to    │
│                       │          │          │ party and have a good time without worrying about the problems of │
│                       │          │          │ the world, even with the looming threat of the millennium (the    │
│                       │          │          │ year 2000) and potential apocalyptic events. The song suggests    │
│                       │          │          │ that we should live life to the fullest and keep dancing as if    │
│                       │          │          │ it's the last night, embracing the 'party like it's 1999' ethos.  │
├───────────────────────┼──────────┼──────────┼───────────────────────────────────────────────────────────────────┤
│ The Smashing Pumpkins │ alt      │ 1979     │ The song '1979' by The Smashing Pumpkins is about the transition  │
│                       │          │          │ from youth to adulthood and the nostalgia and reflection that     │
│                       │          │          │ comes with it. It captures the feeling of freedom and the         │
│                       │          │          │ bittersweet nature of growing up, evoking memories of teenage     │
│                       │          │          │ years and the desire to hold onto those fleeting moments of       │
│                       │          │          │ youth. The song's title references the year 1979, which serves as │
│                       │          │          │ a symbolic marker for this period of change and maturation.       │
├───────────────────────┼──────────┼──────────┼───────────────────────────────────────────────────────────────────┤
│ Phoenix               │ indie    │ 1901     │ The song '1901' by the French indie pop band Phoenix is often     │
│                       │          │          │ interpreted as a nostalgic look back at the past, with references │
│                       │          │          │ to the early 20th century. The lyrics mention falling and folding │
│                       │          │          │ 'like an antiquated love,' which could suggest a reflection on    │
│                       │          │          │ times gone by and the changes that come with the passing of time. │
│                       │          │          │ The chorus of 'It's 20 seconds 'til the last call' might be       │
│                       │          │          │ metaphorical for the last moments of a certain era or the end of  │
│                       │          │          │ a personal period in someone's life. Overall, the song has a      │
│                       │          │          │ sense of looking back on a bygone era with a mix of fondness and  │
│                       │          │          │ the recognition that it is no longer attainable.                  │
└───────────────────────┴──────────┴──────────┴───────────────────────────────────────────────────────────────────┘

Chunking text

We can use the chunk() method to turn a Scenario into a ScenarioList with specified slice/chunk sizes based on num_words or num_lines. Note that the field _chunk is created automatically, and _original is added if optional parameter include_original is used:

[22]:

my_haiku = """
This is a long text.
Pages and pages, oh my!
I need to chunk it.
"""

text_scenario = Scenario({"my_text": my_haiku})

word_chunks_scenariolist = text_scenario.chunk(
    "my_text",
    num_words=5,  # use num_words or num_lines but not both
    include_original=True,  # optional
    hash_original=True,  # optional
)
word_chunks_scenariolist

[22]:

{
    "scenarios": [
        {
            "my_text": "This is a long text.",
            "my_text_chunk": 0,
            "my_text_original": "4aec42eda32b7f32bde8be6a6bc11125"
        },
        {
            "my_text": "Pages and pages, oh my!",
            "my_text_chunk": 1,
            "my_text_original": "4aec42eda32b7f32bde8be6a6bc11125"
        },
        {
            "my_text": "I need to chunk it.",
            "my_text_chunk": 2,
            "my_text_original": "4aec42eda32b7f32bde8be6a6bc11125"
        }
    ]
}

[23]:

line_chunks_scenariolist = text_scenario.chunk("my_text", num_lines=1)
line_chunks_scenariolist

[23]:

{
    "scenarios": [
        {
            "my_text": "",
            "my_text_chunk": 0
        },
        {
            "my_text": "This is a long text. ",
            "my_text_chunk": 1
        },
        {
            "my_text": "Pages and pages, oh my!",
            "my_text_chunk": 2
        },
        {
            "my_text": "I need to chunk it.",
            "my_text_chunk": 3
        },
        {
            "my_text": "",
            "my_text_chunk": 4
        }
    ]
}

Tallying scenario values

We can call the tally() method on a ScenarioList to tally numeric values for a specified key. It returns a dictionary with keys representing the number of each Scenario in the ScenarioList and values representing the tally of the key that was specified:

[24]:

numeric_scenariolist = ScenarioList(
    [
        Scenario({"a": 1, "b": 1}),
        Scenario({"a": 1, "b": 2})
    ]
)

tallied_scenariolist = numeric_scenariolist.tally("b")
tallied_scenariolist

[24]:

[
    {
        "value": [
            1,
            2
        ]
    },
    {
        "count": [
            1,
            1
        ]
    }
]

Expanding scenarios

We can call the expand() method on a ScenarioList to expand it by a specified field. For example, if the values of a scenario key are a list we can pass that key to the method to generate a Scenario for each item in the list:

[25]:

scenariolist = ScenarioList(
    [
        Scenario({"a":1, "b":[1, 2]})
    ]
)

expanded_scenariolist = scenariolist.expand("b")
expanded_scenariolist

[25]:

{
    "scenarios": [
        {
            "a": 1,
            "b": 1
        },
        {
            "a": 1,
            "b": 2
        }
    ]
}

Mutating scenarios

We can call the mutate() method on a ScenarioList to add a key/value to each Scenario based on a logical expression:

[26]:

scenariolist = ScenarioList(
    [
        Scenario({"a": 1, "b": 1}),
        Scenario({"a": 1, "b": 2})
    ]
)

mutated_scenariolist = scenariolist.mutate("c = a + b")
mutated_scenariolist

[26]:

{
    "scenarios": [
        {
            "a": 1,
            "b": 1,
            "c": 2
        },
        {
            "a": 1,
            "b": 2,
            "c": 3
        }
    ]
}

Ordering scenarios

We can call the order_by() method on a ScenarioList to order the scenarios by a field:

[27]:

unordered_scenariolist = ScenarioList(
    [
        Scenario({"a": 1, "b": 1}),
        Scenario({"a": 1, "b": 2})
    ]
)

ordered_scenariolist = unordered_scenariolist.order_by("b")
ordered_scenariolist

[27]:

{
    "scenarios": [
        {
            "a": 1,
            "b": 1
        },
        {
            "a": 1,
            "b": 2
        }
    ]
}

Filtering scenarios

We can call the filter() method on a ScenarioList to filer scenarios based on a conditional expression.

[28]:

unfiltered_scenariolist = ScenarioList(
    [
        Scenario({"a": 1, "b": 1}),
        Scenario({"a": 1, "b": 2})
    ]
)

filtered_scenariolist = unfiltered_scenariolist.filter("b == 2")
filtered_scenariolist

[28]:

{
    "scenarios": [
        {
            "a": 1,
            "b": 2
        }
    ]
}

Creating scenarios from a list

We can call the from_list() method to create a ScenarioList from a list of values and a specified key:

[29]:

my_list = ["Apostolos", "John", "Robin"]

scenariolist = ScenarioList.from_list("name", my_list)
scenariolist

[29]:

{
    "scenarios": [
        {
            "name": "Apostolos"
        },
        {
            "name": "John"
        },
        {
            "name": "Robin"
        }
    ]
}

Adding a list of values to individual scenarios

We can call the add_list() method to add values to individual scenarios in a ScenarioList:

[30]:

scenariolist = ScenarioList(
    [
        Scenario({"weather": "sunny"}),
        Scenario({"weather": "rainy"})
    ]
)

added_scenariolist = scenariolist.add_list("preference", ["high", "low"])
added_scenariolist

[30]:

{
    "scenarios": [
        {
            "weather": "sunny",
            "preference": "high"
        },
        {
            "weather": "rainy",
            "preference": "low"
        }
    ]
}

Adding values to scenarios

We can call the add_value() to add a value to all scenarios in a ScenarioList:

[31]:

scenariolist = ScenarioList(
    [
        Scenario({"name": "Apostolos"}),
        Scenario({"name": "John"}),
        Scenario({"name": "Robin"}),
    ]
)

added_scenariolist = scenariolist.add_value("company", "Expected Parrot")
added_scenariolist

[31]:

{
    "scenarios": [
        {
            "name": "Apostolos",
            "company": "Expected Parrot"
        },
        {
            "name": "John",
            "company": "Expected Parrot"
        },
        {
            "name": "Robin",
            "company": "Expected Parrot"
        }
    ]
}

Creating scenarios from a pandas DataFrame

We can call the from_pandas() method to create a ScenarioList from a pandas DataFrame:

[32]:

import pandas as pd

df = pd.DataFrame(
    {
        "name": ["Apostolos", "John", "Robin"],
        "location": ["New York", "Cambridge", "Cambridge"],
    }
)

scenariolist = ScenarioList.from_pandas(df)
scenariolist

[32]:

{
    "scenarios": [
        {
            "name": "Apostolos",
            "location": "New York"
        },
        {
            "name": "John",
            "location": "Cambridge"
        },
        {
            "name": "Robin",
            "location": "Cambridge"
        }
    ]
}

Creating scenarios from a CSV

We can call the from_csv() method to create a ScenarioList from a CSV. Here we use the dataframe from above stored as a CSV:

[33]:

df.to_csv("scenarios_example.csv", index=False)

[34]:

scenariolist = ScenarioList.from_csv("scenarios_example.csv")
scenariolist

[34]:

{
    "scenarios": [
        {
            "name": "Apostolos",
            "location": "New York"
        },
        {
            "name": "John",
            "location": "Cambridge"
        },
        {
            "name": "Robin",
            "location": "Cambridge"
        }
    ]
}

Creating scenarios from a dictionary

We can call the from_dict() method to create a Scenario or ScenarioList from a dictionary. Note that the dictionary must contain a key “scenarios”:

[35]:

ep_founders = {
    "name": ["Apostolos", "John", "Robin"],
    "location": ["New York", "Cambridge", "Cambridge"]
}

ep_founders_scenario = Scenario().from_dict(ep_founders)
ep_founders_scenario

[35]:

{
    "name": [
        "Apostolos",
        "John",
        "Robin"
    ],
    "location": [
        "New York",
        "Cambridge",
        "Cambridge"
    ]
}

[36]:

ep_founders_list = {
    "scenarios": [
        {
            "name":"Apostolos"
        },
        {
            "name":"John"
        },
        {
            "name":"Robin"
        }
    ]
}

ep_founders_scenariolist = ScenarioList().from_dict(ep_founders_list)
ep_founders_scenariolist

[36]:

{
    "scenarios": [
        {
            "name": "Apostolos"
        },
        {
            "name": "John"
        },
        {
            "name": "Robin"
        }
    ]
}

Turning scenarios into a dictionary

We can call the to_dict() method to turn a Scenario or ScenarioList into a dictionary:

[37]:

scenariolist = ScenarioList(
    [
        Scenario({"name": "Apostolos"}),
        Scenario({"name": "John"}),
        Scenario({"name": "Robin"}),
    ]
)

dict_scenariolist = scenariolist.to_dict()
dict_scenariolist

[37]:

{'scenarios': [{'name': 'Apostolos',
   'edsl_version': '0.1.31.dev4',
   'edsl_class_name': 'Scenario'},
  {'name': 'John',
   'edsl_version': '0.1.31.dev4',
   'edsl_class_name': 'Scenario'},
  {'name': 'Robin',
   'edsl_version': '0.1.31.dev4',
   'edsl_class_name': 'Scenario'}],
 'edsl_version': '0.1.31.dev4',
 'edsl_class_name': 'ScenarioList'}

Creating scenarios for webpages

We can call the from_url() method to create a scenario for a webpage. Note that this automatically creates 2 keys url and text which can be used in questions (and modified as desired):

[38]:

scenario = Scenario.from_url("https://www.expectedparrot.com/about")

scenario.keys()

[38]:

['url', 'text']

Here we use the scenario to run some questions extracting information from the webpage:

[39]:

from edsl import QuestionList, QuestionFreeText, Survey

q_team = QuestionList(
    question_name = "team",
    question_text = "Who is on the Expected Parrot team? {{ text }}"
)

q_purpose = QuestionFreeText(
    question_name = "purpose",
    question_text = "What is the purpose of Expected Parrot? {{ text }}"
)

survey = Survey([q_team, q_purpose])

results = survey.by(scenario).run()

[40]:

results.select("url", "team", "purpose").print(format="rich")

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ scenario                            ┃ answer                              ┃ answer                              ┃
┃ .url                                ┃ .team                               ┃ .purpose                            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ https://www.expectedparrot.com/abo… │ ['Robin Horton', 'John Horton',     │ Expected Parrot is a company        │
│                                     │ 'Apostolos Filippas']               │ focused on creating research-backed │
│                                     │                                     │ tools to advance computational      │
│                                     │                                     │ social science. They leverage AI to │
│                                     │                                     │ simulate surveys, experiments, and  │
│                                     │                                     │ other empirical studies, which can  │
│                                     │                                     │ be applied in social sciences,      │
│                                     │                                     │ businesses, and other               │
│                                     │                                     │ organizations. Their team is based  │
│                                     │                                     │ in Cambridge, Massachusetts, and    │
│                                     │                                     │ they have the backing of Bloomberg  │
│                                     │                                     │ Beta. They offer opportunities for  │
│                                     │                                     │ collaboration and are open to       │
│                                     │                                     │ inquiries through their provided    │
│                                     │                                     │ contact information.                │
└─────────────────────────────────────┴─────────────────────────────────────┴─────────────────────────────────────┘

Creating scenarios for PDF pages

We can call the from_pdf() method to turn the pages of a PDF or doc into a Scenario or ScenarioList. Here we use it for John’s paper “Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?” (link to paper). Note that the keys filename, page and text are automatically specified, so the question_text placeholder that we use for the scenarios must be {{ text }}:

[41]:

homo_silicus_scenariolist = ScenarioList.from_pdf("homo_silicus.pdf")

Here we inspect a couple pages:

[42]:

homo_silicus_scenariolist["scenarios"][0:2]

[42]:

[{'filename': 'homo_silicus.pdf',
  'page': 1,
  'text': 'Large Language Models as Simulated Economic Agents:\nWhat Can We Learn from Homo Silicus?∗\nJohn J. Horton\nMIT & NBER\nJanuary 19, 2023\nAbstract\nNewly-developed large language models (LLM)—because of how they are trained and\ndesigned—are implicit computational models of humans—a homo silicus. LLMs can be\nused like economists use homo economicus: they can be given endowments, information,\npreferences, and so on, and then their behavior can be explored in scenarios via simulation.\nExperiments using this approach, derived from Charness and Rabin (2002), Kahneman,\nKnetsch and Thaler (1986), and Samuelson and Zeckhauser (1988) show qualitatively\nsimilar results to the original, but it is also easy to try variations for fresh insights. LLMs\ncould allow researchers to pilot studies via simulation ﬁrst, searching for novel social sci-\nence insights to test in the real world.\n∗Thanks to the MIT Center for Collective Intelligence for generous oﬀer of funding, though all the ex-\nperiments here cost only about $50 to run. Thanks to Daniel Rock, Elliot Lipnowski, Hong-Yi TuYe, Daron\nAcemoglu, Shakked Noy, Jimbo Brand, David Autor, and Mohammed Alsobay for their helpful conversations\nand comments. Special thanks to Yo Shavit, who has been extremely generous with his time and thinking.\nThanks to GPT-3 for all this work and helping me describe the technology. Author contact information, code,\nand data are currently or will be available at http://www.john-joseph-horton.com/.\n1\narXiv:2301.07543v1  [econ.GN]  18 Jan 2023\n',
  'edsl_version': '0.1.31.dev4',
  'edsl_class_name': 'Scenario'},
 {'filename': 'homo_silicus.pdf',
  'page': 2,
  'text': '1\nIntroduction\nMost economic research takes one of two forms: (a) “What would homo economicus do?” and\nb) “What did homo sapiens actually do?” The (a)-type research takes a maintained model\nof humans, homo economicus, and subjects it to various economic scenarios, endowed with\ndiﬀerent resources, preferences, information, etc., and then deducing behavior; this behavior\ncan then be compared to the behavior of actual humans in (b)-type research.\nIn this paper, I argue that newly developed large language models (LLM)—because of\nhow they are trained and designed—can be thought of as implicit computational models of\nhumans—a homo silicus.\nThese models can be used the same way economists use homo\neconomicus: they can be given endowments, put in scenarios, and then their behavior can\nbe explored—though in the case of homo silicus, through computational simulation, not a\nmathematical deduction.1 This is possible because LLMs can now respond realistically to a\nwide range of textual inputs, giving responses similar to what we might expect from a human.\nIt is essential to note that this is a new possibility—that LLMs of slightly older vintage are\nunsuited for these tasks, as I will show.\nI consider the reasons the reasons why AI experiments might be helpful in understand-\ning actual humans. The core of the argument is that LLMs—by nature of their training\nand design—are (1) computational models of humans and (2) likely possess a great deal of\nlatent social information. For (1), the creators of LLMs have designed them to respond in\nways similar to how a human would react to prompts—including prompts that are economic\nscenarios. The design imperative to be “realistic”’ is why they can be thought of as com-\nputational models of humans. For (2), these models likely capture latent social information\nsuch as economic laws, decision-making heuristics, and common social preferences because\nthe LLMs are trained on a corpus that contains a great deal of written text where people\nreason about and discuss economic matters: What to buy, how to bargain, how to shop, how\nto negotiate a job oﬀer, how to make a job oﬀer, how many hours to work, what to do when\nprices increase, and so on.\nLike all models, any particular homo silicus is wrong, but that judgment is separate from\na decision about usefulness. To be clear, each homo silicus is a ﬂawed model and can often\ngive responses far away from what is rational or even sensical. But ultimately, what will\nmatter in practice is whether these AI experiments are practically valuable for generating\ninsights. As such, the majority of the paper focuses on GPT-3 experiments.\nEach experiment is motivated by a classic experiment in the behavioral economics lit-\nerature.\nI use Charness and Rabin (2002), Kahneman et al. (1986), and Samuelson and\n1Lucas (1980) writes, “One of the functions of theoretical economics is to provide fully articulated, artiﬁcial\neconomic systems that can serve as laboratories in which policies that would be prohibitively expensive to\nexperiment with in actual economies can be tested out at much lower cost.”\n2\n',
  'edsl_version': '0.1.31.dev4',
  'edsl_class_name': 'Scenario'}]

Example usage–note that we can sort results by any component, filter results using conditional expressions, and also limit how many results to display:

[43]:

q = QuestionFreeText(
    question_name="summarize", question_text="Summarize this page: {{ text }}"
)
results = q.by(homo_silicus_scenariolist).run()

[44]:

(
    results.sort_by("page")
    .filter("page > 1")
    .select("page", "summarize")
    .print(format="rich", max_rows=3)
)

┏━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ scenario ┃ answer                                                                                               ┃
┃ .page    ┃ .summarize                                                                                           ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 2        │ This page introduces the concept of large language models (LLMs) as computational models of human    │
│          │ behavior, termed as 'homo silicus.' It discusses how these models can be used in economic research   │
│          │ to simulate human responses in various scenarios, similar to the theoretical 'homo economicus' used  │
│          │ in traditional economic models. The LLMs, trained on vast corpora of human language, can provide     │
│          │ insights into human economic behavior by capturing latent social information and economic reasoning. │
│          │ While acknowledging that LLMs are imperfect models, the paper emphasizes their potential usefulness  │
│          │ in generating valuable insights and focuses on experiments with GPT-3 that are motivated by classic  │
│          │ behavioral economics studies.                                                                        │
├──────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ 3        │ The page summarizes a series of experiments designed to understand how AI, particularly the GPT-3    │
│          │ model, responds to different economic scenarios and social preferences. The experiments include a    │
│          │ unilateral dictator game demonstrating that AI behavior changes based on whether it is endowed with  │
│          │ preferences for equity, efficiency, or self-interest. The AI tends to choose efficient outcomes by   │
│          │ default, but only the advanced GPT-3 text-davinci-003 model can adapt its choices based on the given │
│          │ preference. The paper also explores responses to price gouging, showing that AI views on fairness    │
│          │ are influenced by the amount of price increase and political leanings, but not so much by framing.   │
│          │ The status quo bias, where people prefer options presented as the current state, is replicated with  │
│          │ AI when endowed with baseline views on car or highway safety. Lastly, a hiring scenario shows that   │
│          │ imposing a minimum wage leads to AI hiring more experienced workers due to wage shifts, mirroring    │
│          │ human employer behavior. These experiments aim to compare AI behavior with human responses in        │
│          │ economic decision-making.                                                                            │
├──────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ 4        │ The page discusses the value of using large language models (LLMs) like GPT-3 for running economic   │
│          │ experiments. It suggests that LLMs can simulate human-like responses in experiments quickly and      │
│          │ cheaply, providing insights and guiding empirical work. This approach can explore parameter spaces,  │
│          │ test sensitivity to question wording, and predict actual data patterns. The paper compares this to   │
│          │ economists building 'toy models' to help think through problems. It contrasts with related work by   │
│          │ showing LLMs' use in economic theories and the role of foundational assumptions like rationality.    │
│          │ The page also provides a non-technical background on LLMs, arguing that a deep technical             │
│          │ understanding is not necessary for their use in economics, similar to how economists don't need to   │
│          │ study brain neurons. The page posits LLMs as a tool to indirectly study human behavior, rather than  │
│          │ studying LLMs themselves or using them in productive processes.                                      │
└──────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────┘

Creating scenarios for images

We can call the from_image() method to create a scenario for an image. Here we use it for Figure 1 in the Home Silicus paper.

Note that this method must be used with a vision model (e.g., GPT-4o) and does not require the use of a {{ placeholder }} in the question text. The scenario keys file_path and encoded_image are generated automatically:

[45]:

from edsl import Model

model = Model("gpt-4o")

[46]:

image_scenario = Scenario.from_image("homo_silicus_figure1.png")

[47]:

image_scenario.keys()

[47]:

['file_path', 'encoded_image']

Example usage:

[48]:

q = QuestionFreeText(
    question_name="figure",
    question_text="Explain the graphic on this page.",  # no scenario placeholder
)

results = q.by(image_scenario).by(model).run()
results.select("figure").print(format="rich")

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ answer                                                                                                          ┃
┃ .figure                                                                                                         ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ The graphic illustrates the choices made by different models (Charness & Rabin Population, GPT3 with various    │
│ endowments, Advanced GPT3, Human Brain, and Prior GPT3) in a series of simple tests. Each row represents a      │
│ different test scenario (Berk29, Berk26, etc.), and the columns show the fraction of subjects (either human or  │
│ AI models) choosing each option (Left or Right). The numbers in the boxes represent the proportion of subjects  │
│ choosing that option. The graphic compares how different models and endowments influence decision-making,       │
│ highlighting variations in choices across different scenarios and model types.                                  │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Generating code for scenarios

We can call the code() method to generate the code for producing scenarios (it can also be copied directly at the Coop):

[49]:

scenariolist = ScenarioList.example()

scenariolist_code = scenariolist.code()
scenariolist_code

[49]:

['from edsl.scenarios.Scenario import Scenario\nfrom edsl.scenarios.ScenarioList import ScenarioList',
 "scenario_0 = Scenario({'persona': 'A reseacher studying whether LLMs can be used to generate surveys.'})",
 "scenario_1 = Scenario({'persona': 'A reseacher studying whether LLMs can be used to generate surveys.'})",
 'scenarios = ScenarioList([scenario_0, scenario_1])']

[50]:

from edsl.scenarios.Scenario import Scenario
from edsl.scenarios.ScenarioList import ScenarioList

scenario_0 = Scenario(
    {"persona": "A reseacher studying whether LLMs can be used to generate surveys."}
)
scenario_1 = Scenario(
    {"persona": "A reseacher studying whether LLMs can be used to generate surveys."}
)
scenarios = ScenarioList([scenario_0, scenario_1])

Converting a `ScenarioList` into an `AgentList`

We can call the to_agent_list() method to convert a ScenarioList into an AgentList. Note that agent traits cannot include a “name” key as agent_name is a separate optional field of Agent objects:

[51]:

from edsl import AgentList

scenariolist = ScenarioList(
    [
        Scenario({"first_name": "Apostolos", "location": "New York"}),
        Scenario({"first_name": "John", "location": "Cambridge"}),
        Scenario({"first_name": "Robin", "location": "Cambridge"}),
    ]
)

agentlist = scenariolist.to_agent_list()
agentlist

[51]:

[
    {
        "traits": {
            "first_name": "Apostolos",
            "location": "New York"
        },
        "edsl_version": "0.1.31.dev4",
        "edsl_class_name": "Agent"
    },
    {
        "traits": {
            "first_name": "John",
            "location": "Cambridge"
        },
        "edsl_version": "0.1.31.dev4",
        "edsl_class_name": "Agent"
    },
    {
        "traits": {
            "first_name": "Robin",
            "location": "Cambridge"
        },
        "edsl_version": "0.1.31.dev4",
        "edsl_class_name": "Agent"
    }
]

Note that scenarios function similarly to traits dictionaries that we pass to AI Agents that we can use to answer survey questions. Learn more about designing AI agents for simulating surveys and experiments.

Posting content to the Coop

Here we post this notebook to the Coop:

[52]:

from edsl import Notebook

notebook = Notebook(path = "cheatsheet_scenarios.ipynb")

notebook.push(description = "Cheatsheet: Scenarios", visibility = "public")

[52]:

{'description': 'Cheatsheet: Scenarios',
 'object_type': 'notebook',
 'url': 'https://www.expectedparrot.com/content/a300c6b6-4972-447f-aa9c-4413f45419ec',
 'uuid': 'a300c6b6-4972-447f-aa9c-4413f45419ec',
 'version': '0.1.31.dev4',
 'visibility': 'public'}

To update an object at the Coop (e.g., an updated copy of this notebook):

[54]:

updated_notebook = Notebook(path = "cheatsheet_scenarios.ipynb")

updated_notebook.patch(uuid = "a300c6b6-4972-447f-aa9c-4413f45419ec", value = updated_notebook)

[54]:

{'status': 'success'}

[ ]:

Cheatsheet: Scenarios

Purpose

Contents

How to use this notebook

Technical setup

Importing the tools

Inspecting an example

Using a Scenario

Creating a Scenario

Creating a ScenarioList

Combining scenarios

Replicating scenarios

Renaming scenarios

Sampling

Selecting and dropping scenarios

Adding metadata to scenarios

Chunking text

Tallying scenario values

Expanding scenarios

Mutating scenarios

Ordering scenarios

Filtering scenarios

Creating scenarios from a list

Adding a list of values to individual scenarios

Adding values to scenarios

Creating scenarios from a pandas DataFrame

Creating scenarios from a CSV

Creating scenarios from a dictionary

Turning scenarios into a dictionary

Creating scenarios for webpages

Creating scenarios for PDF pages

Creating scenarios for images

Generating code for scenarios

Converting a ScenarioList into an AgentList

Posting content to the Coop

Converting a `ScenarioList` into an `AgentList`