Skip logic & scenarios
This notebook provides example EDSL code for using a language model to simulate a survey that uses skip logic: rules for determining which questions are administered based on responses to other questions in the survey.
In the first example below we construct a survey of questions and then add a rule to skip one question based on the response to another question.
In the second example we add some complexity. We first create different “scenarios” (versions) of questions and combine them in a survey. Then we add multiple rules to skip specific versions of the questions based on responses to a particular version of a question.
EDSL is an open-source library for simulating surveys, experiments and other research with AI agents and large language models. Before running the code below, please ensure that you have installed the EDSL library and either activated remote inference from your Coop account or stored API keys for the language models that you want to use with EDSL. Please also see our documentation page for tips and tutorials on getting started using EDSL.
Example 1
In the first example below we construct questions, combine them in a survey, and add a rule to skip the second question based on the response to the first question. Then we create Scenario
objects for contents that will be added to the questions when the survey is run. The effect of this is that the second question will be skipped based on the response to the first question for each individual scenario.
We start by constructing questions:
[1]:
from edsl import QuestionYesNo, QuestionNumerical, QuestionMultipleChoice
q1 = QuestionYesNo(
question_name = "recent_purchase",
question_text = "In the last year have you or anyone in your household purchased any {{ item }}?",
)
q2 = QuestionNumerical(
question_name = "amount",
question_text = "In the last year, how much did your household spend on {{ item }} (in USD)?"
)
q3 = QuestionMultipleChoice(
question_name = "next_purchase",
question_text = "When do you next expect to purchase {{ item }}?",
question_options = [
"Never",
"Within the next month",
"Within the next year",
"I do not know"
]
)
We combine the questions in a survey to administer them together:
[2]:
from edsl import Survey
survey = Survey(questions = [q1, q2, q3])
survey
[2]:
Survey # questions: 3; question_name list: ['recent_purchase', 'amount', 'next_purchase'];
question_options | question_text | question_name | question_type | |
---|---|---|---|---|
0 | ['No', 'Yes'] | In the last year have you or anyone in your household purchased any {{ item }}? | recent_purchase | yes_no |
1 | nan | In the last year, how much did your household spend on {{ item }} (in USD)? | amount | numerical |
2 | ['Never', 'Within the next month', 'Within the next year', 'I do not know'] | When do you next expect to purchase {{ item }}? | next_purchase | multiple_choice |
Here we add a rule to skip q2 based on the response to q1:
[3]:
survey = survey.add_skip_rule(q2, "recent_purchase == 'No'")
Next we create scenarios for the “item” to be used with each question:
[4]:
from edsl import Scenario, ScenarioList
s = ScenarioList(
Scenario({"item":item}) for item in ["electronics", "phones"]
)
Note that we could also use a method for the data type that we are using–this is equivalent:
[5]:
s = ScenarioList.from_list("item", ["electronics", "phones"])
s
[5]:
ScenarioList scenarios: 2; keys: ['item'];
item | |
---|---|
0 | electronics |
1 | phones |
Next we create some agent personas to answer the questions:
[6]:
from edsl import Agent, AgentList
income_levels = ["under $100,000", "$100,000-250,000", "above $250,000"]
ages = [30, 50, 70]
a = AgentList(
Agent({"annual_income":income, "age":age}) for income in income_levels for age in ages
)
a
[6]:
AgentList agents: 9;
annual_income | age | |
---|---|---|
0 | under $100,000 | 30 |
1 | under $100,000 | 50 |
2 | under $100,000 | 70 |
3 | $100,000-250,000 | 30 |
4 | $100,000-250,000 | 50 |
5 | $100,000-250,000 | 70 |
6 | above $250,000 | 30 |
7 | above $250,000 | 50 |
8 | above $250,000 | 70 |
Next we select a model to generate the responses (check available models and pricing):
[7]:
from edsl import Model
m = Model("gemini-1.5-flash")
We can inspect (or modify) the default parameters of the model that will be used:
[8]:
m
[8]:
key | value | |
---|---|---|
0 | model | gemini-1.5-flash |
1 | parameters:temperature | 0.500000 |
2 | parameters:topP | 1 |
3 | parameters:topK | 1 |
4 | parameters:maxOutputTokens | 2048 |
5 | parameters:stopSequences | [] |
6 | inference_service |
We run the survey by adding any scenarios, agents and models and then calling the run
:
[9]:
results = survey.by(s).by(a).by(m).run()
Job UUID | ff33201a-883b-43f8-9ef1-be05f5d07f24 |
Progress Bar URL | https://www.expectedparrot.com/home/remote-job-progress/ff33201a-883b-43f8-9ef1-be05f5d07f24 |
Exceptions Report URL | None |
Results UUID | ec83bd1b-6790-4a08-870c-8c6a2f135aa2 |
Results URL | https://www.expectedparrot.com/content/ec83bd1b-6790-4a08-870c-8c6a2f135aa2 |
We can inspect a list of the columns of the dataset of results that has been generated:
[10]:
results.columns
[10]:
0 | |
---|---|
0 | agent.age |
1 | agent.agent_index |
2 | agent.agent_instruction |
3 | agent.agent_name |
4 | agent.annual_income |
5 | answer.amount |
6 | answer.next_purchase |
7 | answer.recent_purchase |
8 | cache_keys.amount_cache_key |
9 | cache_keys.next_purchase_cache_key |
10 | cache_keys.recent_purchase_cache_key |
11 | cache_used.amount_cache_used |
12 | cache_used.next_purchase_cache_used |
13 | cache_used.recent_purchase_cache_used |
14 | comment.amount_comment |
15 | comment.next_purchase_comment |
16 | comment.recent_purchase_comment |
17 | generated_tokens.amount_generated_tokens |
18 | generated_tokens.next_purchase_generated_tokens |
19 | generated_tokens.recent_purchase_generated_tokens |
20 | iteration.iteration |
21 | model.inference_service |
22 | model.maxOutputTokens |
23 | model.model |
24 | model.model_index |
25 | model.stopSequences |
26 | model.temperature |
27 | model.topK |
28 | model.topP |
29 | prompt.amount_system_prompt |
30 | prompt.amount_user_prompt |
31 | prompt.next_purchase_system_prompt |
32 | prompt.next_purchase_user_prompt |
33 | prompt.recent_purchase_system_prompt |
34 | prompt.recent_purchase_user_prompt |
35 | question_options.amount_question_options |
36 | question_options.next_purchase_question_options |
37 | question_options.recent_purchase_question_options |
38 | question_text.amount_question_text |
39 | question_text.next_purchase_question_text |
40 | question_text.recent_purchase_question_text |
41 | question_type.amount_question_type |
42 | question_type.next_purchase_question_type |
43 | question_type.recent_purchase_question_type |
44 | raw_model_response.amount_cost |
45 | raw_model_response.amount_one_usd_buys |
46 | raw_model_response.amount_raw_model_response |
47 | raw_model_response.next_purchase_cost |
48 | raw_model_response.next_purchase_one_usd_buys |
49 | raw_model_response.next_purchase_raw_model_response |
50 | raw_model_response.recent_purchase_cost |
51 | raw_model_response.recent_purchase_one_usd_buys |
52 | raw_model_response.recent_purchase_raw_model_response |
53 | scenario.item |
54 | scenario.scenario_index |
We can select and inspect any components of the results. We can see by a “None” response that a question was skipped:
[11]:
(
results
.sort_by("annual_income", "age", "item")
.select("model", "annual_income", "age", "item", "recent_purchase", "amount", "next_purchase")
)
[11]:
model.model | agent.annual_income | agent.age | scenario.item | answer.recent_purchase | answer.amount | answer.next_purchase | |
---|---|---|---|---|---|---|---|
0 | gemini-1.5-flash | $100,000-250,000 | 30 | electronics | Yes | 2500.000000 | Within the next year |
1 | gemini-1.5-flash | $100,000-250,000 | 30 | phones | No | nan | Within the next year |
2 | gemini-1.5-flash | $100,000-250,000 | 50 | electronics | Yes | 2500.000000 | Within the next year |
3 | gemini-1.5-flash | $100,000-250,000 | 50 | phones | Yes | 1200.000000 | Within the next year |
4 | gemini-1.5-flash | $100,000-250,000 | 70 | electronics | Yes | 500.000000 | Within the next year |
5 | gemini-1.5-flash | $100,000-250,000 | 70 | phones | No | nan | Within the next year |
6 | gemini-1.5-flash | above $250,000 | 30 | electronics | Yes | 5000.000000 | Within the next year |
7 | gemini-1.5-flash | above $250,000 | 30 | phones | Yes | 0.000000 | Within the next year |
8 | gemini-1.5-flash | above $250,000 | 50 | electronics | Yes | 5000.000000 | Within the next year |
9 | gemini-1.5-flash | above $250,000 | 50 | phones | Yes | 2500.000000 | Within the next year |
10 | gemini-1.5-flash | above $250,000 | 70 | electronics | Yes | 5000.000000 | Within the next year |
11 | gemini-1.5-flash | above $250,000 | 70 | phones | No | nan | Never |
12 | gemini-1.5-flash | under $100,000 | 30 | electronics | Yes | 200.000000 | Within the next year |
13 | gemini-1.5-flash | under $100,000 | 30 | phones | No | nan | Within the next year |
14 | gemini-1.5-flash | under $100,000 | 50 | electronics | No | nan | Within the next year |
15 | gemini-1.5-flash | under $100,000 | 50 | phones | No | nan | Within the next year |
16 | gemini-1.5-flash | under $100,000 | 70 | electronics | No | nan | Within the next year |
17 | gemini-1.5-flash | under $100,000 | 70 | phones | No | nan | Never |
Example 2
In the next example, we use the same scenarios to create versions of the questions before we combine them in a survey. This allows us to add a skip rule based on a question/scenario combination, as opposed to skipping a question for all scenarios:
[12]:
q1 = QuestionYesNo(
question_name = "recent_purchase_{{ item }}",
question_text = "In the last year have you or anyone in your household purchased any {{ item }}?",
)
q2 = QuestionNumerical(
question_name = "amount_{{ item }}",
question_text = "In the last year, how much did your household spend on {{ item }} (in USD)?"
)
q3 = QuestionMultipleChoice(
question_name = "next_purchase_{{ item }}",
question_text = "When do you next expect to purchase {{ item }}?",
question_options = [
"Never",
"Within the next month",
"Within the next year",
"I do not know"
]
)
The loop
method creates new versions of questions with scenarios already inserted:
[13]:
questions = q1.loop(s) + q2.loop(s) + q3.loop(s)
questions
[13]:
[Question('yes_no', question_name = """recent_purchase_electronics""", question_text = """In the last year have you or anyone in your household purchased any electronics?""", question_options = ['No', 'Yes']),
Question('yes_no', question_name = """recent_purchase_phones""", question_text = """In the last year have you or anyone in your household purchased any phones?""", question_options = ['No', 'Yes']),
Question('numerical', question_name = """amount_electronics""", question_text = """In the last year, how much did your household spend on electronics (in USD)?""", min_value = None, max_value = None),
Question('numerical', question_name = """amount_phones""", question_text = """In the last year, how much did your household spend on phones (in USD)?""", min_value = None, max_value = None),
Question('multiple_choice', question_name = """next_purchase_electronics""", question_text = """When do you next expect to purchase electronics?""", question_options = ['Never', 'Within the next month', 'Within the next year', 'I do not know']),
Question('multiple_choice', question_name = """next_purchase_phones""", question_text = """When do you next expect to purchase phones?""", question_options = ['Never', 'Within the next month', 'Within the next year', 'I do not know'])]
We combine the questions in a survey to administer them together the same as before:
[14]:
survey = Survey(questions)
Here we add different rules specifying that questions with one scenario (phones) should be administered or skipped based on the answer to a question with another scenario (electronics):
[15]:
survey = (
survey
.add_skip_rule("recent_purchase_phones", "recent_purchase_electronics == 'No'")
.add_skip_rule("amount_phones", "recent_purchase_electronics == 'No'")
.add_skip_rule("next_purchase_phones", "recent_purchase_electronics == 'No'")
)
Here we run the survey with the scenarios, agents and model:
[16]:
results = survey.by(a).by(m).run()
Job UUID | 0091c2e7-ff4f-4e3d-be9e-087d0cf56925 |
Progress Bar URL | https://www.expectedparrot.com/home/remote-job-progress/0091c2e7-ff4f-4e3d-be9e-087d0cf56925 |
Exceptions Report URL | https://www.expectedparrot.com/home/remote-inference/error/306bf45f-4455-45d3-be90-cf24161f22c0 |
Results UUID | ba911ca1-436a-4465-ba18-6cc8ad9476a3 |
Results URL | https://www.expectedparrot.com/content/ba911ca1-436a-4465-ba18-6cc8ad9476a3 |
There is no “scenario” field in results because the scenarios were already added to questions. Instead, there are separate columns for each version of a question:
[17]:
(
results
.sort_by("annual_income", "age")
.select("model", "annual_income", "age", "recent_purchase_electronics", "amount_electronics", "next_purchase_electronics", "recent_purchase_phones", "amount_phones", "next_purchase_phones")
)
[17]:
model.model | agent.annual_income | agent.age | answer.recent_purchase_electronics | answer.amount_electronics | answer.next_purchase_electronics | answer.recent_purchase_phones | answer.amount_phones | answer.next_purchase_phones | |
---|---|---|---|---|---|---|---|---|---|
0 | gemini-1.5-flash | $100,000-250,000 | 30 | Yes | 2500 | Within the next year | No | 1200.000000 | Within the next year |
1 | gemini-1.5-flash | $100,000-250,000 | 50 | Yes | 2500 | Within the next year | Yes | 1200.000000 | Within the next year |
2 | gemini-1.5-flash | $100,000-250,000 | 70 | Yes | 500 | Within the next year | No | 0.000000 | Within the next year |
3 | gemini-1.5-flash | above $250,000 | 30 | Yes | 5000 | Within the next year | Yes | 0.000000 | Within the next year |
4 | gemini-1.5-flash | above $250,000 | 50 | Yes | 5000 | Within the next year | Yes | 2500.000000 | Within the next year |
5 | gemini-1.5-flash | above $250,000 | 70 | Yes | 5000 | Within the next year | No | 0.000000 | Never |
6 | gemini-1.5-flash | under $100,000 | 30 | Yes | 200 | Within the next year | No | 600.000000 | Within the next year |
7 | gemini-1.5-flash | under $100,000 | 50 | No | 250 | Within the next year | nan | nan | nan |
8 | gemini-1.5-flash | under $100,000 | 70 | No | 0 | Within the next year | nan | nan | nan |
Posting to the Coop
Here we post this notebook to the Coop, a free platform for creating and sharing AI-based research (learn more about how it works):
[18]:
from edsl import Notebook
[19]:
n = Notebook(path = "skip_logic_scenarios.ipynb")
[20]:
info = n.push(description = "Using skip logic with question scenarios", visibility = "public")
info
[20]:
{'description': 'Using skip logic with question scenarios',
'object_type': 'notebook',
'url': 'https://www.expectedparrot.com/content/ba541892-b397-437f-a72c-c95f23750540',
'uuid': 'ba541892-b397-437f-a72c-c95f23750540',
'version': '0.1.43.dev1',
'visibility': 'public'}
Updating an object at the Coop:
[21]:
n = Notebook(path = "skip_logic_scenarios.ipynb") # resave
[22]:
n.patch(uuid = info["uuid"], value = n)
[22]:
{'status': 'success'}