Data labeling

This notebook shows how to conduct data labeling and content analysis using EDSL, an open-source library for simulating surveys, experiments and other research with AI agents and large language models.

Using a dataset of mock customer service tickets as an example, we demonstrate how to:

  1. Import data into EDSL

  2. Create questions about the data

  3. Design an AI agent to answer the questions

  4. Select a language model to generate responses

  5. Analyze results as a formatted dataset

This workflow can be visualized as follows: general_survey.png

Technical setup

Before running the code below please ensure that you have completed setup:

Our Starter Tutorial also provides examples of EDSL basic components.

Selecting data for review

First we identify some data for review. Data can be created using the EDSL tools or imported from other sources. For purposes of this demo we import a set of hypothetical customer tickets for a transportation app:

[1]:
tickets = [
    "I just realized I left my phone in the car on my last ride. Can you help me get it back?",
    "I'm unhappy with my recent experience. The driver was very rude and unprofessional.",
    "I was charged more than the estimated fare for my trip yesterday. Can you explain why?",
    "The car seat provided was not properly installed, and I felt my child was at risk. Please ensure driver training.",
    "My driver took a longer route than necessary, resulting in a higher fare. I request a fare adjustment.",
    "I had a great experience with my driver today! Very friendly and efficient service.",
    "I'm concerned about the vehicle's cleanliness. It was not up to the standard I expect.",
    "The app keeps crashing every time I try to book a ride. Please fix this issue.",
    "My driver was exceptional - safe driving, polite, and the car was spotless. Kudos!",
    "I felt unsafe during my ride due to the driver's erratic behavior. This needs to be addressed immediately.",
    "The driver refused to follow my preferred route, which is shorter. I'm not satisfied with the service.",
    "Impressed with the quick response to my ride request and the driver's professionalism.",
    "I was charged for a ride I never took. Please refund me as soon as possible.",
    "The promo code I tried to use didn't work. Can you assist with this?",
    "There was a suspicious smell in the car, and I'm worried about hygiene standards.",
    "My driver was very considerate, especially helping me with my luggage. Appreciate the great service!",
    "The app's GPS seems inaccurate. It directed the driver to the wrong pick-up location.",
    "I want to compliment my driver's excellent navigation and time management during rush hour.",
    "The vehicle didn't match the description in the app. It was confusing and concerning.",
    "I faced an issue with payment processing after my last ride. Can you look into this?",
]

Constructing questions about the data

Next we create some questions about the data. EDSL provides a variety of question types that we can choose from based on the form of the response that we want to get back from the model (multiple choice, free text, checkbox, linear scale, etc.). Learn more about question types.

Note that we use a {{ placeholder }} in each question text in order to parameterize the questions with the individual ticket contents in the next step:

[2]:
from edsl import (
    QuestionMultipleChoice,
    QuestionCheckBox,
    QuestionFreeText,
    QuestionList,
    QuestionYesNo,
    QuestionLinearScale,
)
[3]:
question_issues = QuestionCheckBox(
    question_name="issues",
    question_text="Check all of the issues mentioned in this ticket: {{ ticket }}",
    question_options=[
        "safety",
        "cleanliness",
        "driver performance",
        "GPS/route",
        "lost item",
        "other",
    ],
)
[4]:
question_primary_issue = QuestionFreeText(
    question_name="primary_issue",
    question_text="What is the primary issue in this ticket? Ticket: {{ ticket }}",
)
[5]:
question_accident = QuestionMultipleChoice(
    question_name="accident",
    question_text="If the primary issue in this ticket is safety, was there an accident where someone was hurt? Ticket: {{ ticket }}",
    question_options=["Yes", "No", "Not applicable"],
)
[6]:
question_sentiment = QuestionMultipleChoice(
    question_name="sentiment",
    question_text="What is the sentiment of this ticket? Ticket: {{ ticket }}",
    question_options=[
        "Very positive",
        "Somewhat positive",
        "Neutral",
        "Somewhat negative",
        "Very negative",
    ],
)
[7]:
question_refund = QuestionYesNo(
    question_name="refund",
    question_text="Does the customer ask for a refund in this ticket? Ticket: {{ ticket }}",
)
[8]:
question_priority = QuestionLinearScale(
    question_name="priority",
    question_text="On a scale from 0 to 5, what is the priority level of this ticket? Ticket: {{ ticket }}",
    question_options=[0, 1, 2, 3, 4, 5],
    option_labels={0: "Lowest", 5: "Highest"},
)

Building a survey

We combine the questions into a survey in order to administer them together:

[9]:
from edsl import Survey

survey = Survey(
    questions=[
        question_issues,
        question_primary_issue,
        question_accident,
        question_sentiment,
        question_refund,
        question_priority,
    ]
)

Survey questions are administered asynchronously by default. Learn more about adding conditional logic and memory to your survey.

We can review our questions in a readable format, or export them as a survey to use with human respondents or at other survey platforms:

[10]:
# survey

Designing AI agents

A key feature of EDSL is the ability to create personas for AI agents that the language models are prompted to use in generating responses to the questions. This is done by passing a dictionary of traits to Agent objects:

[11]:
from edsl import Agent

agent = Agent(
    traits={
        "persona": "You are an expert customer service agent.",
        "years_experience": 15,
    }
)

Selecting language models

EDSL allows us to select the language models to use in generating results. To see all available models:

[12]:
from edsl import Model

# Model.available()

Here we select GPT 4o (if no model is specified, the default model is used – run Model() to verify the current default model):

[13]:
model = Model("gpt-4o")

Adding data to the questions

We add the contents of each ticket into each question as an independent “scenario” for review. This allows us to create versions of the questions for each job post and deliver them to the model all at once:

[14]:
from edsl import ScenarioList

scenarios = ScenarioList.from_list("ticket", tickets)

Running the survey

We run the survey by adding the scenarios, agent and model with the by() method and then calling the run() method:

[15]:
results = survey.by(scenarios).by(agent).by(model).run()

This generates a formatted dataset of Results that includes information about all the components, including the prompts and responses. We can see a list of all the components:

[16]:
# results.columns

Analyzing results

EDSL comes with built-in methods for analyzing results. Here we filter, sort, select and print components in a table:

[17]:
(results
 .filter("priority in [4, 5]")
 .sort_by("issues", "sentiment")
 .select("ticket", "issues", "primary_issue", "accident", "sentiment", "refund", "priority")
 .print(format="rich")
)
┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━┓
┃ scenario          answer            answer           answer          answer            answer   answer    ┃
┃ .ticket           .issues           .primary_issue   .accident       .sentiment        .refund  .priority ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━┩
│ The app's GPS     ['GPS/route']     The primary      Not applicable  Somewhat          No       4         │
│ seems                               issue in this                    negative                             │
│ inaccurate. It                      ticket is that                                                        │
│ directed the                        the app's GPS                                                         │
│ driver to the                       is inaccurate,                                                        │
│ wrong pick-up                       which resulted                                                        │
│ location.                           in directing                                                          │
│                                     the driver to                                                         │
│                                     the wrong                                                             │
│                                     pick-up                                                               │
│                                     location.                                                             │
├──────────────────┼──────────────────┼─────────────────┼────────────────┼──────────────────┼─────────┼───────────┤
│ I'm unhappy with  ['driver          The primary      No              Very negative     No       4         │
│ my recent         performance']     issue in this                                                         │
│ experience. The                     ticket is the                                                         │
│ driver was very                     customer's                                                            │
│ rude and                            dissatisfaction                                                       │
│ unprofessional.                     due to the                                                            │
│                                     driver's rude                                                         │
│                                     and                                                                   │
│                                     unprofessional                                                        │
│                                     behavior.                                                             │
├──────────────────┼──────────────────┼─────────────────┼────────────────┼──────────────────┼─────────┼───────────┤
│ I just realized   ['lost item']     The primary      Not applicable  Somewhat          No       4         │
│ I left my phone                     issue in this                    negative                             │
│ in the car on my                    ticket is that                                                        │
│ last ride. Can                      the customer                                                          │
│ you help me get                     left their                                                            │
│ it back?                            phone in the                                                          │
│                                     car during                                                            │
│                                     their last ride                                                       │
│                                     and needs                                                             │
│                                     assistance                                                            │
│                                     retrieving it.                                                        │
├──────────────────┼──────────────────┼─────────────────┼────────────────┼──────────────────┼─────────┼───────────┤
│ The app keeps     ['other']         The primary      Not applicable  Somewhat          No       4         │
│ crashing every                      issue in this                    negative                             │
│ time I try to                       ticket is that                                                        │
│ book a ride.                        the app crashes                                                       │
│ Please fix this                     whenever the                                                          │
│ issue.                              user attempts                                                         │
│                                     to book a ride.                                                       │
├──────────────────┼──────────────────┼─────────────────┼────────────────┼──────────────────┼─────────┼───────────┤
│ I was charged     ['other']         The primary      Not applicable  Somewhat          Yes      4         │
│ for a ride I                        issue in this                    negative                             │
│ never took.                         ticket is that                                                        │
│ Please refund me                    the customer                                                          │
│ as soon as                          was charged for                                                       │
│ possible.                           a ride they                                                           │
│                                     claim they                                                            │
│                                     never took and                                                        │
│                                     they are                                                              │
│                                     requesting a                                                          │
│                                     refund.                                                               │
├──────────────────┼──────────────────┼─────────────────┼────────────────┼──────────────────┼─────────┼───────────┤
│ The car seat      ['safety',        The primary      No              Very negative     No       5         │
│ provided was not  'driver           issue in this                                                         │
│ properly          performance']     ticket is that                                                        │
│ installed, and I                    the car seat                                                          │
│ felt my child                       provided was                                                          │
│ was at risk.                        not properly                                                          │
│ Please ensure                       installed,                                                            │
│ driver training.                    which made the                                                        │
│                                     customer feel                                                         │
│                                     that their                                                            │
│                                     child's safety                                                        │
│                                     was at risk.                                                          │
│                                     The customer is                                                       │
│                                     requesting that                                                       │
│                                     drivers receive                                                       │
│                                     proper training                                                       │
│                                     to ensure car                                                         │
│                                     seats are                                                             │
│                                     installed                                                             │
│                                     correctly in                                                          │
│                                     the future.                                                           │
├──────────────────┼──────────────────┼─────────────────┼────────────────┼──────────────────┼─────────┼───────────┤
│ I felt unsafe     ['safety',        The primary      No              Very negative     No       5         │
│ during my ride    'driver           issue in this                                                         │
│ due to the        performance']     ticket is the                                                         │
│ driver's erratic                    customer's                                                            │
│ behavior. This                      concern about                                                         │
│ needs to be                         feeling unsafe                                                        │
│ addressed                           due to the                                                            │
│ immediately.                        driver's                                                              │
│                                     erratic                                                               │
│                                     behavior. This                                                        │
│                                     is a serious                                                          │
│                                     matter that                                                           │
│                                     needs to be                                                           │
│                                     addressed                                                             │
│                                     immediately to                                                        │
│                                     ensure the                                                            │
│                                     safety and                                                            │
│                                     well-being of                                                         │
│                                     the customer                                                          │
│                                     and to maintain                                                       │
│                                     service                                                               │
│                                     standards.                                                            │
└──────────────────┴──────────────────┴─────────────────┴────────────────┴──────────────────┴─────────┴───────────┘

We can apply some lables to our table:

[18]:
(
    results.select(
        "ticket",
        "issues",
        "primary_issue",
        "accident",
        "sentiment",
        "refund",
        "priority",
    ).print(
        pretty_labels={
            "scenario.ticket": "Ticket",
            "answer.issues": "Issues",
            "answer.primary_issue": "Primary issue",
            "answer.accident": "Accident",
            "answer.sentiment": "Sentiment",
            "answer.refund": "Refund request",
            "answer.priority": "Priority",
        }
    )
)
Ticket Issues Primary issue Accident Sentiment Refund request Priority
I want to compliment my driver's excellent navigation and time management during rush hour. ['driver performance', 'GPS/route'] The primary issue in this ticket is a compliment. The customer wants to commend their driver's excellent navigation and time management skills during rush hour. Not applicable Very positive No 1
I'm concerned about the vehicle's cleanliness. It was not up to the standard I expect. ['cleanliness'] The primary issue in this ticket is the cleanliness of the vehicle. The customer is expressing concern that the vehicle was not as clean as they expected. Not applicable Somewhat negative No 2
My driver was very considerate, especially helping me with my luggage. Appreciate the great service! ['driver performance'] The primary issue in this ticket is actually not an issue at all. It's a positive feedback from a customer who appreciated the considerate service provided by their driver, particularly with assistance on luggage. This is a commendation rather than a complaint. Not applicable Very positive No 0
The car seat provided was not properly installed, and I felt my child was at risk. Please ensure driver training. ['safety', 'driver performance'] The primary issue in this ticket is that the car seat provided was not properly installed, which made the customer feel that their child's safety was at risk. The customer is requesting that drivers receive proper training to ensure car seats are installed correctly in the future. No Very negative No 5
There was a suspicious smell in the car, and I'm worried about hygiene standards. ['safety', 'cleanliness'] The primary issue in this ticket is a concern about a suspicious smell in the car, which raises worries about hygiene standards. The customer is likely seeking reassurance that the vehicle meets cleanliness and hygiene expectations. Not applicable Somewhat negative No 3
The app keeps crashing every time I try to book a ride. Please fix this issue. ['other'] The primary issue in this ticket is that the app crashes whenever the user attempts to book a ride. Not applicable Somewhat negative No 4
The driver refused to follow my preferred route, which is shorter. I'm not satisfied with the service. ['driver performance', 'GPS/route'] The primary issue in this ticket is that the driver refused to follow the customer's preferred route, which the customer believes is shorter. This has led to the customer's dissatisfaction with the service. Not applicable Very negative No 2
I just realized I left my phone in the car on my last ride. Can you help me get it back? ['lost item'] The primary issue in this ticket is that the customer left their phone in the car during their last ride and needs assistance retrieving it. Not applicable Somewhat negative No 4
The vehicle didn't match the description in the app. It was confusing and concerning. ['other'] The primary issue in this ticket is that the vehicle provided did not match the description given in the app. This discrepancy caused confusion and concern for the customer. Not applicable Somewhat negative No 3
My driver took a longer route than necessary, resulting in a higher fare. I request a fare adjustment. ['driver performance', 'GPS/route'] The primary issue in this ticket is that the customer believes their driver took a longer route than necessary, which led to a higher fare. The customer is requesting a fare adjustment to rectify the situation. No Somewhat negative No 2
I'm unhappy with my recent experience. The driver was very rude and unprofessional. ['driver performance'] The primary issue in this ticket is the customer's dissatisfaction due to the driver's rude and unprofessional behavior. No Very negative No 4
Impressed with the quick response to my ride request and the driver's professionalism. ['driver performance', 'other'] It seems like there isn't an issue in this ticket. Instead, it appears to be positive feedback. The customer is expressing satisfaction with the quick response to their ride request and the professionalism of the driver. Not applicable Very positive No 0
I was charged more than the estimated fare for my trip yesterday. Can you explain why? ['other'] The primary issue in this ticket is that the customer was charged more than the estimated fare for their trip and is seeking an explanation for the discrepancy. Not applicable Somewhat negative No 3
I was charged for a ride I never took. Please refund me as soon as possible. ['other'] The primary issue in this ticket is that the customer was charged for a ride they claim they never took and they are requesting a refund. Not applicable Somewhat negative Yes 4
I had a great experience with my driver today! Very friendly and efficient service. ['driver performance'] It appears that there is no issue in this ticket. Instead, it is a positive feedback about the driver's friendly and efficient service. Not applicable Very positive No 0
I felt unsafe during my ride due to the driver's erratic behavior. This needs to be addressed immediately. ['safety', 'driver performance'] The primary issue in this ticket is the customer's concern about feeling unsafe due to the driver's erratic behavior. This is a serious matter that needs to be addressed immediately to ensure the safety and well-being of the customer and to maintain service standards. No Very negative No 5
I faced an issue with payment processing after my last ride. Can you look into this? ['other'] The primary issue in this ticket is a problem with payment processing following the customer's last ride. The customer is requesting assistance to investigate and resolve the payment issue. Not applicable Neutral No 3
The app's GPS seems inaccurate. It directed the driver to the wrong pick-up location. ['GPS/route'] The primary issue in this ticket is that the app's GPS is inaccurate, which resulted in directing the driver to the wrong pick-up location. Not applicable Somewhat negative No 4
The promo code I tried to use didn't work. Can you assist with this? ['other'] The primary issue in this ticket is that the customer is experiencing difficulty with a promo code that did not work when they attempted to use it. They are seeking assistance to resolve this issue. Not applicable Neutral No 2
My driver was exceptional - safe driving, polite, and the car was spotless. Kudos! ['safety', 'cleanliness', 'driver performance'] It looks like this ticket is actually a compliment rather than an issue. The customer is praising the driver for their exceptional service, including safe driving, politeness, and the cleanliness of the car. It’s always great to receive positive feedback! Not applicable Very positive No 0

EDSL also comes with methods for accessing results as a dataframe or SQL table:

[19]:
df = (
    results
    .select(
        "issues",
        "primary_issue",
        "accident",
        "sentiment",
        "refund",
        "priority"
    )
    .to_pandas(remove_prefix=True)
)
df
[19]:
issues primary_issue accident sentiment refund priority
0 ['driver performance', 'GPS/route'] The primary issue in this ticket is a complime... Not applicable Very positive No 1
1 ['cleanliness'] The primary issue in this ticket is the cleanl... Not applicable Somewhat negative No 2
2 ['driver performance'] The primary issue in this ticket is actually n... Not applicable Very positive No 0
3 ['safety', 'driver performance'] The primary issue in this ticket is that the c... No Very negative No 5
4 ['safety', 'cleanliness'] The primary issue in this ticket is a concern ... Not applicable Somewhat negative No 3
5 ['other'] The primary issue in this ticket is that the a... Not applicable Somewhat negative No 4
6 ['driver performance', 'GPS/route'] The primary issue in this ticket is that the d... Not applicable Very negative No 2
7 ['lost item'] The primary issue in this ticket is that the c... Not applicable Somewhat negative No 4
8 ['other'] The primary issue in this ticket is that the v... Not applicable Somewhat negative No 3
9 ['driver performance', 'GPS/route'] The primary issue in this ticket is that the c... No Somewhat negative No 2
10 ['driver performance'] The primary issue in this ticket is the custom... No Very negative No 4
11 ['driver performance', 'other'] It seems like there isn't an issue in this tic... Not applicable Very positive No 0
12 ['other'] The primary issue in this ticket is that the c... Not applicable Somewhat negative No 3
13 ['other'] The primary issue in this ticket is that the c... Not applicable Somewhat negative Yes 4
14 ['driver performance'] It appears that there is no issue in this tick... Not applicable Very positive No 0
15 ['safety', 'driver performance'] The primary issue in this ticket is the custom... No Very negative No 5
16 ['other'] The primary issue in this ticket is a problem ... Not applicable Neutral No 3
17 ['GPS/route'] The primary issue in this ticket is that the a... Not applicable Somewhat negative No 4
18 ['other'] The primary issue in this ticket is that the c... Not applicable Neutral No 2
19 ['safety', 'cleanliness', 'driver performance'] It looks like this ticket is actually a compli... Not applicable Very positive No 0

We can also access results as a SQL table:

[20]:
results.sql("""
select ticket, issues, primary_issue, accident, sentiment, refund, priority
from self
""", shape="wide")
[20]:
ticket issues primary_issue accident sentiment refund priority
0 I want to compliment my driver's excellent nav... ['driver performance', 'GPS/route'] The primary issue in this ticket is a complime... Not applicable Very positive No 1
1 I'm concerned about the vehicle's cleanliness.... ['cleanliness'] The primary issue in this ticket is the cleanl... Not applicable Somewhat negative No 2
2 My driver was very considerate, especially hel... ['driver performance'] The primary issue in this ticket is actually n... Not applicable Very positive No 0
3 The car seat provided was not properly install... ['safety', 'driver performance'] The primary issue in this ticket is that the c... No Very negative No 5
4 There was a suspicious smell in the car, and I... ['safety', 'cleanliness'] The primary issue in this ticket is a concern ... Not applicable Somewhat negative No 3
5 The app keeps crashing every time I try to boo... ['other'] The primary issue in this ticket is that the a... Not applicable Somewhat negative No 4
6 The driver refused to follow my preferred rout... ['driver performance', 'GPS/route'] The primary issue in this ticket is that the d... Not applicable Very negative No 2
7 I just realized I left my phone in the car on ... ['lost item'] The primary issue in this ticket is that the c... Not applicable Somewhat negative No 4
8 The vehicle didn't match the description in th... ['other'] The primary issue in this ticket is that the v... Not applicable Somewhat negative No 3
9 My driver took a longer route than necessary, ... ['driver performance', 'GPS/route'] The primary issue in this ticket is that the c... No Somewhat negative No 2
10 I'm unhappy with my recent experience. The dri... ['driver performance'] The primary issue in this ticket is the custom... No Very negative No 4
11 Impressed with the quick response to my ride r... ['driver performance', 'other'] It seems like there isn't an issue in this tic... Not applicable Very positive No 0
12 I was charged more than the estimated fare for... ['other'] The primary issue in this ticket is that the c... Not applicable Somewhat negative No 3
13 I was charged for a ride I never took. Please ... ['other'] The primary issue in this ticket is that the c... Not applicable Somewhat negative Yes 4
14 I had a great experience with my driver today!... ['driver performance'] It appears that there is no issue in this tick... Not applicable Very positive No 0
15 I felt unsafe during my ride due to the driver... ['safety', 'driver performance'] The primary issue in this ticket is the custom... No Very negative No 5
16 I faced an issue with payment processing after... ['other'] The primary issue in this ticket is a problem ... Not applicable Neutral No 3
17 The app's GPS seems inaccurate. It directed th... ['GPS/route'] The primary issue in this ticket is that the a... Not applicable Somewhat negative No 4
18 The promo code I tried to use didn't work. Can... ['other'] The primary issue in this ticket is that the c... Not applicable Neutral No 2
19 My driver was exceptional - safe driving, poli... ['safety', 'cleanliness', 'driver performance'] It looks like this ticket is actually a compli... Not applicable Very positive No 0

To export results to a CSV file:

[21]:
results.to_csv("data_labeling_example.csv")

Posting content to the Coop

We can post any objects to the Coop, including this notebook. Objects can be updated or modified at your Coop account, and shared with others or stored privately (default visibility is unlisted):

[22]:
results.push(description = "Customer service tickets data labeling example", visibility="public")
[22]:
{'description': 'Customer service tickets data labeling example',
 'object_type': 'results',
 'url': 'https://www.expectedparrot.com/content/8347bb96-951f-4063-a863-a24da3990f52',
 'uuid': '8347bb96-951f-4063-a863-a24da3990f52',
 'version': '0.1.32.dev1',
 'visibility': 'public'}
[23]:
survey.push(description = "Customer service tickets data labeling example survey", visibility="public")
[23]:
{'description': 'Customer service tickets data labeling example survey',
 'object_type': 'survey',
 'url': 'https://www.expectedparrot.com/content/7116a9c0-3bcf-419f-8989-6b1ed5df98c1',
 'uuid': '7116a9c0-3bcf-419f-8989-6b1ed5df98c1',
 'version': '0.1.32.dev1',
 'visibility': 'public'}
[24]:
from edsl import Notebook

n = Notebook(path="data_labeling_example.ipynb")

n.push(description="Data labeling example", visibility="public")
[24]:
{'description': 'Data labeling example',
 'object_type': 'notebook',
 'url': 'https://www.expectedparrot.com/content/830446e3-31e6-4d12-b9ec-ebd38a1ca26e',
 'uuid': '830446e3-31e6-4d12-b9ec-ebd38a1ca26e',
 'version': '0.1.32.dev1',
 'visibility': 'public'}

To update an object at the Coop:

[26]:
n = Notebook(path="data_labeling_example.ipynb")

n.patch(uuid = "830446e3-31e6-4d12-b9ec-ebd38a1ca26e", value = n)
[26]:
{'status': 'success'}