Data labeling agents

This notebook shows how to conduct data labeling tasks using EDSL, an open-source library for simulating surveys, experiments and other research with AI agents and large language models. This workflow consists of the following steps:

Import data into EDSL
Create questions about the data
Design an AI agent to answer the questions
Select a language model to generate responses
Analyze results as a formatted dataset

This workflow can be visualized as follows:

Conducting agent-specific tasks

We can add a layer of complexity to this generalized flow by creating different AI agents for subsets of the data to be reviewed. For example, we can design agents with specific “expertise” to review only the data that is relevant to that expertise. This can be useful if our data is sorted (or sortable) in some way that is important to our task. We can also use EDSL to prompt a language model to sort the data as needed.

This modified workflow can be visualized as follows:

Example task: Evaluating job posts

Using a dataset of job posts as an example, in the steps below we create AI agents with expertise in the relevant job categories and then prompt them to evaluate relevant job posts in a variety of ways. The steps are:

Import a dataset of job categories and job posts.
Construct questions about the job posts and combine them in a survey.
Design AI agents with job category expertise.
Administer the survey to each agent with job posts for the relevant category.
Inspect the results using built-in methods for analysis.

Validating results

EDSL also provides methods for launching your surveys with human respondents to compare and validate LLM results. Learn more about these features.

Technical setup

Before running the code below please see instructions on getting started. An introductory data labeling example notebook is also available at our documentation page.

Import the tools

We start by selecting question types and survey components that we will use. Please see the EDSL Docs for examples of all question types and details on these basic components.

[1]:

from edsl import (
    QuestionMultipleChoice, QuestionList, QuestionNumerical,
    Survey, ScenarioList, AgentList, Agent, Model
)

Import data

Next we import a dataset for review, using Scenario objects to represent the individual data that will be added to each of our data labeling questions. EDSL allows us to create data or import it from other sources (CSV, PDF, PNG, MP4, DOC, tables, lists, dicts, etc.).

For purposes of demonstration, we create a dataframe, post it to Coop with the FileStore module and then retrieve it. Note that FileStore works with many file types and automatically infers the file type. Learn more about posting and retrieving files.

[2]:

import pandas as pd

data = [
    ["job_category", "job_title", "job_post"],
    ["Content Writing", "Blog Post Writing", "Looking for a skilled writer to produce 5 blog posts on digital marketing topics. Each post should be 800-1000 words, well-researched, and SEO-optimized."],
    ["Content Writing", "Product Description Writing", "We need a writer to craft compelling product descriptions for our online store. Each description should highlight the key features and benefits of the product."],
    ["Content Writing", "Technical Writing for Software Documentation", "Seeking an experienced technical writer to create user manuals and API documentation for our software product. Must have a background in tech writing and be familiar with software development terminology."],
    ["Content Writing", "Website Copywriting", "Looking for a copywriter to create persuasive content for our company’s website. The content should be clear, concise, and align with our brand voice."],
    ["Content Writing", "Press Release Writing", "We need a writer to draft a press release for our upcoming product launch. The release should be attention-grabbing and follow industry standards."],
    ["Digital Marketing", "Social Media Management", "We are looking for a social media manager to handle our Instagram and Twitter accounts. Responsibilities include content creation, scheduling posts, and engaging with followers."],
    ["Digital Marketing", "SEO Optimization", "Need an SEO expert to optimize our website for search engines. The project includes keyword research, on-page optimization, and link-building strategies."],
    ["Digital Marketing", "Google Ads Campaign Management", "Looking for a PPC specialist to manage our Google Ads campaigns. The goal is to increase traffic and conversions for our online store."],
    ["Digital Marketing", "Email Marketing Campaign", "Seeking an email marketing expert to design and execute a series of email campaigns for our new product launch. Experience with Mailchimp is preferred."],
    ["Digital Marketing", "Content Marketing Strategy", "Seeking a content marketing strategist to develop a comprehensive plan to increase our online visibility. The strategy should include content creation, distribution, and performance tracking."],
    ["Graphic Design", "Logo Design for New Startup", "We are a new tech startup looking for a creative designer to create a unique logo for our brand. The logo should be modern and represent innovation. Please provide portfolio examples."],
    ["Graphic Design", "Brochure Design", "Looking for an experienced designer to create a professional brochure for our real estate company. The brochure should highlight our services and properties. Must be delivered in print-ready format."],
    ["Graphic Design", "Social Media Graphics", "Need a designer to create eye-catching social media graphics for our upcoming campaign. We need a set of 10 images optimized for Instagram and Facebook."],
    ["Graphic Design", "Website Banner Design", "Seeking a skilled designer to create a series of banners for our e-commerce website. Banners should be consistent with our brand’s aesthetic. Please include examples of previous work."],
    ["Graphic Design", "Infographic Design", "We need a designer to create a visually appealing infographic based on our provided data. The infographic should be easy to understand and shareable on social media."],
    ["Web Development", "WordPress Website Setup", "We need a developer to set up a WordPress site for our small business. The site should be responsive and include a contact form, blog, and e-commerce functionality. Experience with WooCommerce is a plus."],
    ["Web Development", "Custom Web Application Development", "Looking for a full-stack developer to build a custom web application for managing employee schedules. The app should include a login system, user roles, and reporting features."],
    ["Web Development", "Shopify Store Customization", "Seeking a Shopify expert to customize our online store. We need theme adjustments, product page enhancements, and integration with third-party tools."],
    ["Web Development", "API Integration", "Need a developer to integrate our existing CRM system with an external API. The integration should sync customer data in real-time. Previous experience with similar projects required."],
    ["Web Development", "Landing Page Development", "Looking for a developer to create a high-converting landing page for our marketing campaign. The page should be optimized for mobile and desktop users."]
]

df = pd.DataFrame(data[1:], columns=data[0])
df

[2]:

	job_category	job_title	job_post
0	Content Writing	Blog Post Writing	Looking for a skilled writer to produce 5 blog...
1	Content Writing	Product Description Writing	We need a writer to craft compelling product d...
2	Content Writing	Technical Writing for Software Documentation	Seeking an experienced technical writer to cre...
3	Content Writing	Website Copywriting	Looking for a copywriter to create persuasive ...
4	Content Writing	Press Release Writing	We need a writer to draft a press release for ...
5	Digital Marketing	Social Media Management	We are looking for a social media manager to h...
6	Digital Marketing	SEO Optimization	Need an SEO expert to optimize our website for...
7	Digital Marketing	Google Ads Campaign Management	Looking for a PPC specialist to manage our Goo...
8	Digital Marketing	Email Marketing Campaign	Seeking an email marketing expert to design an...
9	Digital Marketing	Content Marketing Strategy	Seeking a content marketing strategist to deve...
10	Graphic Design	Logo Design for New Startup	We are a new tech startup looking for a creati...
11	Graphic Design	Brochure Design	Looking for an experienced designer to create ...
12	Graphic Design	Social Media Graphics	Need a designer to create eye-catching social ...
13	Graphic Design	Website Banner Design	Seeking a skilled designer to create a series ...
14	Graphic Design	Infographic Design	We need a designer to create a visually appeal...
15	Web Development	WordPress Website Setup	We need a developer to set up a WordPress site...
16	Web Development	Custom Web Application Development	Looking for a full-stack developer to build a ...
17	Web Development	Shopify Store Customization	Seeking a Shopify expert to customize our onli...
18	Web Development	API Integration	Need a developer to integrate our existing CRM...
19	Web Development	Landing Page Development	Looking for a developer to create a high-conve...

[3]:

df.to_csv("data.csv")

Here we post the file to Coop and get the information for the object:

[4]:

from edsl import FileStore

fs = FileStore("data.csv")

fs.push(
    description = "Example CSV file: Job categories",
    alias = "filestore-csv-example",
    visibility = "public"
)

[4]:

{'description': 'Example CSV file: Job categories',
 'object_type': 'scenario',
 'url': 'https://www.expectedparrot.com/content/00b84308-166e-4828-9a3a-04cb93cd3222',
 'alias_url': 'https://www.expectedparrot.com/content/RobinHorton/filestore-csv-example',
 'uuid': '00b84308-166e-4828-9a3a-04cb93cd3222',
 'version': '0.1.62.dev1',
 'visibility': 'public'}

Next we retrieve the data file and use it to create scenarios (replace this code with the UUID of any file you want to use):

[5]:

from edsl import FileStore, ScenarioList

fs = FileStore.pull("https://www.expectedparrot.com/content/RobinHorton/filestore-csv-example")

scenarios = ScenarioList.from_source("csv", fs.to_tempfile())
scenarios

defaultdict(<function DelimitedFileSource.to_scenario_list.<locals>.<lambda> at 0x1059d5940>, {})
defaultdict(<function DelimitedFileSource.to_scenario_list.<locals>.<lambda> at 0x1059d5940>, {'': 1})
defaultdict(<function DelimitedFileSource.to_scenario_list.<locals>.<lambda> at 0x1059d5940>, {'': 1, 'job_category': 1})
defaultdict(<function DelimitedFileSource.to_scenario_list.<locals>.<lambda> at 0x1059d5940>, {'': 1, 'job_category': 1, 'job_title': 1})

[5]:

ScenarioList scenarios: 20; keys: ['', 'job_title', 'job_category', 'job_post'];

	Unnamed: 0	job_category	job_title	job_post
0	0	Content Writing	Blog Post Writing	Looking for a skilled writer to produce 5 blog posts on digital marketing topics. Each post should be 800-1000 words, well-researched, and SEO-optimized.
1	1	Content Writing	Product Description Writing	We need a writer to craft compelling product descriptions for our online store. Each description should highlight the key features and benefits of the product.
2	2	Content Writing	Technical Writing for Software Documentation	Seeking an experienced technical writer to create user manuals and API documentation for our software product. Must have a background in tech writing and be familiar with software development terminology.
3	3	Content Writing	Website Copywriting	Looking for a copywriter to create persuasive content for our company’s website. The content should be clear, concise, and align with our brand voice.
4	4	Content Writing	Press Release Writing	We need a writer to draft a press release for our upcoming product launch. The release should be attention-grabbing and follow industry standards.
5	5	Digital Marketing	Social Media Management	We are looking for a social media manager to handle our Instagram and Twitter accounts. Responsibilities include content creation, scheduling posts, and engaging with followers.
6	6	Digital Marketing	SEO Optimization	Need an SEO expert to optimize our website for search engines. The project includes keyword research, on-page optimization, and link-building strategies.
7	7	Digital Marketing	Google Ads Campaign Management	Looking for a PPC specialist to manage our Google Ads campaigns. The goal is to increase traffic and conversions for our online store.
8	8	Digital Marketing	Email Marketing Campaign	Seeking an email marketing expert to design and execute a series of email campaigns for our new product launch. Experience with Mailchimp is preferred.
9	9	Digital Marketing	Content Marketing Strategy	Seeking a content marketing strategist to develop a comprehensive plan to increase our online visibility. The strategy should include content creation, distribution, and performance tracking.
10	10	Graphic Design	Logo Design for New Startup	We are a new tech startup looking for a creative designer to create a unique logo for our brand. The logo should be modern and represent innovation. Please provide portfolio examples.
11	11	Graphic Design	Brochure Design	Looking for an experienced designer to create a professional brochure for our real estate company. The brochure should highlight our services and properties. Must be delivered in print-ready format.
12	12	Graphic Design	Social Media Graphics	Need a designer to create eye-catching social media graphics for our upcoming campaign. We need a set of 10 images optimized for Instagram and Facebook.
13	13	Graphic Design	Website Banner Design	Seeking a skilled designer to create a series of banners for our e-commerce website. Banners should be consistent with our brand’s aesthetic. Please include examples of previous work.
14	14	Graphic Design	Infographic Design	We need a designer to create a visually appealing infographic based on our provided data. The infographic should be easy to understand and shareable on social media.
15	15	Web Development	WordPress Website Setup	We need a developer to set up a WordPress site for our small business. The site should be responsive and include a contact form, blog, and e-commerce functionality. Experience with WooCommerce is a plus.
16	16	Web Development	Custom Web Application Development	Looking for a full-stack developer to build a custom web application for managing employee schedules. The app should include a login system, user roles, and reporting features.
17	17	Web Development	Shopify Store Customization	Seeking a Shopify expert to customize our online store. We need theme adjustments, product page enhancements, and integration with third-party tools.
18	18	Web Development	API Integration	Need a developer to integrate our existing CRM system with an external API. The integration should sync customer data in real-time. Previous experience with similar projects required.
19	19	Web Development	Landing Page Development	Looking for a developer to create a high-converting landing page for our marketing campaign. The page should be optimized for mobile and desktop users.

Construct questions about the data

Next we construct questions to ask about the job posts, selecting question types based on the form of the response that we want to get back from the language model (multiple choice, linear scale, free text, numerical, etc.–see examples of all question types). We include a {{ placeholder }} for the scenario keys in order to parameterize each question with each job post and category when we run the survey:

[6]:

q_skills = QuestionList(
    question_name="skills",
    question_text="""
    Consider the following job category and job post at an online labor marketplace.
    Job category: {{ scenario.job_category }}
    Job post: {{ scenario.job_post }}
    What are some key skills required for this job?
    """,
)

q_experience = QuestionMultipleChoice(
    question_name="experience",
    question_text="""
    Consider the following job category and job post at an online labor marketplace.
    Job category: {{ scenario.job_category }}
    Job post: {{ scenario.job_post }}
    What level of experience is required for this job?
    """,
    question_options=["Entry-level", "Mid-level", "Senior-level"],
)

q_days = QuestionNumerical(
    question_name="days",
    question_text="""
    Consider the following job category and job post at an online labor marketplace.
    Job category: {{ scenario.job_category }}
    Job post: {{ scenario.job_post }}
    Estimate the number of days until this job post is fulfilled.
    """,
)

Combining questions into a Survey

Next we combine our questions into a survey that will be administered to the AI agents. By default, the questions will be administered asynchronously. If desired, we can also specify survey rules (skip/stop logic) and within-survey memories of prior questions and responses. See the EDSL Docs for details on methods for applying survey rules.

[7]:

survey = Survey(questions=[q_skills, q_experience, q_days])

Creating personas for Agents

Next we draft personas for AI agents that will answer the questions. For each job category we construct an AI agent that is an expert in the category. Agents are constructed by passing a dictionary of traits to an Agent object. Learn more about designing AI agents to answer surveys.

To get the set of job categories from the scenarios:

[8]:

job_categories = list(set(scenarios.select("job_category").to_list()))
job_categories

[8]:

['Digital Marketing', 'Web Development', 'Graphic Design', 'Content Writing']

Next we use them to create an agent for each job category:

[9]:

agents = AgentList(
    Agent(
        traits = {
            "persona": "You are an experienced freelancer on online labor marketplaces.",
            "job_category": job_category,
            "expertise": f"You regularly perform jobs in the following category: {job_category}."
        }
    ) for job_category in job_categories
)
agents

[9]:

AgentList agents: 4;

	persona	job_category	expertise
0	You are an experienced freelancer on online labor marketplaces.	Digital Marketing	You regularly perform jobs in the following category: Digital Marketing.
1	You are an experienced freelancer on online labor marketplaces.	Web Development	You regularly perform jobs in the following category: Web Development.
2	You are an experienced freelancer on online labor marketplaces.	Graphic Design	You regularly perform jobs in the following category: Graphic Design.
3	You are an experienced freelancer on online labor marketplaces.	Content Writing	You regularly perform jobs in the following category: Content Writing.

Selecting language models

EDSL works with many popular language models that we can select to generate the agents’ responses to the survey. Information about current models is available here. To check a list of service providers:

[10]:

Model.services()

[10]:

	Service Name
0	anthropic
1	azure
2	bedrock
3	deep_infra
4	deepseek
5	google
6	groq
7	mistral
8	ollama
9	openai
10	openai_v2
11	perplexity
12	together
13	xai

Here we specify a model to use to generate responses (if we do not specify a model, GPT-4o is used by default):

[11]:

model = Model("gemini-1.5-flash", service_name = "google")

Running the survey

We administer a survey by appending the components with the by() method and then calling run() method. In the simplest case where we want a single agent or list of agents to answer all questions with the same scenarios, this takes the following form:

results = survey.by(scenarios).by(agents).by(models).run()

Here we have individual agents answer the questions only for category-specific job posts, and then combine the results:

[13]:

results = None

for job_category in job_categories:
    print("\n\nJob category: ", job_category)

    # Create an agent for the job category
    a = agents.filter(f"job_category == '{job_category}'")

    # Filter the relevant scenarios
    s = scenarios.filter(f"job_category == '{job_category}'")

    # Run the survey with the agent and scenarios
    job_category_results = survey.by(s).by(a).run()

    # Store the results
    if results == None:
        results = job_category_results

    else:
        results = results + job_category_results



Job category:  Digital Marketing

⌃ Job Status 🦜

Completed (5 completed, 0 failed)

Job Links

Results

Progress Report

Content

Remote Jobs

Remote Cache

Identifiers

Results UUID:

128694ba...e930

Use Results.pull(uuid) to fetch results.

Job UUID:

26d39b3e...82a9

Use Jobs.pull(uuid) to fetch job.