Data labeling agents

This notebook shows how to conduct data labeling tasks using EDSL, an open-source library for simulating surveys, experiments and other research with AI agents and large language models. This workflow consists of the following steps:

  1. Import data into EDSL

  2. Create questions about the data

  3. Design an AI agent to answer the questions

  4. Select a language model to generate responses

  5. Analyze results as a formatted dataset

This workflow can be visualized as follows:

general_survey.png

Conducting agent-specific tasks

We can add a layer of complexity to this generalized flow by creating different AI agents for subsets of the data to be reviewed. For example, we can design agents with specific “expertise” to review only the data that is relevant to that expertise. This can be useful if our data is sorted (or sortable) in some way that is important to our task. We can also use EDSL to prompt a language model to sort the data as needed.

This modified workflow can be visualized as follows:

agent_specific_survey.png

Example task: Evaluating job posts

Using a dataset of job posts as an example, in the steps below we create AI agents with expertise in the relevant job categories and then prompt them to evaluate relevant job posts in a variety of ways. The steps are:

  1. Import a dataset of job categories and job posts.

  2. Construct questions about the job posts and combine them in a survey.

  3. Design AI agents with job category expertise.

  4. Administer the survey to each agent with job posts for the relevant category.

  5. Inspect the results using built-in methods for analysis.

Technical setup

Before running the code below please ensure that you have completed setup:

Our Starter Tutorial provides examples of EDSL basic components. An introductory data labeling example notebook may also be useful to you.

Import the tools

We start by selecting question types and survey components that we will use. Please see the EDSL Docs for examples of all question types and details on these basic components.

[1]:
from edsl import (
    QuestionMultipleChoice, QuestionFreeText, QuestionLinearScale, QuestionList, QuestionNumerical,
    Survey, ScenarioList, Scenario, AgentList, Agent, ModelList, Model
)

Import data

Next we import a dataset for review, using Scenario objects to represent the individual data that will be added to each of our data labeling questions.

[2]:
scenarios = ScenarioList.from_csv("job_posts.csv")

We can inspect the scenarios that have been created and edit them as desired. Learn more about working with scenarios.

[3]:
scenarios
[3]:
{
    "scenarios": [
        {
            "job_category": "Graphic Design",
            "job_title": "Logo Design for New Startup",
            "job_post": "We are a new tech startup looking for a creative designer to create a unique logo for our brand. The logo should be modern and represent innovation. Please provide portfolio examples."
        },
        {
            "job_category": "Graphic Design",
            "job_title": "Brochure Design",
            "job_post": "Looking for an experienced designer to create a professional brochure for our real estate company. The brochure should highlight our services and properties. Must be delivered in print-ready format."
        },
        {
            "job_category": "Graphic Design",
            "job_title": "Social Media Graphics",
            "job_post": "Need a designer to create eye-catching social media graphics for our upcoming campaign. We need a set of 10 images optimized for Instagram and Facebook."
        },
        {
            "job_category": "Graphic Design",
            "job_title": "Website Banner Design",
            "job_post": "Seeking a skilled designer to create a series of banners for our e-commerce website. Banners should be consistent with our brand\u2019s aesthetic. Please include examples of previous work."
        },
        {
            "job_category": "Web Development",
            "job_title": "WordPress Website Setup",
            "job_post": "We need a developer to set up a WordPress site for our small business. The site should be responsive and include a contact form, blog, and e-commerce functionality. Experience with WooCommerce is a plus."
        },
        {
            "job_category": "Web Development",
            "job_title": "Custom Web Application Development",
            "job_post": "Looking for a full-stack developer to build a custom web application for managing employee schedules. The app should include a login system, user roles, and reporting features."
        },
        {
            "job_category": "Web Development",
            "job_title": "Shopify Store Customization",
            "job_post": "Seeking a Shopify expert to customize our online store. We need theme adjustments, product page enhancements, and integration with third-party tools."
        },
        {
            "job_category": "Web Development",
            "job_title": "API Integration",
            "job_post": "Need a developer to integrate our existing CRM system with an external API. The integration should sync customer data in real-time. Previous experience with similar projects required."
        },
        {
            "job_category": "Content Writing",
            "job_title": "Blog Post Writing",
            "job_post": "Looking for a skilled writer to produce 5 blog posts on digital marketing topics. Each post should be 800-1000 words, well-researched, and SEO-optimized."
        },
        {
            "job_category": "Content Writing",
            "job_title": "Product Description Writing",
            "job_post": "We need a writer to craft compelling product descriptions for our online store. Each description should highlight the key features and benefits of the product."
        },
        {
            "job_category": "Content Writing",
            "job_title": "Technical Writing for Software Documentation",
            "job_post": "Seeking an experienced technical writer to create user manuals and API documentation for our software product. Must have a background in tech writing and be familiar with software development terminology."
        },
        {
            "job_category": "Content Writing",
            "job_title": "Website Copywriting",
            "job_post": "Looking for a copywriter to create persuasive content for our company\u2019s website. The content should be clear, concise, and align with our brand voice."
        },
        {
            "job_category": "Digital Marketing",
            "job_title": "Social Media Management",
            "job_post": "We are looking for a social media manager to handle our Instagram and Twitter accounts. Responsibilities include content creation, scheduling posts, and engaging with followers."
        },
        {
            "job_category": "Digital Marketing",
            "job_title": "SEO Optimization",
            "job_post": "Need an SEO expert to optimize our website for search engines. The project includes keyword research, on-page optimization, and link-building strategies."
        },
        {
            "job_category": "Digital Marketing",
            "job_title": "Google Ads Campaign Management",
            "job_post": "Looking for a PPC specialist to manage our Google Ads campaigns. The goal is to increase traffic and conversions for our online store."
        },
        {
            "job_category": "Digital Marketing",
            "job_title": "Email Marketing Campaign",
            "job_post": "Seeking an email marketing expert to design and execute a series of email campaigns for our new product launch. Experience with Mailchimp is preferred."
        },
        {
            "job_category": "Graphic Design",
            "job_title": "Infographic Design",
            "job_post": "We need a designer to create a visually appealing infographic based on our provided data. The infographic should be easy to understand and shareable on social media."
        },
        {
            "job_category": "Web Development",
            "job_title": "Landing Page Development",
            "job_post": "Looking for a developer to create a high-converting landing page for our marketing campaign. The page should be optimized for mobile and desktop users."
        },
        {
            "job_category": "Content Writing",
            "job_title": "Press Release Writing",
            "job_post": "We need a writer to draft a press release for our upcoming product launch. The release should be attention-grabbing and follow industry standards."
        },
        {
            "job_category": "Digital Marketing",
            "job_title": "Content Marketing Strategy",
            "job_post": "Seeking a content marketing strategist to develop a comprehensive plan to increase our online visibility. The strategy should include content creation, distribution, and performance tracking."
        }
    ]
}

Construct questions about the data

Next we construct questions to ask about the job posts, selecting question types based on the form of the response that we want to get back from the language model (multiple choice, linear scale, free text, numerical, etc.–see examples of all question types). We include a {{ placeholder }} for the scenario keys in order to parameterize each question with each job post and category when we run the survey:

[4]:
q_skills = QuestionList(
    question_name="skills",
    question_text="""
    Consider the following job category and job post at an online labor marketplace.
    Job category: {{ job_category }}
    Job post: {{ job_post }}
    What are some key skills required for this job?
    """,
)

q_experience = QuestionMultipleChoice(
    question_name="experience",
    question_text="""
    Consider the following job category and job post at an online labor marketplace.
    Job category: {{ job_category }}
    Job post: {{ job_post }}
    What level of experience is required for this job?
    """,
    question_options=["Entry-level", "Mid-level", "Senior-level"],
)

q_days = QuestionNumerical(
    question_name="days",
    question_text="""
    Consider the following job category and job post at an online labor marketplace.
    Job category: {{ job_category }}
    Job post: {{ job_post }}
    Estimate the number of days until this job post is fulfilled.
    """,
)

Combining questions into a Survey

Next we combine our questions into a survey that will be administered to the AI agents. By default, the questions will be administered asynchronously. If desired, we can also specify survey rules (skip/stop logic) and within-survey memories of prior questions and responses. See the EDSL Docs for details on methods for applying survey rules.

[5]:
survey = Survey(questions=[q_skills, q_experience, q_days])

Creating personas for Agents

Next we draft personas for AI agents that will answer the questions. For each job category we construct an AI agent that is an expert in the category. Agents are constructed by passing a dictionary of traits to an Agent object. Learn more about designing AI agents to answer surveys.

To get the set of job categories from the scenarios:

[6]:
job_categories = list(set(scenarios.select("job_category").to_list()))
job_categories
[6]:
['Digital Marketing', 'Graphic Design', 'Content Writing', 'Web Development']

Next we use them to create an agent for each job category:

[7]:
agents = AgentList(
    Agent(
        traits = {
            "persona": "You are an experienced freelancer on online labor marketplaces.",
            "job_category": job_category,
            "expertise": f"You regularly perform jobs in the following category: {job_category}."
        }
    ) for job_category in job_categories
)
agents
[7]:
[
    {
        "traits": {
            "persona": "You are an experienced freelancer on online labor marketplaces.",
            "job_category": "Digital Marketing",
            "expertise": "You regularly perform jobs in the following category: Digital Marketing."
        },
        "edsl_version": "0.1.32.dev1",
        "edsl_class_name": "Agent"
    },
    {
        "traits": {
            "persona": "You are an experienced freelancer on online labor marketplaces.",
            "job_category": "Graphic Design",
            "expertise": "You regularly perform jobs in the following category: Graphic Design."
        },
        "edsl_version": "0.1.32.dev1",
        "edsl_class_name": "Agent"
    },
    {
        "traits": {
            "persona": "You are an experienced freelancer on online labor marketplaces.",
            "job_category": "Content Writing",
            "expertise": "You regularly perform jobs in the following category: Content Writing."
        },
        "edsl_version": "0.1.32.dev1",
        "edsl_class_name": "Agent"
    },
    {
        "traits": {
            "persona": "You are an experienced freelancer on online labor marketplaces.",
            "job_category": "Web Development",
            "expertise": "You regularly perform jobs in the following category: Web Development."
        },
        "edsl_version": "0.1.32.dev1",
        "edsl_class_name": "Agent"
    }
]

Selecting language models

EDSL works with many popular language models that we can select to generate the agents’ responses to the survey. We can check a current list of available models:

[8]:
from edsl import Model

# Model.available()

If we do not specify a model to use, GPT 4 preview is used by default. Here we create a Model for GPT 4 that we will add to the survey when we run it:

[9]:
model = Model("gpt-4")

Running the survey

We administer a survey by appending the components with the by() method and then calling run() method. In the simplest case where we want a single agent or list of agents to answer all questions with the same scenarios, this takes the following form:

results = survey.by(scenarios).by(agents).by(models).run()

Here we have individual agents answer the questions only for category-specific job posts:

[10]:
results = {}

for job_category in job_categories:

    # Create an agent for the job category
    a = agents.filter(f"job_category == '{job_category}'")

    # Filter the relevant scenarios
    s = scenarios.filter(f"job_category == '{job_category}'")

    # Run the survey with the agent and scenarios
    job_category_results = survey.by(s).by(a).run()

    # Store the results
    results[job_category] = job_category_results

Accessing Results

In the previous step we created independent Results objects for our individual agents’ survey results. In the next steps we show how to access results with built-in print and analytical methods.

We can identify the column names to select the fields that we want to inspect:

[11]:
results.keys()
[11]:
dict_keys(['Digital Marketing', 'Graphic Design', 'Content Writing', 'Web Development'])
[12]:
results["Graphic Design"].columns
[12]:
['agent.agent_instruction',
 'agent.agent_name',
 'agent.expertise',
 'agent.persona',
 'answer.days',
 'answer.experience',
 'answer.skills',
 'comment.days_comment',
 'comment.experience_comment',
 'comment.skills_comment',
 'model.frequency_penalty',
 'model.logprobs',
 'model.max_tokens',
 'model.model',
 'model.presence_penalty',
 'model.temperature',
 'model.top_logprobs',
 'model.top_p',
 'prompt.days_system_prompt',
 'prompt.days_user_prompt',
 'prompt.experience_system_prompt',
 'prompt.experience_user_prompt',
 'prompt.skills_system_prompt',
 'prompt.skills_user_prompt',
 'question_options.days_question_options',
 'question_options.experience_question_options',
 'question_options.skills_question_options',
 'question_text.days_question_text',
 'question_text.experience_question_text',
 'question_text.skills_question_text',
 'question_type.days_question_type',
 'question_type.experience_question_type',
 'question_type.skills_question_type',
 'raw_model_response.days_raw_model_response',
 'raw_model_response.experience_raw_model_response',
 'raw_model_response.skills_raw_model_response',
 'scenario.job_category',
 'scenario.job_post',
 'scenario.job_title']

We can select individual fields in a variety of ways:

[13]:
(results["Graphic Design"].select("job_post", "skills", "experience", "days").print())
scenario.job_post answer.skills answer.experience answer.days
We are a new tech startup looking for a creative designer to create a unique logo for our brand. The logo should be modern and represent innovation. Please provide portfolio examples. ['Creativity', 'Adobe Creative Suite proficiency', 'Branding', 'Typography', 'Color Theory', 'Vector Graphic Design', 'Sketching', 'Understanding of Modern Design Trends', 'Communication', 'Conceptual Thinking'] Mid-level 7
Looking for an experienced designer to create a professional brochure for our real estate company. The brochure should highlight our services and properties. Must be delivered in print-ready format. ['Adobe InDesign', 'Adobe Photoshop', 'Layout Design', 'Typography', 'Print Design', 'Color Theory', 'Branding', 'Attention to Detail', 'Communication Skills', 'Time Management'] Mid-level 14
Need a designer to create eye-catching social media graphics for our upcoming campaign. We need a set of 10 images optimized for Instagram and Facebook. ['Adobe Photoshop', 'Adobe Illustrator', 'Graphic Design', 'Social Media Branding', 'Creative Conceptualization', 'Layout and Composition', 'Typography', 'Color Theory', 'Image Optimization', 'Content Scheduling', 'Marketing', 'Visual Communication', 'Attention to Detail', 'Time Management'] Mid-level 7
Seeking a skilled designer to create a series of banners for our e-commerce website. Banners should be consistent with our brand’s aesthetic. Please include examples of previous work. ['Adobe Photoshop', 'Adobe Illustrator', 'Graphic Design', 'Branding', 'Visual Communication', 'Creativity', 'Typography', 'Layout Design', 'Color Theory', 'Marketing', 'UX/UI Principles', 'Attention to Detail', 'Time Management'] Mid-level 7
We need a designer to create a visually appealing infographic based on our provided data. The infographic should be easy to understand and shareable on social media. ['Data visualization', 'Graphic design', 'Creativity', 'Attention to detail', 'Typography', 'Color theory', 'Branding', 'Layout and composition', 'Software proficiency (e.g., Adobe Creative Suite)', 'Social media understanding'] Mid-level 7

We can apply some labels to our table for readability. Note that each question field also automatically includes a <question>_comment field for any commentary by the LLM on the question:

[14]:
(
    results["Graphic Design"]
    .select("job_post", "experience", "experience_comment")
    .print(
        pretty_labels={
            "scenario.job_post": "Job post description",
            "answer.experience": "Experience level",
            "answer.experience_comment": "Comment",
        }
    )
)
Job post description Experience level comment.experience_comment
We are a new tech startup looking for a creative designer to create a unique logo for our brand. The logo should be modern and represent innovation. Please provide portfolio examples. Mid-level The job post specifies the need for a unique and modern logo that represents innovation, which implies a certain level of creative skill and understanding of brand identity. This typically requires more than just basic knowledge, suggesting that a mid-level designer with a solid portfolio would be suitable for this task. Entry-level might lack the experience in brand representation, and senior-level might be overqualified for just a logo design unless the startup is looking for a more comprehensive brand development strategy.
Looking for an experienced designer to create a professional brochure for our real estate company. The brochure should highlight our services and properties. Must be delivered in print-ready format. Mid-level Creating a professional brochure requires a designer who has a good understanding of layout, typography, and branding to effectively highlight the company's services and properties. Mid-level experience is suitable for this task as it requires someone who can deliver a polished, print-ready product without needing senior-level expertise, but is beyond the capabilities of most entry-level designers.
Need a designer to create eye-catching social media graphics for our upcoming campaign. We need a set of 10 images optimized for Instagram and Facebook. Mid-level Creating a set of eye-catching social media graphics requires someone with a good understanding of design principles and social media optimization. A mid-level designer would likely have the necessary skills and experience to execute this task effectively.
Seeking a skilled designer to create a series of banners for our e-commerce website. Banners should be consistent with our brand’s aesthetic. Please include examples of previous work. Mid-level Creating a series of banners that align with a brand's aesthetic requires a designer to have a good understanding of branding, design principles, and possibly some marketing knowledge. This typically goes beyond entry-level skills but doesn't necessarily require senior-level experience. Mid-level is appropriate as it assumes the designer has the necessary skills to create cohesive designs that can engage customers and represent the brand effectively.
We need a designer to create a visually appealing infographic based on our provided data. The infographic should be easy to understand and shareable on social media. Mid-level Creating an infographic that is both visually appealing and easy to understand requires a good grasp of design principles and the ability to present data in a clear and engaging way. This suggests that a mid-level designer with experience in creating infographics and understanding how to optimize them for social media would be well-suited for this job.

We can also access results as a SQL table (called self) with the .sql() method, choosing between a “wide” horizontal view of all fields and a “long” vertical view, and optionally removing the column name prefixes ‘agent’, ‘model’, ‘prompt’, etc.:

[15]:
results["Graphic Design"].sql("select * from self", shape="long")
[15]:
id data_type key value
0 0 agent persona You are an experienced freelancer on online la...
1 0 agent job_category Graphic Design
2 0 agent expertise You regularly perform jobs in the following ca...
3 0 agent agent_name Agent_0
4 0 agent agent_instruction You are answering questions as if you were a h...
... ... ... ... ...
210 4 question_type experience_question_type multiple_choice
211 4 question_type days_question_type numerical
212 4 comment skills_comment These skills are essential for creating an eff...
213 4 comment experience_comment Creating an infographic that is both visually ...
214 4 comment days_comment Based on my experience, graphic design jobs li...

215 rows × 4 columns

Posting content at the Coop

We can post any EDSL objects to the Coop, including this notebook:

[16]:
agents.push(description = "Agents for job posts data labeling task", visibility = "public")
[16]:
{'description': 'Agents for job posts data labeling task',
 'object_type': 'agent_list',
 'url': 'https://www.expectedparrot.com/content/48ca9a15-b503-4da4-8b98-ab7dd906571c',
 'uuid': '48ca9a15-b503-4da4-8b98-ab7dd906571c',
 'version': '0.1.32.dev1',
 'visibility': 'public'}
[17]:
survey.push(description = "Survey for job posts data labeling task", visibility = "public")
[17]:
{'description': 'Survey for job posts data labeling task',
 'object_type': 'survey',
 'url': 'https://www.expectedparrot.com/content/c0ec7ac9-dd25-4599-9544-4bede16060a2',
 'uuid': 'c0ec7ac9-dd25-4599-9544-4bede16060a2',
 'version': '0.1.32.dev1',
 'visibility': 'public'}
[18]:
for job_category in job_categories:
    results[job_category].push(description = f"Results for job posts data labeling task: {job_category}", visibility = "public")
[19]:
from edsl import Notebook

n = Notebook(path = "data_labeling_agent.ipynb")

n.push(description = "Example code for data labeling using agents", visibility = "public")
[19]:
{'description': 'Example code for data labeling using agents',
 'object_type': 'notebook',
 'url': 'https://www.expectedparrot.com/content/75e4f590-9b1f-482e-bd80-f9d1fb454f2e',
 'uuid': '75e4f590-9b1f-482e-bd80-f9d1fb454f2e',
 'version': '0.1.32.dev1',
 'visibility': 'public'}

To update an object:

[22]:
n = Notebook(path = "data_labeling_agent.ipynb")

n.patch(uuid = "75e4f590-9b1f-482e-bd80-f9d1fb454f2e", value = n)
[22]:
{'status': 'success'}

Learn more about using the Coop to conduct LLM-based research.