Concept induction

This notebook offers sample EDSL code for using language models to identify concepts in unstructured texts, then generate criteria for the concepts, and then apply the criteria to evaluate the texts.

This idea is inspired by the recent paper: Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM

Technical setup

Before running the code below, please ensure that you have installed the EDSL library and either activated remote inference from your Coop account or stored API keys for the language models that you want to use with EDSL. Please also see our documentation page for tips and tutorials on getting started using EDSL.

Identify concepts

We start by creating a general question prompting the respondent (a language model) to identify concepts in a given text.

EDSL comes with a variety of question types that we can choose from based on the form of the response that we want to get back from the model. QuestionList may be appropriate where we want the response to be formatted as a list of strings:

[1]:
from edsl import QuestionList

q_concepts = QuestionList(
    question_name="concepts",
    question_text="Identify the key concepts in the following text: {{ text }}",
    # max_list_items = # Optional
)

We might also want to ask some other questions about our data at the same time (a data labeling task). For example:

[2]:
from edsl import QuestionMultipleChoice

q_sentiment = QuestionMultipleChoice(
    question_name="sentiment",
    question_text="Identify the sentiment of this text: {{ text }}",
    question_options=["Negative", "Neutral", "Positive"],
)

We parameterize the questions in order to run them for each of our texts. This is done with Scenario objects that we create for our data (here, some recent tweets by Pres. Biden):

[3]:
# Replace with your data
texts = [  # POTUS recent tweets
    "Tune in as I deliver the keynote address at the U.S. Holocaust Memorial Museum’s Annual Days of Remembrance ceremony in Washington, D.C.",
    "We’re a nation of immigrants. A nation of dreamers. And as Cinco de Mayo represents, a nation of freedom.",
    "Medicare is stronger and Social Security remains strong. My economic plan has helped extend Medicare solvency by a decade. And I am committed to extending Social Security solvency by making the rich pay their fair share.",
    "Today, the Army Black Knights are taking home West Point’s 10th Commander-in-Chief Trophy. They should be proud. I’m proud of them too – not for the wins, but because after every game they hang up their uniforms and put on another: one representing the United States.",
    "This Holocaust Remembrance Day, we mourn the six million Jews who were killed by the Nazis during one of the darkest chapters in human history. And we recommit to heeding the lessons of the Shoah and realizing the responsibility of 'Never Again.'",
    "The recipients of the Presidential Medal of Freedom haven't just kept faith in freedom. They kept all of America's faith in a better tomorrow.",
    "Like Jill says, 'Teaching isn’t just a job. It’s a calling.' She knows that in her bones, and I know every educator who joined us at the White House for the first-ever Teacher State Dinner lives out that truth every day.",
    "Jill and I send warm wishes to Orthodox Christian communities around the world as they celebrate Easter. May the Lord bless and keep you this Easter Sunday and in the year ahead.",
    "Dreamers are our loved ones, nurses, teachers, and small business owners – they deserve the promise of health care just like all of us. Today, my Administration is making that real by expanding affordable health coverage through the Affordable Care Act to DACA recipients.",
    "With today’s report of 175,000 new jobs, the American comeback continues. Congressional Republicans are fighting to cut taxes for billionaires and let special interests rip folks off, I'm focused on job creation and building an economy that works for the families I grew up with.",
]
len(texts)
[3]:
10
[4]:
from edsl import ScenarioList

scenarios = ScenarioList.from_list("text", texts)
# scenarios

Next we combine the questions into a survey in order to administer them together (asynchronously by default, or according to any skip/stop rules or other logic that we want to add–learn more about Survey methods in our documentation):

[5]:
from edsl import Survey

survey = Survey(questions=[q_concepts, q_sentiment])

We add the scenarios to the survey and then run it to generate a dataset of results:

[6]:
results = survey.by(scenarios).run()

EDSL comes with built-in methods for working with results in a variety of forms (data tables, SQL queries, dataframes, JSON, CSV). We can call the columns method to see a list of all the components that we can analyze:

[7]:
results.columns
[7]:
['agent.agent_instruction',
 'agent.agent_name',
 'answer.concepts',
 'answer.sentiment',
 'comment.k_comment',
 'generated_tokens.concepts_generated_tokens',
 'generated_tokens.sentiment_generated_tokens',
 'iteration.iteration',
 'model.frequency_penalty',
 'model.logprobs',
 'model.max_tokens',
 'model.model',
 'model.presence_penalty',
 'model.temperature',
 'model.top_logprobs',
 'model.top_p',
 'prompt.concepts_system_prompt',
 'prompt.concepts_user_prompt',
 'prompt.sentiment_system_prompt',
 'prompt.sentiment_user_prompt',
 'question_options.concepts_question_options',
 'question_options.sentiment_question_options',
 'question_text.concepts_question_text',
 'question_text.sentiment_question_text',
 'question_type.concepts_question_type',
 'question_type.sentiment_question_type',
 'raw_model_response.concepts_cost',
 'raw_model_response.concepts_one_usd_buys',
 'raw_model_response.concepts_raw_model_response',
 'raw_model_response.sentiment_cost',
 'raw_model_response.sentiment_one_usd_buys',
 'raw_model_response.sentiment_raw_model_response',
 'scenario.text']

We can select and print specific components to inspect in a table:

[8]:
results.select("text", "concepts", "sentiment").print(format="rich")
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ scenario                                         answer                                            answer     ┃
┃ .text                                            .concepts                                         .sentiment ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ With today’s report of 175,000 new jobs, the     ['175,000 new jobs', 'American comeback',         Positive   │
│ American comeback continues. Congressional       'Congressional Republicans', 'cut taxes for                  │
│ Republicans are fighting to cut taxes for        billionaires', 'special interests', 'job                     │
│ billionaires and let special interests rip       creation', 'economy that works for families']                │
│ folks off, I'm focused on job creation and                                                                    │
│ building an economy that works for the families                                                               │
│ I grew up with.                                                                                               │
├─────────────────────────────────────────────────┼──────────────────────────────────────────────────┼────────────┤
│ Like Jill says, 'Teaching isn’t just a job.      ['Teaching is a calling', 'Jill', 'Teacher State  Positive   │
│ It’s a calling.' She knows that in her bones,    Dinner', 'White House', 'educator']                          │
│ and I know every educator who joined us at the                                                                │
│ White House for the first-ever Teacher State                                                                  │
│ Dinner lives out that truth every day.                                                                        │
├─────────────────────────────────────────────────┼──────────────────────────────────────────────────┼────────────┤
│ Jill and I send warm wishes to Orthodox          ['Jill', 'Orthodox Christian communities',        Positive   │
│ Christian communities around the world as they   'Easter', 'Lord bless', 'Easter Sunday', 'year               │
│ celebrate Easter. May the Lord bless and keep    ahead']                                                      │
│ you this Easter Sunday and in the year ahead.                                                                 │
├─────────────────────────────────────────────────┼──────────────────────────────────────────────────┼────────────┤
│ Dreamers are our loved ones, nurses, teachers,   ['Dreamers', 'health care', 'Affordable Care      Positive   │
│ and small business owners – they deserve the     Act', 'DACA recipients']                                     │
│ promise of health care just like all of us.                                                                   │
│ Today, my Administration is making that real by                                                               │
│ expanding affordable health coverage through                                                                  │
│ the Affordable Care Act to DACA recipients.                                                                   │
├─────────────────────────────────────────────────┼──────────────────────────────────────────────────┼────────────┤
│ The recipients of the Presidential Medal of      ['Presidential Medal of Freedom', 'faith in       Positive   │
│ Freedom haven't just kept faith in freedom.      freedom', "America's faith", 'better tomorrow']              │
│ They kept all of America's faith in a better                                                                  │
│ tomorrow.                                                                                                     │
├─────────────────────────────────────────────────┼──────────────────────────────────────────────────┼────────────┤
│ This Holocaust Remembrance Day, we mourn the     ['Holocaust Remembrance Day', 'six million        Neutral    │
│ six million Jews who were killed by the Nazis    Jews', 'Nazis', 'darkest chapters in human                   │
│ during one of the darkest chapters in human      history', 'Shoah', 'Never Again']                            │
│ history. And we recommit to heeding the lessons                                                               │
│ of the Shoah and realizing the responsibility                                                                 │
│ of 'Never Again.'                                                                                             │
├─────────────────────────────────────────────────┼──────────────────────────────────────────────────┼────────────┤
│ Medicare is stronger and Social Security         ['Medicare', 'Social Security', 'economic plan',  Positive   │
│ remains strong. My economic plan has helped      'Medicare solvency', 'Social Security solvency',             │
│ extend Medicare solvency by a decade. And I am   'rich pay their fair share']                                 │
│ committed to extending Social Security solvency                                                               │
│ by making the rich pay their fair share.                                                                      │
├─────────────────────────────────────────────────┼──────────────────────────────────────────────────┼────────────┤
│ Today, the Army Black Knights are taking home    ['Army Black Knights', 'West Point', '10th        Positive   │
│ West Point’s 10th Commander-in-Chief Trophy.     Commander-in-Chief Trophy', 'pride', 'United                 │
│ They should be proud. I’m proud of them too –    States']                                                     │
│ not for the wins, but because after every game                                                                │
│ they hang up their uniforms and put on another:                                                               │
│ one representing the United States.                                                                           │
├─────────────────────────────────────────────────┼──────────────────────────────────────────────────┼────────────┤
│ We’re a nation of immigrants. A nation of        ['nation of immigrants', 'nation of dreamers',    Positive   │
│ dreamers. And as Cinco de Mayo represents, a     'Cinco de Mayo', 'nation of freedom']                        │
│ nation of freedom.                                                                                            │
├─────────────────────────────────────────────────┼──────────────────────────────────────────────────┼────────────┤
│ Tune in as I deliver the keynote address at the  ['keynote address', 'U.S. Holocaust Memorial      Neutral    │
│ U.S. Holocaust Memorial Museum’s Annual Days of  Museum', 'Annual Days of Remembrance',                       │
│ Remembrance ceremony in Washington, D.C.         'Washington, D.C.']                                          │
└─────────────────────────────────────────────────┴──────────────────────────────────────────────────┴────────────┘

If our concepts lists are too long, we can run another question prompting a model to condense it. We can specify the number of concepts that we want to get:

[9]:
# Flattening our list of lists for all the texts to use in a follow-on question:
concepts_list = results.select("concepts").to_list(flatten=True)
# concepts_list
[10]:
q_condense = QuestionList(
    question_name="condense",
    question_text="Return a condensed list of the following list of concepts: "
    + ", ".join(concepts_list),
    max_list_items=10,
)

Note that we can call the run() method on either a survey of questions or an individual question:

[11]:
results = q_condense.run()
[12]:
results.select("condense").print(format="rich")
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ answer                                                                                                          ┃
┃ .condense                                                                                                       ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ ['Job creation', 'Economy for families', 'Teaching as a calling', 'Orthodox Christian Easter', 'Affordable Care │
│ Act', 'DACA recipients', 'Holocaust Remembrance', 'Medicare and Social Security solvency', 'Nation of           │
│ immigrants', 'U.S. Holocaust Memorial Museum']                                                                  │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Identify criteria for each concept

Similar to our first step, next we can run a question prompting the model to generate criteria for each concept. We could use QuestionFreeText to generate criteria in an unstructured narrative:

[13]:
from edsl import QuestionFreeText

q_criteria = QuestionFreeText(
    question_name="criteria",
    question_text="""Describe key criteria for determining whether a text is primarily about the
    following concept: {{ concept }}""",
)

For this question, the scenarios are the concepts that we generated:

[14]:
condensed_concepts_list = results.select("condense").to_list(flatten=True)

scenarios = ScenarioList.from_list("concept", condensed_concepts_list)
scenarios
[14]:
{
    "scenarios": [
        {
            "concept": "Job creation"
        },
        {
            "concept": "Economy for families"
        },
        {
            "concept": "Teaching as a calling"
        },
        {
            "concept": "Orthodox Christian Easter"
        },
        {
            "concept": "Affordable Care Act"
        },
        {
            "concept": "DACA recipients"
        },
        {
            "concept": "Holocaust Remembrance"
        },
        {
            "concept": "Medicare and Social Security solvency"
        },
        {
            "concept": "Nation of immigrants"
        },
        {
            "concept": "U.S. Holocaust Memorial Museum"
        }
    ]
}
[15]:
results = q_criteria.by(scenarios).run()
[16]:
results.select("concept", "criteria").print(format="rich")
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ scenario                               answer                                                                  ┃
┃ .concept                               .criteria                                                               ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Job creation                           To determine whether a text is primarily about the concept of job       │
│                                        creation, you can look for several key criteria:                        │
│                                                                                                                │
│                                        1. **Main Topic Focus**: The text should have job creation as its       │
│                                        central theme. This means the bulk of the content discusses how jobs    │
│                                        are being created, the factors influencing job creation, or the         │
│                                        outcomes of job creation efforts.                                       │
│                                                                                                                │
│                                        2. **Economic Indicators**: The text should reference economic          │
│                                        indicators related to job creation, such as employment rates,           │
│                                        unemployment rates, new business startups, and job growth statistics.   │
│                                                                                                                │
│                                        3. **Policy and Programs**: Look for discussions about government       │
│                                        policies, initiatives, or programs designed to stimulate job creation.  │
│                                        This could include tax incentives for businesses, workforce development │
│                                        programs, or infrastructure projects.                                   │
│                                                                                                                │
│                                        4. **Case Studies and Examples**: The text should provide examples or   │
│                                        case studies of successful job creation efforts. This could involve     │
│                                        specific industries, regions, or companies that have created a          │
│                                        significant number of jobs.                                             │
│                                                                                                                │
│                                        5. **Challenges and Solutions**: The text might explore challenges to   │
│                                        job creation, such as economic downturns, automation, or regulatory     │
│                                        hurdles, and propose solutions to these challenges.                     │
│                                                                                                                │
│                                        6. **Stakeholder Involvement**: The text should mention key             │
│                                        stakeholders involved in job creation, including government entities,   │
│                                        private sector companies, non-profits, and educational institutions.    │
│                                                                                                                │
│                                        7. **Future Projections**: Look for discussions about future trends in  │
│                                        job creation, such as emerging industries, technological advancements,  │
│                                        or demographic changes that could impact job creation.                  │
│                                                                                                                │
│                                        8. **Impact on Society**: The text should address the broader impact of │
│                                        job creation on society, such as reducing poverty, improving living     │
│                                        standards, and fostering economic growth.                               │
├───────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────┤
│ Nation of immigrants                   Determining whether a text is primarily about the concept of a "Nation  │
│                                        of Immigrants" involves examining several key criteria:                 │
│                                                                                                                │
│                                        1. **Historical Context**: Look for references to the history of        │
│                                        immigration in the nation being discussed. This includes significant    │
│                                        waves of immigration, key dates, and legislative acts that shaped       │
│                                        immigration policy.                                                     │
│                                                                                                                │
│                                        2. **Diversity and Demographics**: The text should discuss the diverse  │
│                                        origins of the population, highlighting the various ethnic, cultural,   │
│                                        and national backgrounds of the people who have immigrated to the       │
│                                        nation.                                                                 │
│                                                                                                                │
│                                        3. **Immigration Policy**: Analysis of past and present immigration     │
│                                        laws, policies, and reforms is crucial. The text should explore how     │
│                                        these policies have influenced the composition and integration of       │
│                                        immigrants.                                                             │
│                                                                                                                │
│                                        4. **Cultural Impact**: The text should address how immigration has     │
│                                        influenced the culture, traditions, and societal norms of the nation.   │
│                                        This includes contributions to cuisine, language, art, and other        │
│                                        cultural facets.                                                        │
│                                                                                                                │
│                                        5. **Economic Contributions**: Look for discussions on how immigrants   │
│                                        have contributed to the economy, including labor force participation,   │
│                                        entrepreneurship, and innovation.                                       │
│                                                                                                                │
│                                        6. **Challenges and Debates**: The text should cover challenges         │
│                                        associated with immigration, such as integration, discrimination, and   │
│                                        political debates surrounding immigration policy.                       │
│                                                                                                                │
│                                        7. **Identity and National Narrative**: The concept of a "Nation of     │
│                                        Immigrants" often ties into the national identity and narrative. The    │
│                                        text should reflect on how immigration shapes the collective            │
│                                        understanding of what it means to belong to that nation.                │
│                                                                                                                │
│                                        8. **Personal Stories and Testimonies**: Inclusion of personal stories  │
│                                        or testimonies from immigrants can highlight the human aspect of        │
│                                        immigration and its impact on individuals and communities.              │
├───────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────┤
│ Medicare and Social Security solvency  When evaluating whether a text is primarily about Medicare and Social   │
│                                        Security solvency, consider the following key criteria:                 │
│                                                                                                                │
│                                        1. **Central Theme**: The text should focus on the financial health and │
│                                        sustainability of Medicare and Social Security programs. Look for       │
│                                        discussions on funding, expenditures, and the long-term viability of    │
│                                        these programs.                                                         │
│                                                                                                                │
│                                        2. **Statistics and Projections**: The text should include data,        │
│                                        statistics, or projections related to the solvency of Medicare and      │
│                                        Social Security. This might involve actuarial reports, trustee reports, │
│                                        or financial forecasts.                                                 │
│                                                                                                                │
│                                        3. **Policy Discussions**: The text should address policy measures or   │
│                                        reforms aimed at ensuring the solvency of these programs. This can      │
│                                        include proposed changes to benefits, taxes, eligibility criteria, or   │
│                                        other policy interventions.                                             │
│                                                                                                                │
│                                        4. **Challenges and Risks**: The text should outline the challenges and │
│                                        risks to the solvency of Medicare and Social Security. This could       │
│                                        involve demographic changes, economic factors, or healthcare cost       │
│                                        trends that impact the financial stability of these programs.           │
│                                                                                                                │
│                                        5. **Expert Opinions and Analysis**: Look for insights from economists, │
│                                        policymakers, or experts in the field of social insurance and public    │
│                                        finance. Their analysis and opinions can be a key indicator that the    │
│                                        text is centered on solvency issues.                                    │
│                                                                                                                │
│                                        6. **Legislative and Regulatory Context**: The text should reference    │
│                                        relevant legislative or regulatory frameworks that govern Medicare and  │
│                                        Social Security. Discussions about past, current, or proposed laws      │
│                                        impacting these programs are important.                                 │
│                                                                                                                │
│                                        7. **Comparative Analysis**: The text may compare the solvency of       │
│                                        Medicare and Social Security with other social insurance programs,      │
│                                        either within the same country or internationally, to provide context   │
│                                        and highlight solvency issues.                                          │
├───────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────┤
│ Economy for families                   When determining whether a text is primarily about the economy for      │
│                                        families, you should look for key criteria such as:                     │
│                                                                                                                │
│                                        1. **Financial Management**: Discussions on how families manage their   │
│                                        finances, including budgeting, saving, and investing.                   │
│                                                                                                                │
│                                        2. **Income and Employment**: Information on family income levels,      │
│                                        employment rates, job stability, and wage trends that affect family     │
│                                        economic well-being.                                                    │
│                                                                                                                │
│                                        3. **Cost of Living**: Analysis of expenses that families typically     │
│                                        face, such as housing, utilities, groceries, transportation, and        │
│                                        healthcare.                                                             │
│                                                                                                                │
│                                        4. **Government Policies and Benefits**: Examination of policies,       │
│                                        programs, and benefits designed to support families, like tax credits,  │
│                                        subsidies, social security, and child care assistance.                  │
│                                                                                                                │
│                                        5. **Education and Childcare Costs**: Insights into the costs           │
│                                        associated with raising children, including education expenses from     │
│                                        preschool through college, and childcare services.                      │
│                                                                                                                │
│                                        6. **Debt and Credit**: Discussion on family debt levels, credit usage, │
│                                        and financial products like mortgages, student loans, and credit cards. │
│                                                                                                                │
│                                        7. **Economic Challenges**: Identification of specific economic         │
│                                        challenges that families encounter, such as unemployment, inflation,    │
│                                        and economic downturns.                                                 │
│                                                                                                                │
│                                        8. **Wealth Inequality**: Consideration of how economic disparities and │
│                                        wealth inequality impact families differently based on socioeconomic    │
│                                        status.                                                                 │
├───────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────┤
│ Orthodox Christian Easter              To determine whether a text is primarily about Orthodox Christian       │
│                                        Easter, you can look for several key criteria:                          │
│                                                                                                                │
│                                        1. **Date and Timing**: The text should reference the specific timing   │
│                                        of Orthodox Easter, which often differs from Western Christian Easter.  │
│                                        It typically involves the Julian calendar and falls after the Jewish    │
│                                        Passover.                                                               │
│                                                                                                                │
│                                        2. **Liturgical Practices**: Descriptions of unique Orthodox liturgical │
│                                        practices such as the Paschal Vigil, the midnight service, and the use  │
│                                        of the Paschal greeting "Christ is Risen" (and its response "Indeed He  │
│                                        is Risen") would indicate a focus on Orthodox Easter.                   │
│                                                                                                                │
│                                        3. **Cultural Traditions**: Look for mentions of cultural and regional  │
│                                        traditions specific to Orthodox Christian communities, such as the      │
│                                        preparation and blessing of Paschal foods, the dyeing of red eggs, and  │
│                                        festive meals.                                                          │
│                                                                                                                │
│                                        4. **Theological Emphasis**: The text should emphasize the theological  │
│                                        significance of the Resurrection of Jesus Christ within the context of  │
│                                        Orthodox Christianity, including references to the Paschal Troparion    │
│                                        and themes of victory over death.                                       │
│                                                                                                                │
│                                        5. **Iconography and Symbols**: References to Orthodox Christian        │
│                                        iconography related to Easter, such as the icon of the Resurrection or  │
│                                        the Harrowing of Hell, can be a strong indicator.                       │
│                                                                                                                │
│                                        6. **Historical Context**: The text might include historical context    │
│                                        about the development and significance of Orthodox Easter within the    │
│                                        broader history of Christianity, particularly in Eastern Orthodox       │
│                                        traditions.                                                             │
├───────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────┤
│ Holocaust Remembrance                  To determine whether a text is primarily about Holocaust Remembrance,   │
│                                        consider the following key criteria:                                    │
│                                                                                                                │
│                                        1. **Subject Matter**: The text should focus on events, experiences,    │
│                                        and memories related to the Holocaust, including the systematic         │
│                                        persecution and genocide of six million Jews and millions of others by  │
│                                        Nazi Germany during World War II.                                       │
│                                                                                                                │
│                                        2. **Purpose**: The primary intent of the text should be to remember,   │
│                                        honor, and educate about the Holocaust. This can include commemorative  │
│                                        activities, reflections on the impact of the Holocaust, and efforts to  │
│                                        ensure such atrocities are never repeated.                              │
│                                                                                                                │
│                                        3. **Historical Context**: The text should provide historical details   │
│                                        and context about the Holocaust, such as dates, locations, key figures, │
│                                        and significant events like Kristallnacht, concentration camps, and     │
│                                        liberation.                                                             │
│                                                                                                                │
│                                        4. **Personal Accounts and Testimonies**: The inclusion of survivor     │
│                                        stories, witness testimonies, and personal narratives that highlight    │
│                                        individual experiences during the Holocaust is a strong indicator of a  │
│                                        focus on remembrance.                                                   │
│                                                                                                                │
│                                        5. **Memorials and Commemorations**: The text may discuss various       │
│                                        Holocaust memorials, museums, remembrance days (such as Yom HaShoah),   │
│                                        and other forms of commemoration that aim to keep the memory of the     │
│                                        Holocaust alive.                                                        │
│                                                                                                                │
│                                        6. **Educational Content**: The presence of educational materials,      │
│                                        programs, and initiatives designed to teach about the Holocaust and its │
│                                        lessons for future generations is another key criterion.                │
│                                                                                                                │
│                                        7. **Themes of Memory and Legacy**: Themes related to the preservation  │
│                                        of memory, the legacy of the Holocaust, and the moral and ethical       │
│                                        lessons drawn from it are central to texts focused on Holocaust         │
│                                        Remembrance.                                                            │
│                                                                                                                │
│                                        8. **Calls to Action**: The text may include calls to action for        │
│                                        remembrance, such as participating in memorial events, supporting       │
│                                        Holocaust education, and combating Holocaust denial and anti-Semitism.  │
├───────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────┤
│ Teaching as a calling                  To determine whether a text is primarily about the concept of "Teaching │
│                                        as a calling," you can look for several key criteria:                   │
│                                                                                                                │
│                                        1. **Passion and Dedication**: The text should highlight the teacher's  │
│                                        deep passion and dedication to the profession. This often includes      │
│                                        descriptions of teachers going above and beyond their duties, investing │
│                                        extra time and effort to ensure their students succeed.                 │
│                                                                                                                │
│                                        2. **Intrinsic Motivation**: Look for indications that the teacher is   │
│                                        motivated by internal rewards rather than external incentives such as   │
│                                        salary or job security. The text may discuss the personal fulfillment   │
│                                        and joy derived from teaching and making a difference in students'      │
│                                        lives.                                                                  │
│                                                                                                                │
│                                        3. **Sense of Purpose**: The text should convey a strong sense of       │
│                                        purpose and mission. It often includes narratives where teachers feel a │
│                                        profound responsibility and commitment to educating and nurturing their │
│                                        students.                                                               │
│                                                                                                                │
│                                        4. **Impact on Students**: There should be an emphasis on the positive  │
│                                        impact the teacher has on their students. This can include stories of   │
│                                        students' academic and personal growth, as well as testimonials or      │
│                                        anecdotes that showcase the teacher's influence.                        │
│                                                                                                                │
│                                        5. **Personal Sacrifice**: The text might mention the sacrifices        │
│                                        teachers make, such as working long hours, spending their own money on  │
│                                        classroom supplies, or prioritizing their students' needs over their    │
│                                        own.                                                                    │
│                                                                                                                │
│                                        6. **Community and Relationships**: Look for discussions about the      │
│                                        strong relationships teachers build with their students, parents, and   │
│                                        the community. This can include involvement in extracurricular          │
│                                        activities, mentoring, and fostering a supportive learning environment. │
│                                                                                                                │
│                                        7. **Spiritual or Ethical Dimensions**: Sometimes, the text may touch   │
│                                        on spiritual or ethical dimensions of teaching, framing it as a         │
│                                        vocation or calling with a higher purpose beyond just a job.            │
├───────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────┤
│ Affordable Care Act                    To determine whether a text is primarily about the Affordable Care Act  │
│                                        (ACA), you should look for several key criteria:                        │
│                                                                                                                │
│                                        1. **Title and Headings**: Check if the title or headings of the text   │
│                                        explicitly mention the Affordable Care Act, ACA, or related terms like  │
│                                        "Obamacare."                                                            │
│                                                                                                                │
│                                        2. **Main Focus**: Identify if the central theme of the text revolves   │
│                                        around healthcare reform, insurance coverage, health policy changes, or │
│                                        the specific provisions and impacts of the ACA.                         │
│                                                                                                                │
│                                        3. **Legislative Details**: Look for discussions about the legislative  │
│                                        history, implementation, amendments, or legal challenges related to the │
│                                        ACA.                                                                    │
│                                                                                                                │
│                                        4. **Healthcare Coverage**: See if the text addresses changes in        │
│                                        healthcare coverage, such as the expansion of Medicaid, the creation of │
│                                        health insurance marketplaces, or the individual mandate.               │
│                                                                                                                │
│                                        5. **Impact Analysis**: Determine if the text analyzes the effects of   │
│                                        the ACA on various stakeholders, including patients, healthcare         │
│                                        providers, insurers, and the economy.                                   │
│                                                                                                                │
│                                        6. **Policy Debates**: Check if there are debates or discussions about  │
│                                        the merits, drawbacks, or future of the ACA, including political        │
│                                        perspectives and public opinion.                                        │
│                                                                                                                │
│                                        7. **Statistical Data**: Look for statistics or data related to         │
│                                        insurance coverage rates, healthcare costs, or health outcomes that are │
│                                        directly linked to the ACA.                                             │
│                                                                                                                │
│                                        8. **Expert Opinions**: Identify if the text includes opinions or       │
│                                        analyses from healthcare experts, policymakers, or economists           │
│                                        specifically about the ACA.                                             │
├───────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────┤
│ U.S. Holocaust Memorial Museum         To determine whether a text is primarily about the U.S. Holocaust       │
│                                        Memorial Museum, you should look for the following key criteria:        │
│                                                                                                                │
│                                        1. **Central Focus**: The text should have the U.S. Holocaust Memorial  │
│                                        Museum as its main subject. This means that the majority of the content │
│                                        should revolve around the museum itself rather than mentioning it in    │
│                                        passing.                                                                │
│                                                                                                                │
│                                        2. **Historical Context**: The text should provide background           │
│                                        information on the establishment of the museum, including its founding, │
│                                        mission, and significance in preserving the memory of the Holocaust.    │
│                                                                                                                │
│                                        3. **Exhibits and Collections**: Detailed descriptions of the museum's  │
│                                        exhibits, collections, and artifacts should be present. This can        │
│                                        include permanent and temporary exhibitions, as well as any notable     │
│                                        items or displays that are housed within the museum.                    │
│                                                                                                                │
│                                        4. **Educational Programs**: Information about the educational          │
│                                        programs, workshops, and resources offered by the museum should be      │
│                                        highlighted. This can include outreach programs, school visits, and     │
│                                        online educational materials.                                           │
│                                                                                                                │
│                                        5. **Visitor Experience**: The text might describe the visitor          │
│                                        experience, including the layout of the museum, notable features, and   │
│                                        any personal accounts or testimonials from visitors.                    │
│                                                                                                                │
│                                        6. **Events and Commemorations**: The text should discuss any           │
│                                        significant events, commemorations, or ceremonies that are held at the  │
│                                        museum, such as Holocaust Remembrance Day events.                       │
│                                                                                                                │
│                                        7. **Impact and Influence**: The text should cover the broader impact   │
│                                        and influence of the museum on public understanding and awareness of    │
│                                        the Holocaust. This can include its role in promoting historical        │
│                                        research, human rights, and genocide prevention.                        │
│                                                                                                                │
│                                        8. **Location and Accessibility**: Practical information about the      │
│                                        museum's location, hours of operation, and accessibility for visitors   │
│                                        can also indicate that the text is focused on the museum.               │
├───────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────┤
│ DACA recipients                        To determine if a text is primarily about DACA recipients, you can look │
│                                        for the following key criteria:                                         │
│                                                                                                                │
│                                        1. **Mention of DACA**: The text should explicitly mention Deferred     │
│                                        Action for Childhood Arrivals (DACA) multiple times. This includes      │
│                                        references to the program itself, its policies, and its impact.         │
│                                                                                                                │
│                                        2. **Focus on Recipients**: There should be a clear focus on the        │
│                                        individuals who are DACA recipients. This includes their experiences,   │
│                                        challenges, and stories.                                                │
│                                                                                                                │
│                                        3. **Legal and Policy Context**: The text should discuss the legal and  │
│                                        policy framework surrounding DACA, including eligibility criteria,      │
│                                        application processes, and any changes or challenges to the program.    │
│                                                                                                                │
│                                        4. **Impact and Outcomes**: Look for discussions about the impact of    │
│                                        DACA on recipients' lives, such as access to education, employment      │
│                                        opportunities, and protection from deportation.                         │
│                                                                                                                │
│                                        5. **Advocacy and Support**: The text may include information about     │
│                                        advocacy efforts, support organizations, and community responses        │
│                                        related to DACA recipients.                                             │
│                                                                                                                │
│                                        6. **Statistical Data**: The presence of statistical data or research   │
│                                        findings specifically about DACA recipients can indicate that the text  │
│                                        is focused on this group.                                               │
│                                                                                                                │
│                                        7. **Comparative Analysis**: If the text compares the experiences of    │
│                                        DACA recipients with other immigrant groups or discusses their unique   │
│                                        position within the broader immigration system, it is likely focused on │
│                                        DACA recipients.                                                        │
└───────────────────────────────────────┴─────────────────────────────────────────────────────────────────────────┘

Identify the concepts in each text and evaluate based on the criteria

Finally, we can use the concepts and the criteria to run another question where we prompt the model to evaulate each text. Question types QuestionLinearScale, QuestionRank or QuestionNumerical may be appropriate where we want to return a score:

[17]:
from edsl import QuestionLinearScale

q_score = QuestionLinearScale(
    question_name="score",
    question_text="""Consider the following concept and criteria for determining whether
    a given text addresses this concept. Then score how well the following text satisfies
    the criteria for the concept.
    Concept: {{ concept }}
    Criteria: {{ criteria }}
    Text: {{ text }}""",
    question_options=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    option_labels={0: "Not at all", 10: "Very well"},  # Optional
)

Here we want to use both the texts and the concepts and corresponding criteria together as scenarios of the question:

[18]:
concepts_criteria = [
    list(pair)
    for pair in zip(
        results.select("concept").to_list(), results.select("criteria").to_list()
    )
]
len(concepts_criteria)
[18]:
10
[19]:
from edsl import ScenarioList, Scenario

scenarios = ScenarioList(
    Scenario({"text": text, "concept": concept, "criteria": criteria})
    for text in texts
    for [concept, criteria] in concepts_criteria
)
[20]:
results = q_score.by(scenarios).run()

We can filter the results based on the responses–e.g., here we just show the non-zero scores:

[21]:
(
    results.filter("score > 0")
    .select("text", "concept", "score")
    .print(format="rich")
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ scenario                                                        scenario                               answer ┃
┃ .text                                                           .concept                               .score ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ Dreamers are our loved ones, nurses, teachers, and small        Economy for families                   2      │
│ business owners – they deserve the promise of health care just                                                │
│ like all of us. Today, my Administration is making that real                                                  │
│ by expanding affordable health coverage through the Affordable                                                │
│ Care Act to DACA recipients.                                                                                  │
├────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼────────┤
│ With today’s report of 175,000 new jobs, the American comeback  Economy for families                   3      │
│ continues. Congressional Republicans are fighting to cut taxes                                                │
│ for billionaires and let special interests rip folks off, I'm                                                 │
│ focused on job creation and building an economy that works for                                                │
│ the families I grew up with.                                                                                  │
├────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼────────┤
│ With today’s report of 175,000 new jobs, the American comeback  Job creation                           2      │
│ continues. Congressional Republicans are fighting to cut taxes                                                │
│ for billionaires and let special interests rip folks off, I'm                                                 │
│ focused on job creation and building an economy that works for                                                │
│ the families I grew up with.                                                                                  │
├────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼────────┤
│ Medicare is stronger and Social Security remains strong. My     Economy for families                   3      │
│ economic plan has helped extend Medicare solvency by a decade.                                                │
│ And I am committed to extending Social Security solvency by                                                   │
│ making the rich pay their fair share.                                                                         │
├────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼────────┤
│ Dreamers are our loved ones, nurses, teachers, and small        Affordable Care Act                    6      │
│ business owners – they deserve the promise of health care just                                                │
│ like all of us. Today, my Administration is making that real                                                  │
│ by expanding affordable health coverage through the Affordable                                                │
│ Care Act to DACA recipients.                                                                                  │
├────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼────────┤
│ Dreamers are our loved ones, nurses, teachers, and small        DACA recipients                        5      │
│ business owners – they deserve the promise of health care just                                                │
│ like all of us. Today, my Administration is making that real                                                  │
│ by expanding affordable health coverage through the Affordable                                                │
│ Care Act to DACA recipients.                                                                                  │
├────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼────────┤
│ Jill and I send warm wishes to Orthodox Christian communities   Orthodox Christian Easter              1      │
│ around the world as they celebrate Easter. May the Lord bless                                                 │
│ and keep you this Easter Sunday and in the year ahead.                                                        │
├────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼────────┤
│ Tune in as I deliver the keynote address at the U.S. Holocaust  U.S. Holocaust Memorial Museum         1      │
│ Memorial Museum’s Annual Days of Remembrance ceremony in                                                      │
│ Washington, D.C.                                                                                              │
├────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼────────┤
│ Medicare is stronger and Social Security remains strong. My     Medicare and Social Security solvency  4      │
│ economic plan has helped extend Medicare solvency by a decade.                                                │
│ And I am committed to extending Social Security solvency by                                                   │
│ making the rich pay their fair share.                                                                         │
├────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼────────┤
│ Dreamers are our loved ones, nurses, teachers, and small        Nation of immigrants                   3      │
│ business owners – they deserve the promise of health care just                                                │
│ like all of us. Today, my Administration is making that real                                                  │
│ by expanding affordable health coverage through the Affordable                                                │
│ Care Act to DACA recipients.                                                                                  │
├────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼────────┤
│ This Holocaust Remembrance Day, we mourn the six million Jews   Holocaust Remembrance                  6      │
│ who were killed by the Nazis during one of the darkest                                                        │
│ chapters in human history. And we recommit to heeding the                                                     │
│ lessons of the Shoah and realizing the responsibility of                                                      │
│ 'Never Again.'                                                                                                │
├────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼────────┤
│ Like Jill says, 'Teaching isn’t just a job. It’s a calling.'    Teaching as a calling                  3      │
│ She knows that in her bones, and I know every educator who                                                    │
│ joined us at the White House for the first-ever Teacher State                                                 │
│ Dinner lives out that truth every day.                                                                        │
├────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼────────┤
│ We’re a nation of immigrants. A nation of dreamers. And as      Nation of immigrants                   1      │
│ Cinco de Mayo represents, a nation of freedom.                                                                │
├────────────────────────────────────────────────────────────────┼───────────────────────────────────────┼────────┤
│ Tune in as I deliver the keynote address at the U.S. Holocaust  Holocaust Remembrance                  3      │
│ Memorial Museum’s Annual Days of Remembrance ceremony in                                                      │
│ Washington, D.C.                                                                                              │
└────────────────────────────────────────────────────────────────┴───────────────────────────────────────┴────────┘

Posting to the Coop

The Coop is a platform for creating, storing and sharing LLM-based research. It is fully integrated with EDSL and accessible from your workspace or Coop account page. Learn more about creating an account and using the Coop.

Here we post the scenarios, survey and results from above, and this notebook:

[22]:
scenarios.push(description = "Example scenarios", visibility = "public")
[22]:
{'description': 'Example scenarios',
 'object_type': 'scenario_list',
 'url': 'https://www.expectedparrot.com/content/5c1f6856-32e4-4473-97e6-928541759637',
 'uuid': '5c1f6856-32e4-4473-97e6-928541759637',
 'version': '0.1.33.dev1',
 'visibility': 'public'}
[23]:
survey.push(description = "Example survey", visibility = "public")
[23]:
{'description': 'Example survey',
 'object_type': 'survey',
 'url': 'https://www.expectedparrot.com/content/2cf3c5fd-e6c1-4135-af96-0ce866dc28bb',
 'uuid': '2cf3c5fd-e6c1-4135-af96-0ce866dc28bb',
 'version': '0.1.33.dev1',
 'visibility': 'public'}
[24]:
results.push(description = "Example results", visibility = "public")
[24]:
{'description': 'Example results',
 'object_type': 'results',
 'url': 'https://www.expectedparrot.com/content/4d7b3230-575e-47d0-b321-81b59d2df16f',
 'uuid': '4d7b3230-575e-47d0-b321-81b59d2df16f',
 'version': '0.1.33.dev1',
 'visibility': 'public'}

We can also post this notebook:

[25]:
from edsl import Notebook
[26]:
n = Notebook(path = "concept_induction.ipynb")
[27]:
n.push(description = "Example code for concept induction", visibility = "public")
[27]:
{'description': 'Example code for concept induction',
 'object_type': 'notebook',
 'url': 'https://www.expectedparrot.com/content/6f29a7b3-6a2e-460b-bf39-baeb7d6c39a1',
 'uuid': '6f29a7b3-6a2e-460b-bf39-baeb7d6c39a1',
 'version': '0.1.33.dev1',
 'visibility': 'public'}

To update an object at the Coop:

[28]:
n = Notebook(path = "concept_induction.ipynb") # resave it
[29]:
n.patch(uuid = "6f29a7b3-6a2e-460b-bf39-baeb7d6c39a1", value = n)
[29]:
{'status': 'success'}