Concept induction
This notebook offers sample EDSL code for using language models to identify concepts in unstructured texts, then generate criteria for the concepts, and then apply the criteria to evaluate the texts.
This idea is inspired by the recent paper: Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM
Technical setup
Before running the code below, please ensure that you have installed the EDSL library and either activated remote inference from your Coop account or stored API keys for the language models that you want to use with EDSL. Please also see our documentation page for tips and tutorials on getting started using EDSL.
Identify concepts
We start by creating a general question prompting the respondent (a language model) to identify concepts in a given text.
EDSL comes with a variety of question types that we can choose from based on the form of the response that we want to get back from the model. QuestionList
may be appropriate where we want the response to be formatted as a list of strings:
[1]:
from edsl import QuestionList
q_concepts = QuestionList(
question_name="concepts",
question_text="Identify the key concepts in the following text: {{ scenario.text }}",
# max_list_items = # Optional
)
We might also want to ask some other questions about our data at the same time (a data labeling task). For example:
[2]:
from edsl import QuestionMultipleChoice
q_sentiment = QuestionMultipleChoice(
question_name="sentiment",
question_text="Identify the sentiment of this text: {{ scenario.text }}",
question_options=["Negative", "Neutral", "Positive"],
)
We parameterize the questions in order to run them for each of our texts. This is done with Scenario
objects that we create for our data (here, some recent tweets by Pres. Biden):
[3]:
# Replace with your data
texts = [ # POTUS recent tweets
"Tune in as I deliver the keynote address at the U.S. Holocaust Memorial Museum’s Annual Days of Remembrance ceremony in Washington, D.C.",
"We’re a nation of immigrants. A nation of dreamers. And as Cinco de Mayo represents, a nation of freedom.",
"Medicare is stronger and Social Security remains strong. My economic plan has helped extend Medicare solvency by a decade. And I am committed to extending Social Security solvency by making the rich pay their fair share.",
"Today, the Army Black Knights are taking home West Point’s 10th Commander-in-Chief Trophy. They should be proud. I’m proud of them too – not for the wins, but because after every game they hang up their uniforms and put on another: one representing the United States.",
"This Holocaust Remembrance Day, we mourn the six million Jews who were killed by the Nazis during one of the darkest chapters in human history. And we recommit to heeding the lessons of the Shoah and realizing the responsibility of 'Never Again.'",
"The recipients of the Presidential Medal of Freedom haven't just kept faith in freedom. They kept all of America's faith in a better tomorrow.",
"Like Jill says, 'Teaching isn’t just a job. It’s a calling.' She knows that in her bones, and I know every educator who joined us at the White House for the first-ever Teacher State Dinner lives out that truth every day.",
"Jill and I send warm wishes to Orthodox Christian communities around the world as they celebrate Easter. May the Lord bless and keep you this Easter Sunday and in the year ahead.",
"Dreamers are our loved ones, nurses, teachers, and small business owners – they deserve the promise of health care just like all of us. Today, my Administration is making that real by expanding affordable health coverage through the Affordable Care Act to DACA recipients.",
"With today’s report of 175,000 new jobs, the American comeback continues. Congressional Republicans are fighting to cut taxes for billionaires and let special interests rip folks off, I'm focused on job creation and building an economy that works for the families I grew up with.",
]
len(texts)
[3]:
10
[4]:
from edsl import ScenarioList
scenarios = ScenarioList.from_list("text", texts)
scenarios
[4]:
ScenarioList scenarios: 10; keys: ['text'];
text | |
---|---|
0 | Tune in as I deliver the keynote address at the U.S. Holocaust Memorial Museum’s Annual Days of Remembrance ceremony in Washington, D.C. |
1 | We’re a nation of immigrants. A nation of dreamers. And as Cinco de Mayo represents, a nation of freedom. |
2 | Medicare is stronger and Social Security remains strong. My economic plan has helped extend Medicare solvency by a decade. And I am committed to extending Social Security solvency by making the rich pay their fair share. |
3 | Today, the Army Black Knights are taking home West Point’s 10th Commander-in-Chief Trophy. They should be proud. I’m proud of them too – not for the wins, but because after every game they hang up their uniforms and put on another: one representing the United States. |
4 | This Holocaust Remembrance Day, we mourn the six million Jews who were killed by the Nazis during one of the darkest chapters in human history. And we recommit to heeding the lessons of the Shoah and realizing the responsibility of 'Never Again.' |
5 | The recipients of the Presidential Medal of Freedom haven't just kept faith in freedom. They kept all of America's faith in a better tomorrow. |
6 | Like Jill says, 'Teaching isn’t just a job. It’s a calling.' She knows that in her bones, and I know every educator who joined us at the White House for the first-ever Teacher State Dinner lives out that truth every day. |
7 | Jill and I send warm wishes to Orthodox Christian communities around the world as they celebrate Easter. May the Lord bless and keep you this Easter Sunday and in the year ahead. |
8 | Dreamers are our loved ones, nurses, teachers, and small business owners – they deserve the promise of health care just like all of us. Today, my Administration is making that real by expanding affordable health coverage through the Affordable Care Act to DACA recipients. |
9 | With today’s report of 175,000 new jobs, the American comeback continues. Congressional Republicans are fighting to cut taxes for billionaires and let special interests rip folks off, I'm focused on job creation and building an economy that works for the families I grew up with. |
Next we combine the questions into a survey in order to administer them together (asynchronously by default, or according to any skip/stop rules or other logic that we want to add–learn more about Survey
methods in our documentation):
[5]:
from edsl import Survey
survey = Survey(questions=[q_concepts, q_sentiment])
We add the scenarios to the survey and then run it to generate a dataset of results:
[6]:
results = survey.by(scenarios).run()
Job UUID | 82ff6cb7-c8a9-475d-941c-7e5c8b954190 |
Progress Bar URL | https://www.expectedparrot.com/home/remote-job-progress/82ff6cb7-c8a9-475d-941c-7e5c8b954190 |
Exceptions Report URL | None |
Results UUID | b59e3038-ebea-4715-b6d2-8578279eea4a |
Results URL | https://www.expectedparrot.com/content/b59e3038-ebea-4715-b6d2-8578279eea4a |
EDSL comes with built-in methods for working with results in a variety of forms (data tables, SQL queries, dataframes, JSON, CSV). We can call the columns
method to see a list of all the components that we can analyze:
[7]:
results.columns
[7]:
0 | |
---|---|
0 | agent.agent_index |
1 | agent.agent_instruction |
2 | agent.agent_name |
3 | answer.concepts |
4 | answer.sentiment |
5 | cache_keys.concepts_cache_key |
6 | cache_keys.sentiment_cache_key |
7 | cache_used.concepts_cache_used |
8 | cache_used.sentiment_cache_used |
9 | comment.concepts_comment |
10 | comment.sentiment_comment |
11 | generated_tokens.concepts_generated_tokens |
12 | generated_tokens.sentiment_generated_tokens |
13 | iteration.iteration |
14 | model.frequency_penalty |
15 | model.inference_service |
16 | model.logprobs |
17 | model.max_tokens |
18 | model.model |
19 | model.model_index |
20 | model.presence_penalty |
21 | model.temperature |
22 | model.top_logprobs |
23 | model.top_p |
24 | prompt.concepts_system_prompt |
25 | prompt.concepts_user_prompt |
26 | prompt.sentiment_system_prompt |
27 | prompt.sentiment_user_prompt |
28 | question_options.concepts_question_options |
29 | question_options.sentiment_question_options |
30 | question_text.concepts_question_text |
31 | question_text.sentiment_question_text |
32 | question_type.concepts_question_type |
33 | question_type.sentiment_question_type |
34 | raw_model_response.concepts_cost |
35 | raw_model_response.concepts_one_usd_buys |
36 | raw_model_response.concepts_raw_model_response |
37 | raw_model_response.sentiment_cost |
38 | raw_model_response.sentiment_one_usd_buys |
39 | raw_model_response.sentiment_raw_model_response |
40 | scenario.scenario_index |
41 | scenario.text |
We can select and print specific components to inspect in a table:
[8]:
results.select("text", "concepts", "sentiment")
[8]:
scenario.text | answer.concepts | answer.sentiment | |
---|---|---|---|
0 | Tune in as I deliver the keynote address at the U.S. Holocaust Memorial Museum’s Annual Days of Remembrance ceremony in Washington, D.C. | ['keynote address', 'U.S. Holocaust Memorial Museum', 'Annual Days of Remembrance', 'Washington, D.C.'] | Neutral |
1 | We’re a nation of immigrants. A nation of dreamers. And as Cinco de Mayo represents, a nation of freedom. | ['nation of immigrants', 'nation of dreamers', 'Cinco de Mayo', 'freedom'] | Positive |
2 | Medicare is stronger and Social Security remains strong. My economic plan has helped extend Medicare solvency by a decade. And I am committed to extending Social Security solvency by making the rich pay their fair share. | ['Medicare', 'Social Security', 'economic plan', 'Medicare solvency', 'Social Security solvency', 'fair share'] | Positive |
3 | Today, the Army Black Knights are taking home West Point’s 10th Commander-in-Chief Trophy. They should be proud. I’m proud of them too – not for the wins, but because after every game they hang up their uniforms and put on another: one representing the United States. | ['Army Black Knights', 'West Point', 'Commander-in-Chief Trophy', 'United States'] | Positive |
4 | This Holocaust Remembrance Day, we mourn the six million Jews who were killed by the Nazis during one of the darkest chapters in human history. And we recommit to heeding the lessons of the Shoah and realizing the responsibility of 'Never Again.' | ['Holocaust Remembrance Day', 'six million Jews', 'Nazis', 'darkest chapters', 'lessons of the Shoah', 'responsibility', 'Never Again'] | Neutral |
5 | The recipients of the Presidential Medal of Freedom haven't just kept faith in freedom. They kept all of America's faith in a better tomorrow. | ['Presidential Medal of Freedom', 'faith in freedom', "America's faith", 'better tomorrow'] | Positive |
6 | Like Jill says, 'Teaching isn’t just a job. It’s a calling.' She knows that in her bones, and I know every educator who joined us at the White House for the first-ever Teacher State Dinner lives out that truth every day. | ['Teaching', 'Calling', 'Educator', 'White House', 'Teacher State Dinner'] | Positive |
7 | Jill and I send warm wishes to Orthodox Christian communities around the world as they celebrate Easter. May the Lord bless and keep you this Easter Sunday and in the year ahead. | ['Jill and I', 'warm wishes', 'Orthodox Christian communities', 'Easter', 'Lord bless and keep you', 'Easter Sunday', 'year ahead'] | Positive |
8 | Dreamers are our loved ones, nurses, teachers, and small business owners – they deserve the promise of health care just like all of us. Today, my Administration is making that real by expanding affordable health coverage through the Affordable Care Act to DACA recipients. | ['Dreamers', 'health care', 'Affordable Care Act', 'DACA recipients', 'affordable health coverage', 'Administration'] | Positive |
9 | With today’s report of 175,000 new jobs, the American comeback continues. Congressional Republicans are fighting to cut taxes for billionaires and let special interests rip folks off, I'm focused on job creation and building an economy that works for the families I grew up with. | ['American comeback', '175,000 new jobs', 'Congressional Republicans', 'cut taxes', 'billionaires', 'special interests', 'job creation', 'economy', 'families'] | Positive |
If our concepts lists are too long, we can run another question prompting a model to condense it. We can specify the number of concepts that we want to get:
[9]:
# Flattening our list of lists for all the texts to use in a follow-on question:
concepts_list = results.select("concepts").to_list(flatten=True)
concepts_list
[9]:
['keynote address',
'U.S. Holocaust Memorial Museum',
'Annual Days of Remembrance',
'Washington, D.C.',
'nation of immigrants',
'nation of dreamers',
'Cinco de Mayo',
'freedom',
'Medicare',
'Social Security',
'economic plan',
'Medicare solvency',
'Social Security solvency',
'fair share',
'Army Black Knights',
'West Point',
'Commander-in-Chief Trophy',
'United States',
'Holocaust Remembrance Day',
'six million Jews',
'Nazis',
'darkest chapters',
'lessons of the Shoah',
'responsibility',
'Never Again',
'Presidential Medal of Freedom',
'faith in freedom',
"America's faith",
'better tomorrow',
'Teaching',
'Calling',
'Educator',
'White House',
'Teacher State Dinner',
'Jill and I',
'warm wishes',
'Orthodox Christian communities',
'Easter',
'Lord bless and keep you',
'Easter Sunday',
'year ahead',
'Dreamers',
'health care',
'Affordable Care Act',
'DACA recipients',
'affordable health coverage',
'Administration',
'American comeback',
'175,000 new jobs',
'Congressional Republicans',
'cut taxes',
'billionaires',
'special interests',
'job creation',
'economy',
'families']
[10]:
q_condense = QuestionList(
question_name="condense",
question_text="Return a condensed list of the following list of concepts: "
+ ", ".join(concepts_list),
max_list_items=10,
)
Note that we can call the run()
method on either a survey of questions or an individual question:
[11]:
results = q_condense.run()
Job UUID | 6f2034dd-b6d3-489f-8b72-c58cf927781c |
Progress Bar URL | https://www.expectedparrot.com/home/remote-job-progress/6f2034dd-b6d3-489f-8b72-c58cf927781c |
Exceptions Report URL | None |
Results UUID | 91274751-8a6a-4ecc-a2d8-3bb2b7fd7af8 |
Results URL | https://www.expectedparrot.com/content/91274751-8a6a-4ecc-a2d8-3bb2b7fd7af8 |
[12]:
results.select("condense")
[12]:
answer.condense | |
---|---|
0 | ['Holocaust Remembrance', 'Nation of Immigrants', 'Freedom and Social Programs', 'Economic Plan and Job Creation', 'U.S. Military and Honors', 'Education and Teaching', 'Faith and Easter', 'American Comeback', 'Health Care and DACA', 'Taxation and Special Interests'] |
Identify criteria for each concept
Similar to our first step, next we can run a question prompting the model to generate criteria for each concept. We could use QuestionFreeText
to generate criteria in an unstructured narrative:
[13]:
from edsl import QuestionFreeText
q_criteria = QuestionFreeText(
question_name="criteria",
question_text="""Describe key criteria for determining whether a text is primarily about the
following concept: {{ scenario.concept }}""",
)
For this question, the scenarios are the concepts that we generated:
[14]:
condensed_concepts_list = results.select("condense").to_list(flatten=True)
scenarios = ScenarioList.from_list("concept", condensed_concepts_list)
scenarios
[14]:
ScenarioList scenarios: 10; keys: ['concept'];
concept | |
---|---|
0 | Holocaust Remembrance |
1 | Nation of Immigrants |
2 | Freedom and Social Programs |
3 | Economic Plan and Job Creation |
4 | U.S. Military and Honors |
5 | Education and Teaching |
6 | Faith and Easter |
7 | American Comeback |
8 | Health Care and DACA |
9 | Taxation and Special Interests |
[15]:
results = q_criteria.by(scenarios).run()
Job UUID | 40d2e0cf-74d9-40ba-a9e1-ab7c08bcb203 |
Progress Bar URL | https://www.expectedparrot.com/home/remote-job-progress/40d2e0cf-74d9-40ba-a9e1-ab7c08bcb203 |
Exceptions Report URL | None |
Results UUID | f74845ba-26d6-4e0b-abd2-1ee187de086c |
Results URL | https://www.expectedparrot.com/content/f74845ba-26d6-4e0b-abd2-1ee187de086c |
[16]:
results.select("concept", "criteria")
[16]:
scenario.concept | answer.criteria | |
---|---|---|
0 | Holocaust Remembrance | Determining whether a text is primarily about Holocaust Remembrance involves evaluating several key criteria: 1. **Subject Focus**: The text should center on the Holocaust, specifically emphasizing the importance of remembering the events, victims, and survivors. It should discuss commemorative practices, memorials, or anniversaries related to the Holocaust. 2. **Purpose and Intent**: The text's purpose should be to educate, memorialize, or promote awareness of the Holocaust. It may aim to prevent future atrocities through remembrance and reflection on past events. 3. **Language and Tone**: The language used should convey a sense of solemnity, respect, and reflection. The tone might be educational, commemorative, or advocacy-oriented, urging readers to remember and learn from the past. 4. **Content and Themes**: Key themes should include memory, history, education, and the moral and ethical lessons of the Holocaust. The text might discuss survivor testimonies, historical accounts, or the impact of Holocaust education on contemporary society. 5. **Contextual References**: The text may reference specific Holocaust Remembrance events, such as Yom HaShoah (Holocaust Remembrance Day), International Holocaust Remembrance Day, or other national and international commemorations. 6. **Audience Engagement**: The text might encourage reader participation in remembrance activities, such as attending memorial services, visiting Holocaust museums, or engaging in educational programs. 7. **Historical Accuracy**: The text should be grounded in historical facts about the Holocaust, providing accurate information about the events, figures, and outcomes associated with it. 8. **Cultural and Social Impact**: The text should reflect on the broader cultural and social significance of remembering the Holocaust, including its impact on Jewish communities and global human rights discourse. By considering these criteria, one can assess whether a text is primarily focused on Holocaust Remembrance and its associated themes. |
1 | Nation of Immigrants | Determining whether a text is primarily about the concept of a "Nation of Immigrants" involves evaluating several key criteria: 1. **Historical Context**: The text discusses the history of immigration in the country, emphasizing waves of immigrants from various regions and how they have shaped the nation’s development. 2. **Cultural Diversity**: It highlights the cultural contributions of different immigrant groups, showcasing the multicultural makeup of the nation and how these diverse cultures coexist and interact. 3. **Immigration Policy**: The text examines the laws, policies, and governmental attitudes towards immigration, including changes over time and their impact on the nation’s demographic and social landscape. 4. **Economic Impact**: Discussion of the economic roles that immigrants have played, such as filling labor shortages, starting businesses, and contributing to innovation and economic growth. 5. **Social Integration and Challenges**: The text addresses how immigrants have integrated into society, including challenges they face such as discrimination, assimilation, and identity issues. 6. **National Identity**: It explores the idea of national identity in the context of being a "Nation of Immigrants," including debates about what it means to belong to the nation and how immigration shapes national values and identity. 7. **Personal Narratives and Case Studies**: The inclusion of personal stories or case studies of immigrants, illustrating their journeys, struggles, and successes, and how these individual experiences reflect broader trends. 8. **Symbolic References**: Use of symbols or references commonly associated with the concept, such as the Statue of Liberty in the United States, which is often linked to welcoming immigrants. 9. **Comparative Analysis**: The text may compare the nation in question with other countries regarding their immigration histories and policies, highlighting what makes the nation uniquely a "Nation of Immigrants." 10. **Public Discourse and Perception**: Examination of how immigration is perceived by the public and portrayed in media and political discourse, reflecting its significance in national conversations. By assessing these criteria, one can determine whether a text is primarily focused on the concept of a "Nation of Immigrants." |
2 | Freedom and Social Programs | Determining whether a text is primarily about the concept of "Freedom and Social Programs" involves analyzing the content for specific themes, ideas, and discussions that align with both freedom and social programs. Here are key criteria to consider: 1. **Discussion of Freedom:** - **Individual Rights:** The text should explore themes related to personal liberties, autonomy, and the rights of individuals. - **Choice and Agency:** Look for discussions on the ability of individuals to make choices without undue restriction or coercion. - **Political Freedom:** Consider whether the text addresses democratic principles, free speech, or civil liberties. - **Economic Freedom:** Analyze if there is a focus on free markets, entrepreneurship, and the ability to engage in economic activities without excessive regulation. 2. **Examination of Social Programs:** - **Types of Programs:** Identify discussions on government or community initiatives designed to support individuals, such as healthcare, education, unemployment benefits, or welfare. - **Purpose and Impact:** Look for analysis of the goals, effectiveness, and outcomes of these programs in addressing social issues. - **Funding and Resources:** Consider whether the text discusses how these programs are funded and managed, including taxation and budget allocation. 3. **Interplay Between Freedom and Social Programs:** - **Balancing Act:** The text should explore the tension or balance between maintaining individual freedoms and implementing social programs. - **Policy Debates:** Look for discussions on how social programs can enhance or restrict freedom, such as debates over government intervention versus personal responsibility. - **Social Justice and Equity:** Consider whether the text addresses how social programs can promote equitable access to opportunities, thus enhancing freedom for marginalized groups. 4. **Philosophical and Ethical Considerations:** - **Moral Arguments:** Identify any ethical discussions around the obligation of society to provide for its members versus the right to personal freedom. - **Ideological Perspectives:** Analyze the text for different ideological viewpoints, such as liberal, conservative, or libertarian perspectives on the relationship between freedom and social programs. 5. **Case Studies or Examples:** - **Real-world Applications:** Look for specific examples or case studies of social programs and their impact on freedom, either positively or negatively. - **Comparative Analysis:** Consider whether the text compares different countries or regions in terms of their approaches to balancing freedom and social programs. By examining these criteria, you can determine whether a text is primarily focused on the concept of "Freedom and Social Programs" and understand the nuances of how these two elements interact. |
3 | Economic Plan and Job Creation | Determining whether a text is primarily about "Economic Plan and Job Creation" involves evaluating several key criteria: 1. **Explicit Mention of Economic Strategies**: The text should explicitly discuss strategies, policies, or proposals aimed at economic development. This includes references to government plans, fiscal policies, or initiatives designed to stimulate economic growth. 2. **Focus on Job Creation**: There should be a clear emphasis on creating jobs. This can include discussions about reducing unemployment, increasing employment opportunities, or specific programs aimed at workforce development. 3. **Goals and Objectives**: The text should outline specific goals related to economic improvement and job creation, such as targets for reducing unemployment rates, increasing GDP, or other measurable economic indicators. 4. **Stakeholder Involvement**: Look for mentions of key stakeholders involved in economic planning and job creation, such as government agencies, private sector partners, or community organizations. 5. **Analysis of Economic Impact**: The text should provide analysis or predictions about the economic impact of the proposed plans or policies. This can include potential benefits or challenges associated with the initiatives. 6. **Use of Economic Terminology**: The presence of economic terminology and concepts, such as "economic growth," "labor market," "investment," "infrastructure development," etc., can indicate a focus on economic planning and job creation. 7. **Case Studies or Examples**: The inclusion of case studies, examples, or historical references to similar economic plans or job creation efforts can help establish the text’s focus on these topics. 8. **Discussion of Funding and Resources**: There should be mention of how the economic plans will be funded, including budget allocations, investments, or other financial resources necessary for implementation. 9. **Policy and Legislative Context**: The text might discuss relevant legislation or policy frameworks that support or hinder economic plans and job creation efforts. 10. **Public and Expert Opinion**: Look for sections that include opinions or perspectives from economists, policy makers, or the general public regarding the economic plans and their potential to create jobs. By assessing these criteria, one can determine whether a text is primarily concerned with "Economic Plan and Job Creation." |
4 | U.S. Military and Honors | Determining whether a text is primarily about the U.S. Military and Honors involves evaluating several key criteria. Here are some important aspects to consider: 1. **Subject Matter**: The text should focus on topics related to the U.S. Military, such as branches of the armed forces (Army, Navy, Air Force, Marine Corps, Coast Guard, and Space Force), military operations, strategy, or history. It should also cover aspects of military honors, including awards, medals, and recognition ceremonies. 2. **Mentions of Military Honors**: The text should include discussions about specific military honors and awards, such as the Medal of Honor, Purple Heart, Silver Star, Bronze Star, or other commendations given for bravery, service, or achievement. It may also describe the criteria for receiving these honors or notable recipients. 3. **Focus on Personnel**: The text should highlight individuals or groups within the U.S. Military, particularly those who have been recognized for exemplary service or acts of valor. This could include stories of heroism, biographies of decorated service members, or profiles of military leaders. 4. **Ceremonial Context**: The text may describe ceremonies related to the awarding of military honors, such as investiture ceremonies, parades, or memorial services. It might also explore the significance and traditions surrounding these events. 5. **Historical and Cultural Context**: The text should provide historical background or cultural significance of military honors within the U.S. Military. This could include the evolution of awards, historical instances of valor, or the role of honors in military culture. 6. **Language and Terminology**: The use of specific military terminology and jargon related to ranks, units, operations, and honors can indicate a focus on the U.S. Military. The presence of detailed descriptions of military protocols or the process of awarding honors is also a key indicator. 7. **Intent and Purpose**: The text should aim to inform, educate, or commemorate aspects of the U.S. Military and its honors. This could be through news articles, historical analyses, personal narratives, or official military communications. By examining these criteria, one can determine if a text is primarily about the U.S. Military and Honors, as opposed to other related or tangential topics. |
5 | Education and Teaching | Determining whether a text is primarily about Education and Teaching involves analyzing several key criteria: 1. **Subject Matter**: The text should focus on topics related to education systems, teaching methods, learning theories, curriculum development, educational policies, classroom management, or teacher-student interactions. 2. **Terminology**: The presence of specific educational terminology and jargon, such as "pedagogy," "curriculum," "lesson plans," "assessment," "learning outcomes," and "educational psychology," is indicative of a focus on education and teaching. 3. **Purpose and Objectives**: The text should aim to inform, analyze, or discuss aspects of education, such as improving teaching practices, enhancing learning experiences, or evaluating educational outcomes. 4. **Audience**: The intended audience might include educators, students, policymakers, educational researchers, or parents, which can indicate the text's focus on education and teaching. 5. **Content and Themes**: The text should explore themes such as educational equity, teacher training, student engagement, educational technology, or instructional strategies. 6. **Examples and Case Studies**: The use of examples, case studies, or anecdotes from educational settings, such as schools, universities, or training programs, can help identify the text's focus. 7. **Authors and Contributors**: Authors with backgrounds in education, such as teachers, professors, educational researchers, or policymakers, can lend credibility to the text's focus on education and teaching. 8. **Structure and Format**: The structure may include sections typical of educational discourse, such as literature reviews, methodology, discussion of results, or recommendations for practice. 9. **References and Citations**: The text may reference educational theories, landmark studies, or influential educators, indicating a basis in educational discourse. 10. **Impact and Implications**: The text should address the implications of educational practices or policies on learners, educators, and educational institutions. By examining these criteria, one can determine whether a text primarily centers on the concept of Education and Teaching. |
6 | Faith and Easter | Determining whether a text is primarily about the concept of Faith and Easter involves analyzing several key criteria: 1. **Thematic Focus**: The text should prominently feature themes related to faith, such as belief, trust, devotion, and spirituality. It should also discuss Easter, which is a central celebration in Christianity commemorating the resurrection of Jesus Christ. Look for discussions on the significance of Easter in the context of faith. 2. **Religious Context**: The text should be set within a religious or spiritual context, particularly Christianity, as Easter is a Christian holiday. It should reference Christian beliefs, practices, or teachings related to the resurrection of Jesus and how these are linked to faith. 3. **Symbols and Imagery**: Easter is associated with specific symbols such as the cross, the empty tomb, Easter eggs, and lilies. The presence of these symbols, along with imagery that evokes themes of renewal, resurrection, and hope, can indicate a focus on Easter and faith. 4. **Narrative Elements**: The text may include stories or narratives from the Bible, particularly those related to the events of Holy Week, Good Friday, and Easter Sunday. It might also include personal testimonies or reflections on how Easter strengthens or challenges one's faith. 5. **Rituals and Traditions**: The text might describe Easter rituals and traditions, such as attending church services, participating in Easter vigils, or engaging in prayer and fasting. These practices are usually tied to expressions of faith. 6. **Language and Tone**: The language used in the text should convey a sense of reverence, hope, and spirituality. The tone might be reflective, celebratory, or contemplative, focusing on the deeper meanings of faith and Easter. 7. **Purpose and Message**: The primary purpose of the text should be to explore, explain, or celebrate the relationship between faith and Easter. It might aim to inspire, educate, or provide insights into how Easter impacts one's faith journey. By examining these criteria, you can assess whether a text is primarily about Faith and Easter, ensuring that both elements are integrally woven into the content and purpose of the text. |
7 | American Comeback | Determining whether a text is primarily about the concept of an "American Comeback" involves looking for several key criteria and themes that are commonly associated with this idea. Here are some of the main criteria to consider: 1. **Economic Recovery**: The text should discuss aspects of economic revival in the United States. This might include topics like job growth, GDP increases, or the resurgence of key industries. 2. **Political Rhetoric**: The phrase "American Comeback" is often used in political contexts, so the text might include speeches, policy proposals, or campaign messages that emphasize restoring America's strength or prominence. 3. **Cultural Renewal**: Look for discussions on cultural revitalization, including shifts in societal values, renewed national pride, or a resurgence in cultural or artistic contributions. 4. **Historical Context**: The text might reference past periods of decline or challenge in American history, followed by recovery or improvement. This could include comparisons to previous eras of economic depression, war, or social upheaval. 5. **Innovation and Technology**: A focus on advancements in technology or innovation that are driving a resurgence in American competitiveness or leadership on the global stage. 6. **Social Progress**: The text may address improvements in social issues, such as education, healthcare, or civil rights, which contribute to a broader narrative of national improvement. 7. **Global Positioning**: Discussions on how the United States is regaining its influence or leadership role internationally, potentially after a period of decline or isolation. 8. **Challenges and Solutions**: A narrative that outlines the challenges faced by the country and the solutions or strategies being implemented to overcome them, leading to a "comeback." By examining these criteria, you can determine whether the text is primarily focused on the concept of an "American Comeback" or if it merely touches on related themes without making it the central focus. |
8 | Health Care and DACA | To determine whether a text is primarily about the concept of "Health Care and DACA," you should evaluate the content against several key criteria: 1. **Focus on DACA**: - The text should explicitly discuss the Deferred Action for Childhood Arrivals (DACA) program. Look for mentions of DACA recipients, commonly referred to as "Dreamers," and any policies or changes affecting their status. 2. **Health Care Context**: - There should be a significant focus on health care issues, policies, or systems. This could include discussions about access to health care services, health insurance, or specific health care programs. 3. **Intersection of Health Care and DACA**: - The text should explore the relationship between health care and DACA. This could involve examining how DACA recipients access health care, the challenges they face in obtaining health insurance, or the impact of DACA status on health care eligibility. 4. **Policy and Legislation**: - Look for discussions on legislation or policy proposals that affect both health care and DACA recipients. This could include state or federal policies that impact the health care rights or access for DACA recipients. 5. **Challenges and Barriers**: - The text should address specific challenges or barriers that DACA recipients face in the health care system. This could include issues like lack of access to Medicaid, difficulties in obtaining private health insurance, or the impact of immigration status on health care access. 6. **Advocacy and Support**: - Consider whether the text discusses advocacy efforts, support systems, or organizations that assist DACA recipients with health care needs. This could include community health initiatives or legal support related to health care access. 7. **Personal Stories and Case Studies**: - The inclusion of personal stories or case studies about DACA recipients navigating the health care system can indicate a focus on this intersection. These narratives can highlight real-world implications and challenges. 8. **Statistical and Research Data**: - The presence of data or research findings related to health care access or outcomes for DACA recipients can indicate a focus on this topic. This might include studies on health disparities or access to services. By evaluating the text against these criteria, you can determine if it is primarily focused on the intersection of health care and DACA, rather than addressing these topics separately or in a broader context. |
9 | Taxation and Special Interests | Determining whether a text is primarily about "Taxation and Special Interests" involves evaluating several key criteria. Here are some important aspects to consider: 1. **Focus on Taxation:** - **Policy Discussion:** The text should discuss tax policies, including changes, reforms, or proposals related to taxation. - **Types of Taxes:** It may cover specific types of taxes (e.g., income tax, corporate tax, sales tax) and their implications. - **Tax Rates and Structures:** Analysis of tax rates, brackets, and structures, including progressive, regressive, or flat tax systems. - **Economic Impact:** Discussion on how taxation affects economic behavior, government revenue, and public services. 2. **Special Interests:** - **Influence on Policy:** The text should explore how special interest groups influence tax policy, including lobbying efforts and political contributions. - **Beneficiaries of Tax Policies:** Examination of which groups or industries benefit from specific tax policies or loopholes. - **Regulatory Capture:** Discussion on how certain interests may dominate regulatory bodies to shape favorable tax outcomes. 3. **Interconnection Between Taxation and Special Interests:** - **Case Studies or Examples:** The text might provide examples of legislation where special interests have significantly influenced tax outcomes. - **Conflict of Interest:** Analysis of conflicts between public interest and the interests of powerful groups in tax legislation. - **Historical Context:** Historical perspective on how special interests have shaped tax policy over time. 4. **Stakeholder Perspectives:** - **Government and Politicians:** Insights into how government officials and politicians interact with special interest groups regarding tax issues. - **Public Opinion:** Consideration of public sentiment toward tax policies influenced by special interests. - **Economic Theories or Models:** Reference to economic theories that explain the relationship between taxation, special interests, and economic outcomes. 5. **Language and Tone:** - **Keywords and Phrases:** Frequent use of terms like "lobbying," "tax breaks," "corporate interests," "tax reform," and "policy influence." - **Analytical Tone:** The text should have an analytical or critical tone, examining the implications and motivations behind tax policies. By evaluating these criteria, one can determine whether a text is primarily focused on the concept of Taxation and Special Interests, assessing both the direct content and the underlying themes. |
Identify the concepts in each text and evaluate based on the criteria
Finally, we can use the concepts and the criteria to run another question where we prompt the model to evaulate each text. Question types QuestionLinearScale
, QuestionRank
or QuestionNumerical
may be appropriate where we want to return a score:
[17]:
from edsl import QuestionLinearScale
q_score = QuestionLinearScale(
question_name="score",
question_text="""Consider the following concept and criteria for determining whether
a given text addresses this concept. Then score how well the following text satisfies
the criteria for the concept.
Concept: {{ scenario.concept }}
Criteria: {{ scenario.criteria }}
Text: {{ scenario.text }}""",
question_options=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
option_labels={0: "Not at all", 10: "Very well"}, # Optional
)
Here we want to use both the texts and the concepts and corresponding criteria together as scenarios of the question:
[18]:
concepts_criteria = [
list(pair)
for pair in zip(
results.select("concept").to_list(), results.select("criteria").to_list()
)
]
len(concepts_criteria)
[18]:
10
[19]:
from edsl import ScenarioList, Scenario
scenarios = ScenarioList(
Scenario({"text": text, "concept": concept, "criteria": criteria})
for text in texts
for [concept, criteria] in concepts_criteria
)
[20]:
results = q_score.by(scenarios).run()
Job UUID | 9b690c88-c601-49d1-bd13-fe957f3aabbf |
Progress Bar URL | https://www.expectedparrot.com/home/remote-job-progress/9b690c88-c601-49d1-bd13-fe957f3aabbf |
Exceptions Report URL | None |
Results UUID | b15feb5c-4fea-43a2-ab23-b71af2a46c07 |
Results URL | https://www.expectedparrot.com/content/b15feb5c-4fea-43a2-ab23-b71af2a46c07 |
We can filter the results based on the responses–e.g., here we just show the non-zero scores:
[21]:
(
results.filter("score > 0")
.select("text", "concept", "score")
)
[21]:
scenario.text | scenario.concept | answer.score | |
---|---|---|---|
0 | Tune in as I deliver the keynote address at the U.S. Holocaust Memorial Museum’s Annual Days of Remembrance ceremony in Washington, D.C. | Holocaust Remembrance | 5 |
1 | We’re a nation of immigrants. A nation of dreamers. And as Cinco de Mayo represents, a nation of freedom. | Nation of Immigrants | 1 |
2 | We’re a nation of immigrants. A nation of dreamers. And as Cinco de Mayo represents, a nation of freedom. | Freedom and Social Programs | 1 |
3 | Medicare is stronger and Social Security remains strong. My economic plan has helped extend Medicare solvency by a decade. And I am committed to extending Social Security solvency by making the rich pay their fair share. | Freedom and Social Programs | 3 |
4 | Medicare is stronger and Social Security remains strong. My economic plan has helped extend Medicare solvency by a decade. And I am committed to extending Social Security solvency by making the rich pay their fair share. | Economic Plan and Job Creation | 1 |
5 | Medicare is stronger and Social Security remains strong. My economic plan has helped extend Medicare solvency by a decade. And I am committed to extending Social Security solvency by making the rich pay their fair share. | American Comeback | 3 |
6 | Medicare is stronger and Social Security remains strong. My economic plan has helped extend Medicare solvency by a decade. And I am committed to extending Social Security solvency by making the rich pay their fair share. | Taxation and Special Interests | 3 |
7 | Today, the Army Black Knights are taking home West Point’s 10th Commander-in-Chief Trophy. They should be proud. I’m proud of them too – not for the wins, but because after every game they hang up their uniforms and put on another: one representing the United States. | U.S. Military and Honors | 3 |
8 | This Holocaust Remembrance Day, we mourn the six million Jews who were killed by the Nazis during one of the darkest chapters in human history. And we recommit to heeding the lessons of the Shoah and realizing the responsibility of 'Never Again.' | Holocaust Remembrance | 9 |
9 | The recipients of the Presidential Medal of Freedom haven't just kept faith in freedom. They kept all of America's faith in a better tomorrow. | Freedom and Social Programs | 1 |
10 | The recipients of the Presidential Medal of Freedom haven't just kept faith in freedom. They kept all of America's faith in a better tomorrow. | American Comeback | 1 |
11 | Like Jill says, 'Teaching isn’t just a job. It’s a calling.' She knows that in her bones, and I know every educator who joined us at the White House for the first-ever Teacher State Dinner lives out that truth every day. | Education and Teaching | 3 |
12 | Jill and I send warm wishes to Orthodox Christian communities around the world as they celebrate Easter. May the Lord bless and keep you this Easter Sunday and in the year ahead. | Faith and Easter | 3 |
13 | Dreamers are our loved ones, nurses, teachers, and small business owners – they deserve the promise of health care just like all of us. Today, my Administration is making that real by expanding affordable health coverage through the Affordable Care Act to DACA recipients. | Nation of Immigrants | 2 |
14 | Dreamers are our loved ones, nurses, teachers, and small business owners – they deserve the promise of health care just like all of us. Today, my Administration is making that real by expanding affordable health coverage through the Affordable Care Act to DACA recipients. | Freedom and Social Programs | 5 |
15 | Dreamers are our loved ones, nurses, teachers, and small business owners – they deserve the promise of health care just like all of us. Today, my Administration is making that real by expanding affordable health coverage through the Affordable Care Act to DACA recipients. | American Comeback | 1 |
16 | Dreamers are our loved ones, nurses, teachers, and small business owners – they deserve the promise of health care just like all of us. Today, my Administration is making that real by expanding affordable health coverage through the Affordable Care Act to DACA recipients. | Health Care and DACA | 6 |
17 | With today’s report of 175,000 new jobs, the American comeback continues. Congressional Republicans are fighting to cut taxes for billionaires and let special interests rip folks off, I'm focused on job creation and building an economy that works for the families I grew up with. | Freedom and Social Programs | 1 |
18 | With today’s report of 175,000 new jobs, the American comeback continues. Congressional Republicans are fighting to cut taxes for billionaires and let special interests rip folks off, I'm focused on job creation and building an economy that works for the families I grew up with. | Economic Plan and Job Creation | 3 |
19 | With today’s report of 175,000 new jobs, the American comeback continues. Congressional Republicans are fighting to cut taxes for billionaires and let special interests rip folks off, I'm focused on job creation and building an economy that works for the families I grew up with. | American Comeback | 3 |
20 | With today’s report of 175,000 new jobs, the American comeback continues. Congressional Republicans are fighting to cut taxes for billionaires and let special interests rip folks off, I'm focused on job creation and building an economy that works for the families I grew up with. | Taxation and Special Interests | 3 |
Posting to the Coop
The Coop is a platform for creating, storing and sharing LLM-based research. It is fully integrated with EDSL and accessible from your workspace or Coop account page. Learn more about creating an account and using the Coop.
Here we post the scenarios, survey and results from above, and this notebook:
[22]:
# scenarios.push(description = "Example scenarios", alias = "example-scenarios", visibility = "public")
[23]:
# survey.push(description = "Example survey", alias = "example-survey", visibility = "public")
[24]:
# results.push(description = "Example results", alias = "example-results", visibility = "public")
We can also post this notebook:
[25]:
from edsl import Notebook
[26]:
nb = Notebook(path = "concept_induction.ipynb")
if refresh := False:
nb.push(
description = "Example code for concept induction",
alias = "concept-induction-notebook",
visibility = "public"
)
else:
nb.patch("https://www.expectedparrot.com/content/RobinHorton/concept-induction-notebook", value = nb)