Analyzing course evaluations

This notebook provides sample EDSL code for using a language model to analyze course evaluations. The analysis is designed as a survey of questions about the evaluations that we prompt an AI agent to answer, using a language model to generate the responses as a dataset.

EDSL is an open-source library for simulating surveys, experiments and other research with AI agents and large language models. Before running the code below, please ensure that you have installed the EDSL library and either activated remote inference from your Coop account or stored API keys for the language models that you want to use with EDSL. Please also see our documentation page for tips and tutorials on getting started using EDSL.

Create questions

We start by creating questions about a set of course evaluations for an agent to answer. EDSL comes with a variety of question types that we can choose from based on the form of the response that we want to get back from a model (multiple choice, linear scale, checkbox, free text, etc.). We can use a {{ placeholder }} in the question texts to parameterize them with each evaluation. This allows us to create different “scenarios” of the questions that we can administer at once.

We start by importing some question types and composing questions in the relevant templates (see examples of all types in the docs):

[1]:
from edsl import QuestionList, QuestionMultipleChoice
[2]:
q_sentiment = QuestionMultipleChoice(
    question_name="sentiment",
    question_text="What is the overall sentiment of this evaluation: {{ evaluation }}",
    question_options=["Positive", "Neutral", "Negative"],
)

q_themes = QuestionList(
    question_name="themes",
    question_text="Summarize the key points of this evaluation: {{ evaluation }}",
    max_list_items=3,  # Optional
)

q_improvements = QuestionList(
    question_name="improvements",
    question_text="Identify areas for improvement based on this evaluation: {{ evaluation }}",
    max_list_items=3,
)

Construct a survey

Next we combine our questions into a survey. This allows us to administer the questions asynchronously (by default), or according to any desired survey logic or rules that we want to add, such as skip/stop rules or giving an agent “memories” of other questions in the survey. Here we create a simple asynchronous survey by passing the list of questions to a Survey object:

[3]:
from edsl import Survey

survey = Survey(questions=[q_sentiment, q_themes, q_improvements])

Select data for review

Next we identify the data to be analyzed. Here we use some mock evaluations for an Econ 101 course stored as a list of texts:

[4]:
evaluations = [
    "I found the course very engaging and informative. The professor did an excellent job breaking down complex concepts, making them accessible to those of us new to economics. However, the pace was a bit fast, and I sometimes struggled to keep up with the weekly readings.",
    "This class was a struggle for me. The material felt dry and difficult to connect with real-world applications, which I think could have made it more interesting. More examples from current events would definitely have helped spark my interest.",
    "Excellent introductory course! The professor was enthusiastic and always willing to offer extra help during office hours. The interactive lectures and the practical assignments made the theory much more digestible and engaging.",
    "As someone with a strong background in math, I appreciated the analytical rigor of this course. However, I wish there had been more discussions that connected the theories we learned to everyday economic issues. It felt a bit isolated from practical realities at times.",
    "I enjoyed the course, especially the group projects, which were both challenging and rewarding. It was great to apply economic concepts to solve real-life problems. I did feel, however, that the feedback on assignments could be more detailed to help us understand our mistakes.",
    "The course content was well-organized, but the lectures were somewhat monotonous and hard to follow. I would suggest incorporating more visual aids and maybe some guest lectures from industry professionals to liven up the sessions.",
    "This was my favorite class this semester! The mix of theory and case studies was perfect, and the exams were fair. I also really appreciated the diversity of perspectives we explored in class, especially in terms of global economic policies.",
    "I found the textbook to be overly complex for an introductory course. It often used jargon that hadn't been explained in lectures, which was confusing. Simpler reading materials or more explanatory lectures would make a big difference for newcomers to economics.",
    "The professor was knowledgeable and clearly passionate about economics, but I felt the course relied too heavily on tests rather than more creative forms of assessment. More varied assignments would make the course more accessible to students with different learning styles.",
    "This class was a solid introduction to economics, though it leaned heavily on theoretical aspects. I would have liked more opportunities to discuss the real-world implications of economic theories, which I believe would enhance understanding and retention of the material.",
]

Add data to the questions

Next we create a ScenarioList with a Scenario containing a key/value for each evaluation that we will add to the questions when we run the survey. EDSL provides methods for generating scenarios from many data sources (PDFs, CSVs, images, tables, dicts, etc.); here we import a list and match the key to our question texts placeholder:

[5]:
from edsl import ScenarioList

scenarios = ScenarioList.from_list("evaluation", evaluations)

Design AI agents

Next we design agents with relevant traits and personas for a language model to use in answering the questions. This can be useful if we want to compare responses among different audiences. We do this by passing dictionaries of traits to Agent objects. We can also choose whether to give an agent additional instructions for answering the survey (independent of individual question texts). Please see documentation for more details and example code for creating agents to use with surveys.

Here we create a persona for the professor of the course and pass it some special instructions:

[6]:
from edsl import Agent

persona = "You are a professor reviewing student evaluations for your recent Econ 101 course."
instruction = "Be very specific and constructive in providing feedback and suggestions."

agent = Agent(traits={"persona": persona}, instruction=instruction)

Select language models

EDSL works with many popular language models that we can use to generate responses for our survey. We can see a current list of all available models:

[7]:
from edsl import Model
[8]:
Model.available()
[8]:
Model Name Service Name Code
Austism/chronos-hermes-13b-v2 deep_infra 0
BAAI/bge-base-en-v1.5 together 1
BAAI/bge-large-en-v1.5 together 2
Gryphe/MythoMax-L2-13b deep_infra 3
Gryphe/MythoMax-L2-13b together 4
Gryphe/MythoMax-L2-13b-Lite together 5
Meta-Llama/Llama-Guard-7b together 6
NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO together 7
NousResearch/Nous-Hermes-2-Yi-34B together 8
Qwen/Qwen1.5-110B-Chat together 9
Qwen/Qwen1.5-72B-Chat together 10
Qwen/Qwen2-72B-Instruct deep_infra 11
Qwen/Qwen2-72B-Instruct together 12
Qwen/Qwen2-7B-Instruct deep_infra 13
Qwen/Qwen2.5-72B-Instruct deep_infra 14
Salesforce/Llama-Rank-V1 together 15
Sao10K/L3-70B-Euryale-v2.1 deep_infra 16
Sao10K/L3.1-70B-Euryale-v2.2 deep_infra 17
WhereIsAI/UAE-Large-V1 together 18
amazon.titan-text-express-v1 bedrock 19
amazon.titan-text-lite-v1 bedrock 20
anthropic.claude-3-5-sonnet-20240620-v1:0 bedrock 21
anthropic.claude-3-haiku-20240307-v1:0 bedrock 22
anthropic.claude-3-opus-20240229-v1:0 bedrock 23
anthropic.claude-3-sonnet-20240229-v1:0 bedrock 24
anthropic.claude-instant-v1 bedrock 25
anthropic.claude-v2 bedrock 26
anthropic.claude-v2:1 bedrock 27
azure:gpt-4o azure 28
azure:gpt-4o-mini azure 29
chatgpt-4o-latest openai 30
claude-3-5-sonnet-20240620 anthropic 31
claude-3-haiku-20240307 anthropic 32
claude-3-opus-20240229 anthropic 33
claude-3-sonnet-20240229 anthropic 34
codellama/CodeLlama-34b-Instruct-hf together 35
codestral-2405 mistral 36
cohere.command-light-text-v14 bedrock 37
cohere.command-r-plus-v1:0 bedrock 38
cohere.command-r-v1:0 bedrock 39
cohere.command-text-v14 bedrock 40
curie:ft-emeritus-2022-11-30-12-58-24 openai 41
curie:ft-emeritus-2022-12-01-01-04-36 openai 42
curie:ft-emeritus-2022-12-01-01-51-20 openai 43
curie:ft-emeritus-2022-12-01-14-16-46 openai 44
curie:ft-emeritus-2022-12-01-14-28-00 openai 45
curie:ft-emeritus-2022-12-01-14-49-45 openai 46
curie:ft-emeritus-2022-12-01-15-29-32 openai 47
curie:ft-emeritus-2022-12-01-15-42-25 openai 48
curie:ft-emeritus-2022-12-01-15-52-24 openai 49
curie:ft-emeritus-2022-12-01-16-40-12 openai 50
cursor/Llama-3-8b-hf together 51
databricks/dbrx-instruct together 52
davinci:ft-emeritus-2022-11-30-14-57-33 openai 53
deepseek-ai/deepseek-llm-67b-chat together 54
gemini-1.0-pro google 55
gemini-1.5-flash google 56
gemini-1.5-pro google 57
gemini-pro google 58
gemma-7b-it groq 59
gemma2-9b-it groq 60
google/gemma-2-27b-it deep_infra 61
google/gemma-2-27b-it together 62
google/gemma-2-9b-it deep_infra 63
google/gemma-2-9b-it together 64
google/gemma-2b-it together 65
gpt-3.5-turbo openai 66
gpt-3.5-turbo-0125 openai 67
gpt-3.5-turbo-1106 openai 68
gpt-3.5-turbo-16k openai 69
gpt-4 openai 70
gpt-4-0125-preview openai 71
gpt-4-0613 openai 72
gpt-4-1106-preview openai 73
gpt-4-turbo openai 74
gpt-4-turbo-2024-04-09 openai 75
gpt-4-turbo-preview openai 76
gpt-4o openai 77
gpt-4o-2024-05-13 openai 78
gpt-4o-2024-08-06 openai 79
gpt-4o-2024-11-20 openai 80
gpt-4o-audio-preview openai 81
gpt-4o-audio-preview-2024-10-01 openai 82
gpt-4o-mini openai 83
gpt-4o-mini-2024-07-18 openai 84
gpt-4o-realtime-preview openai 85
gpt-4o-realtime-preview-2024-10-01 openai 86
lizpreciatior/lzlv_70b_fp16_hf deep_infra 87
llama-3.1-70b-versatile groq 88
llama-3.1-8b-instant groq 89
llama-3.1-sonar-huge-128k-online perplexity 90
llama-3.1-sonar-large-128k-online perplexity 91
llama-3.1-sonar-small-128k-online perplexity 92
llama-guard-3-8b groq 93
llama3-70b-8192 groq 94
llama3-8b-8192 groq 95
llama3-groq-70b-8192-tool-use-preview groq 96
llama3-groq-8b-8192-tool-use-preview groq 97
meta-llama/Llama-2-13b-chat-hf together 98
meta-llama/Llama-2-70b-hf together 99
meta-llama/Llama-2-7b-chat-hf together 100
meta-llama/Llama-3-70b-chat-hf together 101
meta-llama/Llama-3-8b-chat-hf together 102
meta-llama/LlamaGuard-2-8b together 103
meta-llama/Meta-Llama-3-70B-Instruct deep_infra 104
meta-llama/Meta-Llama-3-70B-Instruct-Lite together 105
meta-llama/Meta-Llama-3-70B-Instruct-Turbo together 106
meta-llama/Meta-Llama-3-8B-Instruct deep_infra 107
meta-llama/Meta-Llama-3-8B-Instruct-Lite together 108
meta-llama/Meta-Llama-3-8B-Instruct-Turbo together 109
meta-llama/Meta-Llama-3.1-405B-Instruct deep_infra 110
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbotogether 111
meta-llama/Meta-Llama-3.1-70B-Instruct deep_infra 112
meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo together 113
meta-llama/Meta-Llama-3.1-8B-Instruct deep_infra 114
meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo together 115
meta-llama/Meta-Llama-Guard-3-8B together 116
meta.llama3-1-405b-instruct-v1:0 bedrock 117
meta.llama3-1-70b-instruct-v1:0 bedrock 118
meta.llama3-1-8b-instruct-v1:0 bedrock 119
meta.llama3-70b-instruct-v1:0 bedrock 120
meta.llama3-8b-instruct-v1:0 bedrock 121
microsoft/Phi-3-medium-4k-instruct deep_infra 122
microsoft/WizardLM-2-7B deep_infra 123
microsoft/WizardLM-2-8x22B deep_infra 124
microsoft/WizardLM-2-8x22B together 125
mistral-embed mistral 126
mistral-large-2407 mistral 127
mistral-medium-latest mistral 128
mistral-small-2409 mistral 129
mistral-small-latest mistral 130
mistral.mistral-7b-instruct-v0:2 bedrock 131
mistral.mistral-large-2402-v1:0 bedrock 132
mistral.mixtral-8x7b-instruct-v0:1 bedrock 133
mistralai/Mistral-7B-Instruct-v0.1 together 134
mistralai/Mistral-7B-Instruct-v0.2 together 135
mistralai/Mistral-7B-Instruct-v0.3 deep_infra 136
mistralai/Mistral-7B-Instruct-v0.3 together 137
mistralai/Mistral-7B-v0.1 together 138
mistralai/Mistral-Nemo-Instruct-2407 deep_infra 139
mistralai/Mixtral-8x22B-Instruct-v0.1 together 140
mistralai/Mixtral-8x7B-Instruct-v0.1 deep_infra 141
mistralai/Mixtral-8x7B-Instruct-v0.1 together 142
mistralai/Mixtral-8x7B-v0.1 together 143
mixtral-8x7b-32768 groq 144
o1-mini openai 145
o1-mini-2024-09-12 openai 146
o1-preview openai 147
o1-preview-2024-09-12 openai 148
omni-moderation-2024-09-26 openai 149
omni-moderation-latest openai 150
open-mistral-7b mistral 151
open-mistral-nemo-2407 mistral 152
open-mixtral-8x22b mistral 153
open-mixtral-8x7b mistral 154
openbmb/MiniCPM-Llama3-V-2_5 deep_infra 155
openchat/openchat_3.5 deep_infra 156
pixtral-12b-2409 mistral 157
test test 158
togethercomputer/StripedHyena-Nous-7B together 159
togethercomputer/m2-bert-80M-2k-retrieval together 160
togethercomputer/m2-bert-80M-32k-retrieval together 161
togethercomputer/m2-bert-80M-8k-retrieval together 162
upstage/SOLAR-10.7B-Instruct-v1.0 together 163

We select models to use with a survey by creating Model objects for them. The default model is GPT 4 Preview, meaning that EDSL will use it to run our survey if we do not specify a different model. Here’s we’ll specify that GPT 4o should be used:

[9]:
model = Model("gpt-4o")
model
[9]:

LanguageModel

key value
model gpt-4o
parameters:temperature 0.5
parameters:max_tokens 1000
parameters:top_p 1
parameters:frequency_penalty0
parameters:presence_penalty 0
parameters:logprobs False
parameters:top_logprobs 3

Run the survey

Next we add the scenarios and agent to the survey, and then run it with the specified model. This will generate a dataset of Results that we can store and begin analyzing:

[10]:
results = survey.by(scenarios).by(agent).by(model).run()
Remote Job Log (2024-12-14 10:08:22)
Remote inference activated. Sending job to server...
Your survey is running at the Expected Parrot server...
Job sent to server. (Job uuid=5689db09-0cfb-492d-8397-94f967573f93).
Job status: queued - last update: 2024-12-14 10:08:06 AM
Job status: running - last update: 2024-12-14 10:08:09 AM
Job status: running - last update: 2024-12-14 10:08:12 AM
Job status: running - last update: 2024-12-14 10:08:15 AM
Job status: running - last update: 2024-12-14 10:08:19 AM

Analyzing results

EDSL comes with built-in methods for analyzing results in data tables, dataframes, SQL queries and other formats. We can print a list of all the components that can be accessed:

[11]:
results.columns
[11]:
0
agent.agent_instruction
agent.agent_name
agent.persona
answer.improvements
answer.sentiment
answer.themes
comment.improvements_comment
comment.sentiment_comment
comment.themes_comment
generated_tokens.improvements_generated_tokens
generated_tokens.sentiment_generated_tokens
generated_tokens.themes_generated_tokens
iteration.iteration
model.frequency_penalty
model.logprobs
model.max_tokens
model.model
model.presence_penalty
model.temperature
model.top_logprobs
model.top_p
prompt.improvements_system_prompt
prompt.improvements_user_prompt
prompt.sentiment_system_prompt
prompt.sentiment_user_prompt
prompt.themes_system_prompt
prompt.themes_user_prompt
question_options.improvements_question_options
question_options.sentiment_question_options
question_options.themes_question_options
question_text.improvements_question_text
question_text.sentiment_question_text
question_text.themes_question_text
question_type.improvements_question_type
question_type.sentiment_question_type
question_type.themes_question_type
raw_model_response.improvements_cost
raw_model_response.improvements_one_usd_buys
raw_model_response.improvements_raw_model_response
raw_model_response.sentiment_cost
raw_model_response.sentiment_one_usd_buys
raw_model_response.sentiment_raw_model_response
raw_model_response.themes_cost
raw_model_response.themes_one_usd_buys
raw_model_response.themes_raw_model_response
scenario.evaluation

Here we select just the responses to the questions and display them in a table:

[12]:
results.select("sentiment", "themes", "themes_generated_tokens", "improvements")
[12]:
answer.sentiment answer.themes generated_tokens.themes_generated_tokens answer.improvements
Positive ['Course was engaging and informative', 'Professor effectively simplified complex concepts', 'Pace of the course was too fast for some students'] ["Course was engaging and informative", "Professor effectively simplified complex concepts", "Pace of the course was too fast for some students"] These points highlight the strengths of the course in terms of engagement and clarity, while also noting a common concern regarding the course's pace, which is valuable for adjusting future syllabi or providing additional support. ['Adjust the pacing of the lectures', 'Provide additional resources or summaries for weekly readings', 'Incorporate more review sessions or Q&A periods']
Negative ['Material felt dry', 'Difficult to connect with real-world applications', 'More current event examples needed'] ["Material felt dry", "Difficult to connect with real-world applications", "More current event examples needed"] The student found the course material unengaging, struggled to see its relevance to real-world situations, and suggested incorporating current events to enhance interest and understanding. ['Incorporate more real-world applications', 'Use current events to illustrate concepts', 'Enhance engagement with interactive activities']
Positive ["Professor's enthusiasm and availability for extra help", 'Interactive lectures', 'Practical assignments enhanced understanding'] ["Professor's enthusiasm and availability for extra help", "Interactive lectures", "Practical assignments enhanced understanding"] The evaluation highlights your enthusiasm and willingness to assist students, the effectiveness of interactive lectures, and the practical assignments that helped students grasp theoretical concepts better. ['Increase variety of teaching methods', 'Provide more real-world examples', 'Enhance feedback on assignments']
Neutral ['Appreciation for analytical rigor', 'Desire for more practical discussions', 'Course felt isolated from real-world issues'] ["Appreciation for analytical rigor", "Desire for more practical discussions", "Course felt isolated from real-world issues"] These points highlight the student's appreciation for the mathematical depth of the course, their wish for more application of theories to real-world contexts, and the sense that the course lacked practical relevance. ['Incorporate real-world case studies into lectures', 'Facilitate class discussions on current economic events', 'Assign projects that apply theories to contemporary issues']
Positive ['Enjoyed the course, especially group projects', 'Appreciated applying economic concepts to real-life problems', 'Desire for more detailed feedback on assignments']["Enjoyed the course, especially group projects", "Appreciated applying economic concepts to real-life problems", "Desire for more detailed feedback on assignments"] These points encapsulate the student's positive experience with the group projects and the practical application of course content, while also highlighting a constructive suggestion for improvement regarding feedback on assignments. ['Provide more detailed feedback on assignments', 'Incorporate more real-life problem-solving exercises', 'Ensure group projects are effectively facilitated']
Neutral ['Course content was well-organized', 'Lectures were monotonous and hard to follow', 'Suggested incorporating visual aids and guest lectures'] ["Course content was well-organized", "Lectures were monotonous and hard to follow", "Suggested incorporating visual aids and guest lectures"] The feedback highlights the organization of the course content as a positive aspect, while pointing out that the lectures lacked engagement. The student suggests using visual aids and inviting guest speakers to enhance the learning experience and make sessions more dynamic. ['Incorporate more visual aids into lectures', 'Invite guest speakers from industry', 'Engage students with interactive activities']
Positive ['Favorite class this semester', 'Perfect mix of theory and case studies', 'Appreciation for diversity in global economic perspectives'] ["Favorite class this semester", "Perfect mix of theory and case studies", "Appreciation for diversity in global economic perspectives"] The student's evaluation highlights their overall enjoyment of the course, the effective balance between theoretical content and practical case studies, and the value they found in exploring diverse global economic perspectives. ['Increase opportunities for student participation', 'Enhance feedback on assignments', 'Incorporate more interactive digital tools']
Negative ['Textbook is too complex for an introductory course', 'Textbook uses unexplained jargon', 'Simpler materials or more explanatory lectures needed'] ["Textbook is too complex for an introductory course", "Textbook uses unexplained jargon", "Simpler materials or more explanatory lectures needed"] These points directly address the student's concerns about the complexity and accessibility of the course materials, offering a clear direction for potential improvements. ['Select a more accessible textbook for introductory students', 'Provide supplemental materials that clarify textbook jargon', 'Incorporate more lecture time to explain complex concepts']
Neutral ['Professor is knowledgeable and passionate', 'Course relies too heavily on tests', 'More varied assignments needed for different learning styles'] ["Professor is knowledgeable and passionate", "Course relies too heavily on tests", "More varied assignments needed for different learning styles"] The evaluation highlights your expertise and enthusiasm for the subject, but suggests that the assessment methods could be diversified to accommodate different learning preferences, which could enhance accessibility and engagement. ['Incorporate diverse assessment methods', 'Include project-based assignments', 'Offer optional creative assignments']
Neutral ['Solid introduction to economics', 'Heavily focused on theoretical aspects', 'Desire for more real-world discussion'] ["Solid introduction to economics", "Heavily focused on theoretical aspects", "Desire for more real-world discussion"] The student appreciates the course as a foundational introduction but notes an overemphasis on theory. They suggest incorporating more discussions on real-world applications to improve understanding and retention. ['Incorporate more real-world case studies', 'Facilitate class discussions on current economic events', 'Include practical applications in assignments']

We can do a quick tally of the sentiments:

[13]:
results.select("sentiment").tally()
[13]:
answer.sentiment count
Positive 4
Neutral 4
Negative 2

We can also transform the results into a dataframe:

[14]:
df = results.to_pandas()
df.head()
[14]:
answer.themes answer.sentiment answer.improvements scenario.evaluation agent.agent_name agent.persona agent.agent_instruction model.temperature model.logprobs model.frequency_penalty ... question_options.improvements_question_options question_type.improvements_question_type question_type.themes_question_type question_type.sentiment_question_type comment.sentiment_comment comment.themes_comment comment.improvements_comment generated_tokens.sentiment_generated_tokens generated_tokens.improvements_generated_tokens generated_tokens.themes_generated_tokens
0 ['Course was engaging and informative', 'Profe... Positive ['Adjust the pacing of the lectures', 'Provide... I found the course very engaging and informati... Agent_0 You are a professor reviewing student evaluati... Be very specific and constructive in providing... 0.5 False 0 ... NaN list list multiple_choice The evaluation is largely positive, highlighti... These points highlight the strengths of the co... The feedback highlights that while the course ... Positive\n\nThe evaluation is largely positive... ["Adjust the pacing of the lectures", "Provide... ["Course was engaging and informative", "Profe...
1 ['Material felt dry', 'Difficult to connect wi... Negative ['Incorporate more real-world applications', '... This class was a struggle for me. The material... Agent_1 You are a professor reviewing student evaluati... Be very specific and constructive in providing... 0.5 False 0 ... NaN list list multiple_choice The evaluation expresses dissatisfaction with ... The student found the course material unengagi... By integrating more real-world applications an... Negative\n\nThe evaluation expresses dissatisf... ["Incorporate more real-world applications", "... ["Material felt dry", "Difficult to connect wi...
2 ["Professor's enthusiasm and availability for ... Positive ['Increase variety of teaching methods', 'Prov... Excellent introductory course! The professor w... Agent_2 You are a professor reviewing student evaluati... Be very specific and constructive in providing... 0.5 False 0 ... NaN list list multiple_choice The evaluation highlights several positive asp... The evaluation highlights your enthusiasm and ... While the evaluation is positive, incorporatin... Positive\n\nThe evaluation highlights several ... ["Increase variety of teaching methods", "Prov... ["Professor's enthusiasm and availability for ...
3 ['Appreciation for analytical rigor', 'Desire ... Neutral ['Incorporate real-world case studies into lec... As someone with a strong background in math, I... Agent_3 You are a professor reviewing student evaluati... Be very specific and constructive in providing... 0.5 False 0 ... NaN list list multiple_choice The evaluation contains both positive and nega... These points highlight the student's appreciat... These suggestions aim to bridge the gap betwee... Neutral\n\nThe evaluation contains both positi... ["Incorporate real-world case studies into lec... ["Appreciation for analytical rigor", "Desire ...
4 ['Enjoyed the course, especially group project... Positive ['Provide more detailed feedback on assignment... I enjoyed the course, especially the group pro... Agent_4 You are a professor reviewing student evaluati... Be very specific and constructive in providing... 0.5 False 0 ... NaN list list multiple_choice The overall sentiment is positive because the ... These points encapsulate the student's positiv... The student appreciated the group projects and... Positive\n\nThe overall sentiment is positive ... ["Provide more detailed feedback on assignment... ["Enjoyed the course, especially group project...

5 rows × 46 columns

We can also use pandas methods by first converting:

[15]:
df_sentiment = results.to_pandas()["answer.sentiment"]
df_sentiment.value_counts()
[15]:
answer.sentiment
Positive    4
Neutral     4
Negative    2
Name: count, dtype: int64

Use responses to construct new questions

We can use the responses to our initial questions to construct more questions about the texts. For example, we can prompt a model to condense the individual lists of themes and areas for improvement into short lists, and then use the new lists to quantify the topics across the set of evaluations.

Here we take the lists of themes in each evaluation, flatten them into a (duplicative) list, and then create a new question prompting a model to condense it for us:

[16]:
results.select("themes", "themes_generated_tokens")
[16]:
answer.themes generated_tokens.themes_generated_tokens
['Course was engaging and informative', 'Professor effectively simplified complex concepts', 'Pace of the course was too fast for some students'] ["Course was engaging and informative", "Professor effectively simplified complex concepts", "Pace of the course was too fast for some students"] These points highlight the strengths of the course in terms of engagement and clarity, while also noting a common concern regarding the course's pace, which is valuable for adjusting future syllabi or providing additional support.
['Material felt dry', 'Difficult to connect with real-world applications', 'More current event examples needed'] ["Material felt dry", "Difficult to connect with real-world applications", "More current event examples needed"] The student found the course material unengaging, struggled to see its relevance to real-world situations, and suggested incorporating current events to enhance interest and understanding.
["Professor's enthusiasm and availability for extra help", 'Interactive lectures', 'Practical assignments enhanced understanding'] ["Professor's enthusiasm and availability for extra help", "Interactive lectures", "Practical assignments enhanced understanding"] The evaluation highlights your enthusiasm and willingness to assist students, the effectiveness of interactive lectures, and the practical assignments that helped students grasp theoretical concepts better.
['Appreciation for analytical rigor', 'Desire for more practical discussions', 'Course felt isolated from real-world issues'] ["Appreciation for analytical rigor", "Desire for more practical discussions", "Course felt isolated from real-world issues"] These points highlight the student's appreciation for the mathematical depth of the course, their wish for more application of theories to real-world contexts, and the sense that the course lacked practical relevance.
['Enjoyed the course, especially group projects', 'Appreciated applying economic concepts to real-life problems', 'Desire for more detailed feedback on assignments']["Enjoyed the course, especially group projects", "Appreciated applying economic concepts to real-life problems", "Desire for more detailed feedback on assignments"] These points encapsulate the student's positive experience with the group projects and the practical application of course content, while also highlighting a constructive suggestion for improvement regarding feedback on assignments.
['Course content was well-organized', 'Lectures were monotonous and hard to follow', 'Suggested incorporating visual aids and guest lectures'] ["Course content was well-organized", "Lectures were monotonous and hard to follow", "Suggested incorporating visual aids and guest lectures"] The feedback highlights the organization of the course content as a positive aspect, while pointing out that the lectures lacked engagement. The student suggests using visual aids and inviting guest speakers to enhance the learning experience and make sessions more dynamic.
['Favorite class this semester', 'Perfect mix of theory and case studies', 'Appreciation for diversity in global economic perspectives'] ["Favorite class this semester", "Perfect mix of theory and case studies", "Appreciation for diversity in global economic perspectives"] The student's evaluation highlights their overall enjoyment of the course, the effective balance between theoretical content and practical case studies, and the value they found in exploring diverse global economic perspectives.
['Textbook is too complex for an introductory course', 'Textbook uses unexplained jargon', 'Simpler materials or more explanatory lectures needed'] ["Textbook is too complex for an introductory course", "Textbook uses unexplained jargon", "Simpler materials or more explanatory lectures needed"] These points directly address the student's concerns about the complexity and accessibility of the course materials, offering a clear direction for potential improvements.
['Professor is knowledgeable and passionate', 'Course relies too heavily on tests', 'More varied assignments needed for different learning styles'] ["Professor is knowledgeable and passionate", "Course relies too heavily on tests", "More varied assignments needed for different learning styles"] The evaluation highlights your expertise and enthusiasm for the subject, but suggests that the assessment methods could be diversified to accommodate different learning preferences, which could enhance accessibility and engagement.
['Solid introduction to economics', 'Heavily focused on theoretical aspects', 'Desire for more real-world discussion'] ["Solid introduction to economics", "Heavily focused on theoretical aspects", "Desire for more real-world discussion"] The student appreciates the course as a foundational introduction but notes an overemphasis on theory. They suggest incorporating more discussions on real-world applications to improve understanding and retention.
[17]:
themes = results.select("themes").to_list(flatten = True)

Next we construct a question to condense the list into a new list:

[18]:
q_condensed_themes = QuestionList(
    question_name="condensed_themes",
    question_text="""Combine the following list of themes extracted from the evaluations
    into a consolidated, non-redundant list: """
    + ", ".join(themes),
    max_list_items=10,
)

Now we run the question and select the new list. Note that we can choose whether we want to use the agent for this question by not adding it to the question when we run it:

[19]:
condensed_themes = q_condensed_themes.run().select("condensed_themes").to_list()[0]
condensed_themes
Remote Job Log (2024-12-14 10:09:15)
Remote inference activated. Sending job to server...
Your survey is running at the Expected Parrot server...
Job sent to server. (Job uuid=f00681f8-c998-4130-889e-2c81d73c3f84).
Job status: queued - last update: 2024-12-14 10:09:02 AM
Job status: running - last update: 2024-12-14 10:09:05 AM
Job status: running - last update: 2024-12-14 10:09:08 AM
Job status: running - last update: 2024-12-14 10:09:11 AM
[19]:
['Engaging and informative course',
 'Effective simplification of complex concepts',
 'Need for more real-world applications',
 'Desire for more practical and varied assignments',
 "Appreciation for professor's enthusiasm and support",
 'Course content organization and clarity',
 'Need for more interactive and dynamic lectures',
 'Appreciation for diversity in perspectives',
 'Challenges with textbook complexity',
 'Balance between theory and practical examples']

Now we can create a question to identify all the themes in the list that appear in each evaluation (our new list becomes the list of answer options):

[20]:
from edsl import QuestionCheckBox

q_themes_list = QuestionCheckBox(
    question_name="themes_list",
    question_text="Select all of the themes that are mentioned in this evaluation: {{ evaluation }}",
    question_options=condensed_themes,
)

Here we run the question and show a table listing all the themes for each evaluation in the results:

[21]:
themes_lists = q_themes_list.by(scenarios).by(agent).run()
themes_lists.select("evaluation", "themes_list")
Remote Job Log (2024-12-14 10:09:40)
Remote inference activated. Sending job to server...
Your survey is running at the Expected Parrot server...
Job sent to server. (Job uuid=99cfb536-2588-48d5-b5e8-70b365dbad03).
Job status: queued - last update: 2024-12-14 10:09:27 AM
Job status: running - last update: 2024-12-14 10:09:30 AM
Job status: running - last update: 2024-12-14 10:09:33 AM
Job status: running - last update: 2024-12-14 10:09:37 AM
[21]:
scenario.evaluation answer.themes_list
I found the course very engaging and informative. The professor did an excellent job breaking down complex concepts, making them accessible to those of us new to economics. However, the pace was a bit fast, and I sometimes struggled to keep up with the weekly readings. ['Engaging and informative course', 'Effective simplification of complex concepts']
This class was a struggle for me. The material felt dry and difficult to connect with real-world applications, which I think could have made it more interesting. More examples from current events would definitely have helped spark my interest. ['Need for more real-world applications', 'Need for more interactive and dynamic lectures']
Excellent introductory course! The professor was enthusiastic and always willing to offer extra help during office hours. The interactive lectures and the practical assignments made the theory much more digestible and engaging. ['Engaging and informative course', 'Effective simplification of complex concepts', "Appreciation for professor's enthusiasm and support"]
As someone with a strong background in math, I appreciated the analytical rigor of this course. However, I wish there had been more discussions that connected the theories we learned to everyday economic issues. It felt a bit isolated from practical realities at times. ['Need for more real-world applications', 'Balance between theory and practical examples']
I enjoyed the course, especially the group projects, which were both challenging and rewarding. It was great to apply economic concepts to solve real-life problems. I did feel, however, that the feedback on assignments could be more detailed to help us understand our mistakes.['Engaging and informative course', 'Desire for more practical and varied assignments']
The course content was well-organized, but the lectures were somewhat monotonous and hard to follow. I would suggest incorporating more visual aids and maybe some guest lectures from industry professionals to liven up the sessions. ['Course content organization and clarity', 'Need for more interactive and dynamic lectures']
This was my favorite class this semester! The mix of theory and case studies was perfect, and the exams were fair. I also really appreciated the diversity of perspectives we explored in class, especially in terms of global economic policies. ['Engaging and informative course', 'Appreciation for diversity in perspectives', 'Balance between theory and practical examples']
I found the textbook to be overly complex for an introductory course. It often used jargon that hadn't been explained in lectures, which was confusing. Simpler reading materials or more explanatory lectures would make a big difference for newcomers to economics. ['Course content organization and clarity', 'Challenges with textbook complexity']
The professor was knowledgeable and clearly passionate about economics, but I felt the course relied too heavily on tests rather than more creative forms of assessment. More varied assignments would make the course more accessible to students with different learning styles. ['Desire for more practical and varied assignments', "Appreciation for professor's enthusiasm and support"]
This class was a solid introduction to economics, though it leaned heavily on theoretical aspects. I would have liked more opportunities to discuss the real-world implications of economic theories, which I believe would enhance understanding and retention of the material. ['Need for more real-world applications', 'Balance between theory and practical examples']
[22]:
wide_evaluation_themes = themes_lists.select("evaluation", "themes_list").to_scenario_list().expand("themes_list").rename({"themes_list": "theme"})
wide_evaluation_themes
[22]:

ScenarioList scenarios: 22; keys: ['theme', 'evaluation'];

evaluation theme
I found the course very engaging and informative. The professor did an excellent job breaking down complex concepts, making them accessible to those of us new to economics. However, the pace was a bit fast, and I sometimes struggled to keep up with the weekly readings. Engaging and informative course
I found the course very engaging and informative. The professor did an excellent job breaking down complex concepts, making them accessible to those of us new to economics. However, the pace was a bit fast, and I sometimes struggled to keep up with the weekly readings. Effective simplification of complex concepts
This class was a struggle for me. The material felt dry and difficult to connect with real-world applications, which I think could have made it more interesting. More examples from current events would definitely have helped spark my interest. Need for more real-world applications
This class was a struggle for me. The material felt dry and difficult to connect with real-world applications, which I think could have made it more interesting. More examples from current events would definitely have helped spark my interest. Need for more interactive and dynamic lectures
Excellent introductory course! The professor was enthusiastic and always willing to offer extra help during office hours. The interactive lectures and the practical assignments made the theory much more digestible and engaging. Engaging and informative course
Excellent introductory course! The professor was enthusiastic and always willing to offer extra help during office hours. The interactive lectures and the practical assignments made the theory much more digestible and engaging. Effective simplification of complex concepts
Excellent introductory course! The professor was enthusiastic and always willing to offer extra help during office hours. The interactive lectures and the practical assignments made the theory much more digestible and engaging. Appreciation for professor's enthusiasm and support
As someone with a strong background in math, I appreciated the analytical rigor of this course. However, I wish there had been more discussions that connected the theories we learned to everyday economic issues. It felt a bit isolated from practical realities at times. Need for more real-world applications
As someone with a strong background in math, I appreciated the analytical rigor of this course. However, I wish there had been more discussions that connected the theories we learned to everyday economic issues. It felt a bit isolated from practical realities at times. Balance between theory and practical examples
I enjoyed the course, especially the group projects, which were both challenging and rewarding. It was great to apply economic concepts to solve real-life problems. I did feel, however, that the feedback on assignments could be more detailed to help us understand our mistakes.Engaging and informative course
I enjoyed the course, especially the group projects, which were both challenging and rewarding. It was great to apply economic concepts to solve real-life problems. I did feel, however, that the feedback on assignments could be more detailed to help us understand our mistakes.Desire for more practical and varied assignments
The course content was well-organized, but the lectures were somewhat monotonous and hard to follow. I would suggest incorporating more visual aids and maybe some guest lectures from industry professionals to liven up the sessions. Course content organization and clarity
The course content was well-organized, but the lectures were somewhat monotonous and hard to follow. I would suggest incorporating more visual aids and maybe some guest lectures from industry professionals to liven up the sessions. Need for more interactive and dynamic lectures
This was my favorite class this semester! The mix of theory and case studies was perfect, and the exams were fair. I also really appreciated the diversity of perspectives we explored in class, especially in terms of global economic policies. Engaging and informative course
This was my favorite class this semester! The mix of theory and case studies was perfect, and the exams were fair. I also really appreciated the diversity of perspectives we explored in class, especially in terms of global economic policies. Appreciation for diversity in perspectives
This was my favorite class this semester! The mix of theory and case studies was perfect, and the exams were fair. I also really appreciated the diversity of perspectives we explored in class, especially in terms of global economic policies. Balance between theory and practical examples
I found the textbook to be overly complex for an introductory course. It often used jargon that hadn't been explained in lectures, which was confusing. Simpler reading materials or more explanatory lectures would make a big difference for newcomers to economics. Course content organization and clarity
I found the textbook to be overly complex for an introductory course. It often used jargon that hadn't been explained in lectures, which was confusing. Simpler reading materials or more explanatory lectures would make a big difference for newcomers to economics. Challenges with textbook complexity
The professor was knowledgeable and clearly passionate about economics, but I felt the course relied too heavily on tests rather than more creative forms of assessment. More varied assignments would make the course more accessible to students with different learning styles. Desire for more practical and varied assignments
The professor was knowledgeable and clearly passionate about economics, but I felt the course relied too heavily on tests rather than more creative forms of assessment. More varied assignments would make the course more accessible to students with different learning styles. Appreciation for professor's enthusiasm and support
This class was a solid introduction to economics, though it leaned heavily on theoretical aspects. I would have liked more opportunities to discuss the real-world implications of economic theories, which I believe would enhance understanding and retention of the material. Need for more real-world applications
This class was a solid introduction to economics, though it leaned heavily on theoretical aspects. I would have liked more opportunities to discuss the real-world implications of economic theories, which I believe would enhance understanding and retention of the material. Balance between theory and practical examples
[23]:
wide_evaluation_themes.tally("theme")
[23]:
theme count
Engaging and informative course 4
Need for more real-world applications 3
Balance between theory and practical examples 3
Effective simplification of complex concepts 2
Need for more interactive and dynamic lectures 2
Appreciation for professor's enthusiasm and support 2
Desire for more practical and varied assignments 2
Course content organization and clarity 2
Appreciation for diversity in perspectives 1
Challenges with textbook complexity 1

We can do the same thing with the areas of improvement:

[24]:
improvements = results.select("improvements").to_list(flatten=True)
improvements
[24]:
0
Adjust the pacing of the lectures
Provide additional resources or summaries for weekly readings
Incorporate more review sessions or Q&A periods
Incorporate more real-world applications
Use current events to illustrate concepts
Enhance engagement with interactive activities
Increase variety of teaching methods
Provide more real-world examples
Enhance feedback on assignments
Incorporate real-world case studies into lectures
Facilitate class discussions on current economic events
Assign projects that apply theories to contemporary issues
Provide more detailed feedback on assignments
Incorporate more real-life problem-solving exercises
Ensure group projects are effectively facilitated
Incorporate more visual aids into lectures
Invite guest speakers from industry
Engage students with interactive activities
Increase opportunities for student participation
Enhance feedback on assignments
Incorporate more interactive digital tools
Select a more accessible textbook for introductory students
Provide supplemental materials that clarify textbook jargon
Incorporate more lecture time to explain complex concepts
Incorporate diverse assessment methods
Include project-based assignments
Offer optional creative assignments
Incorporate more real-world case studies
Facilitate class discussions on current economic events
Include practical applications in assignments
[25]:
q_condensed_improvements = QuestionList(
    question_name="condensed_improvements",
    question_text="""Combine the following list of areas for improvement from the evaluations
    into a consolidated, non-redundant list: """
    + ", ".join(improvements),
    max_list_items=10,
)
[26]:
condensed_improvements = (
    q_condensed_improvements.run().select("condensed_improvements").to_list()[0]
)
condensed_improvements
Remote Job Log (2024-12-14 10:10:29)
Remote inference activated. Sending job to server...
Your survey is running at the Expected Parrot server...
Job sent to server. (Job uuid=c4cc28ae-eedc-45e0-ac21-b75bda5d4050).
Job status: queued - last update: 2024-12-14 10:10:04 AM
Job status: running - last update: 2024-12-14 10:10:07 AM
Job status: running - last update: 2024-12-14 10:10:13 AM
[26]:
['Adjust lecture pacing and incorporate interactive activities',
 'Provide additional resources and summaries for readings',
 'Enhance feedback and provide detailed assignment critiques',
 'Incorporate real-world applications and examples',
 'Facilitate discussions and projects on current events',
 'Increase variety and use of teaching methods and visual aids',
 'Incorporate diverse and project-based assessments',
 'Invite guest speakers and use real-world case studies',
 'Select accessible and clear textbooks',
 'Include practical applications and problem-solving exercises']
[27]:
q_improvements_list = QuestionCheckBox(
    question_name="improvements_list",
    question_text="Select all of the improvements that are mentioned in this evaluation: {{ evaluation }}",
    question_options=condensed_improvements,
)
[28]:
improvements_lists = q_improvements_list.by(scenarios).by(agent).run()
improvements_lists.select("evaluation", "improvements_list")
Remote Job Log (2024-12-14 10:11:12)
Remote inference activated. Sending job to server...
Your survey is running at the Expected Parrot server...
Job sent to server. (Job uuid=c0140c60-e385-4eb1-8cf9-309f5759e8bf).
Job status: queued - last update: 2024-12-14 10:10:55 AM
Job status: queued - last update: 2024-12-14 10:10:58 AM
Job status: running - last update: 2024-12-14 10:11:02 AM
Job status: running - last update: 2024-12-14 10:11:05 AM
Job status: running - last update: 2024-12-14 10:11:08 AM
[28]:
scenario.evaluation answer.improvements_list
I found the course very engaging and informative. The professor did an excellent job breaking down complex concepts, making them accessible to those of us new to economics. However, the pace was a bit fast, and I sometimes struggled to keep up with the weekly readings. ['Adjust lecture pacing and incorporate interactive activities', 'Provide additional resources and summaries for readings']
This class was a struggle for me. The material felt dry and difficult to connect with real-world applications, which I think could have made it more interesting. More examples from current events would definitely have helped spark my interest. ['Incorporate real-world applications and examples', 'Facilitate discussions and projects on current events']
Excellent introductory course! The professor was enthusiastic and always willing to offer extra help during office hours. The interactive lectures and the practical assignments made the theory much more digestible and engaging. []
As someone with a strong background in math, I appreciated the analytical rigor of this course. However, I wish there had been more discussions that connected the theories we learned to everyday economic issues. It felt a bit isolated from practical realities at times. ['Incorporate real-world applications and examples', 'Facilitate discussions and projects on current events', 'Include practical applications and problem-solving exercises']
I enjoyed the course, especially the group projects, which were both challenging and rewarding. It was great to apply economic concepts to solve real-life problems. I did feel, however, that the feedback on assignments could be more detailed to help us understand our mistakes.['Enhance feedback and provide detailed assignment critiques']
The course content was well-organized, but the lectures were somewhat monotonous and hard to follow. I would suggest incorporating more visual aids and maybe some guest lectures from industry professionals to liven up the sessions. ['Increase variety and use of teaching methods and visual aids', 'Invite guest speakers and use real-world case studies']
This was my favorite class this semester! The mix of theory and case studies was perfect, and the exams were fair. I also really appreciated the diversity of perspectives we explored in class, especially in terms of global economic policies. []
I found the textbook to be overly complex for an introductory course. It often used jargon that hadn't been explained in lectures, which was confusing. Simpler reading materials or more explanatory lectures would make a big difference for newcomers to economics. ['Provide additional resources and summaries for readings', 'Select accessible and clear textbooks']
The professor was knowledgeable and clearly passionate about economics, but I felt the course relied too heavily on tests rather than more creative forms of assessment. More varied assignments would make the course more accessible to students with different learning styles. ['Incorporate diverse and project-based assessments']
This class was a solid introduction to economics, though it leaned heavily on theoretical aspects. I would have liked more opportunities to discuss the real-world implications of economic theories, which I believe would enhance understanding and retention of the material. ['Incorporate real-world applications and examples', 'Facilitate discussions and projects on current events', 'Include practical applications and problem-solving exercises']
[29]:
wide_themes = (
    improvements_lists
    .select("evaluation", "improvements_list")
    .to_scenario_list()
    .expand("improvements_list")
    .rename({"improvements_list": "theme"})
)
[30]:
wide_themes.tally("theme")
[30]:
theme count
Incorporate real-world applications and examples 3
Facilitate discussions and projects on current events 3
Provide additional resources and summaries for readings 2
Include practical applications and problem-solving exercises 2
Adjust lecture pacing and incorporate interactive activities 1
Enhance feedback and provide detailed assignment critiques 1
Increase variety and use of teaching methods and visual aids 1
Invite guest speakers and use real-world case studies 1
Select accessible and clear textbooks 1
Incorporate diverse and project-based assessments 1
[31]:
improvements_summary = wide_themes.tally("theme")
[32]:
summary_string = improvements_summary.print(format = "markdown", return_string = True)

Other examples

Please check out the EDSL Docs for examples of other methods and templates for use cases, and join our Discord channel to ask questions and with other users!

Posting to the Coop

The Coop is a platform for creating, storing and sharing LLM-based research. It is fully integrated with EDSL and accessible from your workspace or Coop account page. Learn more about creating an account and using the Coop.

We can post any EDSL object to the Coop by calling the push() method on it, including this notebook:

[33]:
from edsl import Notebook
[34]:
n = Notebook(path = "analyze_evaluations.ipynb")
[35]:
info = n.push(description = "Example code for analyzing course evaluations", visibility = "public")
info
[35]:
{'description': 'Example code for analyzing course evaluations',
 'object_type': 'notebook',
 'url': 'https://www.expectedparrot.com/content/fd042fd0-b29a-4721-89b4-3b6f418e94a6',
 'uuid': 'fd042fd0-b29a-4721-89b4-3b6f418e94a6',
 'version': '0.1.39.dev1',
 'visibility': 'public'}

To update an object:

[36]:
n = Notebook(path = "analyze_evaluations.ipynb") # resave
[37]:
n.patch(uuid = info["uuid"], value = n)
[37]:
{'status': 'success'}