Interviews

Interview class

This module contains the Interview class, which is responsible for conducting an interview asynchronously.

class edsl.jobs.interviews.Interview.Interview(agent: Agent, survey: Survey, scenario: Scenario, model: Type[LanguageModel], debug: bool | None = False, iteration: int = 0, cache: Cache | None = None, sidecar_model: LanguageModel | None = None, skip_retry: bool = False, raise_validation_errors: bool = True)[source]

Bases: object

An ‘interview’ is one agent answering one survey, with one language model, for a given scenario.

The main method is async_conduct_interview, which conducts the interview asynchronously. Most of the class is dedicated to creating the tasks for each question in the survey, and then running them.

async _answer_question_and_record_task(*, question: QuestionBase, task=None) → AgentResponseDict[source]: Answer a question and records the task.

_build_question_tasks(model_buckets: ModelBuckets) → list[Task][source]

Create a task for each question, with dependencies on the questions that must be answered before this one can be answered.

Parameters:

debug – whether to use debug mode, in which case InvigilatorDebug is used.
model_buckets – the model buckets used to track and control usage rates.

_cancel_skipped_questions(current_question: QuestionBase) → None[source]

Cancel the tasks for questions that are skipped.

Parameters:: current_question – the question that was just answered.

It first determines the next question, given the current question and the current answers. If the next question is the end of the survey, it cancels all remaining tasks. If the next question is after the current question, it cancels all tasks between the current question and the next question.

_create_question_task(*, question: QuestionBase, tasks_that_must_be_completed_before: list[Task], model_buckets: ModelBuckets, iteration: int = 0) → Task[source]

Create a task that depends on the passed-in dependencies that are awaited before the task is run.

Parameters:

question – the question to be answered. This is the question we are creating a task for.
tasks_that_must_be_completed_before – the tasks that must be completed before the focal task is run.
model_buckets – the model buckets used to track and control usage rates.
debug – whether to use debug mode, in which case InvigilatorDebug is used.
iteration – the iteration number for the interview.

The task is created by a QuestionTaskCreator, which is responsible for creating the task and managing its dependencies. It is passed a reference to the function that will be called to answer the question. It is passed a list “tasks_that_must_be_completed_before” that are awaited before the task is run. These are added as a dependency to the focal task.

_extract_valid_results() → Generator[Answers, None, None][source]

Extract the valid results from the list of results.

It iterates through the tasks and invigilators, and yields the results of the tasks that are done. If a task is not done, it raises a ValueError. If an exception is raised in the task, it records the exception in the Interview instance except if the task was cancelled, which is expected behavior.

>>> i = Interview.example()
>>> result, _ = asyncio.run(i.async_conduct_interview())
>>> results = list(i._extract_valid_results())
>>> len(results) == len(i.survey)
True

_get_estimated_request_tokens(question) → float[source]: Estimate the number of tokens that will be required to run the focal task.

_get_invigilator(question: QuestionBase) → InvigilatorBase[source]

Return an invigilator for the given question.

Parameters:

question – the question to be answered
debug – whether to use debug mode, in which case InvigilatorDebug is used.

_get_tasks_that_must_be_completed_before(*, tasks: list[Task], question: QuestionBase) → Generator[Task, None, None][source]

Return the tasks that must be completed before the given question can be answered.

Parameters:

tasks – a list of tasks that have been created so far.
question – the question for which we are determining dependencies.

If a question has no dependencies, this will be an empty list, [].

_handle_exception(e: Exception, invigilator: InvigilatorBase, task=None)[source]

_skip_this_question(current_question: QuestionBase) → bool[source]

Determine if the current question should be skipped.

Parameters:: current_question – the question to be answered.

async async_conduct_interview(model_buckets: ModelBuckets | None = None, stop_on_exception: bool = False, sidecar_model: LanguageModel | None = None, raise_validation_errors: bool = True) → tuple[Answers, List[dict[str, Any]]][source]

Conduct an Interview asynchronously. It returns a tuple with the answers and a list of valid results.

Parameters:

model_buckets – a dictionary of token buckets for the model.
debug – run without calls to LLM.
stop_on_exception – if True, stops the interview if an exception is raised.
sidecar_model – a sidecar model used to answer questions.

Example usage:

>>> i = Interview.example()
>>> result, _ = asyncio.run(i.async_conduct_interview())
>>> result['q0']
'yes'

>>> i = Interview.example(throw_exception = True)
>>> result, _ = asyncio.run(i.async_conduct_interview())
>>> i.exceptions
{'q0': ...
>>> i = Interview.example()
>>> result, _ = asyncio.run(i.async_conduct_interview(stop_on_exception = True))
Traceback (most recent call last):
...
asyncio.exceptions.CancelledError

property dag: DAG[source]

Return the directed acyclic graph for the survey.

The DAG, or directed acyclic graph, is a dictionary that maps question names to their dependencies. It is used to determine the order in which questions should be answered. This reflects both agent ‘memory’ considerations and ‘skip’ logic. The ‘textify’ parameter is set to True, so that the question names are returned as strings rather than integer indices.

>>> i = Interview.example()
>>> i.dag == {'q2': {'q0'}, 'q1': {'q0'}}
True

duplicate(iteration: int, cache: Cache) → Interview[source]

Duplicate the interview, but with a new iteration number and cache.

>>> i = Interview.example()
>>> i2 = i.duplicate(1, None)
>>> i.iteration + 1 == i2.iteration
True

classmethod example(throw_exception: bool = False) → Interview[source]: Return an example Interview instance.

classmethod from_dict(d: dict[str, Any]) → Interview[source]: Return an Interview instance from a dictionary.

property has_exceptions: bool[source]: Return True if there are exceptions.

property interview_status: InterviewStatusDictionary[source]: Return a dictionary mapping task status codes to counts.

property task_status_logs: InterviewStatusLog[source]

Return the task status logs for the interview.

The keys are the question names; the values are the lists of status log changes for each task.

to_dict(include_exceptions=True, add_edsl_version=True) → dict[str, Any][source]

Return a dictionary representation of the Interview instance. This is just for hashing purposes.

>>> i = Interview.example()
>>> hash(i)
1217840301076717434

property token_usage: InterviewTokenUsage[source]: Determine how many tokens were used for the interview.