Estimating costs

Before running a survey, you can estimate the cost of running it (in USD and credits) in 2 different ways:

Call the estimate_job_cost() method on the Job object (a survey combined with one or more models). This will return the total estimated cost in USD, the total estimated input and output tokens, and estimated costs and tokens for each inference service and model used.
Call the remote_inference_cost() method on a Coop client object and pass it the job. This will return the estimated cost in credits and USD. (Credits are required to run surveys remotely. Learn more about using credits in the Credits section of the docs.)

Calculations

The above-mentioned methods use the following calculation for each question in a survey to estimate the total cost of the job:

Estimate the input tokens.
- Compute the number of characters in the user_prompt and system_prompt, with any Agent and Scenario data piped in. (Note: Previous answers cannot be piped in because they are not available until the survey is run; they are left as Jinja-bracketed variables in the prompts for purposes of estimating tokens and costs.)
- Apply a piping multiplier of 2 to the number of characters in the user prompt if it has an answer piped in from a previous question (i.e., if the question has Jinja braces). Otherwise, apply a multiplier of 1.
- Convert the number of characters into the number of input tokens using a conversion factor of 4 characters per token, rounding down to the nearest whole number. (This approximation was established by OpenAI.)
Estimate the output tokens.
- Apply a multiplier of 0.75 to the number of input tokens, rounding up to the nearest whole number.
Apply the token rates for the model and inference service.
- Find the model and inference service for the question in the Pricing page: Total cost = (input tokens input token rate) + (output tokens * output token rate)*
- If the model and inference service are not found, use the following fallback token rates (for a low-cost OpenAI model) (you will also see a warning message that a model price was not found):
  - USD 0.60 per 1M input tokens
  - USD 0.15 per 1M ouput tokens
Convert the total cost in USD to credits.
- Total cost in credits = total cost in USD * 100, rounded up to the nearest 1/100th credit.

Then sum the costs for all question prompts to get the total cost of the job.

Example

Here we create an example survey and agent, select a model and combine them to create a job. Then we call the above-mentioned methods for estimating costs and show the underlying calculations. For more details, please see the Credits section of the docs.

[1]:

from edsl import QuestionFreeText, Survey, Agent, Model, Coop
from math import floor, ceil

[2]:

q0 = QuestionFreeText(
    question_name = "favorite_flower",
    question_text = "What is the name of your favorite flower?"
)
q1 = QuestionFreeText(
    question_name = "flower_color",
    question_text = "What color is {{ favorite_flower.answer }}?"
)

survey = Survey(questions = [q0, q1])

[3]:

a = Agent(traits = {"persona":"You are a botanist on Cape Cod."})

[4]:

m = Model("gpt-4o")

[5]:

job = survey.by(a).by(m)

[6]:

job.estimate_job_cost()

/Users/a16174/edsl/.venv/lib/python3.11/site-packages/edsl/agents/PromptConstructor.py:202: UserWarning: Question instructions still has variables: ['answer'].
  warnings.warn(msg)

[6]:

{'estimated_total_cost': 0.0009175000000000001,
 'estimated_total_input_tokens': 91,
 'estimated_total_output_tokens': 69,
 'model_costs': [{'inference_service': 'openai',
   'model': 'gpt-4o',
   'estimated_cost': 0.0009175000000000001,
   'estimated_input_tokens': 91,
   'estimated_output_tokens': 69}]}

[7]:

c = Coop()

[8]:

c.remote_inference_cost(job)

[8]:

{'credits': 0.1, 'usd': 0.00092}

[9]:

job.show_prompts()

/Users/a16174/edsl/.venv/lib/python3.11/site-packages/edsl/agents/PromptConstructor.py:202: UserWarning: Question instructions still has variables: ['answer'].
  warnings.warn(msg)

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ user_prompt                               ┃ system_prompt                                                       ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ What is the name of your favorite flower? │ You are answering questions as if you were a human. Do not break    │
│                                           │ character. Your traits: {'persona': 'You are a botanist on Cape     │
│                                           │ Cod.'}                                                              │
├───────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│ What color is {{ answer }}?               │ You are answering questions as if you were a human. Do not break    │
│                                           │ character. Your traits: {'persona': 'You are a botanist on Cape     │
│                                           │ Cod.'}                                                              │
└───────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────┘

Count the characters in each user prompt and system prompt:

[10]:

q0_user_prompt_characters = len("What is the name of your favorite flower?")
q0_user_prompt_characters

[10]:

[11]:

q0_system_prompt_characters = len("You are answering questions as if you were a human. Do not break character. Your traits: {'persona': 'You are a botanist on Cape Cod.'}")
q0_system_prompt_characters

[11]:

Apply the piping multiplier to tbe q1 user prompt:

[12]:

q1_user_prompt_characters = len("What color is {{ answer }}?") * 2
q1_user_prompt_characters

[12]:

The system prompt characters are identical for the single agent:

[13]:

q1_system_prompt_characters = len("You are answering questions as if you were a human. Do not break character. Your traits: {'persona': 'You are a botanist on Cape Cod.'}")
q1_system_prompt_characters

[13]:

Estimate the input and output tokens for each set of prompts:

[14]:

q0_input_tokens = (q0_user_prompt_characters + q0_system_prompt_characters) // 4
q0_input_tokens

[14]:

[15]:

q0_output_tokens = ceil(0.75 * q0_input_tokens)
q0_output_tokens

[15]:

[16]:

q1_input_tokens = (q1_user_prompt_characters + q1_system_prompt_characters) // 4
q1_input_tokens

[16]:

[17]:

q1_output_tokens = ceil(0.75 * q1_input_tokens)
q1_output_tokens

[17]:

Apply the token rates for the model:

[18]:

q0_tokens_cost = (2.50/1000000 * q0_input_tokens) + (10.00/1000000 * q0_output_tokens)
q0_tokens_cost

[18]:

0.00044000000000000007

[19]:

q1_tokens_cost = (2.50/1000000 * q1_input_tokens) + (10.00/1000000 * q1_output_tokens)
q1_tokens_cost

[19]:

0.00047750000000000006

[20]:

total_cost_usd = q0_tokens_cost + q1_tokens_cost
total_cost_usd

[20]:

0.0009175000000000001

[21]:

q0_credits = ceil(q0_tokens_cost * 100 * 100) / 100
q0_credits

[21]:

0.05

[22]:

q1_credits = ceil(q1_tokens_cost * 100 * 100) / 100
q1_credits

[22]:

0.05

[23]:

total_credits = q0_credits + q1_credits
total_credits

[23]:

0.1

Posting to Coop

[1]:

from edsl import Notebook

n = Notebook(path = "estimating_costs.ipynb")

n.push(description = "Estimating job costs", visibility = "public")

[1]:

{'description': 'Estimating job costs',
 'object_type': 'notebook',
 'url': 'https://www.expectedparrot.com/content/c379241a-7039-4505-8d42-4c909a54c6e0',
 'uuid': 'c379241a-7039-4505-8d42-4c909a54c6e0',
 'version': '0.1.37.dev1',
 'visibility': 'public'}

Updating content at Coop:

[2]:

n = Notebook(path = "estimating_costs.ipynb")

n.patch(uuid = "c379241a-7039-4505-8d42-4c909a54c6e0", value = n)

[2]:

{'status': 'success'}