Remote Caching
Remote caching allows you to store responses from language models on the Expected Parrot server.
Activating remote caching
Log into your Coop account.
Navigate to API Settings. Toggle on the slider for Remote caching and copy your API key.
Add the following line to your .env file in your edsl working directory (replace your_api_key_here with your actual API key):
EXPECTED_PARROT_API_KEY='your_api_key_here'
You can regenerate your key (and update your .env file) at any time.
Using remote caching
When remote caching is on, the results of any question or survey that you run will be stored automatically on the Expected Parrot server.
We can use remote caching by passing a Cache object to the run method of a survey.
Example
Here we import the Cache module together with Survey and Question types, create a survey, and pass a Cache() object when we call the run on the survey. Note that we use an empty in-memory cache for demonstration purposes; the code can also be used with an existing local cache. See Caching LLM Calls for more details on caching results locally.
from edsl import QuestionMultipleChoice, QuestionFreeText, Survey, Cache
survey = Survey(questions=[QuestionMultipleChoice.example(), QuestionFreeText.example()])
result = survey.run(cache=Cache(), remote_cache_description="Example survey #1")
Remote cache logs
We can inspect Coop remote cache logs to verify that our results were cached successfully. The logs will show that we have 2 remote cache entries:
If you see more than 2 uploaded entries in your own logs, it may be that your local cache already contained some entries (see details about Syncing below).
We can inspect the details of individual entries by clicking on View entries.
Bulk remote cache operations
The remote cache logs page allows you to perform bulk operations on your cache entries. We currently support two bulk operations:
Send to cache: This creates unlisted cache objects on Coop that will appear at your My Caches page. After an object has been created you can change the visibility to public.
Delete: This deletes entries from your remote cache. This operation is currently irreversible, so use with caution!
When performing a bulk remote cache operation, you can select from one of three targets:
Selected entries: The entries you’ve selected via checkbox.
Search results: The entries that match your search query. Search queries are case insensitive. They match either the raw model output or the cache entry description.
Remote cache: All of the entries in your remote cache.
Clearing the cache programatically
You are currently allowed to store a maximum of 50,000 entries in the remote cache.
Trying to exceed this limit will raise an APIRemoteCacheError
.
If you need to clear the remote cache, you can do so with the following command:
# Remove all entries from the remote cache
coop.remote_cache_clear()
Output:
{'status': 'success', 'deleted_entry_count': 2}
You can also clear the logs shown on Coop as follows:
coop.remote_cache_clear_log()
Syncing
When you run a survey with remote caching enabled, the local and remote caches are synced.
How it works
Behind the scenes, remote caching involves the following steps:
Identify local cache entries not present in the remote cache, and vice versa.
Update the local cache with entries from the remote cache.
Run the EDSL survey.
Update the remote cache with entries from the local cache, along with the new entries from the survey.
Let’s take a closer look at how syncing works. To start, we’ll create a local cache with some example entries. We’ll also add examples to the remote cache.
from edsl import CacheEntry, Cache, Coop
local_entries = [CacheEntry.example(randomize=True) for _ in range(10)]
remote_entries = [CacheEntry.example(randomize=True) for _ in range(15)]
# Add entries to local cache
c = Cache()
c.add_from_dict({entry.key: entry for entry in local_entries})
# Add entries to remote cache
coop = Coop()
coop.remote_cache_create_many(remote_entries, description="Set of 15 example entries")
We now have 10 entries in the local cache and 15 in the remote cache. We can verify this by looking at the remote cache logs:
Now, let’s run a survey:
from edsl import Survey, QuestionCheckBox, QuestionNumerical
survey = Survey(questions=[QuestionCheckBox.example(), QuestionNumerical.example()])
result = survey.run(cache=c, remote_cache_description="Example survey #2", verbose=True)
Setting the verbose
flag to True provides us with some helpful output:
Updating local cache with 15 new entries from remote...
Local cache updated!
Running job...
Job completed!
Updating remote cache with 12 new entries... # 10 from local, 2 from survey
Remote cache updated!
There are 27 entries in the local cache.
We now have 27 entries in both caches:
To recap, our 27 entries come from:
15 entries in remote cache (from calling
coop.remote_cache_create_many
)10 entries in local cache (from calling
c.add_from_dict
)2 entries from survey (from calling
survey.run
)
Remote cache methods
When remote caching is activated, EDSL will automatically send your LLM responses to the server when you run a job (i.e., you do not need to execute methods manually).
If you want to interact with the remote cache programatically, you can use the following methods:
Coop class
- class edsl.coop.coop.Coop(api_key: str = None, url: str = None)[source]
Bases:
object
Client for the Expected Parrot API.
- remote_cache_clear() dict [source]
Clear all remote cache entries.
>>> entries = [CacheEntry.example(randomize=True) for _ in range(10)] >>> coop.remote_cache_create_many(cache_entries=entries) >>> coop.remote_cache_clear() {'status': 'success', 'deleted_entry_count': 10}
- remote_cache_clear_log() dict [source]
Clear all remote cache log entries.
>>> coop.remote_cache_clear_log() {'status': 'success'}
- remote_cache_create(cache_entry: CacheEntry, visibility: Literal['private', 'public', 'unlisted'] = 'private', description: str | None = None) dict [source]
Create a single remote cache entry. If an entry with the same key already exists in the database, update it instead.
- Parameters:
cache_entry – The cache entry to send to the server.
visibility – The visibility of the cache entry.
description (optional) – A description for this entry in the remote cache.
>>> entry = CacheEntry.example() >>> coop.remote_cache_create(cache_entry=entry) {'status': 'success', 'created_entry_count': 1, 'updated_entry_count': 0}
- remote_cache_create_many(cache_entries: list[CacheEntry], visibility: Literal['private', 'public', 'unlisted'] = 'private', description: str | None = None) dict [source]
Create many remote cache entries. If an entry with the same key already exists in the database, update it instead.
- Parameters:
cache_entries – The list of cache entries to send to the server.
visibility – The visibility of the cache entries.
description (optional) – A description for these entries in the remote cache.
>>> entries = [CacheEntry.example(randomize=True) for _ in range(10)] >>> coop.remote_cache_create_many(cache_entries=entries) {'status': 'success', 'created_entry_count': 10, 'updated_entry_count': 0}