File Store

FileStore is a module for storing and sharing data on the Coop to use in EDSL projects, such as survey data, PDFs, CSVs, docs or images. It can be particularly useful for storing data intended to be used with surveys as Scenario objects, such as in data labeling tasks, and allows you to include code for retrieving and processing the data files in your EDSL project to facilitate collaboration and replication of results.

File types

The following file types are currently supported by the FileStore:

  • CSV

  • PDF

  • PNG (image)

Posting a file

To post a file, import the FileStore type (CSVFileStore, PDFFileStore or PNGFileStore) and create an object with the path to the file. Then call the push method to store the file on the Coop and get a URL and uuid for accessing it. You can optionally pass a description and visibility parameter to the push method (Coop objects can be public, private or unlisted by default).

CSV example

from edsl.scenarios.FileStore import CSVFileStore

fs = CSVFileStore("example.csv")
info = fs.push()
print(info) # display the URL and Coop uuid of the stored file for retrieving it later

Example output (showing the default description and visibility setting):

{'description': 'File: example.csv',
'object_type': 'scenario',
'url': 'https://www.expectedparrot.com/content/4531d6ac-5425-4c93-aa02-07c1fa64aaa3',
'uuid': '4531d6ac-5425-4c93-aa02-07c1fa64aaa3',
'version': '0.1.33.dev1',
'visibility': 'unlisted'}

PDF example

from edsl.scenarios.FileStore import PDFFileStore

fs = PDFFileStore("top_secret.pdf")
info = fs.push()
print(info) # display the URL and Coop uuid of the stored file for retrieving it later

Example output:

{'description': 'File: top_secret.pdf',
'object_type': 'scenario',
'url': 'https://www.expectedparrot.com/content/a6231668-3166-4741-93d8-f3248b91660f',
'uuid': 'a6231668-3166-4741-93d8-f3248b91660f',
'version': '0.1.33.dev1',
'visibility': 'unlisted'}

PNG example

from edsl.scenarios.FileStore import PNGFileStore

fs = PNGFileStore("parrot_logo.png")
info = fs.push()
print(info) # display the URL and Coop uuid of the stored file for retrieving it later

Example output:

{'description': 'File: parrot_logo.png',
'object_type': 'scenario',
'url': 'https://www.expectedparrot.com/content/148e6320-5642-486c-9332-a6d30be0daae',
'uuid': '148e6320-5642-486c-9332-a6d30be0daae',
'version': '0.1.33.dev1',
'visibility': 'unlisted'}

Retrieving and using a file

To retrieve a file, create a FileStore object (CSVFileStore, PDFFileStore or PNGFileStore) and pass it the Coop uuid of the file you want to retrieve and the Expected Parrot URL. Then call the pull method to retrieve the file from the Coop.

Once retrieved, a file can be converted into scenarios by calling the relevant method on a ScenarioList object:

  • ScenarioList.from_csv() for CSV files

  • ScenarioList.from_pdf() for PDF files

  • ScenarioList.from_image() for PNG files

CSV example

Here we retrieve the CSV file posted above and then convert it into a ScenarioList object with the from_csv() method. The keys are the column names of the CSV file, which can be modified with the rename method.

from edsl.scenarios.FileStore import CSVFileStore
from edsl import ScenarioList

csv_file = CSVFileStore.pull("4531d6ac-5425-4c93-aa02-07c1fa64aaa3", expected_parrot_url="https://www.expectedparrot.com")

scenarios = ScenarioList.from_csv(csv_file.to_tempfile())

PDF example

Here we retrieve the PDF file posted above and then convert it into a ScenarioList object with the from_pdf() method. The default keys are filename, page, text, which can be modified with the rename method.

from edsl.scenarios.FileStore import PDFFileStore
from edsl import ScenarioList

pdf_file = PDFFileStore.pull("a6231668-3166-4741-93d8-f3248b91660f", expected_parrot_url="https://www.expectedparrot.com")

scenario = ScenarioList.from_pdf(pdf_file.to_tempfile())

To inspect the keys:

scenario.parameters

Output:

{'filename', 'page', 'text'}

PNG example

Here we retrieve the PNG file posted above and then convert it into a ScenarioList object with the from_image() method. We can optionally pass the name of a key to use for the scenario object, or edit the key later.

from edsl.scenarios.FileStore import PNGFileStore
from edsl import Scenario

png_file = PNGFileStore.pull("148e6320-5642-486c-9332-a6d30be0daae", expected_parrot_url="https://www.expectedparrot.com")

scenario = Scenario.from_image(png_file.to_tempfile(), "parrot_logo") # including a key for the scenario object

Working with scenarios

Before using the scenario, we can verify the key and value of the scenario object (e.g., by printing), and rename the key as desired to use in survey questions.

For a single Scenario we can check the key:

scenario.keys()

(For a ScenarioList object, we can call the parameters method to get the keys.)

If the key is parrot_logo and you want to rename it logo:

scenario = scenario.rename({"parrot_logo": "logo"})

To use it in a question, the question should be parameterized with the key:

from edsl import QuestionFreeText

q = QuestionFreeText(
    question_name = "test",
    question_text = "What is the logo of the company? {{ logo }}"
)

results = q.by(scenario).run()

Example notebook

The following notebook at the Coop includes the above code examples: https://www.expectedparrot.com/content/e1a00873-dfc6-4383-9426-cc032296bab1

FileStore class

class edsl.scenarios.FileStore.CSVFileStore(path: str | None = None, mime_type: str | None = None, binary: bool | None = None, suffix: str | None = None, base64_string: str | None = None, external_locations: Dict[str, str] | None = None, **kwargs)[source]

Bases: FileStore

classmethod example()[source]

Returns an example Scenario instance.

Parameters:

randomize – If True, adds a random string to the value of the example key.

view()[source]
class edsl.scenarios.FileStore.FileStore(path: str | None = None, mime_type: str | None = None, binary: bool | None = None, suffix: str | None = None, base64_string: str | None = None, external_locations: Dict[str, str] | None = None, **kwargs)[source]

Bases: Scenario

__init__(path: str | None = None, mime_type: str | None = None, binary: bool | None = None, suffix: str | None = None, base64_string: str | None = None, external_locations: Dict[str, str] | None = None, **kwargs)[source]

Initialize a new Scenario.

# :param data: A dictionary of keys/values for parameterizing questions. #

static base64_to_file(base64_string, is_binary=True)[source]
static base64_to_text_file(base64_string) IO[source]
encode_file_to_base64_string(file_path: str)[source]
classmethod example(example_type='text')[source]

Returns an example Scenario instance.

Parameters:

randomize – If True, adds a random string to the value of the example key.

classmethod from_dict(d)[source]

Convert a dictionary to a scenario.

Example:

>>> Scenario.from_dict({"food": "wood chips"})
Scenario({'food': 'wood chips'})
classmethod from_url(url: str, download_path: str | None = None, mime_type: str | None = None) FileStore[source]
Parameters:
  • url – The URL of the file to download.

  • download_path – The path to save the downloaded file.

  • mime_type – The MIME type of the file. If None, it will be guessed from the file extension.

open() IO[source]
property path: str[source]

Property that returns a valid path to the file content. If the original path doesn’t exist, generates a temporary file from the base64 content.

classmethod pull(uuid: str, expected_parrot_url: str | None = None) FileStore[source]
Parameters:
  • uuid – The UUID of the object to pull.

  • expected_parrot_url – The URL of the Parrot server to use.

Returns:

The object pulled from the Parrot server.

push(description: str | None = None, visibility: str = 'unlisted') dict[source]

Push the object to Coop. :param description: The description of the object to push. :param visibility: The visibility of the object to push.

property size: int[source]
to_tempfile(suffix=None)[source]
upload_google(refresh: bool = False) None[source]
view(max_size: int = 300) None[source]
class edsl.scenarios.FileStore.HTMLFileStore(path: str | None = None, mime_type: str | None = None, binary: bool | None = None, suffix: str | None = None, base64_string: str | None = None, external_locations: Dict[str, str] | None = None, **kwargs)[source]

Bases: FileStore

classmethod example()[source]

Returns an example Scenario instance.

Parameters:

randomize – If True, adds a random string to the value of the example key.

view()[source]
class edsl.scenarios.FileStore.PDFFileStore(path: str | None = None, mime_type: str | None = None, binary: bool | None = None, suffix: str | None = None, base64_string: str | None = None, external_locations: Dict[str, str] | None = None, **kwargs)[source]

Bases: FileStore

classmethod example()[source]

Returns an example Scenario instance.

Parameters:

randomize – If True, adds a random string to the value of the example key.

view()[source]
class edsl.scenarios.FileStore.PNGFileStore(path: str | None = None, mime_type: str | None = None, binary: bool | None = None, suffix: str | None = None, base64_string: str | None = None, external_locations: Dict[str, str] | None = None, **kwargs)[source]

Bases: FileStore

classmethod example()[source]

Returns an example Scenario instance.

Parameters:

randomize – If True, adds a random string to the value of the example key.

view()[source]
class edsl.scenarios.FileStore.SQLiteFileStore(path: str | None = None, mime_type: str | None = None, binary: bool | None = None, suffix: str | None = None, base64_string: str | None = None, external_locations: Dict[str, str] | None = None, **kwargs)[source]

Bases: FileStore

classmethod example()[source]

Returns an example Scenario instance.

Parameters:

randomize – If True, adds a random string to the value of the example key.

view()[source]
edsl.scenarios.FileStore.view_csv(csv_path)[source]
edsl.scenarios.FileStore.view_html(html_path)[source]
edsl.scenarios.FileStore.view_pdf(pdf_path)[source]