Extracting information from PDFs

This notebook provides sample EDSL code demonstrating a method from_pdf() that imports a PDF and automatically creates Scenario objects for the pages to use as parameters of survey questions. This can be helpful when using EDSL to extract qualitative information from a large text efficiently.

EDSL is an open-source library for simulating surveys and experiments with AI agents and large language models. Please see our documentation page for tips and tutorials on getting started.

How it works

EDSL comes with a variety of question types that we can select from based on the desired form of the response (multiple choice, free text, etc.). We can also parameterize questions with textual content in order to ask questions about it. We do this by creating a {{ placeholder }} in a question text, e.g., What are the key themes of this text: {{ text }}, and then creating Scenario objects for the content to be inserted in the placeholder when we run the survey. This allows us to administer multiple versions of a question with different inputs all at once. A common use case for this is performing data labeling tasks designed as questions about one or more pieces of textual data that can be inserted into the survey question texts. Learn more about using scenarios.

Example

For purposes of demonstration we use a PDF copy of the first page of the recent paper Automated Social Science: Language Models as Scientist and Subjects and conduct a survey consisting of several questions about the contents of it:

dcbbec8e87294291b9a5191476e8dfb2

Importing the tools:

[1]:
# pip install edsl
[2]:
from edsl.questions import QuestionFreeText, QuestionList
from edsl import ScenarioList, Survey

Here we create a survey of questions that we will administer for each page of the PDF. Note that the from_pdf() method requires that the scenario placeholders be {{ text }} (for regular scenario objects, you can use any placeholder word that you like):

[3]:
q_summary = QuestionFreeText(
    question_name="summary",
    question_text="Briefly summarize the abstract of this paper: {{ text }}",
)

q_authors = QuestionList(
    question_name="authors",
    question_text="List the names of all the authors of the following paper: {{ text }}",
)

q_thanks = QuestionList(
    question_name="thanks",
    question_text="List the names of the people thanked in the following paper: {{ text }}",
)

survey = Survey([q_summary, q_authors, q_thanks])

Next we create a ScenarioList for the PDF using the from_pdf() method, which automatically creates a list of Scenario objects for the pages of the PDF which will be inserted in our questions (in our example, this is just the first page of the paper):

[4]:
automated_social_scientist = ScenarioList.from_pdf("automated_social_scientist.pdf")

Now we can add the list of scenarios to to the survey and run it:

[5]:
results = survey.by(automated_social_scientist).run()

We can see a list of all the components of results that are directly accessible:

[6]:
results.columns
[6]:
['agent.agent_instruction',
 'agent.agent_name',
 'answer.authors',
 'answer.summary',
 'answer.thanks',
 'comment.authors_comment',
 'comment.thanks_comment',
 'iteration.iteration',
 'model.frequency_penalty',
 'model.logprobs',
 'model.max_tokens',
 'model.model',
 'model.presence_penalty',
 'model.temperature',
 'model.top_logprobs',
 'model.top_p',
 'prompt.authors_system_prompt',
 'prompt.authors_user_prompt',
 'prompt.summary_system_prompt',
 'prompt.summary_user_prompt',
 'prompt.thanks_system_prompt',
 'prompt.thanks_user_prompt',
 'question_options.authors_question_options',
 'question_options.summary_question_options',
 'question_options.thanks_question_options',
 'question_text.authors_question_text',
 'question_text.summary_question_text',
 'question_text.thanks_question_text',
 'question_type.authors_question_type',
 'question_type.summary_question_type',
 'question_type.thanks_question_type',
 'raw_model_response.authors_raw_model_response',
 'raw_model_response.summary_raw_model_response',
 'raw_model_response.thanks_raw_model_response',
 'scenario.edsl_class_name',
 'scenario.edsl_version',
 'scenario.filename',
 'scenario.page',
 'scenario.text']

We can select components of the results to inspect and print:

[7]:
results.select("summary", "authors", "thanks").print(format="rich")
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ answer                               answer                               answer                              ┃
┃ .summary                             .authors                             .thanks                             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ The abstract describes a novel       ['Benjamin S. Manning', 'Kehang      ['Drew Houston', 'Jordan            │
│ method for generating and testing    Zhu', 'John J. Horton']              Ellenberg', 'Benjamin Lira          │
│ social science hypotheses using                                           Luttges', 'David Holtz', 'Bruce     │
│ large language models (LLMs) and                                          Sacerdote', 'Paul Röttger',         │
│ structural causal models (SCMs).                                          'Mohammed Alsobay', 'Ray Duch',     │
│ SCMs help articulate hypotheses,                                          'Matt Schwartz', 'David Autor',     │
│ construct LLM-based agents, design                                        'Dean Eckles']                      │
│ experiments, and analyze data. The                                                                            │
│ research applies this method to                                                                               │
│ various scenarios such as                                                                                     │
│ negotiations, bail hearings, job                                                                              │
│ interviews, and auctions, proposing                                                                           │
│ and evaluating causal                                                                                         │
│ relationships. The findings                                                                                   │
│ indicate that while LLMs can                                                                                  │
│ predict the direction of causal                                                                               │
│ effects, they struggle with the                                                                               │
│ magnitude unless informed by a                                                                                │
│ fitted SCM. The study suggests that                                                                           │
│ LLMs have latent knowledge that can                                                                           │
│ be better utilized when paired with                                                                           │
│ SCMs. The paper includes thanks for                                                                           │
│ support and feedback from various                                                                             │
│ individuals and mentions that the                                                                             │
│ authors' contact information, code,                                                                           │
│ and data will be available online.                                                                            │
└─────────────────────────────────────┴─────────────────────────────────────┴─────────────────────────────────────┘

Another example

Let’s try another example - the (complete) paper Owning, Using and Renting: Some Simple Economics of the “Sharing Economy”.

Here we import it and verify that all the pages have been turned into scenarios:

[8]:
sharing_economy = ScenarioList.from_pdf("sharing_economy.pdf")
[9]:
len(sharing_economy)
[9]:
56
[10]:
sharing_economy[0:2]
[10]:
{
    "scenarios": [
        {
            "filename": "sharing_economy.pdf",
            "page": 1,
            "text": "Owning, Using and Renting:\nSome Simple Economics of the \u201cSharing Economy\u201d\u2217\nApostolos Filippas\u2020\nJohn J. Horton\u2021\nRichard J. Zeckhauser\u00a7\nMay 10, 2019\nAbstract\nNew Internet-based \u201csharing economy\u201d markets enable consumer-owners to rent out\ntheir durable goods to non-owners. We model such markets, and explore their equilib-\nria both in the short-run, in which ownership decisions are \ufb01xed, and in the long-run,\nin which ownership decisions can be changed. We \ufb01nd that \u201csharing economy\u201d markets\nalways expand consumption and increase surplus, but may increase or decrease owner-\nship. Regardless, ownership is decoupled from individual preferences in the long-run,\nas the rental rates and the purchase prices of goods become equal. If there are costs\nof bringing unused capacity to the market, they are partially passed through, creat-\ning a bias towards ownership. To test our theoretical work empirically, we conduct a\nsurvey of consumers, \ufb01nding broad support for our modeling assumptions. The survey\nalso allows us to o\ufb00er a partial decomposition of the bring-to-market costs, based on\nattributes that make a good more or less amenable to being shared.\n\u2217Thanks to Andrey Fradkin, Samuel Fraiberger, Joe Golden, Ramesh Johari, Arun Sundararajan, and\nHal Varian for helpful discussions and comments. Author contact information and code are currently or will\nbe available at http://www.john-joseph-horton.com/.\n\u2020Fordham University, Gabelli School of Business\n\u2021New York University, Stern School of Business\n\u00a7Harvard University, Kennedy School of Government\n1\n",
            "edsl_version": "0.1.24",
            "edsl_class_name": "Scenario"
        },
        {
            "filename": "sharing_economy.pdf",
            "page": 2,
            "text": "1\nIntroduction\nIn traditional rental markets owners hold assets to rent them out. In recent years, \ufb01rms\nhave created a new kind of rental market, in which owners sometimes use their assets for\npersonal consumption, and sometimes rent them out. Such markets are commonly referred\nto as peer-to-peer (P2P) rental or \u201csharing economy\u201d markets. To be sure, some renting\nby consumer-owners has long existed, but given the high transaction cost per rental, it was\nlargely con\ufb01ned to expensive, infrequently used goods, such as vacation homes and pleasure\nboats, usually with rental periods of longer duration. More often, goods were shared among\nfamily and friends, often without explicit payment. In contrast, these new P2P rental markets\nare open markets, and the good is \u201cshared\u201d in exchange for payment.\nAirbnb is a prominent example of a P2P rental market, enabling individuals to rent out\nspare bedrooms, apartments, or entire homes. Airbnb and platforms like it have been her-\nalded by many, as they promise to expand access to goods, diversify individual consumption,\nbolster e\ufb03ciency by increasing asset utilization, and provide income to owners (Botsman and\nRogers, 2010; Edelman and Geradin, 2015; Sundararajan, 2016). The business interest in\nthese platforms has been intense.1\nCompanies organizing \u201csharing economy\u201d markets have also attracted substantial policy\ninterest, much of it negative (Malhotra and Van Alstyne, 2014; Avital et al., 2015; Slee, 2015;\nFilippas and Horton, 2018). Critics charge that the primary competitive advantage of these\nplatforms is their ability to duck costly regulations\u2014regulations that protect third-parties\nand remedy market failures.2 However, the counter-argument is often made that existing\nregulations were designed to solve market problems that these \u201csharing economy\u201d platforms\nsolve in an innovative fashion, primarily with better information provision and reputation\nsystems, thereby making top-down regulation unnecessary (Koopman et al., 2014).\nProgress in designing and operating P2P rental markets, as well as in advancing the\ncorresponding policy debate, requires a better understanding of these markets. More speci\ufb01-\ncally, what are the economic problems that P2P rental markets address, what are the drivers\nbehind their recent emergence, and what are the likely short- and the long-run properties\nand e\ufb00ects of these markets? The goal of this paper is to provide answers to these questions.\n1Airbnb alone has attracted nearly $4.4 billion in venture capital investment, and was valued at $31 billion\nduring its most recent funding round. Uber, which also has a P2P rental market\u2014albeit with a substantial\nlabor component\u2014was valued at $62.5 billion in its last funding round (see also http://www.crunchbase.\ncom/organization/airbnb, and http://www.crunchbase.com/organization/uber).\n2For example, Dean Baker, in an opinion piece for the Guardian characterizes Airbnb and Uber as be-\ning primarily based on \u201cevading regulations and breaking the law\u201d (see also http://www.theguardian.\ncom/commentisfree/2014/may/27/airbnb-uber-taxes-regulation). Edelman and Geradin (2015) dis-\ncuss both the promised e\ufb03ciencies of \u201csharing economy\u201d platforms, and the regulatory issues they raise.\nCannon and Summers (2014) o\ufb00er a playbook for \u201csharing economy\u201d companies to win over regulators.\n2\n",
            "edsl_version": "0.1.24",
            "edsl_class_name": "Scenario"
        }
    ]
}

Let’s see what pages are the most important. We start by generating a summary of the paper based on the abstract, using just the first scenario, which is the first page of the paper. We can also create an agent with a relevant persona for the model to use in answering the questions (learn more about creating AI agents to answer survey questions):

[11]:
from edsl.questions import QuestionFreeText, QuestionList, QuestionLinearScale
from edsl import Agent, Survey
[12]:
social_scientist_agent = Agent(
    {"persona": "You are an experienced social scientist."},
    instruction="You are evaluating the contents of a research paper.",
)
[13]:
q_summary = QuestionFreeText(
    question_name="summary",
    question_text="Draft a summary of the paper based on the abstract: {{ text }}",
)

q_authors = QuestionList(
    question_name="authors", question_text="List the authors of this paper: {{ text }}"
)

survey = Survey([q_summary, q_authors])
[14]:
results = survey.by(sharing_economy[0]).by(social_scientist_agent).run()

results.select("summary", "authors").print(format="rich")
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ answer                                                  answer                                                 ┃
┃ .summary                                                .authors                                               ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ The paper by Filippas, Horton, and Zeckhauser examines  ['Apostolos Filippas', 'John J. Horton', 'Richard J.   │
│ the economic implications of the sharing economy,       Zeckhauser']                                           │
│ where consumer-owners can rent out durable goods to                                                            │
│ non-owners through Internet-based markets. The authors                                                         │
│ develop a model to analyze the market equilibria in                                                            │
│ both short-run (with fixed ownership decisions) and                                                            │
│ long-run (with variable ownership decisions)                                                                   │
│ scenarios. They find that sharing economy markets                                                              │
│ invariably increase consumption and overall surplus.                                                           │
│ However, the effect on ownership is ambiguous; it may                                                          │
│ rise or fall. In the long run, the model predicts that                                                         │
│ ownership preferences become irrelevant as rental                                                              │
│ rates and purchase prices converge. The paper also                                                             │
│ acknowledges that costs associated with bringing                                                               │
│ unused capacity to the market can lead to a partial                                                            │
│ pass-through, which biases the market towards                                                                  │
│ ownership. The authors' empirical work, which includes                                                         │
│ a consumer survey, supports the theoretical                                                                    │
│ assumptions and offers insight into the costs of                                                               │
│ bringing goods to the sharing economy, highlighting                                                            │
│ characteristics that influence an item's shareability.                                                         │
└────────────────────────────────────────────────────────┴────────────────────────────────────────────────────────┘

We’ll use the summary as context for a new set of questions prompting the agent to identify the most important idea on each page, then select and summarize the most important ideas, and rate the relative importance of each page of the paper:

[15]:
summary = results.select("summary").first()
[16]:
q_idea = QuestionFreeText(
    question_name="idea",
    question_text="Paper summary: "
    + summary
    + " Quote the most important sentence on this page: {{ text }}",
)
[17]:
ideas = (
    q_idea.by(sharing_economy).by(social_scientist_agent).run().select("idea").to_list()
)
ideas[0:10]
[17]:
['We find that “sharing economy” markets always expand consumption and increase surplus, but may increase or decrease ownership. Regardless, ownership is decoupled from individual preferences in the long-run, as the rental rates and the purchase prices of goods become equal.',
 'The goal of this paper is to provide answers to these questions.',
 'Our first major question is why P2P rental markets only became a force in the 21st century, despite the fact that the economic problem these markets are able to solve—under-utilization of durable goods—is hardly new.',
 'While ownership may increase or decrease in the long-run, the option of renting out an owned good makes ownership more valuable. As such, a P2P rental market can have a market-expanding effect, in the sense that it allows a previously infeasible product market to emerge.',
 'owners with lower valuations are the biggest beneficiaries, as they consume the good less of the time, and hence they have more excess capacity to rent. Similarly, non-owners with higher valuations see the largest increase in surplus. As such, the greatest gains in surplus are obtained when original non-owners value the good nearly as highly as owners, suggesting that goods where income (rather than taste or planned usage) primarily explains ownership could offer the greatest increase in surplus.',
 'Our main \xadfinding is that income is only important in determining ownership for a small number of the goods we asked about (e.g., vacation homes); for most goods, planned usage was the primary driver, thus supporting our basic modeling framework.',
 'The economic rationale for P2P rental markets is that owners of most durable goods use them substantially less than 100% of the time. This under-utilization generates excess capacity that can then be rented out to non-owners who would like to use the good, but not enough to purchase it.',
 'The second, often understated, reason behind the decrease in transaction costs and the subsequent proliferation of P2P rental markets, is that these markets have stood on the shoulders of their electronic commerce predecessors, such as eBay and Amazon. There are now more than 20 years of accumulated industrial experience in building online marketplaces and solving their fundamental problems. The creator of a potential P2P rental market can easily draw upon this experience. At the same time, several aspects of these fundamental problems are different in the P2P context, requiring innovative solutions.',
 'To reduce transaction costs, and close these gaps, P2P rental markets give individual owners resources that are available to traditional firms. Platforms can do this because they enjoy scale economies for many costly tasks compared to individual owners.',
 'An example is found in the case of home-sharing, where residential houses now become mixed-use real estate, creating negative externalities that can lead to market failure, and which previous public policy responses are not fit to address (Filippas and Horton, 2018).']
[18]:
q_important = QuestionFreeText(
    question_name="important",
    question_text=f"""Paper summary: {summary}
    Consider the following ideas that are mentioned in the paper and
    summarize the 5 most important of them: {ideas}.""",
)
[19]:
results = q_important.by(social_scientist_agent).run()
results.select("important").print(format="rich")
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ answer                                                                                                          ┃
┃ .important                                                                                                      ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ The paper by Filippas, Horton, and Zeckhauser provides a comprehensive analysis of the sharing economy's impact │
│ on consumption, ownership, and market equilibria. The five most important ideas from the paper are: (1) The     │
│ sharing economy invariably increases consumption and surplus, with the ambiguous effect on ownership, which may │
│ rise or fall. In the long run, ownership preferences become irrelevant as rental rates and purchase prices      │
│ converge. (2) Peer-to-peer (P2P) rental markets have become significant due to decreased transaction costs and  │
│ the accumulated experience from e-commerce platforms. These markets utilize under-utilized durable goods by     │
│ allowing owners to rent them out, increasing the value of ownership. (3) The sharing economy benefits both      │
│ owners with lower valuations, who can monetize their excess capacity, and non-owners with higher valuations,    │
│ who gain access to goods they value without the need to own them. (4) The economic model developed in the paper │
│ indicates that bringing unused capacity to the market incurs costs, which leads to partial pass-through in      │
│ rental rates and biases the market towards ownership. (5) Empirical evidence from consumer surveys supports the │
│ theoretical model, showing that planned usage rather than income is the primary determinant of ownership for    │
│ most goods, and that characteristics such as predictability and chunkiness of use influence an item's           │
│ shareability and the likelihood of ownership.                                                                   │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
[20]:
important_ideas = results.select("important").first()
[21]:
q_relative = QuestionLinearScale(
    question_name="relative",
    question_text="Consider the following paper summary: "
    + important_ideas
    + " What is the relative importance of this page of the paper: {{ text }}",
    question_options=[0, 1, 2, 3, 4, 5],
    option_labels={0: "Unimportant", 3: "Important", 5: "Most important"},
)
[22]:
results = q_relative.by(sharing_economy).by(social_scientist_agent).run()

We can filter and sort pages based on the responses, and inspect the agent’s comments on its answers:

[23]:
(
    results.sort_by("page")
    .filter("relative == '5'")
    .select("page", "relative", "relative_comment")
    .print(format="rich")
)
┏━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ scenario  answer     comment                                                                                  ┃
┃ .page     .relative  .relative_comment                                                                        ┃
┡━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 5         5          This page of the paper appears to be quite important as it provides key insights into    │
│                      how bring-to-market (BTM) costs influence the dynamics of the sharing economy. It        │
│                      discusses the implications of these costs on the rental market's viability, the rental   │
│                      rates, and the decision-making process regarding ownership versus renting. The page      │
│                      connects the theoretical model with practical considerations and outlines the conditions │
│                      under which the sharing economy is most beneficial. The impact of BTM costs on consumer  │
│                      behavior, market equilibrium, and the potential for profit in rental businesses are all  │
│                      critical elements for understanding the sharing economy. Therefore, this page is likely  │
│                      to be of high importance in the context of the paper.                                    │
├──────────┼───────────┼──────────────────────────────────────────────────────────────────────────────────────────┤
│ 14        5          This page of the paper appears to be crucial as it outlines the mathematical model that  │
│                      underpins the theoretical framework of the paper. It provides the utility functions for  │
│                      both owners and renters within the sharing economy, which is essential for understanding │
│                      the economic implications of peer-to-peer rental markets. The model also delineates the  │
│                      conditions under which owners and non-owners decide to rent or rent out goods,           │
│                      respectively. Furthermore, the section on short-run equilibrium introduces important     │
│                      concepts for understanding market dynamics in the presence of fixed ownership. The       │
│                      complexity and relevance of the mathematical model to the paper's aims make this page    │
│                      highly important.                                                                        │
├──────────┼───────────┼──────────────────────────────────────────────────────────────────────────────────────────┤
│ 20        5          The page provides a detailed analysis of how the P2P rental market affects the utility   │
│                      of both owners and non-owners, showing the differential impact on consumption and        │
│                      utility based on the valuation of the goods. This analysis is crucial for understanding  │
│                      the distributional consequences of the sharing economy and the dynamics of ownership and │
│                      renting. It also touches upon the long-term equilibrium effects and the behavioral       │
│                      adjustments of consumers near the extensive margin, which are significant insights for   │
│                      comprehending the broader economic implications of the sharing economy. Therefore, the   │
│                      content of this page is highly relevant and integral to the paper's overall argument and │
│                      findings.                                                                                │
├──────────┼───────────┼──────────────────────────────────────────────────────────────────────────────────────────┤
│ 23        5          This page is crucial as it discusses the dynamics of ownership decisions in relation to  │
│                      consumer surplus, which is a key aspect of the sharing economy's impact on market        │
│                      equilibria. It provides a mathematical framework for understanding how changes in        │
│                      ownership due to P2P rental markets affect consumer surplus, and it illustrates the      │
│                      conditions under which surplus is maximized. The page also acknowledges the broader      │
│                      implications of increased consumption due to the sharing economy, such as the impact on  │
│                      complementary goods and labor, as well as potential negative externalities. Overall, it  │
│                      encapsulates significant theoretical and practical insights into the functioning of the  │
│                      sharing economy, making it of high importance in the context of the paper.               │
└──────────┴───────────┴──────────────────────────────────────────────────────────────────────────────────────────┘

The selected page:

[24]:
(
    results.sort_by("page")
    .filter("relative == '5'")
    .select("page", "text")
    .print(format="rich")
)
┏━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ scenario  scenario                                                                                             ┃
┃ .page     .text                                                                                                ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 5         owners with lower valuations are the biggest beneficiaries, as they consume the good less             │
│           of the time, and hence they have more excess capacity to rent. Similarly, non-owners with            │
│           higher valuations see the largest increase in surplus. As such, the greatest gains in surplus        │
│           are obtained when original non-owners value the good nearly as highly as owners, suggesting          │
│           that goods where income (rather than taste or planned usage) primarily explains ownership            │
│           could offer the greatest increase in surplus.                                                         │
│           Although we began by assuming that owners can rent out their unused capacity costlessly,             │
│           in practice, making a good available for rentals is at least somewhat costly—as we argued, one       │
│           could conceptualize the rise of the “sharing economy” as caused by a significant decrease in          │
│           these costs. Some of these bring-to-market (BTM) costs are straightforward, such as labor,           │
│           depreciation, and complementary consumables. For example, driving with Uber requires                 │
│           labor, increases the car’s mileage, and consumes gas.                                                │
│           In our model, when we assume that owners do face BTM costs, the predictions change                   │
│           in several important ways. We find that if BTM costs are sufficiently high relative to the              │
│           purchase price of the good, a P2P rental market cannot be supported at all. If the market            │
│           can exist, BTM costs lower the quantity of the good transacted in the market and raise the           │
│           rental rate, both in the short- and the long-run. In particular, we show that BTM costs              │
│           do get incorporated into rental rates, being the equivalent of a per-unit sales tax. As with         │
│           a sales tax, they are not fully incorporated in the rental rate—the magnitude of the pass-           │
│           through depends on the elasticity of the supply (owners) and the elasticity of the demand            │
│           (renters). An implication of the incomplete pass-through is that both owning and renting             │
│           become less compelling as BTM costs increase. Furthermore, total ownership may either                │
│           increase or decrease in the long-run as BTM costs change, depending on the incidence of the          │
│           BTM costs.                                                                                           │
│           When making a good available to be rented is costless, the rentals option decouples in-              │
│           dividual preferences from ownership. However, when BTM costs are introduced, the incom-              │
│           plete pass-through of BTM costs couples preferences and ownership again, tilting consumers           │
│           with higher valuations towards ownership. The reason consumers with higher valuations—               │
│           and hence more planned usage—find ownership relatively more attractive than owners with               │
│           low valuations, is that consumers bear no BTM costs for own-usage. This is similar to the            │
│           inefficient bias towards home production that a labor market wedge creates.                            │
│           The incomplete pass-through finding implies that, when BTM costs are positive, it be-                 │
│           comes loss-making to buy the good solely to rent it out—if BTM costs are zero, it is merely          │
│           zero-profit. This result has important managerial implications for would-be rental firms.              │
│           However, in the presence of large setup costs, or significant economies of scale in offering           │
│           rental services, for-profit firms can compete.                                                         │
│           5                                                                                                    │
│                                                                                                                │
├──────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ 14        Assume that owners of the good can provide their unused quantities to the market at                  │
│           a rental rate r, where the rental period is the lifetime of the asset.12 An owner’s problem          │
│           becomes to select the optimal personal usage                                                         │
│           xO(r; α) = argmax                                                                                    │
│           x∈[0,1]                                                                                              │
│           2αx −x2                                                                                              │
│           |                                                                                                    │
│           {z                                                                                                   │
│           }                                                                                                    │
│           consumption util.                                                                                    │
│           +                                                                                                    │
│           r(1 −x)                                                                                              │
│           |                                                                                                    │
│           {z                                                                                                   │
│           }                                                                                                    │
│           rental income                                                                                        │
│           = max                                                                                                │
│           n                                                                                                    │
│           0, α −r                                                                                              │
│           2                                                                                                    │
│           o                                                                                                    │
│           ,                                                                                                    │
│           (2)                                                                                                  │
│           which yields utility                                                                                 │
│           uO(r; α) = α2 −αr + r2                                                                               │
│           4 + r −p.                                                                                            │
│           (3)                                                                                                  │
│           In the presence of the rental market, owners of the good reduce their usage to gain the              │
│           benefits of “sharing.” Owners are never worse offthan they were before the rental option,              │
│           as they can choose not to participate in the P2P rental market.                                      │
│           Non-owners can choose to become renters. At rental rate r, a renter’s problem is                     │
│           xR(r; α) = argmax                                                                                    │
│           x∈[0,1]                                                                                              │
│           2αx −x2 −                                                                                            │
│           rx                                                                                                   │
│           |{z}                                                                                                 │
│           rental cost                                                                                          │
│           = max                                                                                                │
│           n                                                                                                    │
│           0, α −r                                                                                              │
│           2                                                                                                    │
│           o                                                                                                    │
│           ,                                                                                                    │
│           (4)                                                                                                  │
│           through which a renter obtains utility                                                               │
│           uR(r; α) = α2 −αr + r2                                                                               │
│           4 .                                                                                                  │
│           (5)                                                                                                  │
│           With P2P rentals, non-owners can consume the good some of the time, and hence reap                   │
│           higher utility. However, not all non-owners benefit, as those with valuations α < r                   │
│           2 remain                                                                                             │
│           excluded from consumption, and their utility remains unchanged.                                      │
│           3.3                                                                                                  │
│           Short-run equilibrium (fixed ownership)                                                               │
│           To examine the short-run effects of the emergence of a P2P rental market, we assume that              │
│           original purchase decisions are fixed: owners cannot become renters, and non-owners cannot            │
│           buy the good to become owners.                                                                       │
│           With ownership being fixed, the short-run equilibrium is characterized by the equilibrium             │
│           market rental rate rS. The highest-valuation potential renter is the one who was previously          │
│           indifferent between owning and not owning the good, and hence for any quantity to be                  │
│           rented, rS ≤2√p. As owners can make their capacity available on the market without costs,            │
│           owners have an incentive to rent out their good if rS ≥0. The short-run rental market is             │
│           12We examine the case where making the excess capacity available in the market is costly in Section  │
│           5.                                                                                                   │
│           14                                                                                                   │
│                                                                                                                │
├──────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ 20        4                                                                                                    │
│           Economic effects for consumers                                                                        │
│           4.1                                                                                                  │
│           Using, renting, and the distributional consequences                                                  │
│           Following the introduction of the P2P rental option, owners decrease their consumption, from         │
│           x∗(α) = α to xO(r; α) = α −r                                                                         │
│           2. While owners’ utility from consuming the good decreases,                                          │
│           their utility from renting increases by a greater amount. The net increase equals                    │
│           ∆uO = (1 −α)r + r2                                                                                   │
│           4 .                                                                                                  │
│           (12)                                                                                                 │
│           From Equation 12, we see that if r > 0 then ∆uO > 0; hence, the utility of all owners                │
│           increases. As d∆uO                                                                                   │
│          │
│           < 0, owners with low valuations obtain the greatest benefits from renting                             │
│           out their goods: as usage is analogous to valuation, owners with low valuations have more            │
│           excess capacity to rent out. If the short-run rental rate rS is lower than the purchase price        │
│           p, then the rental rate increases in the long-run equilibrium, that is, rL > rS. Consequently,       │
│           owners see their utility further increase in the long-run, as they rent out their excess capacity    │
│           at a higher price. The opposite holds in the case where the short-run rental rate exceeds the        │
│           purchase price.                                                                                      │
│           With P2P rentals, non-owners—who previously obtained zero utility—can become renters                 │
│           and consume some of the good, increasing their consumption from 0 to xR(r; α) = α −r                 │
│           2,                                                                                                   │
│           and obtaining utility                                                                                │
│           ∆uR =                                                                                                │
│                                                                                                                │
│           α −r                                                                                                 │
│           2                                                                                                    │
│           2                                                                                                    │
│           .                                                                                                    │
│           (13)                                                                                                 │
│           Unlike owners, it is the higher-valuation renters who benefit most from the introduction of           │
│           the P2P rental option. As such, if the short-run rental rate rS is lower than the purchase           │
│           price p, renters see their utility decrease in the long-run (and vice versa), relative to the        │
│           short-run equilibrium, but their utility is still higher than the pre-“sharing” status quo. The      │
│           rental option does not benefit every non-owner: non-owners with very low valuations will              │
│           still not consume any of the good.                                                                   │
│           The biggest beneficiaries from the emergence of the P2P rental market are consumers                   │
│           near the extensive margin, i.e., the breakeven point for ownership. In the short-run, these          │
│           consumers see their utilities increase the most, as they constitute the highest-valuation non-       │
│           owners and the lowest-valuation owners (see Equation 12 and 13). In the long-run, at-the-            │
│           margin consumers who revise their ownership see the largest utility gains: maintaining their         │
│           ownership status-quo is made more attractive than without the P2P rental option, and hence           │
│           consumers who revise their ownership decisions are made even better off.                              │
│           It is worthwhile noting that owners never rent out their entire capacity. This commonly              │
│           20                                                                                                   │
│                                                                                                                │
├──────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ 23        in the long-run equilibrium of the P2P rental market.                                                │
│           To calculate the change in total                                                                     │
│           consumer surplus, we can ignore changes in rental rates, as the corresponding changes in             │
│           rental incomes and expenditures for consumers that did not revise their ownership, as these          │
│           changes simply constitute transfers. As such, we need focus only on consumers who revise             │
│           their ownership decisions.                                                                           │
│           Consumers revise their ownership decisions because doing so                                          │
│           increases their utilities. Hence, their actions are surplus-improving.                               │
│           The highest short- and long-run surplus gains are obtained when rS = rL = p. Let                     │
│           ∆U(p) = UL(p) −U0(p). Clearly, ∆U(0) = 0, as the P2P rental option is of no benefit when              │
│           a good can be purchased at no cost. The first order derivative of the surplus gains yields            │
│           d                                                                                                    │
│           dp∆U(p) =                                                                                            │
│           Z 1                                                                                                  │
│           √p                                                                                                   │
│           dF(α)                                                                                                │
│           |                                                                                                    │
│           {z                                                                                                   │
│           }                                                                                                    │
│           pre-P2P rental capacity                                                                              │
│          │
│           Z 1                                                                                                  │
│           p/2                                                                                                  │
│                                                                                                                │
│           α −p                                                                                                 │
│           2                                                                                                    │
│                                                                                                                │
│           dF(α)                                                                                                │
│           |                                                                                                    │
│           {z                                                                                                   │
│           }                                                                                                    │
│           long-run consumption with P2P rentals                                                                │
│           (17)                                                                                                 │
│           If rS > p, ownership decreases in the long-run equilibrium, which implies that the pre-P2P           │
│           rental market capacity exceeds the total consumption of the good in the long-run market              │
│           equilibrium, and hence                                                                               │
│           d                                                                                                    │
│           dp∆U(p) > 0. Similarly, if rS < p, then                                                              │
│           d                                                                                                    │
│           dp∆U(p) < 0. Therefore,                                                                              │
│           Equation 17 implies that the long-run surplus gains are maximized when no consumers                  │
│           revise their ownership decisions in the long-run. Furthermore, as US ≥UL, with the equality          │
│           holding only when no consumers revise their ownership decisions, short-run surplus gains             │
│           are also maximized when rS = p.                                                                      │
│           These results are graphically depicted in the bottom panel of Figure 3c, for the case                │
│           of uniformly distributed consumer valuations. There exist consumer surplus gains in the              │
│           short-run equilibrium, and these gains further increase in the long-run equilibrium. Equality        │
│           holds only when no consumers revise their ownership decisions in the long-run, that is, when         │
│           rS = p, in which case both the maximum short- and long-run surplus gains are obtained.               │
│           An additional source of surplus is found in the market-expanding region, depicted in the             │
│           shaded area, where demand would be zero in the absence of a P2P rental market. This                  │
│           kind of surplus is fundamentally different from the previous cases, and is not captured in the        │
│           formulation of Equation 16.                                                                          │
│           A complete assessment of the surplus implications of P2P rentals would necessarily con-              │
│           sider industry-specific factors. As the consumption of the focal good increases, so will the          │
│           consumption of complementary goods and labor, which further increase surplus. However,               │
│           increased consumption of goods with negative externalities—say, an Airbnb rental in a build-         │
│           ing creates unwanted disturbance to neighbors—may lead to a decrease in surplus, and pos-            │
│           sibly to a market failure (Filippas and Horton, 2018). Another assumption in our surplus             │
│           calculations is that there exist no pecuniary externalities, that is, that the purchase price of     │
│           23                                                                                                   │
│                                                                                                                │
└──────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────┘

Please see our documentation page for examples of other survey methods and use cases!