Do time preferences explain low health insurance take-up?

Low insurance take-up in low-income populations is not easily explained by the standard single-period expected utility model of insurance that overlooks the relevance of time preference when liquidity is constrained. We design field survey instruments to elicit quasi-hyperbolic time preferences, as well as prospect theory risk preferences, and use them to examine whether time preferences explain health insurance behavior of low-income Filipinos. Consistent with theory, those who exhibit stronger time preference are less likely to insure and the partial association is most pronounced at low wealth where liquidity is most likely to be constrained. Among those with better understanding of insurance, lower take-up is also associated with present bias. We do not find that insurance is significantly associated with risk preferences. JEL Classification: D03, D81, D90, I13


Introduction
Low take-up of insurance against substantial risks related to healthcare and agriculture in developing countries is difficult to explain with the standard single-period expected utility model of insurance. That may be because that model affords no role to time preference despite insurance often requiring upfront payment of a premium to secure entitlement to compensation for a future loss. This paper examines whether time preference can partly explain low insurance take-up.
Evidence that the timing of premium payments influences demand for insurance in lowincome settings (Casaburi & Willis, 2018;Belissa et al., 2019;Liu et al., 2020) is consistent with the time dimension of insurance acquiring importance when liquidity is constrained (Ericson & Sydnor, 2018). Both discounting of future compensation when credit cannot be used to pay a premium up front and present bias may reduce demand. However, the explanation for low insurance take-up does not necessarily lie only in time preferences. Compared with expected utility maximization, a more descriptively accurate model of decision under risk may provide greater insight into why so many low-income individuals do not insure. A prospect theory model of insurance in which the premium is perceived as a loss predicts that convex utility over losses will drive demand down (Wakker et al., 1997). Any motivation to insure would then have to come from distorted transformation of loss probabilities into decision weights ( Barseghyan et al., 2013;Jaspersen et al. 2021).
To assess empirical support for these explanations of low insurance take-up, we elicit quasi-hyperbolic time preferences and prospect theory risk preferences in a nationwide survey of low-income individuals in the Philippines and examine whether these preferences are associated with the decision to voluntarily insure medical expenses rather than remain uninsured. As far as we know, no previous study has elicited constant discounting, present bias, utility curvature, and probability weighting in a general population survey and estimated associations between insurance and these four dimensions of preferences.
To motivate the empirical analysis, we obtain predictions from a simple model of the decision whether or not to enroll in health insurance. A contract that stipulates a minimum period between payment of a premium and entitlement to make a claim leaves scope for time preference to impinge on the decision to insure if credit cannot be used to smooth the cost of the premium across periods. We show that this intuition is correct irrespective of whether the time dimension of the decision is embedded in an expected utility model of insurance or a 5 prospect theory model. The insurance probability is predicted to decrease with stronger time preference when there is a binding liquidity constraint. Constrained liquidity is likely to be common in the Filipino population we study and previous evidence, consistent with theory (Ericson & Sydnor, 2018), suggests it is associated with lower demand for insurance (Platteau et al., 2017): poorer households are less likely to insure (Giné & Yang, 2009;Cole et al., 2013); take-up responds positively to cash handouts given immediately before insurance is offered (Cole et al., 2013); and, the opportunity to delay payment of a premium to harvest time raises enrollment substantially (Casaburi & Willis, 2018;Balessi et al., 2019;Liu et al., 2020). 1 Constant discounting is not necessarily the best description of the time dimension of economic behavior (Laibson, 1997;Frederick et al., 2002;Anderson et al., 2008;Dupas, 2011;Burks et al., 2012;Wang et al., 2016). We show that present bias compounds the negative effect of time preference on the demand for insurance unless there is an instrument available to commit to the purchase of insurance.
Consistent with the first prediction, we find that those who discount more aggressively are less likely to purchase health insurance, and this relationship is strongest among the least wealthy who are most likely to be liquidity constrained. In the full sample, insurance take-up is not related to present bias. However, when we restrict attention to respondents who display some understanding of how health insurance operates and what it covers, there is clear evidence that the more present biased are less likely to insure.
Our finding of a negative association between health insurance and discounting is consistent with evidence that an option to delay payment of the premium raises take-up of agricultural insurance (Casaburi & Willis, 2018;Balessi et al., 2019;Liu et al., 2020). 2 Consistent with theory that identifies illiquidity as the mechanism through which the time dimension of insurance acquires relevance (Ericson & Sydnor, 2018) and with our finding that the insurance-discounting association is stronger at lower wealth, Casaburi and Willis (2018) find that the effect of delayed payment is stronger among poorer and liquidity constrained farmers. They also demonstrate that an opportunity to pre-commit can raise take-up, which 6 suggests that present bias partly explains the large effect of delayed-payment. 3 Unlike these experiment-based studies, this paper does not deliver evidence of the causal effect of an intervention from which preferences can be inferred subject to the validity of assumptions about the mechanism. Rather, we directly elicit preferences in a nationwide survey and examine their associations with insurance behavior.
We do not find evidence of consistent and statistically significant associations between insurance and risk preferences. A point estimate indicates that, in the sample, greater risk seeking (convex utility) over losses is associated with a higher probability of insurance, and this association is significant when the sample is restricted to respondents who show better understanding of the concept and content of insurance. While inconsistent with theory, the direction of this association is consistent with evidencebased on preferences elicited from Holt and Laury (2002) lotteries over gainsthat insurance demand is weak among predominantly risk averse individuals (Chemin, 2018) and that demand is decreasing with increasing risk aversion (Giné et al., 2008;Giné & Yang, 2009;Giesbert et al., 2011;Cole et al., 2013;Clarke 2016;Liu et al., 2020). It may be that insurance is viewed as a risky prospect by many with little experience of it.
Besides the finding that low take-up of health insurance in the Philippines is associated with time preference and, to a lesser extent, present bias, the paper introduces a new instrument that makes it feasible to elicit four dimensions preferences in a field survey of the general population. In developing country settings, even standard preferences are usually elicited from samples of students, farmers, or households in small communities (Binswanger, 1980;Cardenas & Carpenter, 2008& 2013Tanaka et al., 2010;Vieider et al., 2015;Vieider et al., 2019).
Elicitation of quasi-hyperbolic time preferences and prospect theory risk preferences is usually thought to require a laboratory setting and some degree of sophistication on the part of respondents (Charness et al. 2013). To make the exercise feasible in a time-limited field survey, we asked (mostly poorly educated) respondents to make only four sets of choices. From the implied indifference points, we derive non-parametric preference measures. We also identify four preference parameters that rest on assumptions about the nature of utility and probability weighting functions. We obtain a non-parametric estimate of the bivariate relationship between insurance and each dimension of preferences (captured by both the non-parametric measure and 7 the respective parameter) and estimate partial associations from models that include all four preferences and covariates.
Section 2 provides the theoretical motivation. Section 3 presents the preference elicitation instrument and derives the preference measures and parameters. Section 4 provides background on health insurance in the Philippines. Section 5 describes the data. Section 6 describes distributions of the preference measures and parameters before presenting estimates of their bivariate and partial associations with insurance. The final section discusses implications and limitations.

Theory
We consider a consumer deciding whether to purchase insurance that offers a fixed quantity of cover at a given price. This corresponds to the voluntary social health insurance enrollment decision in the Philippines and elsewhere. Although our focus is on time preferences, we begin with the standard assumption that risk is resolved in the same period that the premium is paid and consider the role of risk preferences according to both expected utility (EU) and (cumulative) prospect theory (PT). We then allow for delay between payment of the premium and resolution of the risk and examine the role of time preferences in each model. 4

Risk preferences
Under EU, an offer of a fixed quantity of insurance (≤ loss) will be accepted provided the price is actuarially fair, or subsidized, and utility is concave in the level of consumption. Consider the offer to insure a fraction (  0,1   of a loss L incurred with probability p at a price . Since insurance at a price that is not unfair reduces the variance of consumption and does not reduce its expected value, expected utility is greater with insurance provided utility 4 Jaspersen et al. (2021) compare the predictive performance of 17 models of insurance. We restrict attention to EU and PT to explore the consequences of introducing time into two of the main models and because we elicit preferences deemed relevant by these models. 8 is concave. If the price is unfair, then a higher price necessitates a higher degree of concavity for insurance to be purchased.
In PT, the motivation to insure differs from that identified with EU. Consider the preferences of an individual facing risky outcomes, which are real numbers denoted or xz that are defined as changes from some reference situation. Let p xz denote a prospect yielding x with probability p and z otherwise. Risk attitudes are represented through a utility function , which is anticipated to be concave for gains but convex for losses, and a probability weighting function ( ) wp that is increasing and maps cumulative probabilities into decision weights (Kahneman & Tversky, 1979;Tversky & Kahneman, 1992). The prospect is valued as zx  Applied to insurance, the reference point is usually taken to be consumption in the state in which insurance is not purchased and the bad outcome does not occur (Wakker et al. 1997). 5 In that case, the choice is between a small certain loss (the premium) and a larger uncertain loss (medical expenses). If utility is indeed convex for losses, then utility curvature does not motivate insurance. It would generate risk seeking and constrain demand for insurance that carries no default risk. The motivation for insurance can come from the weighting of probabilities. If, as is often found in laboratory experiments (Wakker, 2010 p.204), the weighting function is inverse S-shapedoverweighting smaller probabilities ( ) 1 3 p  and underweighting moderate and larger onesthen this would raise demand for insurance against a fixed loss that occurs with a smallish probability. 6 Inverse S-types would be more likely to insure than both EU-types who weight probabilities linearly and S-types who underweight smaller probabilities. 7 When, as with medical expenses, the loss is not fixed, extreme outcomes in each direction can occur with small probability. Inverse S-shaped probability weighting can then result in the overweighting of both unusually large losses and unusually small ones. Willingness 5 See Jaspersen et al. (2021) for alternatives. 6 This is because the weight placed on the probability of the loss rises, ( ) , w p p  while the weight attached to the prospect of paying the insurance premium is the same as that given under EU, ( ) 1 1. w = 7 Exactly how low the loss probability must be for these predictions to hold depends on the precise shape of the weighting function. From previous evidence (Wakker 2010 p.204), if the probability of loss is 10%, then we would expect ( ) to pay for insurance then depends on the precise shape of both the distribution of losses and the weighting function (Baillon et al., 2021).

Time preferences
Under EU, time preference is irrelevant to insurance demand if there is no constraint on liquidity; otherwise, time preference reduces demand (Liu & Myers, 2016;Casaburi & Willis, 2018). 8 When liquidity is constrained, insurance must be adjusted to compensate for the reduced opportunity to use credit to make transfers between periods. Purchasing insurance creates imbalance in consumption across periods if the premium must be paid up front and cannot be credit-financed. It also ties resources up in an illiquid asset, raising exposure to (formally) uninsured risks that (constrained) borrowing is insufficient to cope with. Through these direct and indirect mechanisms, insurance affects cross-period, and not only cross-state, consumption levels and so becomes dependent on time preference. Unlike the previous literature that assumes EU maximization, we analyze the influence of time preference on insurance without confining attention to a particular theory of decision under risk.

Constant discounting
First, consider a two-period model with no opportunity to borrow or save. With some probability, a fixed loss is incurred in the second period. We assume decisions can be represented by discounted utility which is a function of consumption determined by time-and state-invariant per period income and the risk: is the discount factor. 9 V can be either EU, with ( ) where we take income to be the reference point. 10 8 Liu and Myers (2016) show that when liquidity is constrained full insurance is not optimal at an actuarially fair premium. Their equation (11) implies that the optimal insurance quantity depends on the discount factor. Casaburi and Willis (2018) show (equation (4)) that the insurance probability is decreasing with both the degree of time discounting and present bias. 9 We assume the discount factor is bounded at 1 in order to examine how insurance varies with the intensity of positive time preference. We do not constrain elicited time preference to be positive. 10 In general, discounting V need not be the same as applying PT on discounted utilities. However, given there is no risk in the first period and income is the reference point in each period, the two approaches are consistent in the problem we analyze. The relevant reference point is that contemplated at the time the individual decides whether or not to insure. At this time, the individual compares a prospect with insurance with a prospect without insurance. Evaluation of each prospect is made with respect to the same reference point, which we assume is no In the first period, there is a take-it-or-leave-it offer to purchase a fixed quantity of insurance at a given price that must be paid up front. Taking the offer yields The offer is accepted if ( Since   is increasing in  , stronger discounting (smaller  ) reduces the propensity to insure.
OBSERVATION: When it is not possible to borrow or save, individuals who discount the future more heavily are less likely to insure irrespective of whether they behave in accordance with expected utility or prospect theory.
For some combinations of risk attitudes and price, even someone who does not discount the future will decline the offer of insurance. In such cases, variation in the discount factor will obviously have no influence on the decision. However, provided risk aversion and the price are such that insurance is purchased by someone with no time preference, then an otherwise identical individual will be less likely to insure the more they discount the future. This is because insuring involves making a sacrifice now to cushion a possible loss in the future and there are no other instruments available to offset the intertemporal redistribution.
The analysis assumes there is no opportunity to borrow. Lack of credit to purchase health insurance is not a particularly strong assumption in the context of our empirical analysis.
It is critical to the result. The constrained opportunity to borrow to pay the insurance premium creates the dependence of insurance on time preference. 11 To simplify the exposition, we have confined attention to the extreme case in which there is no opportunity to borrow whatsoever.
But if a less extreme liquidity constraint were imposed, the logic tying the demand for insurance to the discount factor would hold. Given a large fraction of the Filipino households in our insurance-no loss. By the time the risk is resolved and the insurance is paid out (if there is a loss), the reference point will have shifted. But that is irrelevant to the insurance decision, as modelled.
11 Ericson and Sydnor (2018) show that predictions obtained from the standard single-period expected utility model of insurance differ from those obtained from a multi-period model when liquidity is constrained. By restricting attention to a two-period model with the risk of incurring a loss in the second period, we rule out the possibility of borrowing to smooth consumption over that loss. Allowing for the possibility to self-insure through credit would reduce the demand for formal insurance offered at an actuarially unfair price (Gollier 2003) but would not change our prediction that the propensity to insure falls together with the discount factor unless credit is available to pay the premium.

11
sample are likely to be liquidity constrained, we expect to find a positive relationship between insurance and the discount factor.

Quasi-hyperbolic discounting
Extending the two-period model to allow for quasi-hyperbolic discounting (QHD) (Laibson 1997), the condition for accepting a take-it-or-leave-it offer of insurance becomes is the present bias parameter. In this case, present bias only increases the degree to which the future is discounted. The propensity to insure is increasing in  and so more present-biased people (smaller β) are less likely to insure. 12 The effect of present bias can change in a three-period model. In period 0, the decision to insure is taken. In period 1, the premium must be paid. In period 2, the loss is incurred or not. In period 0, the agent would compare The difference is   , which has the same sign as   .
Present bias then has no effect on the decision to insure, provided that decision is binding. If the decision taken in period 0 is not binding, then some present-biased individualsthose with 0   but 0  would change their mind and refuse to pay for the insurance. 13 To summarize, we expect present bias to reduce the likelihood of enrollment if payment of the premium is simultaneous with the decision to insure (two-period model) or if the initial decision can be revised when payment is due (three-period model without commitment). In the insurance program we study, part of the premium is due upfront and the remainder is paid at intervals over the course of the year. Present bias may therefore reduce the propensity to enroll initially and increase the propensity to drop out before payment is due for all of the premium. 12 Provided the present biased are naïve, allowing saving does not change the argument. With a saving option, present bias could increase the insurance of sophisticated agents. Upfront payment of a one-off premium offers a means of committing to insurance. It is less prone to procrastination than self-insurance through saving (without a saving commitment device). Recognizing this, those with self-awareness of their present bias may opt for formal insurance (Ito and Kono 2010). We might therefore expect the relationship between insurance and present bias to vary between sophisticates and naïfs, and across contexts distinguished by opportunities to commit to saving. Given the context and our data, we are not able to explore this empirically.

Preference elicitation
This section describes instruments we designed to elicit time and risk preferences in a general population field survey. Appendices A and B contain the instruments. We first describe elicitation of risk preferences and identification of respective parameters because one of those parameters is subsequently used to identify time preference parameters.

Risk preferences
According to PT with a reference point of no insurance-no loss, the insurance decision is between outcomes in the loss domain. We aim to elicit PT risk preferences in that domain. Loss aversion is omitted because it is identified (and relevant) only if there are both gains and losses.
We use two independent sets of hypothetical lotteries implemented in a respondent-enumerator interview. Respondents are asked to choose between two jars each containing four balls of two colors. They are told that one ball will be drawn randomly from the chosen jar and the color of that ball will determine the magnitude of the loss (if any) supposedly incurred.
In the first set of lottery choices, we elicit a sure loss x that leaves the respondent indifferent compared with a 50% chance of losing 400 PHP (pesos) or incurring no loss. Elicitation was conducted with a bisection method. It starts with a choice between two prospects with the same expected value. 14 Depending on the answer, we either increase or decrease the expected value of the second option while keeping the first option constant until the respondent switches from one to the other. If there is no switching, then the procedure ends after offering four choices (Appendix A Figure A4). From the switching point, we infer a range 14 The expected value (-200 pesos) coincides with the insurance premium per month of the program we study.

13
in which the value corresponding to indifference lies and use the midpoint as the estimate of that value.
From the estimated points of indifference x and z , we infer components of PT risk preferences. We do this both with and without the imposition of parametric functions for utility and probability weighting. In the non-parametric case, we use the average of the indifference points ( ) ( ) 2 xz + as a measure of risk tolerance that reflects both utility curvature and optimism.
Larger values (not magnitudes) indicate lower risk tolerance.
We infer probability weighting from the difference between the two indifference points. overweighted. For all weighting functions that cross the diagonal between 0.25 and 0.5, x − z > 0 implies that the function is inverse S-shaped, while x − z < 0 implies that it is S-shaped. For EU types who weight probabilities linearly, x − z = 0. We use x − z as a measure of probability weighting. Larger absolute values indicate greater nonlinear weighting of probabilities. In some analyses, we distinguish between the three types: inverse S-shaped, S-shaped, and linear weighting.
For (tractable) parametric identification of utility curvature and probability weighting, we restrict attention to the single parameter forms of the respective functions that have been demonstrated to perform best among (combinations of) the most popular specifications (Stott 2006 Table A1).

Time preferences
We elicit time preferences in the gain domain, which is expected to reduce the cognitive burden on respondents. The canonical discounted utility model does not prescribe any difference between discounting of gains and losses. There is evidence that gains are discounted more than losses (Yates & Watts, 1975;Thaler, 1981;Benzion et al., 1989). However, this is not a concern since we are interested in the association between insurance and time preference, not the . We use this measure to examine the association between insurance and time preference without making any assumption about the nature of discounting or the functional form of utility. 16 The measure, like all those we use, potentially captures influences of liquidity 16 While we refer to ( ) 0 1 2 2 xz + as a measure of time preference (for shorthand), it does not correspond to a discount factor. For analysis of associations, it is sufficient that the measure is monotonically related to the strength of time preference/discounting. and borrowing costs on temporal choices, in addition to pure time preference (Epper, 2017;Cohen et al., 2020;Dean & Sautmann, 2021). Clearly, this is limiting with respect to identifying mechanisms through which temporal choice impacts the demand for insurance. But it does not prevent examination of how discounting overallnot each of its sourcesaffects insurance, which, in any case, is hypothesized to rest on constrained liquidity (Lui & Myers, 2016;Casaburi & Willis, 2018;Ericson & Sydnor, 2018).
The difference between the indifference points ( )  (Halevy, 2015). We cannot test this assumption because we elicit the preferences of each respondent on only one date. With constrained liquidity, anticipated fluctuation in cash flow may cause non-stationarity without violation of time consistency (Halevy, 2015;Epper, 2017). There is, indeed, evidence of liquidity constrained individuals behaving as if they are present biased (Carvalho et al., 2016;Janssens et al., 2017). Again, from the perspective of this study, this is only a partial limitation since elicitation task choices that appear to reflect present bias, but derive from constrained liquidity, could still help explain low insurance take-up that is caused by the combination of (short-term) discounting and limited opportunity to borrow to pay the premium.
Under the assumption that respondents anticipate immediate consumption of any money Pretesting of the preference elicitation instruments on convenience samples (in Manila) suggested reasonably good comprehension of the tasks and confirmed the effectiveness of visual aids. 20 As is the norm in field surveys, respondents were not paid to participate. Because we elicited risk preferences in the loss domain, we used hypothetical lotteries without incentives. Such lotteries are sometimes found to increase risk seeking (e.g. Holt & Laury, 2002), although Etchart-Vincent & l'Haridon (2011) found no differences in preferences elicited using hypothetical losses, losses from an initial endowment, and real losses. 21 This is less of a concern because we are not primarily interested in the absolute values of the preference parameters but in their associations with insurance. The hypothetical nature of the questions could, however, increase noise and obfuscate relationships.

Health insurance in the Philippines
The National Health Insurance Program (aka PhilHealth) covers most households in the Philippines. Salaried employees are covered by a mandatory employment-based program.
Senior citizens are entitled to cover at no charge and the indigent are covered (subject to a means-test conducted infrequently) by another fully subsidized program. Other disadvantaged groups also get cover without charge. The remainderinformal workers and the self-employed considered insufficiently poor to qualify for the indigent programcan enroll in PhilHealth 18 Since we elicit risk preferences in the loss domain, we are assuming that utility curvature is the same in both domains. As previously mentioned, loss aversion is irrelevant under the assumptions made. 19 The indifference points from Choice 1 and Choice 2 imply  Table B1. 20 Comprehension was demonstrated by most pretest respondents switching at some point and the others being able to rationalize extreme choices. Those opting for a small amount of money rather wait for a much larger amount explained that they needed money immediately. Those opting for a smaller amount in the future explained they were taking the opportunity to save. Those making choices that implied extreme risk seeking explained that they enjoyed the thrill of gambling. Effectiveness of the visual aids was evident from finding a greater propensity to switch when they were used. After pretesting, adjustments were made to some of the amounts used in the Choices/Lotteries to increase the scope for respondents to switch.
voluntarily through the Individual Paying Program (IPP). 22 Only one third of the eligible population joins this program (Manasan, 2011;Capuno et al., 2016). We examine whether the decision to enroll rather than remain uninsured is associated with preferences.
The IPP premium is 2,400 PHP per year (~$50) if average monthly income is no more than 25,000 PHP ($540) and is 3,600 PHP otherwise. In practice, given the difficulty of verifying informal sector incomes, almost all who enroll do so at the lower premium. At the time of application, the premium for one month or a quarter must be paid. Subsequently, the premium is paid at monthly or quarterly intervals. A claim can only be made if the premium has been paid for four months continuously in the last six months. Hence, anyone enrolling in the program for the first time would have to wait for half a year before filing a claim. This creates substantial delay between paying for the insurance and benefiting from it. 23 It also leaves scope for enrolling but not following through with payment of all installments of the premium, and so losing cover. There is potential for time preferencesdiscounting and present biasto influence the decisions to insure and to maintain cover through continued payment of the premium.
As with all PhilHealth programs, anyone who becomes a member of the IPP obtains cover for their spouse, children (<21 years old), and parents (≥ 65 years). The insurance benefit package includes inpatient treatments at accredited hospitals as well as some outpatient services and primary care. Limited coverage of ambulatory care and medicines, as well as reimbursement ceilings, means that the insured are still exposed to the risk of incurring medical expenses that will not be reimbursed (Bredenkamp & Buisman, 2016). 22 The means-test that gives cover through the indigent program is conducted nationwide only infrequently. Hence, a household impoverished through medical expenses would not immediately qualify for this program. Years may pass before entitlement is acquired. This limits the extent to which the indigent program crowds-out insurance through the IPP. 23 Ericson and Sydnor (2018) show that a liquidity constraint increases the value of insurance that can be paid for smoothly over the contract period because it provides a means of smoothing the consumption impact of a loss that cannot be self-insured through borrowing. However, not allowing a claim to be filed for some time after payment of the first instalment of a premium shuts down this mechanism for the duration of the waiting period. Hence, there is a large upfront element to payment for the IPP that in combination with a constraint on liquidity is expected to reduce the value of the insurance. In 2011, after stratification by 15 regions, 243 municipalities were randomly sampled and randomly assigned to treatment (n=179) and control (n=64) sites. Within the sampled municipalities, a random sample of 2,950 households was drawn. In the treatment sites, uninsured households were offered a 50% discount on the IPP premium for one year, along with information on this program. A randomly selected half of those who initially did not take up this offer were offered one-time assistance with application (Baillon et al., 2019).
The 2015 survey aimed to re-interview all 1,975 households that were not covered by mandatory, employment-based health insurance in 2011. Interviews were conducted with 1,513 (77%) of the targeted households. Attrition is higher for urban, younger, and better educated households (Appendix Table D1). We control for these characteristics. To increase the size of the sample, a random selection of 267 households that did have mandatory insurance in 2011 was added.
In 2011, the head of each household or their spouse was targeted for interview. If neither could be interviewed, another adult (≥ 21 years) was selected as the respondent. In 2015, enumerators were instructed to interview the same person in each household. If that person was unavailable, the spouse was to be interviewed.
For each of the 1780 (1513+267) households in the 2015 sample, we define insurance status by the cover of the respondentthe person whose preferences were elicited. The sample used to examine associations between preferences and insurance consists of 807 households that either voluntarily purchased health insurance (n=176) or were uninsured (n=631). 24 The remaining 973 households are excluded because they did not face an insurance decisionthey 24 Of the 1780 household respondents interviewed in 2015, 48 reported not knowing if they had health insurance. The insurance status of 31 of these respondents could be determined using that reported for another household member. We assume that the remaining 17 were uninsured.

19
had mandatory or fully subsidized insurance. The 807 households in the analysis sample are spread over all 15 regions and 222 of the 243 municipalities originally sampled. Most of the voluntarily insured were covered by the IPP (158/176)the others had private insuranceand most had enrolled themselves (108/176)the others were covered through enrollment of a relative. We find some evidence that insurance obtained through self enrollment is more strongly associated with time preference than is insurance acquired through enrollment of a relative (Appendix E). Table 1 reports means of covariates used as controls in multivariable analysis of the association between insurance and preferences. A majority of respondents are female; the proportion female is lower among the uninsured. More than a third of the respondents are heads of their households. Almost a fifth of the uninsured, and around an eighth of the insured, have not completed elementary schooling.

Sample characteristics
To capture risk perceptions, we asked each respondent to compare their household's risk of incurring at least 8000 PHP of out-of-pocket (OOP) medical expenses next year with the risk faced by other households (Appendix C). We use an indicator of perceived own risk lower than the risk of other households. The uninsured, despite their lack of protection, are more likely to report facing a lower risk, which may reflect optimism bias or adverse selection.
We asked respondents whether each of five statements about the operation of health insurance was true or false (Appendix C). Those giving three or more incorrect answers are categorized as having low health insurance literacy. We also asked respondents whether the social health insurance program covers 18 treatments and categorize those giving seven or more incorrect answers as having poor knowledge of insurance benefits (Appendix C). The uninsured are slightly more likely to display low insurance literacy and poor knowledge of insurance benefits, but not significantly.
To control for health, we use indicators of whether anyone in the household a) was sick in the last 30 days, b) has a disability or a chronic illness, and c) was admitted to hospital in the last year. In the sample, b) is more prevalent among the insured, which may also indicate adverse selection. Notes: Sample is restricted to those voluntarily insured or uninsured. Standard deviations of continuous variables in brackets. Right-hand column gives p-values from a t-test of no difference in the means between the insured and uninsured. Adjustment is made for clustering at the municipality level. The uninsured and the insured are drawn from 206 and 101 clusters, respectively. Definitions of variables related to perceptions of out-of-pocket (OOP) medical expenditure risk, health insurance literacy and knowledge of insurance benefits in Appendix C.
We control for measures of both wealth and household income per capita. Wealth is proxied by the first principal component from a factor analysis of housing materials, sanitation, water source, and possession of durable assets (Filmer & Pritchett, 2001). In the analysis, we control for quartile group of this wealth index. In Table 1, we show means of a score that goes from 1 for the least wealthy quartile group to 4 for the wealthiest. The mean is 2.5 in the full sample by construction. The uninsured are less wealthy than the insured. Income is reported household annual income from all sources divided by the number of persons in the household.
The mean incomes presented in Table 1 confirm that uninsured households are poorer. In the analysis, we control for income quartile groups. 21 Insured households are more than ten percentage points more likely to be located in municipalities randomly selected as treatment sites in the experiment (Baillon et al., 2019). Table 2    choices, implying stationary preferences. Around a fifth of respondents appear to be more impatient in the future than in the present, which is consistent with several studies that allow 22 for future bias (Loewenstein, 1987;Scholten & Read, 2006;Sayman & Öncüler, 2009;Attema et al., 2010;Takeuchi, 2011;Bleichrodt et al., 2016;Delaney & Lades, 2017).

Figure 1. Histograms of non-parametric preference measures
There are 250 respondents who consistently display either extreme patience or extreme impatience by not switching on both Choice 1 and Choice 2. 25 While these responses could reflect extreme negative and positive time preference, respectively, they may also arise from misunderstanding or low cognitive effort. Respondents who consistently go for the extremes account for 61% of all those who apparently have stationary preferences, and so dropping those consistently taking extreme options reduces the sample proportion of this preference category and increases the prevalence of present bias (Table 2). 26 On the risk preference elicitation task, the median point of indifference is about -50 for Lottery 1 (x) and -125 for Lottery 2 (z) ( Table 2). Both the median and the mean of the individual-specific averages of the indifference pointsour non-parametric measure of risk tolerance ( ) ( ) Lottery 2 (p-value<0.001), implying that there is inverse S-shaped probability weighting, on average. The bottom right panel of Table 2 shows that more than two fifths of the sample respondents make choices consistent with linear weighting, more than a third are classified as inverse S-shaped types, and less than a fifth are S-shaped (see also Figure 1).
In Lottery 1, 15 out of the 807 respondents choose a dominated prospect, preferring a loss of 400 pesos for sure over a 50% chance to lose the same amount (or nothing) (Appendix Table D4). In the more cognitively demanding Lottery 2, a greater number of respondents (71) violate monotonicity by choosing a dominated prospect. The risk preference parameters are not defined for the 77 respondents who opt for a dominated prospect in either lottery, forcing us to drop these cases from analyses that use these parameters. 27 Since a large majority of those who choose dominated prospects do so only on Lottery 2 (62/77), most are classified as exhibiting inverse S-shaped probability distortion. Dropping them reduces the prevalence of this type.
Around a quarter of respondents consistently make choices that imply extreme risk seeking: on both lotteries, they never switch to a less risky prospect. 28 Dropping them reduces the proportion categorized as exhibiting linear probability weighting. Figure 2 shows distributions of the derived preference parameters.

Figure 2. Histograms of preference parameters
Notes. Distributions of δ, γ, and α are censored at the 99 th percentile. Distribution of β is censored at the 97 th percentile. N=730 27 The proportion insured and covariate means do not differ significantly between those taking a dominated prospect and those not (Appendix Table D5). Less educated are more likely to take a dominated prospect. 28 These respondents do not differ significantly from others, except they are less likely to have poor knowledge of insurance benefits and more likely to perceive their OOP risk as lower than average (Table D6).  29 The conditional mean functions are robust to estimation by local constant regression (Watson, 1964;Nadarya, 1965) rather than local linear regression (Appendix Figure D1). 25 Figure 3. Probability of insurance as function of each non-parametric preference measure Notes: The graph in each panel is obtained from a bivariate local linear regression of an insurance indicator on the respective preference measure. Dots indicate point estimates of conditional means at the 5 th , 10 th ,., 95 th percentiles of the respective measure. The number of dots is less than 19 in some panels because of equal percentile values. The Epanechnikov kernel function is used. Bandwidth is selected by the plugin estimator of the asymptotically optimal constant bandwidth because cross-validation (Li & Racine, 2004) produced very wide bandwidths for the non-stationarity and risk tolerance measures. To aid convergence, each measure was scaled through division by 1000, but the x-axes in the figure are on the original scales. Whiskers show 95% confidence intervals (CI) obtained from a bootstrap percentile method (Cattaneo & Jansson, 2018) with 500 replications.

Bivariate associations of insurance with preferences
The top row of Table 3 panel A shows estimates of the marginal change in the conditional probability of insurance with respect to each non-parametric preference measure calculated at each value of that measure and averaged. Each estimate is from the bivariate local linear regression used to estimate the conditional mean function in the respective panel of Figure 3. For shorthand, we refer to these estimates as "effects" without inferring causality. We scale all measures through division by 1000. Hence, the estimate in the first column indicates that a standard deviation increase in the time preference measure is associated with a 1.5 percentage point (0.186×(82.4/1000)×100) increase the probability of insurance, on average.
That is 7% of an estimated insurance probability of 22.1 percent. However, this averaged estimate is not remotely significant and should be interpreted bearing in mind the nonlinearity observed in Figure 3. The estimated associations of insurance with the other non-parametric preference measures are also not at all significant. The negative point estimates for nonstationarity and probability weighting indicate that, in the sample, the negative relationships observed on the left of the respective graphs in Figure 3 predominate: the insurance probability decreases with weaker present bias and with less intense S-shaped probability weighting. 26    . It is around 9 percentage points (pp) higher at the point where there is no (long-term) discounting (δ=1). Thereafter, over a range of the parameter distribution that is much less dense (Figure 2), the insurance probability declines as discounting becomes increasingly negative. The statistically significant estimate in the first column of Table 3 panel 30 The conditional mean functions are generally robust to estimation by local constant regression, rather than local linear regression. Two differences are that the local constant regression estimator gives a slightly flatter function of insurance with respect to the present bias parameter and a function that displays somewhat more of a negative slope for the utility curvature parameter (Appendix Figure D2).

27
B indicates that a standard deviation increase in δ is associated with a 5.7 pp (7.92×0.7163) increase in the probability of insurance, on average. There is no clear relationship between insurance and the present bias parameter. As with the non-parametric measure, compared with stationary types, both those who display more present bias and (up to a point) those more future biased are more likely to be insured. 31 The bottom right graph in Figure 3 shows that the probability of being insured is lowest for those who are close to having linear utility ( ) 1  . From this point, the probability rises, although not continuously, with more convex utility in the domain of losses ( )   . That is, risk seekers in this domain are more likely to be insured than the risk neutral. However, the confidence intervals are wide and overlapping, and the negative point estimate of the average effect of utility curvature on the probability is not remotely significant (Table 3). There is little or no variation in the probability of insurance as probability sensitivity varies over the range of inverse S-shaped distortion up to linear weighting ( ) 1   (Figure 4). Thereafter, the probability 31 Conditional mean functions with respect to time preference parameters derived under the restriction of linear utility have similar shapes to those shown in the top row of Figure 4 (Appendix Figure D3), except that the mean is a less smooth function of δ, which is more right-skewed under this restriction. The estimated marginal change in the insurance probability with respect to δ derived under linear utility is about one third of the respective estimate shown in Table 3 (Table D7), which could be due to confounding by utility curvature.

28
increases with more S-shaped weighting, although not at the extreme and the average effect is not significant (Table 3).

Main estimates
We estimate partial associations of insurance with all preferences simultaneously using probit models of a binary insurance indicator, ( ) * 10 ii yy = , specified as follows,   Table 1 (with age in quadratic form, and wealth and income each represented by respective quartile group indicators), i S is a vector of strata (province) indicators, and i  is a standard normal distributed error that possibly exhibits dependence within sample clusters.
While controlling for covariates reduces the risk of confounding, it increases the risk of introducing bad controls. Preferences potentially influence insurance through education, health behavior, and health, for example. For this reason, we also present estimates obtained with no covariate controls (other than strata indicators). We continue to refer to estimates as "effects" but interpret them only as partial associations. Table 4 panel A gives estimates of average partial effects from models that use the nonparametric preference measures. In columns (1)-(3), each measure enters the linear index function (4) linearly. In column (1), where there is no control for covariates (other than strata), none of the preference measures has a remotely significant partial effect. As with the bivariate estimates (Table 3)  Notes: Partial effects derived from probit estimates and are averaged over the respective sample. Robust standard errors adjusted for clustering at the municipality level in parentheses. Except for column (1), models include covariates in Table 1 (with age quadratic and quartile group indicators for each of wealth and income). All models include strata indicators. Partial effects of covariates in Appendix D Table D8. x0 , z1/2, x, and z are the indifference points elicited from Choice 1, Choice 2, Lottery 1, and Lottery 2, respectively. In column (4), reference groups are stationary preferences ( 0 = 1 2 ⁄ and = 1 ) and linear probability weighting ( = and = 1 ). Dominated refers to respondents who choose a dominated prospect on either lottery. δ specification refers to how the longterm discount factor enters the latent index. *** p<0.01, ** p<0.05, * p<0.1 Dropping respondents who choose a dominated prospect on the risk elicitation task (column (3)) increases the estimated effects of time preference and, to a greater extent, risk tolerance. The positive point estimate of the effect of the latter implies that, in the sample, a 30 standard deviation increase in the direction of greater risk seeking in the loss domain is associated with a 2.1 pp increase in the conditional probability of being insured.
In column (4), we replace the continuous measure of non-stationarity with indicators of present bias and future bias, and we use indicators of inverse S-shaped and S-shaped probability weighting instead of the respective continuous measure. Consistent with Figure 3, the point estimates suggest that both the present and future biased are more likely to insure than the unbiased, and both inverse S-shaped and S-shaped weighting types are more likely to insure than the linear weighting (EU) types. However, none of these differences is statistically significant.
Panel B presents estimates from models that use the preference parameters, and so exclude respondents who opt for dominated prospects. In columns (1) and (2), each parameter enters the latent index linearly. Without control for covariates (column (1)), there are marginally significant estimated effects of the parameters for long-term discounting (δ), utility curvature (γ), and probability sensitivity (α). The respective signs imply that the conditional probability of insurance is higher for those who exhibit a) less discounting of future returns, b) more risk seeking over losses, and c) less inclination toward inverse S-shaped probability weighting.
With the addition of covariates (column (2)), the discounting effect strengthens in magnitude and gains some precision, while the other two effects weaken. A standard deviation increase in the long-term discount factor, which for given present bias implies weaker time preference, is associated with a 2.7 pp increase in the probability of insurance. This is around half the magnitude of the bivariate estimate (Table 3), which may be because the specification used in column (2) does not allow sufficiently for the nonlinearity observed in Figure 4. When the index function is specified as a quadratic function of δ, the average partial effect of this parameter increases by 50% (column (3)). With this specification, we estimate that a standard deviation increase in δ is associated with a 4 pp increase in the insurance probability.
In column (4), we return to a linear specification for δ and allow nonlinearity (of the index function) in both β and α by replacing those parameters with indicators of present/future bias and inverse S-shaped/S-shaped probability weighting, respectively. This again shows that, in the sample, both present bias and future bias are associated with a higher probability of insurance compared with stationarity, and that S-shaped weighting types are more likely to insure, although neither difference is significant. The estimated effect of the discount factor is robust to this specification. 31

Robustness
We assess robustness of the estimates given in Table 4 column (2) (see Appendix Table D9).
Linear probability model (LPM) estimates differ little in size and significance from the probit estimates, particularly for the model that uses parameters. The LPM estimate of the partial effect of the long-term discounting parameter δ (0.0388, SE=0.0162) is extremely close to the respective probit estimate (0.0378, SE=0.0155).
Using probit and controlling for a more limited set of covariates (sex, age, and random assignment to an experiment treatment site) that are almost certainly not determined by preferences produces estimates that generally lie between those given in column (2) (full controls) and column (1) (no controls) of Table 4. The effects of the utility curvature and probability sensitivity parameters are marginally significant with this specification.
Excluding respondents who make choices that imply extremely strong or extremely weak time preference (because they do not switch in the elicitation task) reduces the estimated effect of the non-parametric time preference measure by 46% but reduces the effect of δ by 15%. The standard error of the latter estimate increases and it loses statistical significance because of a 30% reduction in the sample size. Exclusion of respondents who make choices that imply extreme risk seeking has relatively little impact on the estimates.
To further assess robustness of the finding that the probability of insurance rises with the discounting factor δ, we estimate probit models with alternative transformations of the parameters (Table D10). First, we confirm that the effect is not driven by outliers by censoring (winsorizing) δ, and the other parameters, at the 99 th percentile. This substantially increases the magnitude of the partial effect of δ, bringing it closer to the estimate obtained with a quadratic specification shown in Table 4 column (3). Allowing for nonlinearity by entering the inverse hyperbolic sine of δ, and each of the other parameters, into the probit latent index function also produces a larger and significant average partial effect of δ similar to the Table 4 column (3) estimate. 32 This is also true if we replace the value of δ with its relative rank, except that the average effect is no longer significant in this case. 32 We use the inverse hyperbolic sine rather than the log transformation because there are some extremely small (close to zero) values of δ. Because of these values, the log transformation turns a right-skewed distribution into a left-skewed one. Further, if the log transformation is used, then the expression for the partial effect of δ on the probability of insurance includes division by δ. This produces huge partial effects for a few respondents and distorts the average partial effect.
Using time preference parameters derived under the assumption of linear utility reduces the average effect of δ to 0.0226 and renders it insignificant. This may be due to confounding by utility curvature. With this one caveat, which may itself be due to a misspecification, we conclude that the finding of a positive association between insurance and the long-term discount factor (weaker time preference) does not appear to be sensitive to the empirical specification or the treatment of extreme/outlier responses.

Heterogeneity
Theory predicts that time preference influences insurance when liquidity is constrained. While we do not observe liquidity constraints, we do have a proxy measure of wealth. Less wealthy households are more likely to be liquidity constrained. Insurance is expected to be more strongly associated with the time preference of these households. To test this, we extend the model used to produce the estimates in Table 4 Panel B column (2) by including interactions between each preference parameter and an indicator of being below the median of the wealth index. 33 Consistent with the theory, estimates in Table 5 columns (1) and (2) show that the effect of the discount factor δ is much larger, and only significant, at lower wealth. A standard deviation increase in δ is associated with a 4.9 pp increase in the probability that a low-wealth household insures, which is a 36% increase from an enrollment rate of 13.6%. At low wealth, there is even a significant partial association between insurance and the non-parametric time preference measure that also suggests that stronger time preference reduces the probability of insurance among households that are more likely to be liquidity constrained. 34 Time and risk preferences may have little impact on the insurance decision of someone who does not fully comprehend that insurance involves making an up-front non-refundable payment to reduce, but not eliminate, liability for losses that they are not certain to incur in the future. To test this hypothesis, we allow interactions between each preference parameter and an indicator of low health insurance literacy or poor knowledge of the medical expenses covered by the insurance program. The estimates in Table 5 columns (3)-(4) reveal that the positive effect the discount factor δ is much stronger and only significant when insurance literacy and 33 We also explore heterogeneity in the partial effects of the non-parametric preference measures by estimating extended versions of the model used to produce the Table 4 Panel A column (2) estimates that include interactions between each of those measures and characteristics, such as low wealth. See Appendix Table D11. 34 The stronger partial association between insurance and time preference at low wealth is also evident in LPM estimates. See Table D12, which gives LPM estimates for all the heterogenous effects. 33 knowledge are high. In that case, the estimated effect of δ is almost 2.5 times larger than its average effect over all individuals.
There is a positive and significant effect of the present bias parameter  only for those who do not lack insurance literacy and knowledge. For this group, a standard deviation increase in β, which corresponds to less present bias, is associated with an estimated 6.7 pp increase in the probability of insurance. The relationship has the opposite sign and is weaker when insurance literacy/knowledge is low. Notes: Average partial effects from probit models like that used for Table 4 Panel B column (2) but extended to include interactions with each preference parameter. Model for columns (1) and (2) includes interactions with indicator of below (low) wealth index median. Model for columns (3) and (4) has interactions with indicator of low health insurance literacy/knowledge (Appendix C). Model for columns (5) and (6)  The significant, negative relationship between insurance and utility curvature in column (4) suggests that among people with reasonable understanding of the concept and content of insurance, those with more intense risk seeking preferences (conditional on all else) are more likely to insure. This is similar to puzzling evidence from other studies that greater risk aversion in the domain of gains is associated with a smaller likelihood of insurance (Giné et al., 2008;Giesbert et al., 2011;Cole et al., 2013;Dercon et al., 2015). One possible explanation is that those with little experience of insurance view it as a risky prospect. Another is that the risk of default on the payment of compensation deters the risk averse. This is a plausible scenario in the Philippines, where healthcare providers are permitted to charge in excess of reimbursement rates paid by the insurer.
Associations between preferences and insurance may also be obscured by low education if this impedes comprehension of the preference elicitation tasks and results in noisier measures of preferences. To test this, we allow interactions between each preference parameter and an 34 indicator that distinguishes high school and college graduates (high) from those with less education (low). 35 Estimates in columns (5)-(6) confirm that effects of the time preference parameters are larger in magnitude when education is high, and both of these effects are positive and significant only for this group. The puzzling association between insurance and risk seeking is much weaker and not significant among the better educated.
Almost four fifths ( (Table D8). We find stronger relationships between insurance and preferences in the control sites (Table D13), which suggests that our central finding that insurance is associated with time preference is not attributable to using a sample that was previously incentivized to insure. If anything, this may have weakened the association.
There is some support for the hypothesis that elicited preferences are more strongly associated with insurance when respondents obtain it through enrolling in the program themselves rather than through enrollment of a family member (Appendix E). In particular, the discount factor is significantly associated only with the probability of being insured directly and the point estimate of this effect is twice as large as the estimated effect on the probability of being insured indirectly (Table E1).

Conclusion
Finding stronger time preference is associated with a lower probability of insurance confirms the theoretical prediction (of both expected utility and prospect theory) obtained when the newly insured must wait before making a claim and are unable to borrow (at a reasonable interest rate) to pay the premium. The waiting period is six months in the health insurance program we study, many of the low-income Filipinos in our sample are likely to be liquidity constrained, and we find an even stronger association between insurance and time preference amongst least wealthy. While the study does not have a design capable of delivering causal effects, the estimated positive association between insurance and elicited discount factors is consistent with time preference influencing the decision to insure, and theory gives grounds for this interpretation. If it were accepted, it would imply that one of the reasons many low-income, 35 Among those who did not graduate from high school or college, 49.3% score low on insurance literacy or knowledge, compared with 48.4% among graduates. Hence, insurance literacy/knowledge and education are distinct dimensions of potential heterogeneity.

35
potentially liquidity constrained Filipinos do not insure medical expenses is that they value the premium due at enrollment above highly discounted benefits they would enjoy at least six months later, if at all. In addition to the premium, up-front indirect effort costs of applying for insurance may deter low-income individuals with strong time preference and little cognitive bandwidth to contemplate future circumstances (Mullainathan & Shafir, 2014;Schilbach et al., 2016). Randomized experiments, including in the Philippines, demonstrate that assistance that reduces application costs can be highly effective (at least initially) in increasing insurance takeup (Thornton et al. 2010, Capuno et al. 2016, Banerjee et al. 2021, Baillon et al. 2019. Constrained liquidity can cause the propensity to take up-front payment insurance to decrease with the intensity of time preference and it can raise the value of a more flexible insurance contract that allows the premium to be paid at times of liquidity (Casaburi & Willis, 2018;Belissa et al., 2019;Liu et al., 2020). Redesigning contracts with respect to the timing of premium payments and claim entitlements may increase take-up of health insurance in countries, such as the Philippines, that are striving for Universal Health Coverage. The welfare gains from such contracts potentially extend beyond smoothing consumption over medical expense shocks to health improvements produced by more affordable healthcare. This inference is supported by US evidence that more comprehensive, subsidized health insurance weakens the extent to which the utilization of medicines is sensitivity to the liquidity of low-income households (Gross et al., 2021). Liquidity constraints are barriers to both healthcare consumption and health insurance.
We find that the probability of insurance is negatively associated with present bias only among those who appears to understand how insurance operates and what it delivers. While a negative association is consistent with predictions obtained from the quasi-hyperbolic discounting model, it does not necessarily follow that insurance take-up would be raised by redesigning contracts to counter time inconsistency. We infer present bias from violation of stationarity, which implies time inconsistency only if preferences are time invariant (Halevy, 2015). Constrained liquidity may induce time varying preferences (Janssens et al., 2017) and cause people to behave as if their preferences are present biased (Epper, 2017), including in preference elicitation tasks (Dean & Sautmann, 2021). Consequently, attempts to raise insurance demand by offering commitment devices may fail not only because of a lack of sophistication but also because some of those who appear to be present biased are actually time consistent. If illiquidity is the source of the problem, then there will be little interest in committing any liquid assets that are at hand. It would be more effective to either loosen the 36 constraint on liquidity or redesign insurance such illiquidity is less of a barrier to its purchase.
A combination of the two would not necessarily be even more effective since illiquidity raises the value of insurance that can be paid for smoothly over a contract period (Ericson & Sydnor, 2018).
We do not find strong associations between insurance and risk preferences. Barseghyan et al. (2013) and Sydnor (2010) find that concave utility over final consumption cannot account for patterns of demand for home and automobile insurance in the US. The first study infers from revealed preferences that there is overweighting small probabilities and this drives insurance demand. Jaspersen et al. (2021) find that each of three dimensions of elicited risk preferencesutility curvature, probability weighting, and loss aversioncorrelate with insurance behavior in a laboratory experiment. However, collectively these three components of risk preferences can explain only a small fraction of the variation in insurance. Our study adds to accumulating evidence (Casaburi & Willis, 2018;Belissa et al., 2019;Liu et al., 2020) that time preferences may be an important missing ingredient to explain the (low) demand for insurance.
We treat time preferences as distinct from risk preferences. An alternative view is that time preferences are inherently connected to risk preferences because any future prospect carries risk (Epper & Fehr-Duda, 2021). One might wonder whether this explains our main finding. That is, insurance associated with elicited time preference because the latter captures risk attitudes. This interpretation is inconsistent with the direction of the association we find. If insurance is associated with risk aversion and the more risk averse display stronger time preference, then the positive association between insurance and the elicited discount factor will underestimate the strength of the association between insurance and pure time preference. I am now going to ask you some questions which are a little bit similar to "pera o bayong", but they are about losing money as opposed to winning money. There is no right or wrong answer, I am just curious to hear what you prefer.

A.
I am going to ask you to make the choice between two options, represented by these two jars. The first jar contains 4 balls. If you choose the first jar, one ball will be drawn from the jar. If a black ball is drawn you will lose 400 pesos and if a white ball is drawn you will lose nothing. This means there is an equal chance of losing 400 pesos and losing nothing (fifty-fifty). If you choose the second jar you will lose 200 pesos for sure. If you were to choose between these two jars, which one would you choose?
An equal chance (fifty-fifty) of losing 400 pesos and losing nothing 1 CONTINUE TO B Lose 200 pesos for sure 2 SKIP TO E B.

NOTE TO THE INTERVIEWER: Change the visual support for only the second option
What if the first jar stays the same, but if you choose the second jar you will now lose 150 pesos for sure. If you were to choose between these two jars, which one will you choose?
Again, an equal chance (fifty-fifty) of losing 400 pesos and losing nothing 1 CONTINUE TO C Lose 150 pesos for sure 2 PROCEED TO NEXT SET C.

NOTE TO THE INTERVIEWER: Change the visual support for only the second option
What if the first jar stays the same, but if you choose the second jar you will now lose 100 pesos for sure. If you were to choose between these two jars, which one will you choose?
Again, an equal chance (fifty-fifty) of losing 400 pesos and losing nothing 1 CONTINUE TO D Lose 100 pesos for sure 2 PROCEED TO NEXT SET D.

NOTE TO THE INTERVIEWER: Change the visual support for only the second option
What if the first jar stays the same, but if you choose the second jaryou will now lose 1 peso for sure. If you were to choose between these two jars, which one will you choose?
Again, an equal chance (fifty-fifty) of losing 400 pesos and losing nothing 1 PROCEED TO NEXT SET Lose 1 peso for sure 2 PROCEED TO NEXT SET 43 Figure A1. Questionnaire for Lottery 1 used to elicit risk preferences E.

NOTE TO THE INTERVIEWER: Change the visual support for only the second option
What if the first jar stays the same, but if you choose the second jaryou will now lose 250 pesos for sure. If you were to choose between these two jars, which one will you choose?
Again, an equal chance (fifty-fifty) of losing 400 pesos and losing nothing 1 PROCEED TO NEXT SET Lose 250 pesos for sure 2 CONTINUE TO F F.

NOTE TO THE INTERVIEWER: Change the visual support for only the second option
What if the first jar stays the same, but if you choose the second jaryou will now lose 300 pesos for sure. If you were to choose between these two jars, which one will you choose?
Again, an equal chance (fifty-fifty) of losing 400 pesos and losing nothing 1

PROCEED TO NEXT SET
Lose 300 pesos for sure 2 CONTINUE TO G G.

NOTE TO THE INTERVIEWER: Change the visual support for only the second option
What if the first jar stays the same, but if you choose the second jaryou will now lose 400 pesos for sure. If you were to choose between these two jars, which one will you choose?
Again, an equal chance (fifty-fifty) of losing 400 pesos and losing nothing 1

PROCEED TO NEXT SET
Lose 400 pesos for sure 2 PROCEED TO NEXT SET

RISK AVERSION(2) [Lottery 2]
NOTE TO THE INTERVIEWER: Present the respondent with the visual support. The visual support for option 1 stays the same throughout the question, but the visual support for option 2 needs to be changed for every sub question.

A.
In this question I am again going to ask you to make the choice between two options, represented by these two jars. The first jar again contains 4 balls. If you choose the first jar, one ball will be drawn from the jar. If a black ball is drawn you will lose 400 pesos and if a white ball is drawn you will lose nothing. This means there is a chance of 1 out of 4 (25%) of losing 400 pesos and 3 out of 4 (75%) losing nothing. The second jar also contains 4 balls. If you choose the second jar also one ball will be drawn from the jar. If a black ball is drawn you will lose 200 pesos and if a white ball is drawn you will lose nothing. This means there is an equal chance of losing 200 pesos and losing nothing (fifty-fifty). If you were to choose between these two jars, which one would you choose?
A chance of 1 out of 4 (25%) of losing 400 pesos and 3 out of 4 (75%) losing nothing 1 CONTINUE TO B An equal chance (fifty-fifty) of losing 200 pesos and losing nothing 2 SKIP TO E B.

NOTE TO THE INTERVIEWER: Change the visual support for only the second option
What if the first jar stays the same but the second jar changes? If a black ball is drawn from the second jar it now means you lose 150 pesos. If you were to choose between these two jars, which one will you choose?
Again, a chance of 1 out of 4 (25%) of losing 400 pesos and 3 out of 4 (75%) losing nothing 1 CONTINUE TO C An equal chance (fifty-fifty) of losing 150 pesos and losing nothing 2 PROCEED TO NEXT SET C.

NOTE TO THE INTERVIEWER: Change the visual support for only the second option
What if the first jar stays the same but the second jar changes? If a black ball is drawn from the second jar it now means you lose 100 pesos. If you were to choose between these two jars, which one will you choose?
Again, a chance of 1 out of 4 (25%) of losing 400 pesos and 3 out of 4 (75%) losing nothing 1 CONTINUE TO D An equal chance (fifty-fifty) of losing 100 pesos and losing nothing 2 PROCEED TO NEXT SET 45 Figure A2. Questionnaire for Lottery 2 used to elicit risk preferences D.

NOTE TO THE INTERVIEWER: Change the visual support for only the second option
What if the first jar stays the same but the second jar changes? If a black ball is drawn from the second jar it now means you lose 1 peso. If you were to choose between these two jars, which one will you choose?
Again, a chance of 1 out of 4 (25%) of losing 400 pesos and 3 out of 4 (75%) losing nothing 1 PROCEED TO NEXT SET An equal chance (fifty-fifty) of losing 1 peso and losing nothing 2 PROCEED TO NEXT SET E.

NOTE TO THE INTERVIEWER: Change the visual support for only the second option
What if the first jar stays the same but the second jar changes? If a black ball is drawn from the second jar it now means you lose 250 pesos. If you were to choose between these two jars, which one will you choose?

PROCEED TO NEXT SET
An equal chance (fifty-fifty) of losing 250 pesos and losing nothing 2 CONTINUE TO F F.

NOTE TO THE INTERVIEWER: Change the visual support for only the second option
What if the first jar stays the same but the second jar changes? If a black ball is drawn from the second jar it now means you lose 300 pesos. If you were to choose between these two jars, which one will you choose?
Again, a chance of 1 out of 4 (25%) of losing 400 pesos and 3 out of 4 (75%) losing nothing 1 PROCEED TO NEXT SET An equal chance (fifty-fifty) of losing 300 pesos and losing nothing 2 CONTINUE TO G G.

NOTE TO THE INTERVIEWER: Change the visual support for only the second option
What if the first jar stays the same but the second jar changes? If a black ball is drawn from the second jar it now means you lose 400 pesos. If you were to choose between these two jars, which one will you choose?
Again, a chance of 1 out of 4 (25%) of losing 400 pesos and 3 out of 4 (75%) losing nothing 1 PROCEED TO NEXT SET An equal chance (fifty-fifty) of losing 400 pesos and losing nothing 2 PROCEED TO NEXT SET 46 Figure A3. Example of visual support for risk elicitation task From (1) and (2), we have  and Lottery 2 (z) indifference points Utility curvature (γ) Utility curvature (γ) Probability weighting (α)

TIME PREFERENCE (2) [Choice 2] SHOWCARD.
Again, I will ask you to make some choices between receiving money at different points in time, now they concern choices between half a year from now and one year from now. Again, there is no right or wrong answer. If you were to choose between the following two options, which one will you choose? Figure B2. Questionnaire for Choice 2 used to elicit time preferences  Table B1: Time preference parameters δ l and β l for all combinations Choice 1 (x0) and Choice 2 (y) indifference points assuming linear utility Note: To obtain the parameters under power utility, raise each entry in the tables above to a power given by any value of utility curvature ( )  given in Table A1.  Figure D2. Conditional probability of insurance as function of each preference parameter -Local constant regression estimates Notes: The graph in each panel is obtained from a bivariate local constant regression of an insurance indicator on the respective preference parameter. It shows estimates of conditional means at the 5 th , 10 th ,., 95 th percentiles of the respective parameter. The number of estimates is less than 19 in some panels because of equal percentile values. The Epanechnikov kernel function is used. For δ, β and α, the bandwidth is selected optimally by crossvalidation (Li & Racine, 2004). This method produced very wide bandwidth for γ, and so for this parameter, the bandwidth is selected by the plugin estimator of the asymptotically optimal constant bandwidth. Whiskers show 95% confidence intervals (CI).

Power utility
Linear utility Figure D3. Conditional probability of insurance as function of time preference parameters comparison of parameters derived assuming power (left) and linear (right) utility Notes: The graph in each panel is obtained from a bivariate local linear regression of an insurance indicator on the respective preference parameter. The graphs in the left-hand column are the same as those in the top row of Figure  4. They use time preference parameters that are derived under the assumption of power utility, using individualspecific estimates of utility curvature. The graphs in the right-hand column use parameters derived under the assumption of linear utility. Dots indicate point estimates of conditional means at the 5 th , 10 th ,., 95 th percentiles of the respective measure. The number of dots is less than 19 in some panels because of equal percentile values. The Epanechnikov kernel function is used. Bandwidths are selected optimally by cross-validation (Li & Racine, 2004). Whiskers show 95% confidence intervals (CI) obtained from a bootstrap percentile method (Cattaneo & Jansson, 2018) with 500 replications.   Notes: Sample includes respondents voluntarily insured or uninsured at follow up. Choosing the same extreme option corresponds to not switching either at a) D in both Choice 1 and Choice 2 (extreme impatience) or b) I in both Choice 1 and Choice 2 (extreme patience) (see Appendix B Figure B3). All covariates measured at follow-up. p-values adjusted for clustering at municipality level.  Notes: Sample includes respondents voluntarily insured or uninsured at follow up. Choosing a dominated prospect corresponds to selecting option G in either Lottery 1 or Lottery 2 (see Appendix A Figure A4). All covariates measured at follow-up. p-values adjusted for clustering at municipality level. Notes: Sample includes respondents voluntarily insured or uninsured at follow up. Choosing the same extreme risk seeking prospect corresponds to not switching either at D in both Lottery 1 1 and Lottery 2 (see Appendix A Figure A4). All covariates measured at follow-up. p-values adjusted for clustering at municipality level. Notes. The rows labelled "Effect" show estimates of the marginal change in the probability of insurance associated with a unit increase in the respective time preference parameter. The marginal changes are estimated at all values of the respective parameter and averaged. The columns headed "Power utility" reproduce estimates from Table 3, that correspond to the conditional mean functions shown in Figure 4. They utilize estimates of the time preference parameters derived under the assumption of power utility, using individual-specific estimates of utility curvature. The estimates in the columns headed "Linear utility" use preference parameter estimates derived under the assumption of linear utility. They correspond to the conditional mean functions shown in the right-hand column of Figure D3. All estimates are from bivariate local linear regressions. See notes to the respective figures for details of computation. The bandwidths shown in the table are those used for estimation of derivatives of the mean function. They are obtained by cross-validation (Henderson et al., 2015). Bootstrap standard errors in parentheses (500 replications). Statistical significance indicated by * p<0.1, ** p<0.05, *** p<0.01. Notes: Probit estimates of (averaged) partial effects on probability of being insured for covariates used as controls in models used to produce Table 4. Estimates under "Non-parametric preference measures" go with Panel A, column (2) estimates in Table 4. Estimates under "Preference parameters" go with Panel B, column (2) estimates in Table 4. Estimates for 14 strata indicators also included in the models are not presented. Age enters all models with a quadratic specification. Reference categories: education -college graduate; household sizeno. persons =1; wealth index quartilerichest 25%; income per capita quartile grouprichest 25%. Definitions of variables related to perceptions of out-of-pocket (OOP) medical expenditure risk, health insurance literacy and knowledge of insurance benefits in Appendix C. Robust standard errors adjusted for clustering at the municipality level in parentheses. *** p<0.01, ** p<0.05, * p<0.1  Table 4, column (2). Column (1) shows estimates obtained using the same covariate specification and samples as Table 4, column (2) but estimated as linear probability model by ordinary least squares. Column (2) uses same (probit) estimator as Table 4 but with a more restricted set of controls comprising sex, age (quadratic), experiment treatment site indicator and strata indicators. Column (3) estimated by probit with covariates as Table 4, column (2) but excluding respondents who do not switch in both Choice 1 & Choice 2 of the time preference elicitation task. Column (4) excludes respondents who do not switch in both Lottery 1 & Lottery 2 of the risk preference elicitation task. Robust standard errors adjusted for clustering at the municipality level in parentheses. x0 , z1/2, x, and z are the indifference points elicited from Choice 1, Choice 2, Lottery 1, and Lottery 2, respectively. Dominated refers to respondents who choose a dominated prospect on either lottery. *** p<0.01, ** p<0.05, * p<0.1  Table 4 columns (2)-(3) but differs in how the preferences parameters enter the latent index function. In column (1), all parameters are winsorized at the respective 99 th percentile. In column (2), the inverse hyperbolic spline (IHS) of each parameter is used. In column (3), the relative rank of each parameter is used. In column (4), we enter the time preference parameters (δ and β) that are derived in the money domain under the assumption of linear utility. In columns (2) and (3), we still show the average partial effect of a unit change in each parameter, not the effect of the transformation of the parameter that enters the probit latent index. For the IHS, the estimated partial effect of δ, for example, is ( )

60
is the standard normal probability density function, * ï y is the predicted value of the latent index from the probit estimate of (4), and b  is the coefficient on the IHS transformation of δ included in the latent index. For the relative rank, we obtain the partial effect of a unit change in the relative rank, which corresponds to moving from the minimum to the maximum value, and scale this by the relative rank (percentile) change that corresponds to a unit change in the parameter from the minimum value. Robust standard errors adjusted for clustering at the municipality level in parentheses. Dominated refers to respondents who choose a dominated prospect on either lottery. *** p<0.01, ** p<0.05, * p<0.1 Notes: Tables shows average partial effects from probit models like that used to produce the estimates in Panel A column (2) of Table 4 but extended to include interactions with each preference measure. Model used for columns (1) and (2) includes interactions with an indicator of low health insurance literacy and/or knowledge (Appendix C for definitions). Model for columns (3) and (4) has interactions with an indicator that distinguishes high school and college graduates (high) from those with less education (low). Model for columns (5) and (6) has interactions with an indicator of below (low) and above (high) median of wealth index. Models include the covariates used for Table 4, column (2) estimates, except in columns (1)-(2) one indicator of low literacy or low knowledge replaces separate indicators of low insurance literacy and knowledge, in columns (3)-(4) one indicator of high school/college graduate replaces separate indicators of four levels of education, and in (5)-(6) one indicator of below/above median wealth replaces indicators for wealth quartile groups. All models include indicators of the sample strata. Robust standard errors adjusted for clustering at the municipality level in parentheses. N indicates the size of the respective group, although the full sample is used to estimate each model. *** p<0.01, ** p<0.05, * p<0.1 Notes: Table shows OLS estimates of partial effects from LPM like those used to produce the estimates in column (1) of Table D9 but extended to include interactions with each preference measure. Models used for columns (1) and (2) include interactions with an indicator of below (low) and above (high) median of wealth index. Models for columns (3) and (4) have interactions with an indicator of low health insurance literacy and/or knowledge (Appendix C for definitions). Models for columns (5) and (6) have interactions with an indicator that distinguishes high school and college graduates (high) from those with less education (low). Models include the covariates used for Table D9 column (1) (and Table 4, column (2)) estimates, except in columns (1)-(2) one indicator of below/above median wealth replaces indicators for wealth quartile groups, in columns (3)-(4) one indicator of low literacy or low knowledge replaces separate indicators of low insurance literacy and knowledge, and in (5)-(6) one indicator of high school/college graduate replaces separate indicators of four levels of education. All models include indicators of the sample strata. Robust standard errors adjusted for clustering at the municipality level in parentheses. N indicates the size of the respective group, although the full sample is used to estimate each model. *** p<0.01, ** p<0.05, * p<0.1 69 Notes: Table shows Probit and LPM estimates of partial effects from models like those used to produce the estimates in column (2) of Table 4 and column (1) of Table D9, respectively, but extended to include interactions between each preference measure/parameter and an indicator of being located in a treatment site of the insurance experiment conducted in 2011/12. Models include the covariates used for Table 4 column (2) and Table D9 column (1) estimates. All models include indicators of the sample strata. Robust standard errors adjusted for clustering at the municipality level in parentheses. N indicates the size of the respective group, although the full sample is used to estimate each model. *** p<0.01, ** p<0.05, * p<0.1

Appendix E Partial associations of preferences with insurance obtained as a program member and as a member's dependent
A respondent can obtain insurance directly through their own membership of a PhilHealth program (or a private policy) or indirectly through the membership of someone else in their household. Over three fifths of the insured respondents in our sample are direct members of a program. While a respondent's (elicited) preferences need not be irrelevant to becoming insured indirectly as a dependentinsurance may be a household decision, particularly since cover is extended to a member's householdthe influence of those preferences would be weakened if the respondent negotiates over whether to insure with a spouse, who possibly holds different preferences. In that case, the associations of the elicited preferences with the acquisition of insurance through direct membership would be stronger than the associations with insurance obtained as a dependent.
To explore this potential heterogeneity, we estimate multinomial probit models of a threecategory outcome: i) insured as a member, ii) insured as a dependent, and iii) uninsured. Average partial effects derived from these models are given in Table E1.
As expected, the partial effect of the discount factor on the probability of being insured as a member is greater than the effect on the probability of being insured as a dependent, although the difference appears to arise from a much larger effect of extreme negative time preference on direct insurance as a member. The parameter  has a positive partial effect only on the probability of being insured as a member, indicating that this probability increases with decreasing present bias, although not significantly. This parameter has a puzzling nonlinear relationship only with the probability of insurance as a dependent. The effect of present bias is more in line with expectations when attention is confined to direct insurance as a member and to respondents who have greater insurance literacy and knowledge (see Table 5).
The point estimate of the partial effect of extreme risk seeking is larger on the probability of direct insurance. Without controlling for this, utility curvature is estimated to have a stronger negative effect on the probability of insurance as a member. Both types of probability weighting are more strongly associated with the probability of being insured as a dependent, which may cast doubt on the extent to which these associations do reflect effects of probability weighting on the decision to insure. 71 Table E1. Partial effects of preference measures and parameters on probability of being uninsured, insured as a program member, and insured as a member's dependent Notes: Table gives average partial effects from multinomial logit models of being i) uninsured, ii) insured as a program member, and iii) insured as a dependent of a program member. Effects on the three probabilities sum to zero. Covariate specification as for Table 4 column (2). Strata indicators are not included because doing so resulted in a highly singular covariance matrix for the panel A model that is likely due to small cell sizes. Robust standard errors adjusted for clustering at the municipality level in parentheses. *** p<0.01, ** p<0.05, * p<0.1