|
on Econometrics |
By: | Andrey Simonov; Jean-Pierre H. Dubé; Günter J. Hitsch; Peter E. Rossi |
Abstract: | We analyze the initial conditions bias in the estimation of brand choice models with structural state dependence. Using a combination of Monte Carlo simulations and empirical case studies of shopping panels, we show that popular, simple solutions that mis-specify the initial conditions are likely to lead to bias even in relatively long panel datasets. The magnitude of the bias in the state dependence parameter can be as large as a factor of 2 to 2.5. We propose a solution to the initial conditions problem that samples the initial states as auxiliary variables in an MCMC procedure. The approach assumes that the joint distribution of prices and consumer choices, and hence the distribution of initial states, is in equilibrium. This assumption is plausible for the mature consumer packaged goods products used in this and the majority of prior empirical applications. In Monte Carlo simulations, we show that the approach recovers the true parameter values even in relatively short panels. Finally, we propose a diagnostic tool that uses common, biased approaches to bound the values of the state dependence and construct a computationally light test for state dependence. |
JEL: | D12 L66 M3 |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:26217&r=all |
By: | Ivan Korolev |
Abstract: | This paper develops a consistent series-based specification test for semiparametric panel data models with fixed effects. The test statistic resembles the Lagrange Multiplier (LM) test statistic in parametric models and is based on a quadratic form in the restricted model residuals. The use of series methods facilitates both estimation of the null model and computation of the test statistic. The asymptotic distribution of the test statistic is standard normal, so that appropriate critical values can easily be computed. The projection property of series estimators allows me to develop a degrees of freedom correction. This correction makes it possible to account for the estimation variance and obtain refined asymptotic results. It also substantially improves the finite sample performance of the test. |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1909.05649&r=all |
By: | Ayden Higgins; Federico Martellosio |
Abstract: | This paper explores the estimation of a panel data model with cross-sectional interaction that is flexible both in its approach to specifying the network of connections between cross-sectional units, and in controlling for unobserved heterogeneity. It is assumed that there are different sources of information available on a network, which can be represented in the form of multiple weights matrices. These matrices may reflect observed links, different measures of connectivity, groupings or other network structures, and the number of matrices may be increasing with sample size. A penalised quasi-maximum likelihood estimator is proposed which aims to alleviate the risk of network misspecification by shrinking the coefficients of irrelevant weights matrices to exactly zero. Moreover, controlling for unobserved factors in estimation provides a safeguard against the misspecification that might arise from unobserved heterogeneity. The estimator is shown to be consistent and selection consistent as both $n$ and $T$ tend to infinity, and its limiting distribution is characterised. Finite sample performance is assessed by means of a Monte Carlo simulation, and the method is applied to study the prevalence of network spillovers in determining growth rates across countries. |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1909.02823&r=all |
By: | Violetta Dalla (National and Kapodistrian University of Athens); Liudas Giraitis (Queen Mary, University of London); Peter C.B. Phillips (Cowles Foundation, Yale University) |
Abstract: | Commonly used tests to assess evidence for the absence of autocorrelation in a univariate time series or serial cross-correlation between time series rely on procedures whose validity holds for i.i.d. data. When the series are not i.i.d., the size of correlogram and cumulative Ljung-Box tests can be signi?cantly distorted. This paper adapts standard correlogram and portmanteau tests to accommodate hidden dependence and non-stationarities involving heteroskedasticity, thereby uncoupling these tests from limiting assumptions that reduce their applicability in empirical work. To enhance the Ljung-Box test for non-i.i.d. data a new cumulative test is introduced. Asymptotic size of these tests is una?ected by hidden dependence and heteroskedasticity in the series. Related extensions are provided for testing cross-correlation at various lags in bivariate time series. Tests for the i.i.d. property of a time series are also developed. An extensive Monte Carlo study con?rms good performance in both size and power for the new tests. Applications to real data reveal that standard tests frequently produce spurious evidence of serial correlation. |
Keywords: | Serial correlation, Cross-correlation, Heteroskedasticity, Martingale differences |
JEL: | C12 |
Date: | 2019–04 |
URL: | http://d.repec.org/n?u=RePEc:cwl:cwldpp:2194&r=all |
By: | Peter C.B. Phillips (Cowles Foundation, Yale University); Ying Wang (The University of Auckland) |
Abstract: | Behavior at the individual level in panels or at the station level in spatial models is often influenced by aspects of the system in aggregate. In particular, the nature of the interaction between individual-specific explanatory variables and an individual dependent variable may be affected by `global’ variables that are relevant in decision making and shared communally by all individuals in the sample. To capture such behavioral features, we employ a functional coefficient panel model in which certain communal covariates may jointly influence panel interactions by means of their impact on the model coefficients. Two classes of estimation procedures are proposed, one based on station averaged data the other on the full panel, and their asymptotic properties are derived. Inference regarding the functional coe?icient is also considered. The finite sample performance of the proposed estimators and tests are examined by simulation. An empirical spatial model illustration is provided in which the climate sensitivity of temperature to atmospheric CO_2 concentration is studied at both station and global levels. |
Keywords: | Climate modeling, Communal covariates, Fixed effects, Functional coefficients, Panel data, Spatial modeling |
JEL: | C14 C23 |
Date: | 2019–05 |
URL: | http://d.repec.org/n?u=RePEc:cwl:cwldpp:2193&r=all |
By: | Qi Wang; Jos\'e E. Figueroa-L\'opez; Todd Kuffner |
Abstract: | Volatility estimation based on high-frequency data is key to accurately measure and control the risk of financial assets. A L\'{e}vy process with infinite jump activity and microstructure noise is considered one of the simplest, yet accurate enough, models for financial data at high-frequency. Utilizing this model, we propose a "purposely misspecified" posterior of the volatility obtained by ignoring the jump-component of the process. The misspecified posterior is further corrected by a simple estimate of the location shift and re-scaling of the log likelihood. Our main result establishes a Bernstein-von Mises (BvM) theorem, which states that the proposed adjusted posterior is asymptotically Gaussian, centered at a consistent estimator, and with variance equal to the inverse of the Fisher information. In the absence of microstructure noise, our approach can be extended to inferences of the integrated variance of a general It\^o semimartingale. Simulations are provided to demonstrate the accuracy of the resulting credible intervals, and the frequentist properties of the approximate Bayesian inference based on the adjusted posterior. |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1909.04853&r=all |
By: | Susan Athey; Guido Imbens; Jonas Metzger; Evan Munro |
Abstract: | Researchers often use artificial data to assess the performance of new econometric methods. In many cases the data generating processes used in these Monte Carlo studies do not resemble real data sets and instead reflect many arbitrary decisions made by the researchers. As a result potential users of the methods are rarely persuaded by these simulations that the new methods are as attractive as the simulations make them out to be. We discuss the use of Wasserstein Generative Adversarial Networks (WGANs) as a method for systematically generating artificial data that mimic closely any given real data set without the researcher having many degrees of freedom. We apply the methods to compare in three different settings twelve different estimators for average treatment effects under unconfoundedness. We conclude in this example that (i) there is not one estimator that outperforms the others in all three settings, and (ii) that systematic simulation studies can be helpful for selecting among competing methods. |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1909.02210&r=all |
By: | Victor Chernozhukov; Iv\'an Fern\'andez-Val; Blaise Melly |
Abstract: | The widespread use of quantile regression methods depends crucially on the existence of fast algorithms. Despite numerous algorithmic improvements, the computation time is still non-negligible because researchers often estimate many quantile regressions and use the bootstrap for inference. We suggest two new fast algorithms for the estimation of a sequence of quantile regressions at many quantile indexes. The first algorithm applies the preprocessing idea of Portnoy and Koenker (1997) but exploits a previously estimated quantile regression to guess the sign of the residuals. This step allows for a reduction of the effective sample size. The second algorithm starts from a previously estimated quantile regression at a similar quantile index and updates it using a single Newton-Raphson iteration. The first algorithm is exact, while the second is only asymptotically equivalent to the traditional quantile regression estimator. We also apply the preprocessing idea to the bootstrap by using the sample estimates to guess the sign of the residuals in the bootstrap sample. Simulations show that our new algorithms provide very large improvements in computation time without significant (if any) cost in the quality of the estimates. For instance, we divide by 100 the time required to estimate 99 quantile regressions with 20 regressors and 50,000 observations. |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1909.05782&r=all |
By: | Mohammad Arshad Rahman; Angela Vossmeyer |
Abstract: | This paper develops a framework for quantile regression in binary longitudinal data settings. A novel Markov chain Monte Carlo (MCMC) method is designed to fit the model and its computational efficiency is demonstrated in a simulation study. The proposed approach is flexible in that it can account for common and individual-specific parameters, as well as multivariate heterogeneity associated with several covariates. The methodology is applied to study female labor force participation and home ownership in the United States. The results offer new insights at the various quantiles, which are of interest to policymakers and researchers alike. |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1909.05560&r=all |
By: | Fiona Burlig; Louis Preonas; Matt Woerman |
Abstract: | How should researchers design panel data experiments? We analytically derive the variance of panel estimators, informing power calculations in panel data settings. We generalize Frison and Pocock (1992) to fully arbitrary error structures, thereby extending McKenzie (2012) to allow for non-constant serial correlation. Using Monte Carlo simulations and real world panel data, we demonstrate that failing to account for arbitrary serial correlation ex ante yields experiments that are incorrectly powered under proper inference. By contrast, our “serial-correlation-robust” power calculations achieve correctly powered experiments in both simulated and real data. We discuss the implications of these results, and introduce a new software package to facilitate proper power calculations in practice. |
JEL: | B4 C23 C9 O1 Q4 |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:26250&r=all |
By: | Hao Dong (Southern Methodist University); Taisuke Otsu (London School of Economics and Political Science); Luke Taylor (Aarhus University) |
Abstract: | We propose a semi-parametric estimator for varying coefficient models when the regressors in the nonparametric component are measured with error. Varying coefficient models are an extension of other popular semiparametric models, including partially linear and nonparametric additive models, and deliver an attractive solution to the curse-of-dimensionality. We use deconvolution kernel estimation in a two-step procedure and show that the estimator is consistent and asymptotically normally distributed. We do not assume that we know the distribution of the measurement error a priori, nor do we assume that the error is symmetrically distributed. Instead, we suppose we have access to a repeated measurement of the noisy regressor and use the approach of Li and Vuong (1998) based on Kotlarski�s (1967) identity. We show that the convergence rate of the estimator is significantly reduced when the distribution of the measurement error is assumed unknown and possibly asymmetric. Finally, we study the small sample behavior of our estimator in a simulation study. |
Keywords: | Varying coefficient models, deconvolution, classical measurement error, unknown error distribution. |
JEL: | C14 |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:smu:ecowpa:1905&r=all |
By: | Nicholas Illenberger; Dylan S. Small; Pamela A. Shaw |
Abstract: | To make informed policy recommendations from observational data, we must be able to discern true treatment effects from random noise and effects due to confounding. Difference-in-Difference techniques which match treated units to control units based on pre-treatment outcomes, such as the synthetic control approach have been presented as principled methods to account for confounding. However, we show that use of synthetic controls or other matching procedures can introduce regression to the mean (RTM) bias into estimates of the average treatment effect on the treated. Through simulations, we show RTM bias can lead to inflated type I error rates as well as decreased power in typical policy evaluation settings. Further, we provide a novel correction for RTM bias which can reduce bias and attain appropriate type I error rates. This correction can be used to perform a sensitivity analysis which determines how results may be affected by RTM. We use our proposed correction and sensitivity analysis to reanalyze data concerning the effects of California's Proposition 99, a large-scale tobacco control program, on statewide smoking rates. |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1909.04706&r=all |
By: | Mike Tsionas (Department of Economics, Lancaster University Management School); Marwan Izzeldin (Department of Economics, Lancaster University Management School); Arne Henningsen (Department of Food and Resource Economics, University of Copenhagen); Evaggelos Paravalos (Department of Economics, Athens University of Economics and Business (Greece)) |
Abstract: | In this paper, we consider the stochastic ray production function that has been revived recently by Henningsen et al. (2017). We use a profit-maximizing framework to resolve endogeneity problems that are likely to arise, as in all distance functions, and we derive the system of equations after incorporating technical inefficiency. As technical inefficiency enters non-trivially into the system of equations and the Jacobian is highly complicated, we propose Monte Carlo methods of inference. We illustrate the new approach using US banking data and we also address the problems of missing prices and selection of ordering for outputs. |
Keywords: | Stochastic ray production frontier, Technical inefficiency, Profit maximization, Bayesian inference |
JEL: | C11 C13 D24 |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:foi:wpaper:2019_06&r=all |
By: | Christophe BELLEGO (CREST; ENSAE.); Louis-Daniel PAPE (CREST; Ecole Polytechnique.) |
Abstract: | Log-linear and log-log regressions are one of the most used statistical model. However, handling zeros in the dependent and independent variable has remained obscure despite the prevalence of the situation. In this paper, we discuss how to deal with this issue. We show that using Pseudo-Poisson Maximum Likelihood (PPML) is a good practice compared to other approximate solutions. We then introduce a new complementary solution to deal with zeros consisting in adding a positive value specific to each observation that avoids some numerical issues faced by the former. |
Keywords: | Log(0), Log of zero, Log-log, Bias, Elasticity, PPML |
Date: | 2019–08–28 |
URL: | http://d.repec.org/n?u=RePEc:crs:wpaper:2019-13&r=all |
By: | Christian Francq; Jean-Michel Zakoian |
Abstract: | In order to estimate the conditional risk of a portfolio's return, two strategies can be advocated. A multivariate strategy requires estimating a dynamic model for the vector of risk factors, which is often challenging, when at all possible, for large portfolios. A univariate approach based on a dynamic model for the portfolio's return seems more attractive. However, when the combination of the individual returns is time varying, the portfolio's return series is typically non stationary which may invalidate statistical inference. An alternative approach consists in reconstituting a "virtual portfolio", whose returns are built using the current composition of the portfolio and for which a stationary dynamic model can be estimated. This paper establishes the asymptotic properties of this method, that we call Virtual Historical Simulation. Numerical illustrations on simulated and real data are provided. |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1909.04661&r=all |
By: | Harold D. Chiang; Kengo Kato; Yukun Ma; Yuya Sasaki |
Abstract: | This paper investigates double/debiased machine learning (DML) under multiway clustered sampling environments. We propose a novel multiway cross fitting algorithm and a multiway DML estimator based on this algorithm. Simulations indicate that the proposed procedure has favorable finite sample performance. |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1909.03489&r=all |
By: | Tatiana Komarova; Javier Hidalgo |
Abstract: | We describe and examine a test for shape constraints, such as monotonicity, convexity (or both simultaneously), U-shape, S-shape and others, in a nonparametric framework using partial sums empirical processes. We show that, after a suitable transformation, its asymptotic distribution is a functional of the standard Brownian motion, so that critical values are available. However, due to the possible poor approximation of the asymptotic critical values to the finite sample ones, we also describe a valid bootstrap algorithm. |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1909.01675&r=all |
By: | Rahul Singh; Liyang Sun |
Abstract: | Instrumental variable identification is a concept in causal statistics for estimating the counterfactual effect of treatment D on output Y controlling for covariates X using observational data. Even when measurements of (Y,D) are confounded, the treatment effect on the subpopulation of compliers can nonetheless be identified if an instrumental variable Z is available, which is independent of (Y,D) conditional on X and the unmeasured confounder. We introduce a de-biased machine learning (DML) approach to estimating complier parameters with high-dimensional data. Complier parameters include local average treatment effect, average complier characteristics, and complier counterfactual outcome distributions. In our approach, the de-biasing is itself performed by machine learning, a variant called de-biased machine learning via regularized Riesz representers (DML-RRR). We prove our estimator is consistent, asymptotically normal, and semi-parametrically efficient. In experiments, our estimator outperforms state of the art alternatives. We use it to estimate the effect of 401(k) participation on the distribution of net financial assets. |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1909.05244&r=all |
By: | Nibbering, D.; Paap, R. |
Abstract: | This paper proposes an asymmetric grouping estimator for panel data forecasting. The estimator relies on the observation that the bias- variance trade-off in potentially heterogeneous panel data may be dif- ferent across individuals. Hence, the group of individuals used for parameter estimation that is optimal in terms of forecast accuracy, may be different for each individual. For a specific individual, the estimator uses cross-validation to estimate the bias-variance of all individual groupings, and uses the parameter estimates of the optimal grouping to produce the individual-specific forecast. Integer programming and screening methods deal with the combinatorial problem of a large number of individuals. A simulation study and an application to market leverage forecasts of U.S. firms demonstrate the promising performance of our new estimators |
Keywords: | Panel data, forecasting, parameter heterogeneity |
Date: | 2019–09–01 |
URL: | http://d.repec.org/n?u=RePEc:ems:eureir:119521&r=all |
By: | Martin Huber; Mark Schelker; Anthony Strittmatter |
Abstract: | We propose a novel approach for causal mediation analysis based on changes-in-changes assumptions restricting unobserved heterogeneity over time. This allows disentangling the causal effect of a binary treatment on a continuous outcome into an indirect effect operating through a binary intermediate variable (called mediator) and a direct effect running via other causal mechanisms. We identify average and quantile direct and indirect effects for various subgroups under the condition that the outcome is monotonic in the unobserved heterogeneity and that the distribution of the latter does not change over time conditional on the treatment and the mediator. We also provide a simulation study and an empirical application to the Jobs II programme. |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1909.04981&r=all |
By: | Nathalie Gimenes; Emmanuel Guerre |
Abstract: | The paper proposes a sieve quantile regression approach for first-price auctions with symmetric risk-neutral bidders under the independent private value paradigm. It is first shown that a private value quantile regression model generates a quantile regression for the bids. The private value quantile regression can be easily estimated from the bid quantile regression and its derivative with respect to the quantile level. A new local polynomial technique is proposed to estimate the latter over the whole quantile level interval. Plug in estimation of functionals is also considered, as needed for the expected revenue or the case of CRRA risk-averse bidders, which is amenable to our framework. A quantile regression analysis to USFS timber is found more appropriate than the homogenized bid methodology and illustrates the contribution of each explanatory variables to the private value distribution. |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1909.05542&r=all |
By: | Bruno Ferman |
Abstract: | We analyze the conditions in which ignoring spatial correlation is problematic for inference in differences-in-differences (DID) models. Assuming that the spatial correlation structure follows a linear factor model, we show that inference ignoring such correlation remains reliable when either (i) the second moment of the difference between the pre- and post-treatment averages of common factors is low, or (ii) the distribution of factor loadings has the same expected values for treated and control groups, and do not exhibit significant spatial correlation. We present simulations with real datasets that corroborate these conclusions. Our results provide important guidelines on how to minimize inference problems due to spatial correlation in DID applications. |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1909.01782&r=all |
By: | Robert W. Dimand (Department of Economics, Brock University) |
Abstract: | This paper explores the development of dynamic modelling of macroeconomic fluctuations at the Cowles Commission from Roos, Dynamic Economics (Cowles Monograph No. 1, 1934) and Davis, Analysis of Economic Time Series (Cowles Monograph No. 6, 1941) to Koopmans, ed., Statistical Inference in Dynamic Economic Models (Cowles Monograph No. 10, 1950) and Klein’s Economic Fluctuations in the United States, 1921-1941 (Cowles Monograph No. 11, 1950), emphasizing the emergence of a distinctive Cowles Commission approach to structural modelling of macroeconomic fluctuations influenced by Cowles Commission work on structural estimation of simulation equations models, as advanced by Haavelmo (“A Probability Approach to Econometrics,” Cowles Commission Paper No. 4, 1944) and in Cowles Monographs Nos. 10 and 14. This paper is part of a larger project, a history of the Cowles Commission and Foundation commissioned by the Cowles Foundation for Research in Economics at Yale University. Presented at the Association Charles Gide workshop “Macroeconomics: Dynamic Histories. When Statics is no longer Enough,” Colmar, May 16-19, 2019. |
Keywords: | Macroeconomic dynamics, Cowles Commission, Business cycles, Lawrence R. Klein, Tjalling C. Koopmans |
Date: | 2019–04 |
URL: | http://d.repec.org/n?u=RePEc:cwl:cwldpp:2195&r=all |