nep-ecm 2024-04-29 papers

on Econometrics

Issue of 2024‒04‒29
twenty papers chosen by
Sune Karlsson, Örebro universitet

Modelling with Discretized Variables By Felix Chan; Laszlo Matyas; Agoston Reguly
Inference on conditional moment restriction models with generated variables By Kimoto, Ryo; Otsu, Taisuke
Robust Inference in Locally Misspecified Bipartite Networks By Luis E. Candelaria; Yichong Zhang
Privacy-Protected Spatial Autoregressive Model By Danyang Huang; Ziyi Kong; Shuyuan Wu; Hansheng Wang
Resistant Inference in Instrumental Variable Models By Jens Klooster; Mikhail Zhelonkin
Nonparametric Identification and Estimation with Non-Classical Errors-in-Variables By Kirill S. Evdokimov; Andrei Zeleneev
Goodness-of-Fit for Conditional Distributions: An Approach Using Principal Component Analysis and Component Selection By Cui Rui; Li Yuhao
Quasi-randomization tests for network interference By Supriya Tiwari; Pallavi Basu
Integrated Variance Estimation for Assets Traded in Multiple Venues By Gustavo Fruet Dias; Karsten Schweiker
Composite likelihood estimation of stationary Gaussian processes with a view toward stochastic volatility By Mikkel Bennedsen; Kim Christensen; Peter Christensen
Information matrix tests for multinomial logit models By Dante Amengual; Gariele Fiorentini; Enrique Sentan
Fused LASSO as Non-Crossing Quantile Regression By Tibor Szendrei; Arnab Bhattacharjee; Mark E. Schaffer
Difference-in-Differences with Unpoolable Data By Sunny Karim; Matthew D. Webb; Nichole Austin; Erin Strumpf
Robustly estimating heterogeneity in factorial data using Rashomon Partitions By Aparajithan Venkateswaran; Anirudh Sankar; Arun G. Chandrasekhar; Tyler H. McCormick
Flexible Analysis of Individual Heterogeneity in Event Studies: Application to the Child Penalty By Dmitry Arkhangelsky; Kazuharu Yanagimoto; Tom Zohar
A Gaussian smooth transition vector autoregressive model: An application to the macroeconomic effects of severe weather shocks By Markku Lanne; Savi Virolainen
High-Dimensional Mean-Variance Spanning Tests By David Ardia; S\'ebastien Laurent; Rosnel Sessinou
The Informativeness of Combined Experimental and Observational Data under Dynamic Selection By Yechan Park; Yuya Sasaki
TDSRL: Time Series Dual Self-Supervised Representation Learning for Anomaly Detection from Different Perspectives By Dai, Yongsheng; Wang, Hui; Rafferty, Karen; Spence, Ivor; Quinn, Barry
Oil price shocks in real time By Andrea Gazzani; Fabrizio Venditti; Giovanni Veronese

By:	Felix Chan; Laszlo Matyas; Agoston Reguly
Abstract:	This paper deals with econometric models in which the dependent variable, some explanatory variables, or both are observed as censored interval data. This discretization often happens due to confidentiality of sensitive variables like income. Models using these variables cannot point identify regression parameters as the conditional moments are unknown, which led the literature to use interval estimates. Here, we propose a discretization method through which the regression parameters can be point identified while preserving data confidentiality. We demonstrate the asymptotic properties of the OLS estimator for the parameters in multivariate linear regressions for cross-sectional data. The theoretical findings are supported by Monte Carlo experiments and illustrated with an application to the Australian gender wage gap.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.15220&r=ecm

Inference on conditional moment restriction models with generated variables

By:	Kimoto, Ryo; Otsu, Taisuke
Abstract:	A seminal work by Domínguez and Lobato (2004) proposed a consistent estimation method for conditional moment restrictions, which does not rely on additional identification assumptions as in the GMM estimator using unconditional moments and is free from any userchosen number. Their methodology is further extended by Domínguez and Lobato (2015, 2020) for consistent specification testing of conditional moment restrictions, which may involve generated variables. We follow up this literature and derive the asymptotic distribution of Domínguez and Lobato’s (2004) estimator that involves generated variables. Our simulation result illustrates that ignoring proxy errors in the generated variables may cause severer distortions for the coverage or size properties of statistical inference on parameters.
Keywords:	conditional moment restriction; generated variable; GMM
JEL:	C14
Date:	2022–06–01
URL:	http://d.repec.org/n?u=RePEc:ehl:lserod:114264&r=ecm

Robust Inference in Locally Misspecified Bipartite Networks

By:	Luis E. Candelaria; Yichong Zhang
Abstract:	This paper introduces a methodology to conduct robust inference in bipartite networks under local misspecification. We focus on a class of dyadic network models with misspecified conditional moment restrictions. The framework of misspecification is local, as the effect of misspecification varies with the sample size. We utilize this local asymptotic approach to construct a robust estimator that is minimax optimal for the mean square error within a neighborhood of misspecification. Additionally, we introduce bias-aware confidence intervals that account for the effect of the local misspecification. These confidence intervals have the correct asymptotic coverage for the true parameter of interest under sparse network asymptotics. Monte Carlo experiments demonstrate that the robust estimator performs well in finite samples and sparse networks. As an empirical illustration, we study the formation of a scientific collaboration network among economists.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.13725&r=ecm

Privacy-Protected Spatial Autoregressive Model

By:	Danyang Huang; Ziyi Kong; Shuyuan Wu; Hansheng Wang
Abstract:	Spatial autoregressive (SAR) models are important tools for studying network effects. However, with an increasing emphasis on data privacy, data providers often implement privacy protection measures that make classical SAR models inapplicable. In this study, we introduce a privacy-protected SAR model with noise-added response and covariates to meet privacy-protection requirements. However, in this scenario, the traditional quasi-maximum likelihood estimator becomes infeasible because the likelihood function cannot be formulated. To address this issue, we first consider an explicit expression for the likelihood function with only noise-added responses. However, the derivatives are biased owing to the noise in the covariates. Therefore, we develop techniques that can correct the biases introduced by noise. Correspondingly, a Newton-Raphson-type algorithm is proposed to obtain the estimator, leading to a corrected likelihood estimator. To further enhance computational efficiency, we introduce a corrected least squares estimator based on the idea of bias correction. These two estimation methods ensure both data security and the attainment of statistically valid estimators. Theoretical analysis of both estimators is carefully conducted, and statistical inference methods are discussed. The finite sample performances of different methods are demonstrated through extensive simulations and the analysis of a real dataset.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.16773&r=ecm

Resistant Inference in Instrumental Variable Models

By:	Jens Klooster; Mikhail Zhelonkin
Abstract:	The classical tests in the instrumental variable model can behave arbitrarily if the data is contaminated. For instance, one outlying observation can be enough to change the outcome of a test. We develop a framework to construct testing procedures that are robust to weak instruments, outliers and heavy-tailed errors in the instrumental variable model. The framework is constructed upon M-estimators. By deriving the influence functions of the classical weak instrument robust tests, such as the Anderson-Rubin test, K-test and the conditional likelihood ratio (CLR) test, we prove their unbounded sensitivity to infinitesimal contamination. Therefore, we construct contamination resistant/robust alternatives. In particular, we show how to construct a robust CLR statistic based on Mallows type M-estimators and show that its asymptotic distribution is the same as that of the (classical) CLR statistic. The theoretical results are corroborated by a simulation study. Finally, we revisit three empirical studies affected by outliers and demonstrate how the new robust tests can be used in practice.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.16844&r=ecm

Nonparametric Identification and Estimation with Non-Classical Errors-in-Variables

By:	Kirill S. Evdokimov; Andrei Zeleneev
Abstract:	This paper considers nonparametric identification and estimation of the regression function when a covariate is mismeasured. The measurement error need not be classical. Employing the small measurement error approximation, we establish nonparametric identification under weak and easy-to-interpret conditions on the instrumental variable. The paper also provides nonparametric estimators of the regression function and derives their rates of convergence.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.11309&r=ecm

Goodness-of-Fit for Conditional Distributions: An Approach Using Principal Component Analysis and Component Selection

By:	Cui Rui; Li Yuhao
Abstract:	This paper introduces a novel goodness-of-fit test technique for parametric conditional distributions. The proposed tests are based on a residual marked empirical process, for which we develop a conditional Principal Component Analysis. The obtained components provide a basis for various types of new tests in addition to the omnibus one. Component tests that based on each component serve as experts in detecting certain directions. Smooth tests that assemble a few components are also of great use in practice. To further improve testing efficiency, we introduce a component selection approach, aiming to identify the most contributory components. The finite sample performance of the proposed tests is illustrated through Monte Carlo experiments.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.10352&r=ecm

Quasi-randomization tests for network interference

By:	Supriya Tiwari; Pallavi Basu
Abstract:	Many classical inferential approaches fail to hold when interference exists among the population units. This amounts to the treatment status of one unit affecting the potential outcome of other units in the population. Testing for such spillover effects in this setting makes the null hypothesis non-sharp. An interesting approach to tackling the non-sharp nature of the null hypothesis in this setup is constructing conditional randomization tests such that the null is sharp on the restricted population. In randomized experiments, conditional randomized tests hold finite sample validity. Such approaches can pose computational challenges as finding these appropriate sub-populations based on experimental design can involve solving an NP-hard problem. In this paper, we view the network amongst the population as a random variable instead of being fixed. We propose a new approach that builds a conditional quasi-randomization test. Our main idea is to build the (non-sharp) null distribution of no spillover effects using random graph null models. We show that our method is exactly valid in finite-samples under mild assumptions. Our method displays enhanced power over other methods, with substantial improvement in complex experimental designs. We highlight that the method reduces to a simple permutation test, making it easy to implement in practice. We conduct a simulation study to verify the finite-sample validity of our approach and illustrate our methodology to test for interference in a weather insurance adoption experiment run in rural China.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.16673&r=ecm

Integrated Variance Estimation for Assets Traded in Multiple Venues

By:	Gustavo Fruet Dias (School of Economics, University of East Anglia); Karsten Schweiker (University of Hohenheim)
Abstract:	In this paper, we identify a novel form of multiplicative market microstructure noise, referred to as fragmentation noise, which arises when the same asset is traded across multiple venues. We demonstrate that conventional estimators, such as realized variance and other well-established noise-robust methods, yield inconsistent estimates in the presence of fragmentation noise. To address this estimation issue, we propose a two-step estimator. In the first step, we model prices in different trading venues using a vector error correction model, leveraging its common trend representation to estimate the efficient price of the asset. In the second step, we compute the realized variance estimator using the estimates of the efficient price. We derive the asymptotic distribution of our proposed two-step estimator and conduct comprehensive simulation experiments. An application to the constituents of the DJIA reveals that our two-step estimator outperforms or performs on par with the univariate estimators under consideration.
Keywords:	High-frequency data, Ornstein-Uhlenbeck process, Cointegration, Realized variance, Realized kernel estimators, Market microstructure, Price discovery
JEL:	C12 C15 G14
Date:	2024–04
URL:	http://d.repec.org/n?u=RePEc:uea:ueaeco:2024-04&r=ecm

Composite likelihood estimation of stationary Gaussian processes with a view toward stochastic volatility

By:	Mikkel Bennedsen; Kim Christensen; Peter Christensen
Abstract:	We develop a framework for composite likelihood inference of parametric continuous-time stationary Gaussian processes. We derive the asymptotic theory of the associated maximum composite likelihood estimator. We implement our approach on a pair of models that has been proposed to describe the random log-spot variance of financial asset returns. A simulation study shows that it delivers good performance in these settings and improves upon a method-of-moments estimation. In an application, we inspect the dynamic of an intraday measure of spot variance computed with high-frequency data from the cryptocurrency market. The empirical evidence supports a mechanism, where the short- and long-term correlation structure of stochastic volatility are decoupled in order to capture its properties at different time scales.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.12653&r=ecm

Information matrix tests for multinomial logit models

By:	Dante Amengual (CEMFI, Centro de Estudios Monetarios y Financieros); Gariele Fiorentini (Università di Firenze and RCEA); Enrique Sentan (CEMFI, Centro de Estudios Monetarios y Financieros)
Abstract:	We show that the influence functions of the information matrix test for the multinomial logit model are the Kronecker product of the outer product of the generalised residuals minus their covariance matrix conditional on the explanatory variables times the outer product of those variables. Thus, it resembles a multivariate heteroskedasticity test à la White (1980), which confirms Chesher’s (1984) unobserved heterogeneity interpretation. Our simulation experiments indicate that using theoretical expressions for the conditional covariance matrices involved substantially reduces size distortions, while the parametric bootstrap practically eliminates them. We also show that the test has good power against several relevant alternatives.
Keywords:	Hessian matrix, outer product of the score, specification test, unobserved heterogeneity.
JEL:	C35 C25
Date:	2024–06
URL:	http://d.repec.org/n?u=RePEc:cmf:wpaper:wp2024_2406&r=ecm

Fused LASSO as Non-Crossing Quantile Regression

By:	Tibor Szendrei; Arnab Bhattacharjee; Mark E. Schaffer
Abstract:	Quantile crossing has been an ever-present thorn in the side of quantile regression. This has spurred research into obtaining densities and coefficients that obey the quantile monotonicity property. While important contributions, these papers do not provide insight into how exactly these constraints influence the estimated coefficients. This paper extends non-crossing constraints and shows that by varying a single hyperparameter ($\alpha$) one can obtain commonly used quantile estimators. Namely, we obtain the quantile regression estimator of Koenker and Bassett (1978) when $\alpha=0$, the non crossing quantile regression estimator of Bondell et al. (2010) when $\alpha=1$, and the composite quantile regression estimator of Koenker (1984) and Zou and Yuan (2008) when $\alpha\rightarrow\infty$. As such, we show that non-crossing constraints are simply a special type of fused-shrinkage.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.14036&r=ecm

Difference-in-Differences with Unpoolable Data

By:	Sunny Karim; Matthew D. Webb; Nichole Austin; Erin Strumpf
Abstract:	In this study, we identify and relax the assumption of data "poolability" in difference-in-differences (DID) estimation. Poolability, or the combination of observations from treated and control units into one dataset, is often not possible due to data privacy concerns. For instance, administrative health data stored in secure facilities is often not combinable across jurisdictions. We propose an innovative approach to estimate DID with unpoolable data: UN--DID. Our method incorporates adjustments for additional covariates, multiple groups, and staggered adoption. Without covariates, UN--DID and conventional DID give identical estimates of the average treatment effect on the treated (ATT). With covariates, we show mathematically and through simulations that UN--DID and conventional DID provide different, but equally informative, estimates of the ATT. An empirical example further underscores the utility of our methodology. The UN--DID method paves the way for more comprehensive analyses of policy impacts, even under data poolability constraints.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.15910&r=ecm

Robustly estimating heterogeneity in factorial data using Rashomon Partitions

By:	Aparajithan Venkateswaran; Anirudh Sankar; Arun G. Chandrasekhar; Tyler H. McCormick
Abstract:	Many statistical analyses, in both observational data and randomized control trials, ask: how does the outcome of interest vary with combinations of observable covariates? How do various drug combinations affect health outcomes, or how does technology adoption depend on incentives and demographics? Our goal is to partition this factorial space into ``pools'' of covariate combinations where the outcome differs across the pools (but not within a pool). Existing approaches (i) search for a single ``optimal'' partition under assumptions about the association between covariates or (ii) sample from the entire set of possible partitions. Both these approaches ignore the reality that, especially with correlation structure in covariates, many ways to partition the covariate space may be statistically indistinguishable, despite very different implications for policy or science. We develop an alternative perspective, called Rashomon Partition Sets (RPSs). Each item in the RPS partitions the space of covariates using a tree-like geometry. RPSs incorporate all partitions that have posterior values near the maximum a posteriori partition, even if they offer substantively different explanations, and do so using a prior that makes no assumptions about associations between covariates. This prior is the $\ell_0$ prior, which we show is minimax optimal. Given the RPS we calculate the posterior of any measurable function of the feature effects vector on outcomes, conditional on being in the RPS. We also characterize approximation error relative to the entire posterior and provide bounds on the size of the RPS. Simulations demonstrate this framework allows for robust conclusions relative to conventional regularization techniques. We apply our method to three empirical settings: price effects on charitable giving, chromosomal structure (telomere length), and the introduction of microfinance.
Date:	2024–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2404.02141&r=ecm

Flexible Analysis of Individual Heterogeneity in Event Studies: Application to the Child Penalty

By:	Dmitry Arkhangelsky; Kazuharu Yanagimoto; Tom Zohar
Abstract:	We provide a practical toolkit for analyzing effect heterogeneity in event studies. We develop an estimation algorithm and adapt existing econometric results to provide its theoretical justification. We apply these tools to Dutch administrative data to study individual heterogeneity in the child-penalty (CP) context in three ways. First, we document significant heterogeneity in the individual-level CP trajectories, emphasizing the importance of going beyond the average CP. Second, we use individual-level estimates to examine the impact of childcare supply expansion policies. Our approach uncovers nonlinear treatment effects, challenging the conventional policy evaluation methods constrained to less flexible specifications. Third, we use the individual-level estimates as a regressor on the right-hand side to study the intergenerational elasticity of the CP between mothers and daughters. After adjusting for the measurement error bias, we find the elasticity of 24\%. Our methodological framework contributes to empirical practice by offering a flexible approach tailored to specific research questions and contexts. We provide an open-source package ('unitdid') to facilitate widespread adoption.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.19563&r=ecm

A Gaussian smooth transition vector autoregressive model: An application to the macroeconomic effects of severe weather shocks

By:	Markku Lanne; Savi Virolainen
Abstract:	We introduce a new smooth transition vector autoregressive model with a Gaussian conditional distribution and transition weights that, for a $p$th order model, depend on the full distribution of the preceding $p$ observations. Specifically, the transition weight of each regime increases in its relative weighted likelihood. This data-driven approach facilitates capturing complex switching dynamics, enhancing the identification of gradual regime shifts. In an empirical application to the macroeconomic effects of a severe weather shock, we find that in monthly U.S. data from 1961:1 to 2022:3, the impacts of the shock are stronger in the regime prevailing in the early part of the sample and in certain crisis periods than in the regime dominating the latter part of the sample. This suggests overall adaptation of the U.S. economy to increased severe weather over time.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.14216&r=ecm

High-Dimensional Mean-Variance Spanning Tests

By:	David Ardia; S\'ebastien Laurent; Rosnel Sessinou
Abstract:	We introduce a new framework for the mean-variance spanning (MVS) hypothesis testing. The procedure can be applied to any test-asset dimension and only requires stationary asset returns and the number of benchmark assets to be smaller than the number of time periods. It involves individually testing moment conditions using a robust Student-t statistic based on the batch-mean method and combining the p-values using the Cauchy combination test. Simulations demonstrate the superior performance of the test compared to state-of-the-art approaches. For the empirical application, we look at the problem of domestic versus international diversification in equities. We find that the advantages of diversification are influenced by economic conditions and exhibit cross-country variation. We also highlight that the rejection of the MVS hypothesis originates from the potential to reduce variance within the domestic global minimum-variance portfolio.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.17127&r=ecm

The Informativeness of Combined Experimental and Observational Data under Dynamic Selection

By:	Yechan Park; Yuya Sasaki
Abstract:	This paper addresses the challenge of estimating the Average Treatment Effect on the Treated Survivors (ATETS; Vikstrom et al., 2018) in the absence of long-term experimental data, utilizing available long-term observational data instead. We establish two theoretical results. First, it is impossible to obtain informative bounds for the ATETS with no model restriction and no auxiliary data. Second, to overturn this negative result, we explore as a promising avenue the recent econometric developments in combining experimental and observational data (e.g., Athey et al., 2020, 2019); we indeed find that exploiting short-term experimental data can be informative without imposing classical model restrictions. Furthermore, building on Chesher and Rosen (2017), we explore how to systematically derive sharp identification bounds, exploiting both the novel data-combination principles and classical model restrictions. Applying the proposed method, we explore what can be learned about the long-run effects of job training programs on employment without long-term experimental data.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.16177&r=ecm

TDSRL: Time Series Dual Self-Supervised Representation Learning for Anomaly Detection from Different Perspectives

By:	Dai, Yongsheng; Wang, Hui; Rafferty, Karen; Spence, Ivor; Quinn, Barry
Abstract:	Time series anomaly detection plays a critical role in various applications, from finance to industrial monitoring. Effective models need to capture both the inherent characteristics of time series data and the unique patterns associated with anomalies. While traditional forecasting-based and reconstruction-based approaches have been successful, they tend to struggle with complex and evolving anomalies. For instance, stock market data exhibits complex and ever-changing fluctuation patterns that defy straightforward modelling. In this paper, we propose a novel approach called TDSRL (Time Series Dual Self-Supervised Representation Learning) for robust anomaly detection. TDSRL leverages synthetic anomaly segments which are artificially generated to simulate real-world anomalies. The key innovation lies in dual self-supervised pretext tasks: one task characterises anomalies in relation to the entire time series, while the other focuses on local anomaly boundaries. Additionally, we introduce a data degradation method that operates in both the time and frequency domains, creating a more natural simulation of real-world anomalies compared to purely synthetic data. Consequently, TDSRL is expected to achieve more accurate predictions of the location and extent of anomalous segments. Our experiments demonstrate that TDSRL outperforms state-of-the-art methods, making it a promising avenue for time series anomaly detection.
Keywords:	Time series anomaly detection, self-supervised representation learning, contrastive learning, synthetic anomaly
Date:	2024
URL:	http://d.repec.org/n?u=RePEc:zbw:qmsrps:202403&r=ecm

Oil price shocks in real time

By:	Andrea Gazzani (Bank of Italy); Fabrizio Venditti (Bank of Italy); Giovanni Veronese (Bank of Italy)
Abstract:	Oil prices contain information on global shocks of key relevance for monetary policy decisions. We propose a novel approach to identify these shocks at the daily frequency in a Structural Vector Autoregression (SVAR). Our method is devised to be used in real time to interpret developments in the oil market and their implications for the macroeconomy, circumventing the problem of publication lags that plagues monthly data used in workhorse SVAR models. This method proves particularly valuable for monetary policymakers at times when macroeconomic conditions evolve rapidly, like during the COVID-19 pandemic or the invasion of Ukraine by Russia.
Keywords:	oil prices, VAR, real time, monetary policy
JEL:	Q43 C32 E32 C53
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:bdi:wptemi:td_1448_24&r=ecm

This nep-ecm issue is ©2024 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.