nep-ecm New Economics Papers
on Econometrics
Issue of 2021‒03‒15
35 papers chosen by
Sune Karlsson
Örebro universitet

  1. Uniform Inference after Pretesting for Exogeneity with Heteroskedastic Data By Doko Tchatoka, Firmin; Wang, Wenjie
  2. Theory of Evolutionary Spectra for Heteroskedasticity and Autocorrelation Robust Inference in Possibly Misspecified and Nonstationary Models By Alessandro Casini
  3. Minimax MSE Bounds and Nonlinear VAR Prewhitening for Long-Run Variance Estimation Under Nonstationarity By Alessandro Casini; Pierre Perron
  4. Dynamic covariate balancing: estimating treatment effects over time By Davide Viviano; Jelena Bradic
  5. Extension of the Lagrange multiplier test for error cross-section independence to large panels with non normal errors By Zhaoyuan Li; Jianfeng Yao
  6. Improved Estimation of Dynamic Models of Conditional Means and Variances By Wang, Weining; Wooldridge, Jeffrey M.; Xu, Mengshan
  7. Estimating the causal effect of an intervention in a time series setting: the C-ARIMA approach By Fiammetta Menchetti; Fabrizio Cipollini; Fabrizia Mealli
  8. High-dimensional estimation of quadratic variation based on penalized realized variance By Kim Christensen; Mikkel Slot Nielsen; Mark Podolskij
  9. The Kernel Trick for Nonlinear Factor Modeling By Varlam Kutateladze
  10. Causal inference with misspecified exposure mappings By Fredrik S\"avje
  11. Bayes estimates of multimodal density features using DNA and Economic Data By Nalan Basturk; Lennart Hoogerheide; Herman K. van Dijk
  12. A Bayesian Graphical Approach for Large-Scale Portfolio Management with Fewer Historical Data By Sakae Oya
  13. A combinatorial optimization approach to scenario filtering in portfolio selection By Justo Puerto; Federica Ricca; Mois\'es Rodr\'iguez-Madrena; Andrea Scozzari
  14. Kernel Estimation: the Equivalent Spline Smoothing Method By Härdle, Wolfgang Karl; Nussbaum, Michael
  15. Factor-Based Imputation of Missing Values and Covariances in Panel Data of Large Dimensions By Ercument Cahan; Jushan Bai; Serena Ng
  16. Theory of Low Frequency Contamination from Nonstationarity and Misspecification: Consequences for HAR Inference By Alessandro Casini; Taosong Deng; Pierre Perron
  17. Simultaneous Bandwidths Determination for DK-HAC Estimators and Long-Run Variance Estimation in Nonparametric Settings By Federico Belotti; Alessandro Casini; Leopoldo Catania; Stefano Grassi; Pierre Perron
  18. Non-Parametric Estimation of Spot Covariance Matrix with High-Frequency Data By Mustafayeva, Konul; Wang, Weining
  19. Bootstrap Inference for Partially Linear Model with Many Regressors By Wang, Wenjie
  20. A penalized two-pass regression to predict stock returns with time-varying risk premia By Gaetan Bakalli; Stéphane Guerrier; Olivier Scaillet
  21. Dynamic Spatial Network Quantile Autoregression By Xu, Xiu; Wang, Weining; Shin, Yongcheol
  22. Improved Estimation of Poisson Rate Distributions through a Multi-Mode Survey Design By Marcin Hitczenko
  23. Confronting Machine Learning With Financial Research By Kristof Lommers; Ouns El Harzli; Jack Kim
  24. Standing on the Shoulders of Machine Learning: Can We Improve Hypothesis Testing? By Gary Cornwall; Jeff Chen; Beau Sauley
  25. Are Most Published Research Findings False In A Continuous Universe? By Neves, Kleber; Tan, Pedro Batista; Amaral, Olavo Bohrer
  26. Prediction of Attrition in Large Longitudinal Studies: Tree-based methods versus Multinomial Logistic Models By Best, Katherine Laura; Speyer, Lydia Gabriela; Murray, Aja Louise; Ushakova, Anastasia
  27. A new criterion for selecting valid instruments By Ratbek Dzhumashev; Ainura Tursunalieva
  28. A data-driven P-spline smoother and the P-Spline-GARCH models By Feng, Yuanhua; Härdle, Wolfgang Karl
  29. Are Bartik Regressions Always Robust to Heterogeneous Treatment Effects? By Cl\'ement de Chaisemartin; Ziteng Lei
  30. Extreme Value Statistics in Semi-Supervised Models By Ahmed, Hanan; Einmahl, John; Zhou, Chen
  31. Some Finite Sample Properties of the Sign Test By Yong Cai
  32. Testing for Granger Causality in Quantiles Between the Wage Share and Capacity Utilization By André M. Marques; Gilberto Tadeu Lima
  33. Local asymptotic normality of general conditionally heteroskedastic and score-driven time-series models By Francq, Christian; Zakoian, Jean-Michel
  34. The money-inflation nexus revisited By Leopold Ringwald; Thomas O. Zörner
  35. Discrete-Continuous Dynamic Choice Models: Identification and Conditional Choice Probability Estimation By Bruneel-Zupanc, Christophe Alain

  1. By: Doko Tchatoka, Firmin; Wang, Wenjie
    Abstract: Pretesting for exogeneity has become a routine in many empirical applications involving instrumental variables (IVs) to decide whether the ordinary least squares (OLS) or the two-stage least squares (2SLS) method is appropriate. Guggenberger (2010) shows that the second-stage t-test – based on the outcome of a Durbin-Wu-Hausman type pretest for exogeneity in the first-stage – has extreme size distortion with asymptotic size equal to 1 when the standard asymptotic critical values are used. In this paper, we first show that the standard wild bootstrap procedures (with either independent or dependent draws of disturbances) are not viable solutions to such extreme size-distortion problem. Then, we propose a novel hybrid bootstrap approach, which combines the residual-based bootstrap along with an adjusted Bonferroni size-correction method. We establish uniform validity of this hybrid bootstrap under conditional heteroskedasticity in the sense that it yields a two-stage test with correct asymptotic size. Monte Carlo simulations confirm our theoretical findings. In particular, our proposed hybrid method achieves remarkable power gains over the 2SLS-based t-test, especially when IVs are not very strong.
    Keywords: DWH Pretest; Instrumental Variable; Asymptotic Size; Bootstrap; Bonferroni-based Size-correction; Uniform Inference.
    JEL: C12 C13 C26
    Date: 2021–03–04
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:106408&r=all
  2. By: Alessandro Casini
    Abstract: We develop a theory of evolutionary spectra for heteroskedasticity and autocorrelation robust (HAR) inference when the data may not satisfy second-order stationarity. Nonstationarity is a common feature of economic time series which may arise either from parameter variation or model misspecification. In such a context, the theories that support HAR inference are either not applicable or do not provide accurate approximations. HAR tests standardized by existing long-run variance estimators then may display size distortions and little or no power. This issue can be more severe for methods that use long bandwidths (i.e., fixed-b HAR tests). We introduce a class of nonstationary processes that have a time-varying spectral representation which evolves continuously except at a finite number of time points. We present an extension of the classical heteroskedasticity and autocorrelation consistent (HAC) estimators that applies two smoothing procedures. One is over the lagged autocovariances, akin to classical HAC estimators, and the other is over time. The latter element is important to flexibly account for nonstationarity. We name them double kernel HAC (DK-HAC) estimators. We show the consistency of the estimators and obtain an optimal DK-HAC estimator under the mean squared error (MSE) criterion. Overall, HAR tests standardized by the proposed DK-HAC estimators are competitive with fixed-b HAR tests, when the latter work well, with regards to size control even when there is strong dependence. Notably, in those empirically relevant situations in which previous HAR tests are undersized and have little or no power, the DK-HAC estimator leads to tests that have good size and power.
    Date: 2021–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.02981&r=all
  3. By: Alessandro Casini; Pierre Perron
    Abstract: We establish new mean-squared error (MSE) bounds for long-run variance (LRV) estimation, valid for both stationary and nonstationary sequences that are sharper than previously established. The key element to construct such bounds is to use restrictions onthe degree of nonstationarity. Unlike previous bounds, they show how nonstationarity influences the bias-variance trade-off. Unlike previously established bounds, either under stationarity or nonstationarity, the new bounds depends on the form of nonstationarity. The bounds are established for double kernel long-run variance estimators. The corresponding bounds for classical long-run variance estimators follow as a special case. We use them to construct new data-dependent methods for the selection of bandwidths for (double) kernel heteroskedasticity autocorrelation consistent (DK-HAC) estimators. These account more flexibly for nonstationarity and lead to tests with The new MSE bounds and associated bandwidths help to to improve good finite-sample performance, especially good power when existing LRV estimators lead to tests having little or no or no power. The second contribution is to introduce a nonparametric nonlinear VAR prewhitened LRV estimator. This accounts explicitly for nonstationarity unlike previous prewhitened procedures which are known to be unstable. Its consistency, rate of convergence and MSE bounds are established. The prewhitened DK-HAC estimators lead to tests with good finite-sample size while maintaining good monotonic power.
    Date: 2021–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.02235&r=all
  4. By: Davide Viviano; Jelena Bradic
    Abstract: This paper discusses the problem of estimation and inference on time-varying treatments. We propose a method for inference on treatment histories, by introducing a \textit{dynamic} covariate balancing method. Our approach allows for (i) treatments to propagate arbitrarily over time; (ii) non-stationarity and heterogeneity of treatment effects; (iii) high-dimensional covariates, and (iv) unknown propensity score functions. We study the asymptotic properties of the estimator, and we showcase the parametric convergence rate of the proposed procedure. We illustrate in simulations and an empirical application the advantage of the method over state-of-the-art competitors.
    Date: 2021–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.01280&r=all
  5. By: Zhaoyuan Li; Jianfeng Yao
    Abstract: This paper reexamines the seminal Lagrange multiplier test for cross-section independence in a large panel model where both the number of cross-sectional units n and the number of time series observations T can be large. The first contribution of the paper is an enlargement of the test with two extensions: firstly the new asymptotic normality is derived in a simultaneous limiting scheme where the two dimensions (n, T) tend to infinity with comparable magnitudes; second, the result is valid for general error distribution (not necessarily normal). The second contribution of the paper is a new test statistic based on the sum of the fourth powers of cross-section correlations from OLS residuals, instead of their squares used in the Lagrange multiplier statistic. This new test is generally more powerful, and the improvement is particularly visible against alternatives with weak or sparse cross-section dependence. Both simulation study and real data analysis are proposed to demonstrate the advantages of the enlarged Lagrange multiplier test and the power enhanced test in comparison with the existing procedures.
    Date: 2021–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.06075&r=all
  6. By: Wang, Weining; Wooldridge, Jeffrey M.; Xu, Mengshan
    Abstract: Modelling dynamic conditional heteroscedasticity is the daily routine in time series econometrics. We propose a weighted conditional moment estimation to potentially improve the eciency of the QMLE (quasi maximum likelihood estimation). The weights of conditional moments are selected based on the analytical form of optimal instruments, and we nominally decide the optimal instrument based on the third and fourth moments of the underlying error term. This approach is motivated by the idea of general estimation equations (GEE). We also provide an analysis of the eciency of QMLE for the location and variance parameters. Simulations and applications are conducted to show the better performance of our estimators.
    JEL: C00
    Date: 2020
    URL: http://d.repec.org/n?u=RePEc:zbw:irtgdp:2020021&r=all
  7. By: Fiammetta Menchetti; Fabrizio Cipollini; Fabrizia Mealli
    Abstract: The potential outcomes approach to causal inference, most commonly known as the Rubin Causal Model (RCM), is a framework that allows to define the causal effect of a treatment (or "intervention") as a contrast of potential outcomes, to discuss assumptions enabling to identify such causal effects from available data, as well as to develop methods for estimating causal effects under these assumptions. In recent years, several methods have been developed under the RCM to estimate causal effects in time series settings. However, none of these make use of ARIMA models, which are instead very common in the econometrics literature. In this paper, we propose a novel approach, C-ARIMA, to define and estimate the causal effect of an intervention in a time series setting under the RCM. We check the validity of the proposed method with an extensive simulation study, comparing its performance against a standard intervention analysis approach. In the empirical application, we use C-ARIMA to assess the causal effect of a new price policy on supermarket sales.
    Date: 2021–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.06740&r=all
  8. By: Kim Christensen; Mikkel Slot Nielsen; Mark Podolskij
    Abstract: In this paper, we develop a penalized realized variance (PRV) estimator of the quadratic variation (QV) of a high-dimensional continuous It\^{o} semimartingale. We adapt the principle idea of regularization from linear regression to covariance estimation in a continuous-time high-frequency setting. We show that under a nuclear norm penalization, the PRV is computed by soft-thresholding the eigenvalues of realized variance (RV). It therefore encourages sparsity of singular values or, equivalently, low rank of the solution. We prove our estimator is minimax optimal up to a logarithmic factor. We derive a concentration inequality, which reveals that the rank of PRV is -- with a high probability -- the number of non-negligible eigenvalues of the QV. Moreover, we also provide the associated non-asymptotic analysis for the spot variance. We suggest an intuitive data-driven bootstrap procedure to select the shrinkage parameter. Our theory is supplemented by a simulation study and an empirical application. The PRV detects about three-five factors in the equity market, with a notable rank decrease during times of distress in financial markets. This is consistent with most standard asset pricing models, where a limited amount of systematic factors driving the cross-section of stock returns are perturbed by idiosyncratic errors, rendering the QV -- and also RV -- of full rank.
    Date: 2021–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.03237&r=all
  9. By: Varlam Kutateladze
    Abstract: Factor modeling is a powerful statistical technique that permits to capture the common dynamics in a large panel of data with a few latent variables, or factors, thus alleviating the curse of dimensionality. Despite its popularity and widespread use for various applications ranging from genomics to finance, this methodology has predominantly remained linear. This study estimates factors nonlinearly through the kernel method, which allows flexible nonlinearities while still avoiding the curse of dimensionality. We focus on factor-augmented forecasting of a single time series in a high-dimensional setting, known as diffusion index forecasting in macroeconomics literature. Our main contribution is twofold. First, we show that the proposed estimator is consistent and it nests linear PCA estimator as well as some nonlinear estimators introduced in the literature as specific examples. Second, our empirical application to a classical macroeconomic dataset demonstrates that this approach can offer substantial advantages over mainstream methods.
    Date: 2021–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.01266&r=all
  10. By: Fredrik S\"avje
    Abstract: Exposure mappings facilitate investigations of complex causal effects when units interact in experiments. Current methods assume that the exposures are correctly specified, but such an assumption cannot be verified, and its validity is often questionable. This paper describes conditions under which one can draw inferences about exposure effects when the exposures are misspecified. The main result is a proof of consistency under mild conditions on the errors introduced by the misspecification. The rate of convergence is determined by the dependence between units' specification errors, and consistency is achieved even if the errors are large as long as they are sufficiently weakly dependent. In other words, exposure effects can be precisely estimated also under misspecification as long as the units' exposures are not misspecified in the same way. The limiting distribution of the estimator is discussed. Asymptotic normality is achieved under stronger conditions than those needed for consistency. Similar conditions also facilitate conservative variance estimation.
    Date: 2021–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.06471&r=all
  11. By: Nalan Basturk (Maastricht University); Lennart Hoogerheide (Vrije Universiteit Amsterdam); Herman K. van Dijk (Erasmus University Rotterdam)
    Abstract: In several scientific fields, like bioinformatics, financial and macro-economics, important theoretical and practical issues exist that involve multimodal data distributions. We propose a Bayesian approach using mixtures distributions to approximate accurately such data distributions. Shape and other features of the mixture approximations are estimated including their uncertainty. For discrete data, we introduce a novel mixture of shifted Poisson distributions with an unknown number of components, which overcomes the equidispersion restriction in the standard Poisson which accomodates a wide range of shapes such as multimodality and long tails. Our simulation-based Bayesian inference treats the density features as random variables and highest credibility regions around features are easily obtained. For discrete data we develop an adapted version of the Reversible Jump Markov Chain Monte Carlo (RJMCMC) method, which allows for an unknown number of components instead of the more restrictive approach of choosing a particular number of mixture components using information criteria. Using simulated data, we show that our approach works successfully for three issues that one encounters during the estimation of mixtures: label switching; mixture complexity and prior information and mode membership versus component membership. The proposed method is applied to three empirical data sets: The count data method yields a novel perspective of the data on DNA tandem repeats in \cite{DNA_leiden}; the bimodal distribution of payment details of clients obtaining a loan from a financial institution in Spain in 1990 gives insight into the repayment ability of individual clients; and the distribution of the modes of real GDP growth data from the PennWorld Tables and their evolution over time explores possible world-wide economic convergence as well as group convergence between the US and European countries. The results of our descriptive analysis may be used as input for forecasting and policy analysis.
    Keywords: Multimodality, mixtures, Markov Chain Monte Carlo, Bayesian Inference
    JEL: C11 C14 C63
    Date: 2021–02–10
    URL: http://d.repec.org/n?u=RePEc:tin:wpaper:20210017&r=all
  12. By: Sakae Oya
    Abstract: Managing a large-scale portfolio with many assets is one of the most challenging tasks in the field of finance. It is partly because estimation of either covariance or precision matrix of asset returns tends to be unstable or even infeasible when the number of assets $p$ exceeds the number of observations $n$. For this reason, most of the previous studies on portfolio management have focused on the case of $p n$, we propose to use a new Bayesian framework based on adaptive graphical LASSO for estimating the precision matrix of asset returns in a large-scale portfolio. Unlike the previous studies on graphical LASSO in the literature, our approach utilizes a Bayesian estimation method for the precision matrix proposed by Oya and Nakatsuma (2020) so that the positive definiteness of the precision matrix should be always guaranteed. As an empirical application, we construct the global minimum variance portfolio of $p=100$ for various values of $n$ with the proposed approach as well as the non-Bayesian graphical LASSO approach, and compare their out-of-sample performance with the equal weight portfolio as the benchmark. In this comparison, the proposed approach produces more stable results than the non-Bayesian approach in terms of Sharpe ratio, portfolio composition and turnover. Furthermore, the proposed approach succeeds in estimating the precision matrix even if $n$ is much smaller than $p$ and the non-Bayesian approach fails to do so.
    Date: 2021–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.05880&r=all
  13. By: Justo Puerto; Federica Ricca; Mois\'es Rodr\'iguez-Madrena; Andrea Scozzari
    Abstract: Recent studies stressed the fact that covariance matrices computed from empirical financial time series appear to contain a high amount of noise. This makes the classical Markowitz Mean-Variance Optimization model unable to correctly evaluate the performance associated to selected portfolios. Since the Markowitz model is still one of the most used practitioner-oriented tool, several filtering methods have been proposed in the literature to fix the problem. Among them, the two most promising ones refer to the Random Matrix Theory or to the Power Mapping strategy. The basic idea of these methods is to transform the correlation matrix maintaining the Mean-Variance Optimization model. However, experimental analysis shows that these two strategies are not adequately effective when applied to real financial datasets. In this paper we propose an alternative filtering method based on Combinatorial Optimization. We advance a new Mixed Integer Quadratic Programming model to filter those observations that may influence the performance of a portfolio in the future. We discuss the properties of this new model and we test it on some real financial datasets. We compare the out-of-sample performance of our portfolios with the one of the portfolios provided by the two above mentioned alternative strategies. We show that our method outperforms them. Although our model can be solved efficiently with standard optimization solvers the computational burden increases for large datasets. To overcome this issue we also propose a heuristic procedure that empirically showed to be both efficient and effective.
    Date: 2021–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.01123&r=all
  14. By: Härdle, Wolfgang Karl; Nussbaum, Michael
    Abstract: Among nonparametric smoothers, there is a well-known correspondence between kernel and Fourier series methods, pivoted by the Fourier transform of the kernel. This suggests a similar relationship between kernel and spline estimators. A known special case is the result of Silverman (1984) on the effective kernel for the classical Reinsch-Schoenberg smoothing spline in the nonparametric regression model. We present an extension by showing that a large class of kernel estimators have a spline equivalent, in the sense of identical asymptotic local behaviour of the weighting coefficients. This general class of spline smoothers includes also the minimax linear estimator over Sobolev ellipsoids. The analysis is carried out for piecewise linear splines and equidistant design.
    Keywords: Kernel estimator,spline smoothing,filtering coefficients,differential operator,Green's function approximation,asymptotic minimax spline
    JEL: C00
    Date: 2020
    URL: http://d.repec.org/n?u=RePEc:zbw:irtgdp:2020010&r=all
  15. By: Ercument Cahan; Jushan Bai; Serena Ng
    Abstract: Economists are blessed with a wealth of data for analysis, but more often than not, values in some entries of the data matrix are missing. Various methods have been proposed to impute missing observations with iterative estimates of their unconditional or conditional means. We exploit the factor structure in panel data of large dimensions. We first use a series of projections of variable specific length to impute the common component associated with missing observations and show that this immediately yields a consistent estimate without further iteration. But setting the idiosyncratic errors to zero will under-estimate variability. Hence in a second imputation, we inject the missing idiosyncratic noise by resampling to obtain a consistent estimator for the convariance matrix, which plays an important role in risk management. Simulations calibrated to CRSP returns over the sample 1990-2018 are used to show that the double imputation methodology significantly improves various performance measures over single imputation. Implications for using
    Date: 2021–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.03045&r=all
  16. By: Alessandro Casini; Taosong Deng; Pierre Perron
    Abstract: We establish theoretical and analytical results about the low frequency contamination induced by general nonstationarity for estimates such as the sample autocovariance and the periodogram, and deduce consequences for heteroskedasticity and autocorrelation robust (HAR) inference. We show that for short memory nonstationarity data these estimates exhibit features akin to long memory. We present explicit expressions for the asymptotic bias of these estimates. This bias increases with the degree of heterogeneity. in the data and is responsible for generating low frequency contamination or simply making the time series exhibiting long memory features. The sample autocovariances display hyperbolic rather than exponential decay while the periodogram becomes unbounded near the origin. We distinguish cases where this contamination only occurs as a small-sample problem and cases where the contamination continues to hold asymptotically. We show theoretically that nonparametric smoothing over time is robust to low frequency contamination.in that the sample local autocovariance and the local periodogram are unlikely to exhibit long memory features. Simulations confirm that our theory provides useful approximations. Since the autocovariances and the periodogram are key elements for HAR inference, our results provide new insights on the debate between consistent versus inconsistent small versus long/fixed-b bandwidths for long-run variance (LRV) estimation-based inference.
    Date: 2021–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.01604&r=all
  17. By: Federico Belotti; Alessandro Casini; Leopoldo Catania; Stefano Grassi; Pierre Perron
    Abstract: We consider the derivation of data-dependent simultaneous bandwidths for double kernel heteroskedasticity and autocorrelation consistent (DK-HAC) estimators. In addition to the usual smoothing over lagged autocovariances for classical HAC estimators, the DK-HAC estimator also applies smoothing over the time direction. We obtain the optimal bandwidths that jointly minimize the global asymptotic MSE criterion and discuss the trade-off between bias and variance with respect to smoothing over lagged autocovariances and over time. Unlike the MSE results of Andrews (1991), we establish how nonstationarity affects the bias-variance trade-o?. We use the plug-in approach to construct data-dependent bandwidths for the DK-HAC estimators and compare them with the DK-HAC estimators from Casini (2021) that use data-dependent bandwidths obtained from a sequential MSE criterion. The former performs better in terms of size control, especially with stationary and close to stationary data. Finally, we consider long-run variance estimation under the assumption that the series is a function of a nonparametric estimator rather than of a semiparametric estimator that enjoys the usual T^(1/2) rate of convergence. Thus, we also establish the validity of consistent long-run variance estimation in nonparametric parameter estimation settings.
    Date: 2021–02
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.00060&r=all
  18. By: Mustafayeva, Konul; Wang, Weining
    Abstract: Estimating spot covariance is an important issue to study, especially with the increasing availability of high-frequency nancial data. We study the estimation of spot covariance using a kernel method for high-frequency data. In particular, we consider rst the kernel weighted version of realized covariance estimator for the price process governed by a continuous multivariate semimartingale. Next, we extend it to the threshold kernel estimator of the spot covariances when the underlying price process is a discontinuous multivariate semimartingale with nite activity jumps. We derive the asymptotic distribution of the estimators for both xed and shrinking bandwidth. The estimator in a setting with jumps has the same rate of convergence as the estimator for di usion processes without jumps. A simulation study examines the nite sample properties of the estimators. In addition, we study an application of the estimator in the context of covariance forecasting. We discover that the forecasting model with our estimator outperforms a benchmark model in the literature.
    Keywords: high-frequency data,kernel estimation,jump,forecasting covariance matrix
    JEL: C00
    Date: 2020
    URL: http://d.repec.org/n?u=RePEc:zbw:irtgdp:2020025&r=all
  19. By: Wang, Wenjie
    Abstract: In this note, for the case that the disturbances are conditional homoskedastic, we show that a properly re-scaled residual bootstrap procedure is able to consistently estimate the limiting distribution of a series estimator in the partially linear model even when the number of regressors is of the same order as the sample size. Monte Carlo simulations show that the bootstrap procedure has superior �finite sample performance than asymptotic approximations when the sample size is small and the number of regressors is close to the sample size.
    Keywords: Bootstrap approximation, Partially linear model, Many regressors asymptotics
    JEL: C12 C26
    Date: 2021–03–03
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:106391&r=all
  20. By: Gaetan Bakalli (University of Geneva); Stéphane Guerrier (University of Geneva); Olivier Scaillet (University of Geneva and Swiss Finance Institute)
    Abstract: We develop a penalized two-pass regression with time-varying factor loadings. The penalization in the first pass enforces sparsity for the time-variation drivers while also maintaining compatibility with the no arbitrage restrictions by regularizing appropriate groups of coefficients. The second pass delivers risk premia estimates to predict equity excess returns. Our Monte Carlo results and our empirical results on a large cross-sectional data set of US individual stocks show that penalization without grouping can yield to nearly all estimated time-varying models violating the no arbitrage restrictions. Moreover, our results demonstrate that the proposed method reduces the prediction errors compared to a penalized approach without appropriate grouping or a time-invariant factor model.
    Keywords: two-pass regression, predictive modeling, large panel, factor model, LASSO penalization.
    JEL: C13 C23 C51 C52 C53 C55 C58 G12 G17
    Date: 2021–01
    URL: http://d.repec.org/n?u=RePEc:chf:rpseri:rp2109&r=all
  21. By: Xu, Xiu; Wang, Weining; Shin, Yongcheol
    Abstract: This paper proposes a dynamic spatial autoregressive quantile model. Using predetermined network information, we study dynamic tail event driven risk using a system of conditional quantile equations. Extending Zhu, Wang, Wang and Härdle (2019), we allow the contemporaneous dependency of nodal responses by incorporating a spatial lag in our model. For example, this is to allow a firm’s tail behavior to be connected with a weighted aggregation of the simultaneous returns of the other firms. In addition, we control for the common factor effects. The instrumental variable quantile regressive method is used for our model estimation, and the associated asymptotic theory for estimation is also provided. Simulation results show that our model performs well at various quantile levels with different network structures, especially when the node size increases. Finally, we illustrate our method with an empirical study. We uncover significant network effects in the spatial lag among financial institutions.
    Keywords: Network,Quantile autoregression,Instrumental variables,Dynamic models
    JEL: C32 C51 G17
    Date: 2020
    URL: http://d.repec.org/n?u=RePEc:zbw:irtgdp:2020024&r=all
  22. By: Marcin Hitczenko
    Abstract: Researchers interested in studying the frequency of events or behaviors among a population must rely on count data provided by sampled individuals. Often, this involves a decision between live event counting, such as a behavioral diary, and recalled aggregate counts. Diaries are generally more accurate, but their greater cost and respondent burden generally yield less data. The choice of survey mode, therefore, involves a potential tradeoff between bias and variance of estimators. I use a case study comparing inferences about payment instrument use based on different survey designs to illustrate this dilemma. I then use a simulation study to show how and under what conditions a hybrid survey design can improve efficiency of estimation, in terms of mean-squared error. Overall, this work suggests that such a hybrid design can have considerable benefits as long as there is nontrivial overlap in the diary and recall samples.
    Keywords: recall surveys; diaries; bias; mean-squared error; multi-level models
    JEL: C15 C81 C83
    Date: 2021–02–03
    URL: http://d.repec.org/n?u=RePEc:fip:fedawp:90080&r=all
  23. By: Kristof Lommers; Ouns El Harzli; Jack Kim
    Abstract: This study aims to examine the challenges and applications of machine learning for financial research. Machine learning algorithms have been developed for certain data environments which substantially differ from the one we encounter in finance. Not only do difficulties arise due to some of the idiosyncrasies of financial markets, there is a fundamental tension between the underlying paradigm of machine learning and the research philosophy in financial economics. Given the peculiar features of financial markets and the empirical framework within social science, various adjustments have to be made to the conventional machine learning methodology. We discuss some of the main challenges of machine learning in finance and examine how these could be accounted for. Despite some of the challenges, we argue that machine learning could be unified with financial research to become a robust complement to the econometrician's toolbox. Moreover, we discuss the various applications of machine learning in the research process such as estimation, empirical discovery, testing, causal inference and prediction.
    Date: 2021–02
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.00366&r=all
  24. By: Gary Cornwall; Jeff Chen; Beau Sauley
    Abstract: In this paper we have updated the hypothesis testing framework by drawing upon modern computational power and classification models from machine learning. We show that a simple classification algorithm such as a boosted decision stump can be used to fully recover the full size-power trade-off for any single test statistic. This recovery implies an equivalence, under certain conditions, between the basic building block of modern machine learning and hypothesis testing. Second, we show that more complex algorithms such as the random forest and gradient boosted machine can serve as mapping functions in place of the traditional null distribution. This allows for multiple test statistics and other information to be evaluated simultaneously and thus form a pseudo-composite hypothesis test. Moreover, we show how practitioners can make explicit the relative costs of Type I and Type II errors to contextualize the test into a specific decision framework. To illustrate this approach we revisit the case of testing for unit roots, a difficult problem in time series econometrics for which existing tests are known to exhibit low power. Using a simulation framework common to the literature we show that this approach can improve upon overall accuracy of the traditional unit root test(s) by seventeen percentage points, and the sensitivity by thirty six percentage points.
    Date: 2021–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.01368&r=all
  25. By: Neves, Kleber; Tan, Pedro Batista; Amaral, Olavo Bohrer
    Abstract: Diagnostic screening models for the interpretation of null hypothesis significance test (NHST) results have been influential in highlighting the effect of selective publication on the reproducibility of the published literature, leading to John Ioannidis’ much-cited claim that most published research findings are false. These models, however, are typically based on the assumption that hypotheses are dichotomously true or false, without considering that effect sizes for different hypotheses are not the same. To address this limitation, we develop a simulation model that overcomes this by modeling effect sizes explicitly using different continuous distributions, while retaining other aspects of previous models such as publication bias and the pursuit of statistical significance. Our results show that the combination of selective publication, bias, low statistical power and unlikely hypotheses consistently leads to high proportions of false positives, irrespective of the effect size distribution assumed. Using continuous effect sizes also allows us to evaluate the degree of effect size overestimation and prevalence of estimates with the wrong signal in the literature, showing that the same factors that drive false-positive results also lead to errors in estimating effect size direction and magnitude. Nevertheless, the relative influence of these factors on different metrics varies depending on the distribution assumed for effect sizes. The model is made available as an R ShinyApp interface, allowing one to explore features of the literature in various scenarios.
    Date: 2021–03–05
    URL: http://d.repec.org/n?u=RePEc:osf:metaar:jk7sa&r=all
  26. By: Best, Katherine Laura; Speyer, Lydia Gabriela; Murray, Aja Louise; Ushakova, Anastasia
    Abstract: Identifying predictors of attrition is essential for designing longitudinal studies such that attrition bias can be minimised, and for identifying the variables that can be used as auxiliary in statistical techniques to help correct for non-random drop-out. This paper provides a comparative overview of predictive techniques that can be used to model attrition and identify important risk factors that help in its prediction. Logistic regression and several tree-based machine learning methods were applied to Wave 2 dropout in an illustrative sample of 5000 individuals from a large UK longitudinal study, Understanding Society. Each method was evaluated based on accuracy, AUC-ROC, plausibility of key assumptions and interpretability. Our results suggest a 10% improvement in accuracy for random forest compared to logistic regression methods. However, given the differences in estimation procedures we suggest that both models could be used in conjunction to provide the most comprehensive understanding of attrition predictors.
    Date: 2021–03–02
    URL: http://d.repec.org/n?u=RePEc:osf:socarx:tyszr&r=all
  27. By: Ratbek Dzhumashev; Ainura Tursunalieva
    Abstract: This paper develops a new criterion for selecting a valid instrumental variable (IV). This criterion imposes directional restrictions on an instrument in-formed by the sign of the covariance between the error term and the endogenous variable cov(X,u). We show that a valid IV for the case when cov(X,u) > 0 is not suitable for the case when cov(X,u)
    Keywords: two-stage least squeare estimation, instrumental variable, vailidty test
    JEL: C26 C18
    Date: 2019–06
    URL: http://d.repec.org/n?u=RePEc:mos:moswps:2019-04&r=all
  28. By: Feng, Yuanhua; Härdle, Wolfgang Karl
    Abstract: Penalized spline smoothing of time series and its asymptotic properties are studied. A data-driven algorithm for selecting the smoothing parameter is developed. The proposal is applied to de ne a semiparametric extension of the well-known Spline- GARCH, called a P-Spline-GARCH, based on the log-data transformation of the squared returns. It is shown that now the errors process is exponentially strong mixing with nite moments of all orders. Asymptotic normality of the P-spline smoother in this context is proved. Practical relevance of the proposal is illustrated by data examples and simulation. The proposal is further applied to value at risk and expected shortfall.
    Keywords: P-spline smoother,smoothing parameter selection,P-Spline-GARCH,strong mixing,value at risk,expected shortfall
    JEL: C14 C51
    Date: 2020
    URL: http://d.repec.org/n?u=RePEc:zbw:irtgdp:2020016&r=all
  29. By: Cl\'ement de Chaisemartin; Ziteng Lei
    Abstract: Bartik regressions use locations' differential exposure to nationwide sector-level shocks as an instrument to estimate the effect of a location-level treatment on an outcome. In the canonical Bartik design, locations' differential exposure to industry-level employment shocks are used as an instrument to measure the effect of their employment evolution on their wage evolution. Some recent papers studying Bartik designs have assumed that the sector-level shocks are exogenous and all have the same expectation. This second assumption may sometimes be implausible. For instance, there could be industries whose employment is more likely to grow than that of other industries. We replace that second assumption by parallel trends assumptions. Under our assumptions, Bartik regressions identify weighted sums of location-specific effects, with weights that may be negative. Accordingly, such regressions may be misleading in the presence of heterogeneous effects, an issue that was not present under the assumptions maintained in previous papers. Estimating the weights attached to Bartik regressions is a way to assess their robustness to heterogeneous effects. We also propose an alternative estimator that is robust to location-specific effects. Finally, we revisit two applications. In both cases, Bartik regressions have fairly large negative weights attached to them. Our alternative estimator is substantially different from the Bartik regression coefficient in one application.
    Date: 2021–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.06437&r=all
  30. By: Ahmed, Hanan (Tilburg University, Center For Economic Research); Einmahl, John (Tilburg University, Center For Economic Research); Zhou, Chen
    Keywords: Asymptotic normality; extreme value index; semi-supervised inference; tail dependence; variance reduction
    Date: 2021
    URL: http://d.repec.org/n?u=RePEc:tiu:tiucen:ad83a546-fb09-408e-80cc-b4b2db763d37&r=all
  31. By: Yong Cai
    Abstract: This paper contains two finite-sample results about the sign test. First, we show that the sign test is unbiased against two-sided alternatives even when observations are not identically distributed. Second, we provide simple theoretical counterexamples to show that correlation that is unaccounted for leads to size distortion and over-rejection. Our results have implication for practitioners, who are increasingly employing randomization tests for inference.
    Date: 2021–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2103.01412&r=all
  32. By: André M. Marques; Gilberto Tadeu Lima
    Abstract: This paper tests for Granger causality in quantiles between the wage share and capacity utilization in twelve advanced countries using annual data ranging from 1960 to 2019. Instead of focusing only on the conditional mean, we test for causality in the full conditional distribution of the variables of interest. This interestingly allows detecting causal relations in both the mean and the entire conditional distribution. Based on confidence intervals generated by bootstrap resampling and the Wald test for joint significance, our main statistically significant results are the following. Capacity utilization positively causes the wage share in seven out of the twelve sample countries. In these countries, the Granger causal effect of capacity utilization on the wage share is strong and heterogeneous across quantiles, it being larger for more extreme quantiles. Capacity utilization positively Granger causes the wage share in all conditional quantiles in the U.S. The wage share negatively Granger causes capacity utilization in most conditional quantiles in Spain. There is no significant Granger causality in either direction between capacity utilization and the wage share in Norway, Canada, Portugal, and Greece.
    Keywords: Granger causality in distribution; quantile regression; bootstrap resampling; wage share; capacity utilization
    JEL: C32 C12 E22 E25
    Date: 2021–03–08
    URL: http://d.repec.org/n?u=RePEc:spa:wpaper:2021wpecon03&r=all
  33. By: Francq, Christian; Zakoian, Jean-Michel
    Abstract: The paper establishes the Local Asymptotic Normality (LAN) property for general conditionally heteroskedastic time series models of multiplicative form, $\epsilon_t=\sigma_t(\btheta_0)\eta_t$, where the volatility $\sigma_t(\btheta_0)$ is a parametric function of $\{\epsilon_{s}, s
    Keywords: APARCH; Asymmetric Student-$t$ distribution; Beta-$t$-GARCH; Conditional heteroskedasticity; LAN in time series; Quadratic mean differentiability.
    JEL: C51
    Date: 2021
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:106542&r=all
  34. By: Leopold Ringwald (Department of Economics, Vienna University of Economics and Business); Thomas O. Zörner (Department of Economics, Vienna University of Economics and Business)
    Abstract: This paper proposes a Bayesian Logistic Smooth Transition Autoregressive (LSTAR) model with stochastic volatility (SV) to model inflation dynamics in a nonlinear fashion. Inflationary regimes are determined by smoothed money growth which serves as a transition variable that governs the transition between regimes. We apply this approach on quarterly data from the US, the UK and Canada and are able to identify well-known, high inflation periods in the samples. Moreover, our results suggest that the role of money growth is specific to the economy under scrutiny. Finally, we analyse a variety of different model specifications and are able to confirm that adjusted money growth still has leading indicator properties on inflation regimes.
    Keywords: Money-inflation link, Nonlinear modeling, Bayesian inference, LSTAR-SV model
    JEL: C11 C32 E31 E51
    Date: 2021–03
    URL: http://d.repec.org/n?u=RePEc:wiw:wiwwuw:wuwp310&r=all
  35. By: Bruneel-Zupanc, Christophe Alain
    Abstract: This paper develops a general framework for models, static or dynamic, in which agents simultaneously make both discrete and continuous choices. I show that such models are nonparametrically identified. Based on the constructive identification arguments, I build a novel two-step estimation method in the lineage of Hotz and Miller (1993) but extended to discrete and continuous choice models. The method is especially attractive for complex dynamic models because it significantly reduces the computational burden associated with their estimation. To illustrate my new method, I estimate a dynamic model of female labor supply and consumption.
    Date: 2021–02–03
    URL: http://d.repec.org/n?u=RePEc:tse:wpaper:125232&r=all

This nep-ecm issue is ©2021 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.