|
on Econometrics |
By: | Zhaoxing Gao; Ruey S. Tsay |
Abstract: | This paper proposes a hierarchical approximate-factor approach to analyzing high-dimensional, large-scale heterogeneous time series data using distributed computing. The new method employs a multiple-fold dimension reduction procedure using Principal Component Analysis (PCA) and shows great promises for modeling large-scale data that cannot be stored nor analyzed by a single machine. Each computer at the basic level performs a PCA to extract common factors among the time series assigned to it and transfers those factors to one and only one node of the second level. Each 2nd-level computer collects the common factors from its subordinates and performs another PCA to select the 2nd-level common factors. This process is repeated until the central server is reached, which collects common factors from its direct subordinates and performs a final PCA to select the global common factors. The noise terms of the 2nd-level approximate factor model are the unique common factors of the 1st-level clusters. We focus on the case of 2 levels in our theoretical derivations, but the idea can easily be generalized to any finite number of hierarchies. We discuss some clustering methods when the group memberships are unknown and introduce a new diffusion index approach to forecasting. We further extend the analysis to unit-root nonstationary time series. Asymptotic properties of the proposed method are derived for the diverging dimension of the data in each computing unit and the sample size $T$. We use both simulated data and real examples to assess the performance of the proposed method in finite samples, and compare our method with the commonly used ones in the literature concerning the forecastability of extracted factors. |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2103.14626&r=all |
By: | Greta Goracci; Simone Giannerini; Kung-Sik Chan; Howell Tong |
Abstract: | We present supremum Lagrange Multiplier tests to compare a linear ARMA specification against its threshold ARMA extension. We derive the asymptotic distribution of the test statistics both under the null hypothesis and contiguous local alternatives. Moreover, we prove the consistency of the tests. The Monte Carlo study shows that the tests enjoy good finite-sample properties, are robust against model mis-specification and their performance is not affected if the order of the model is unknown. The tests present a low computational burden and do not suffer from some of the drawbacks that affect the quasi-likelihood ratio setting. Lastly, we apply our tests to a time series of standardized tree-ring growth indexes and this can lead to new research in climate studies. |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2103.13977&r=all |
By: | Torben G. Andersen; Rasmus T. Varneskov |
Abstract: | This paper develops parameter instability and structural change tests within predictive regressions for economic systems governed by persistent vector autoregressive dynamics. Specifically, in a setting where all – or a subset – of the variables may be fractionally integrated and the predictive relation may feature cointegration, we provide sup-Wald break tests that are constructed using the Local speCtruM (LCM) approach. The new tests cover both parameter variation and multiple structural changes with unknown break dates, and the number of breaks being known or unknown. We establish asymptotic limit theory for the tests, showing that it coincides with standard testing procedures. As a consequence, existing critical values for tied-down Bessel processes may be applied, without modification. We implement the new structural change tests to explore the stability of the fractionally cointegrating relation between implied- and realized volatility (IV and RV). Moreover, we assess the relative efficiency of IV forecasts against a challenging time-series benchmark constructed from high-frequency data. Unlike existing studies, we find evidence that the IV-RV cointegrating relation is unstable, and that carefully constructed time-series forecasts are more efficient than IV in capturing low-frequency movements in RV. |
JEL: | G12 G17 |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:28570&r=all |
By: | Torben G. Andersen; Rasmus T. Varneskov |
Abstract: | We study standard predictive regressions in economic systems governed by persistent vector autoregressive dynamics for the state variables. In particular, all – or a subset – of the variables may be fractionally integrated, which induces a spurious regression problem. We propose a new inference and testing procedure – the Local speCtruM (LCM) approach – for joint significance of the regressors, that is robust against the variables having different integration orders and remains valid regardless of whether predictors are significant and if they induce cointegration. Specifically, the LCM procedure is based on fractional filtering and band-spectrum regression using a suitably selected set of frequency ordinates. Contrary to existing procedures, we establish a uniform Gaussian limit theory and a standard χ2-distributed test statistic. Using LCM inference and testing techniques, we explore predictive regressions for the realized return variation. Standard least squares inference indicates that popular financial and macroeconomic variables convey valuable information about future return volatility. In contrast, we find no significant evidence using our robust LCM procedure. If anything, our tests support a reverse chain of causality: rising financial volatility predates adverse innovations to macroeconomic variables. Simulations illustrate the relevance of the theoretical arguments for finite-sample inference. |
JEL: | G12 G17 |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:28568&r=all |
By: | Javier Alejo; Antonio F. Galvao; Gabriel Montes-Rojas |
Abstract: | This paper develops inference procedures to evaluate the validity of instruments in instrumental variables (IV) quantile regression (QR) models. We first derive a first-stage regression for the IVQR model, analogue to the least squares case, which is a weighted least-squares regression. The weights are given by the density function of the conditional distribution of the innovation term in the QR structural model, conditional on the exogenous covariates and the nstruments. The first-stage regression is a natural framework to evaluate the instruments since we can test for their statistical significance. In the QR case, the instruments could be relevant at some quantiles but not for others or at the mean. Monte Carlo finite sample experiments show that the tests work as expected in terms of empirical size and power. Two applications illustrate that checking for the statistical significance of the instruments at di↵erent quantiles is important. |
Keywords: | quantile regression, instrumental variables, first-stage |
JEL: | C13 C14 C21 C51 C53 |
Date: | 2020–11 |
URL: | http://d.repec.org/n?u=RePEc:aep:anales:4304&r=all |
By: | Patrik Guggenberger; Frank Kleibergen; Sophocles Mavroeidis |
Abstract: | We introduce a new test for a two-sided hypothesis involving a subset of the structural parameter vector in the linear instrumental variables (IVs) model. Guggenberger et al. (2019), GKM19 from now on, introduce a subvector Anderson-Rubin (AR) test with data-dependent critical values that has asymptotic size equal to nominal size for a parameter space that allows for arbitrary strength or weakness of the IVs and has uniformly nonsmaller power than the projected AR test studied in Guggenberger et al. (2012). However, GKM19 imposes the restrictive assumption of conditional homoskedasticity. The main contribution here is to robustify the procedure in GKM19 to arbitrary forms of conditional heteroskedasticity. We first adapt the method in GKM19 to a setup where a certain covariance matrix has an approximate Kronecker product (AKP) structure which nests conditional homoskedasticity. The new test equals this adaption when the data is consistent with AKP structure as decided by a model selection procedure. Otherwise the test equals the AR/AR test in Andrews (2017) that is fully robust to conditional heteroskedasticity but less powerful than the adapted method. We show theoretically that the new test has asymptotic size bounded by the nominal size and document improved power relative to the AR/AR test in a wide array of Monte Carlo simulations when the covariance matrix is not too far from AKP. |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2103.11371&r=all |
By: | Pang Du; Christopher F. Parmeter; Jeffrey S. Racine |
Abstract: | We consider shape constrained kernel-based probability density function (PDF) and probability mass function (PMF) estimation. Our approach is of widespread potential applicability and includes, separately or simultaneously, constraints on the PDF (PMF) function itself, its integral (sum), and derivatives (finite-differences) of any order. We also allow for pointwise upper and lower bounds (i.e., inequality constraints) on the PDF and PMF in addition to more popular equality constraints, and the approach handles a range of transformations of the PDF and PMF including, for example, logarithmic transformations (which allows for the imposition of log-concave or log-convex constraints that are popular with practitioners). Theoretical underpinnings for the procedures are provided. A simulation-based comparison of our proposed approach with those obtained using Grenander-type methods is favourable to our approach when the DGP is itself smooth. As far as we know, ours is also the only smooth framework that handles PDFs and PMFs in the presence of inequality bounds, equality constraints, and other popular constraints such as those mentioned above. An implementation in R exists that incorporates constraints such as monotonicity (both increasing and decreasing), convexity and concavity, and log-convexity and log-concavity, among others, while respecting finite-support boundaries via explicit use of boundary kernel functions. |
Keywords: | nonparametric; density; restricted estimation |
JEL: | C14 |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:mcm:deptwp:2021-05&r=all |
By: | del Barrio Castro, Tomás |
Abstract: | Cointegration between Periodically Integrated (PI) processes has been analyzed among other by Birchen- hall, Bladen-Hovell, Chui, Osborn, and Smith (1989), Boswijk and Franses (1995), Franses and Paap (2004), Kleibergen and Franses (1999) and del Barrio Castro and Osborn (2008). However, so far there is not a method, published in an academic journal, that allows us to determine the cointegration rank between PI processes. This paper fills the gap, a method to determine the cointegration rank between a set PI Processes based on the idea of pseudo-demodulation is proposed in the context of Seasonal Cointegration by del Barrio Castro, Cubadda and Osborn (2020). Once a pseudo-demodulation time series is obtained the Johansen (1995) procedure could be applied to determine the cointegration rank. A Monte Carlo experiment shows that the proposed approach works satisfactorily for small samples. |
Keywords: | Reduced Rank Regression,Periodic Cointegration, Periodically Integrated Processes. |
JEL: | C32 |
Date: | 2021 |
URL: | http://d.repec.org/n?u=RePEc:pra:mprapa:106603&r=all |
By: | Tengyuan Liang |
Abstract: | We propose a computationally efficient method to construct nonparametric, heteroskedastic prediction bands for uncertainty quantification, with or without any user-specified predictive model. The data-adaptive prediction band is universally applicable with minimal distributional assumptions, with strong non-asymptotic coverage properties, and easy to implement using standard convex programs. Our approach can be viewed as a novel variance interpolation with confidence and further leverages techniques from semi-definite programming and sum-of-squares optimization. Theoretical and numerical performances for the proposed approach for uncertainty quantification are analyzed. |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2103.17203&r=all |
By: | Dake Li; Mikkel Plagborg-M{\o}ller; Christian K. Wolf |
Abstract: | We conduct a simulation study of Local Projection (LP) and Vector Autoregression (VAR) estimators of structural impulse responses across thousands of data generating processes (DGPs), designed to mimic the properties of the universe of U.S. macroeconomic data. Our analysis considers various structural identification schemes and several variants of LP and VAR estimators, and we pay particular attention to the role of the researcher's loss function. A clear bias-variance trade-off emerges: Because our DGPs are not exactly finite-order VAR models, LPs have lower bias than VAR estimators; however, the variance of LPs is substantially higher than that of VARs at intermediate or long horizons. Unless researchers are overwhelmingly concerned with bias, shrinkage via Bayesian VARs or penalized LPs is attractive. |
Date: | 2021–04 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2104.00655&r=all |
By: | Yiyan Huang; Cheuk Hang Leung; Xing Yan; Qi Wu |
Abstract: | Most existing studies on the double/debiased machine learning method concentrate on the causal parameter estimation recovering from the first-order orthogonal score function. In this paper, we will construct the $k^{\mathrm{th}}$-order orthogonal score function for estimating the average treatment effect (ATE) and present an algorithm that enables us to obtain the debiased estimator recovered from the score function. Such a higher-order orthogonal estimator is more robust to the misspecification of the propensity score than the first-order one does. Besides, it has the merit of being applicable with many machine learning methodologies such as Lasso, Random Forests, Neural Nets, etc. We also undergo comprehensive experiments to test the power of the estimator we construct from the score function using both the simulated datasets and the real datasets. |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2103.11869&r=all |
By: | Augusteijn, Hilde Elisabeth Maria (Tilburg University); van Aert, Robbie Cornelis Maria; van Assen, Marcel A. L. M. |
Abstract: | Publication bias remains to be a great challenge when conducting a meta-analysis. It may result in overestimated effect sizes, increased frequency of false positives, and over- or underestimation of the effect size heterogeneity parameter. A new method is introduced, Bayesian Meta-Analytic Snapshot (BMAS), which evaluates both effect size and its heterogeneity and corrects for potential publication bias. It evaluates the probability of the true effect size being zero, small, medium or large, and the probability of true heterogeneity being zero, small, medium or large. This approach, which provides an intuitive evaluation of uncertainty in the evaluation of effect size and heterogeneity, is illustrated with a real-data example, a simulation study, and a Shiny web application of BMAS. |
Date: | 2021–03–18 |
URL: | http://d.repec.org/n?u=RePEc:osf:osfxxx:avkgj&r=all |
By: | Lukas Boer; Helmut Lütkepohl |
Abstract: | A major challenge for proxy vector autoregressive analysis is the construction of a suitable external instrument variable or proxy for identifying a shock of interest. Some authors construct sophisticated proxies that account for the dating and size of the shock while other authors consider simpler versions that use only the dating and signs of particular shocks. It is shown that such qualitative (sign-)proxies can lead to impulse response estimates of the impact effects of the shock of interest that are nearly as efficient as or even more efficient than estimators based on more sophisticated quantitative proxies that also reflect the size of the shock. Moreover, the sign-proxies tend to provide more precise impulse response estimates than an approach based merely on the higher volatility of the shocks of interest on event dates. |
Keywords: | GMM, heteroskedastic VAR, instrumental variable estimation, proxy VAR, structural vector autoregression |
JEL: | C32 |
Date: | 2021 |
URL: | http://d.repec.org/n?u=RePEc:diw:diwwpp:dp1940&r=all |
By: | Yinchu Zhu |
Abstract: | We consider the setting in which a strong binary instrument is available for a binary treatment. The traditional LATE approach assumes the monotonicity condition stating that there are no defiers (or compliers). Since this condition is not always obvious, we investigate the sensitivity and testability of this condition. In particular, we focus on the question: does a slight violation of monotonicity lead to a small problem or a big problem? We find a phase transition for the monotonicity condition. On one of the boundary of the phase transition, it is easy to learn the sign of LATE and on the other side of the boundary, it is impossible to learn the sign of LATE. Unfortunately, the impossible side of the phase transition includes data-generating processes under which the proportion of defiers tends to zero. This boundary of phase transition is explicitly characterized in the case of binary outcomes. Outside a special case, it is impossible to test whether the data-generating process is on the nice side of the boundary. However, in the special case that the non-compliance is almost one-sided, such a test is possible. We also provide simple alternatives to monotonicity. |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2103.13369&r=all |
By: | Riccardo (Jack) Lucchetti (Dipartimento di Scienze Economiche e Sociali (DiSES), Università Politecnica delle Marche); Claudia Pigini (Dipartimento di Scienze Economiche e Sociali (DiSES), Università Politecnica delle Marche) |
Abstract: | Estimation of random-effects dynamic probit models for panel data entails the so-called “initial conditions problem”. We argue that the relative finite-sample performance of the two main competing solutions is driven by the magnitude of the individual unobserved heterogeneity and/or of the state dependence in the data. We investigate our conjecture by means of a comprehensive Monte Carlo experiment and offer useful indications for the practitioner. |
Keywords: | Panel data, dynamic probit, initial conditions |
JEL: | C23 C25 |
Date: | 2020 |
URL: | http://d.repec.org/n?u=RePEc:ven:wpaper:2020:27&r=all |
By: | Torben G. Andersen; Rasmus T. Varneskov |
Abstract: | This paper studies the properties of predictive regressions for asset returns in economic systems governed by persistent vector autoregressive dynamics. In particular, we allow for the state variables to be fractionally integrated, potentially of different orders, and for the returns to have a latent persistent conditional mean, whose memory is difficult to estimate consistently by standard techniques in finite samples. Moreover, the predictors may be endogenous and “imperfect”. In this setting, we provide a cointegration rank test to determine the predictive model framework as well as the latent persistence of returns. This motivates a rank-augmented Local Spectrum (LCM) procedure, which is consistent and delivers asymptotic Gaussian inference. Simulations illustrate the theoretical arguments. Finally, in an empirical application concerning monthly S&P 500 return prediction, we provide evidence for a fractionally integrated conditional mean component. Moreover, using the rank-augmented LCM procedure, we document significant predictive power for key state variables such as the price-earnings ratio and the default spread. |
JEL: | G12 G17 |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:28569&r=all |
By: | O'Brien, Martin (Central Bank of Ireland); Velasco, Sofia (Central Bank of Ireland) |
Abstract: | This paper develops a multivariate filter based on an unobserved component trend-cycle model. It incorporates stochastic volatility and relies on specific formulations for the cycle component. We test the performance of this algorithm within a Monte-Carlo experiment and apply this decomposition tool to study the evolution of the financial cycle (estimated as the cycle of the credit-to-GDP ratio) for the United States, the United Kingdom and Ireland. We compare our credit cycle measure to the Basel III credit-to- GDP gap, prominent for its role informing the setting of countercyclical capital buffers. The Basel-gap employs the Hodrick-Prescott filter for trend extraction. Filtering methods reliant on similar-duration assumptions suffer from endpoint-bias or spurious cycles. These shortcomings might bias the shape of the credit cycle and thereby limit the precision of the policy assessment reliant on its evolution to target financial distress. Allowing for a flexible law of motion of the variance covariance matrix and informing the estimation of the cycle via economic fundamentalsweare able to improve the statistical properties and to find a more economically meaningful measure of the build-up of cyclical systemic risks. Additionally, we find a large heterogeneity in the drivers of the credit cycles across time and countries. This result stresses the relevance in macro prudential policy of considering flexible approaches that can be tailored to country characteristics in contrast to standardized indicators. |
Keywords: | Credit imbalances, cyclical systemic risk, financial cycle, macroprudential analysis, multivariate unobserved-components models, stochastic volatility . |
JEL: | C32 E32 E58 G01 G28 |
Date: | 2020–12 |
URL: | http://d.repec.org/n?u=RePEc:cbi:wpaper:09/rt/20&r=all |
By: | Joanna Morais; Christine Thomas-Agnan (TSE - Toulouse School of Economics - UT1 - Université Toulouse 1 Capitole - EHESS - École des hautes études en sciences sociales - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement) |
Abstract: | In the framework of Compositional Data Analysis, vectors carrying relative information, also called compositional vectors, can appear in regression models either as dependent or as explanatory variables. In some situations, they can be on both sides of the regression equation. Measuring the marginal impacts of covariates in these types of models is not straightforward since a change in one component of a closed composition automatically affects the rest of the composition. Previous work by the authors has shown how to measure, compute and interpret these marginal impacts in the case of linear regression models with compositions on both sides of the equation. The resulting natural interpretation is in terms of an elasticity, a quantity commonly used in econometrics and marketing applications. They also demonstrate the link between these elasticities and simplicial derivatives. The aim of this contribution is to extend these results to other situations, namely when the compositional vector is on a single side of the regression equation. In these cases, the marginal impact is related to a semi-elasticity and also linked to some simplicial derivative. Moreover we consider the possibility that a total variable is used as an explanatory variable, with several possible interpretations of this total and we derive the elasticity formulas in that case. |
Keywords: | compositional regression model,marginal effects,simplicial derivative,elasticity,semi-elasticity |
Date: | 2021–01 |
URL: | http://d.repec.org/n?u=RePEc:hal:journl:hal-03180682&r=all |
By: | Mohammadreza Ghanbari; Mahdi Goldani |
Abstract: | Support vector machine modeling is a new approach in machine learning for classification showing good performance on forecasting problems of small samples and high dimensions. Later, it promoted to Support Vector Regression (SVR) for regression problems. A big challenge for achieving reliable is the choice of appropriate parameters. Here, a novel Golden sine algorithm (GSA) based SVR is proposed for proper selection of the parameters. For comparison, the performance of the proposed algorithm is compared with eleven other meta-heuristic algorithms on some historical stock prices of technological companies from Yahoo Finance website based on Mean Squared Error and Mean Absolute Percent Error. The results demonstrate that the given algorithm is efficient for tuning the parameters and is indeed competitive in terms of accuracy and computing time. |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2103.11459&r=all |
By: | Joachim Freyberger |
Abstract: | An important class of structural models investigates the determinants of skill formation and the optimal timing of interventions. To achieve point identification of the parameters, researcher typically normalize the scale and location of the unobserved skills. This paper shows that these seemingly innocuous restrictions can severely impact the interpretation of the parameters and counterfactual predictions. For example, simply changing the units of measurements of observed variables might yield ineffective investment strategies and misleading policy recommendations. To tackle these problems, this paper provides a new identification analysis, which pools all restrictions of the model, characterizes the identified set of all parameters without normalizations, illustrates which features depend on these normalizations, and introduces a new set of important policy-relevant parameters that are identified under weak assumptions and yield robust conclusions. As a byproduct, this paper also presents a general and formal definition of when restrictions are truly normalizations. |
Date: | 2021–04 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2104.00473&r=all |
By: | Jonas Meier |
Abstract: | This paper introduces multivariate distribution regression (MDR), a semi-parametric approach to estimate the joint distribution of outcomes. The method allows studying complex dependence structures and distributional treatment eects without making strong parametric assumptions. I show that the MDR coecient process converges to a Gaussian process and that the bootstrap is consistent for the asymptotic distribution of the estimator. Methodologically, MDR contributes by oering the analysis of many functionals of the CDF. For instance, this includes counterfactual distributions. Compared to copula models, MDR achieves the same accuracy but is (i) more robust to misspecication and (ii) allows to condition on many covariates, thus ensuring a high degree of exibility. Finally, an application analyzes shifts in spousal labor supply in response to a health shock. I find that if low-income individuals receive disability insurance benets, their spouses respond by increasing their labor supply. Whereas the opposite holds for high-income households, likely because they are well insured and can aord to work fewer hours. |
Keywords: | Distribution regression; joint distribution; decomposition analysis, distributional treatment eects |
JEL: | C14 C21 |
Date: | 2020–12 |
URL: | http://d.repec.org/n?u=RePEc:ube:dpvwib:dp2023&r=all |
By: | V. A. Kalyagin; A. P. Koldanov; P. A. Koldanov |
Abstract: | Maximum spanning tree (MST) is a popular tool in market network analysis. Large number of publications are devoted to the MST calculation and it's interpretation for particular stock markets. However, much less attention is payed in the literature to the analysis of uncertainty of obtained results. In the present paper we suggest a general framework to measure uncertainty of MST identification. We study uncertainty in the framework of the concept of random variable network (RVN). We consider different correlation based networks in the large class of elliptical distributions. We show that true MST is the same in three networks: Pearson correlation network, Fechner correlation network, and Kendall correlation network. We argue that among different measures of uncertainty the FDR (False Discovery Rate) is the most appropriated for MST identification. We investigate FDR of Kruskal algorithm for MST identification and show that reliability of MST identification is different in these three networks. In particular, for Pearson correlation network the FDR essentially depends on distribution of stock returns. We prove that for market network with Fechner correlation the FDR is non sensitive to the assumption on stock's return distribution. Some interesting phenomena are discovered for Kendall correlation network. Our experiments show that FDR of Kruskal algorithm for MST identification in Kendall correlation network weakly depend on distribution and at the same time the value of FDR is almost the best in comparison with MST identification in other networks. These facts are important in practical applications. |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2103.14593&r=all |
By: | Kilian Huber |
Abstract: | Researchers use (quasi-)experimental methods to estimate how shocks affect directly treated firms and households. Such methods typically do not account for general equilibrium spillover effects. I outline a method that estimates spillovers operating among groups of firms and households. I argue that the presence of multiple types of spillovers, measurement error, and nonlinear effects can severely bias estimates. I show how instrumental variables, heterogeneity tests, and flexible functional forms can overcome different sources of bias. The analysis is particularly relevant to the estimation of spillovers following large-scale financial and business cycle shocks. |
Keywords: | general equilibrium effects, spillovers, estimation, macroeconomic shocks, financial shocks |
Date: | 2021 |
URL: | http://d.repec.org/n?u=RePEc:ces:ceswps:_8955&r=all |
By: | Wenyang Huang; Huiwen Wang; Shanshan Wang |
Abstract: | The (open-high-low-close) OHLC data is the most common data form in the field of finance and the investigate object of various technical analysis. With increasing features of OHLC data being collected, the issue of extracting their useful information in a comprehensible way for visualization and easy interpretation must be resolved. The inherent constraints of OHLC data also pose a challenge for this issue. This paper proposes a novel approach to characterize the features of OHLC data in a dataset and then performs dimension reduction, which integrates the feature information extraction method and principal component analysis. We refer to it as the pseudo-PCA method. Specifically, we first propose a new way to represent the OHLC data, which will free the inherent constraints and provide convenience for further analysis. Moreover, there is a one-to-one match between the original OHLC data and its feature-based representations, which means that the analysis of the feature-based data can be reversed to the original OHLC data. Next, we develop the pseudo-PCA procedure for OHLC data, which can effectively identify important information and perform dimension reduction. Finally, the effectiveness and interpretability of the proposed method are investigated through finite simulations and the spot data of China's agricultural product market. |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2103.16908&r=all |
By: | Hanjo Odendaal (Department of Economics, Stellenbosch University) |
Abstract: | This paper aims to offer an alternative to the manually labour intensive process of constructing a domain specific lexicon or dictionary through the operationalization of subjective information processing. This paper builds on current empirical literature by (a) constructing a domain specific dictionary for various economic confidence indices, (b) introducing a novel weighting schema of text tokens that account for time dependence; and (c) operationalising subjective information processing of text data using machine learning. The results show that sentiment indices constructed from machine generated dictionaries have a better fit with multiple indicators of economic activity than @loughran2011liability's manually constructed dictionary. Analysis shows a lower RMSE for the domain specific dictionaries in a five year holdout sample period from 2012 to 2017. The results also justify the time series weighting design used to overcome the p>>n problem, commonly found when working with economic time series and text data. |
Keywords: | Sentometrics, Machine learning, Domain-specific dictionaries |
JEL: | C32 C45 C53 C55 |
Date: | 2021 |
URL: | http://d.repec.org/n?u=RePEc:sza:wpaper:wpapers366&r=all |
By: | Yijian Chuan; Chaoyi Zhao; Zhenrui He; Lan Wu |
Abstract: | We develop a novel approach to explain why AdaBoost is a successful classifier. By introducing a measure of the influence of the noise points (ION) in the training data for the binary classification problem, we prove that there is a strong connection between the ION and the test error. We further identify that the ION of AdaBoost decreases as the iteration number or the complexity of the base learners increases. We confirm that it is impossible to obtain a consistent classifier without deep trees as the base learners of AdaBoost in some complicated situations. We apply AdaBoost in portfolio management via empirical studies in the Chinese market, which corroborates our theoretical propositions. |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2103.12345&r=all |
By: | Sophocles Mavroeidis |
Abstract: | I show that the Zero Lower Bound (ZLB) on interest rates can be used to identify the causal effects of monetary policy. Identification depends on the extent to which the ZLB limits the efficacy of monetary policy. I develop a general econometric methodology for the identification and estimation of structural vector autoregressions (SVARs) with an occasionally binding constraint. The method provides a simple way to test the efficacy of unconventional policies, modelled via a `shadow rate'. I apply this method to U.S. monetary policy using a three-equation SVAR model of inflation, unemployment and the federal funds rate. I reject the null hypothesis that unconventional monetary policy has no effect at the ZLB, but find some evidence that it is not as effective as conventional monetary policy. |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2103.12779&r=all |
By: | S. Borağan Aruoba; Marko Mlikota; Frank Schorfheide; Sergio Villalvazo |
Abstract: | We develop a structural VAR in which an occasionally-binding constraint generates censoring of one of the dependent variables. Once the censoring mechanism is triggered, we allow some of the coefficients for the remaining variables to change. We show that a necessary condition for a unique reduced form is that regression functions for the non-censored variables are continuous at the censoring point and that parameters satisfy some mild restrictions. In our application the censored variable is a nominal interest rate constrained by an effective lower bound (ELB). According to our estimates based on U.S. data, once the ELB becomes binding, the coefficients in the inflation equation change significantly, which translates into a change of the inflation responses to (unconventional) monetary policy and demand shocks. Our results suggest that the presence of the ELB is indeed empirically relevant for the propagation of shocks. We also obtain a shadow interest rate that shows a significant accommodation in the early parts of the Great Recession, followed by a mild and steady accommodation until liftoff in 2016. |
JEL: | C11 C22 C34 E32 E52 |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:28571&r=all |
By: | Pamela Jakiela |
Abstract: | Difference-in-differences estimation is a widely used method of program evaluation. When treatment is implemented in different places at different times, researchers often use two-way fixed effects to control for location-specific and period-specific shocks. Such estimates can be severely biased when treatment effects change over time within treated units. I review the sources of this bias and propose several simple diagnostics for assessing its likely severity. I illustrate these tools through a case study of free primary education in Sub-Saharan Africa. |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2103.13229&r=all |
By: | Huiwen Wang; Wenyang Huang; Shanshan Wang |
Abstract: | Forecasting the (open-high-low-close)OHLC data contained in candlestick chart is of great practical importance, as exemplified by applications in the field of finance. Typically, the existence of the inherent constraints in OHLC data poses great challenge to its prediction, e.g., forecasting models may yield unrealistic values if these constraints are ignored. To address it, a novel transformation approach is proposed to relax these constraints along with its explicit inverse transformation, which ensures the forecasting models obtain meaningful openhigh-low-close values. A flexible and efficient framework for forecasting the OHLC data is also provided. As an example, the detailed procedure of modelling the OHLC data via the vector auto-regression (VAR) model and vector error correction (VEC) model is given. The new approach has high practical utility on account of its flexibility, simple implementation and straightforward interpretation. Extensive simulation studies are performed to assess the effectiveness and stability of the proposed approach. Three financial data sets of the Kweichow Moutai, CSI 100 index and 50 ETF of Chinese stock market are employed to document the empirical effect of the proposed methodology. |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2104.00581&r=all |
By: | Apostolos Chalkis; Emmanouil Christoforou; Theodore Dalamagkas; Ioannis Z. Emiris |
Abstract: | We exploit a recent computational framework to model and detect financial crises in stock markets, as well as shock events in cryptocurrency markets, which are characterized by a sudden or severe drop in prices. Our method manages to detect all past crises in the French industrial stock market starting with the crash of 1929, including financial crises after 1990 (e.g. dot-com bubble burst of 2000, stock market downturn of 2002), and all past crashes in the cryptocurrency market, namely in 2018, and also in 2020 due to covid-19. We leverage copulae clustering, based on the distance between probability distributions, in order to validate the reliability of the framework; we show that clusters contain copulae from similar market states such as normal states, or crises. Moreover, we propose a novel regression model that can detect successfully all past events using less than 10% of the information that the previous framework requires. We train our model by historical data on the industry assets, and we are able to detect all past shock events in the cryptocurrency market. Our tools provide the essential components of our software framework that offers fast and reliable detection, or even prediction, of shock events in stock and cryptocurrency markets of hundreds of assets. |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2103.13294&r=all |
By: | Shoya Ishimaru |
Abstract: | This paper shows that a two-way fixed effects (TWFE) estimator is a weighted average of first-difference (FD) estimators with different gaps between periods, generalizing a well-known equivalence theorem in a two-period panel. Exploiting the identity, I clarify required conditions for the causal interpretation of the TWFE estimator. I highlight its several limitations and propose a generalized estimator that overcomes the limitations. An empirical application on the estimates of the minimum wage effects illustrates that recognizing the numerical equivalence and making use of the generalized estimator enable more transparent understanding of what we get from the TWFE estimator. |
Date: | 2021–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2103.12374&r=all |