|
on Econometrics |
By: | Anoek Castelein (Erasmus University Rotterdam); Dennis Fok (Erasmus University Rotterdam); Richard Paap (Erasmus University Rotterdam) |
Abstract: | In this paper, we develop a general method for heterogeneous variable selection in Bayesian nonlinear panel data models. Heterogeneous variable selection refers to the possibility that subsets of units are unaffected by certain variables. It may be present in applications as diverse as health treatments, consumer choice-making, macroeconomics, and operations research. Our method additionally allows for other forms of cross-sectional heterogeneity. We consider a two-group approach for the model's unit-specific parameters: each unit-specific parameter is either equal to zero (heterogeneous variable selection) or comes from a Dirichlet process (DP) mixture of multivariate normals (other cross-sectional heterogeneity). We develop our approach for general nonlinear panel data models, encompassing multinomial logit and probit models, poisson and negative binomial count models, exponential models, among many others. For inference, we develop an efficient Bayesian MCMC sampler. In a Monte Carlo study, we find that our approach is able to capture heterogeneous variable selection whereas a ``standard'' DP mixture is not. In an empirical application, we find that accounting for heterogeneous variable selection and non-normality of the continuous heterogeneity leads to an improved in-sample and out-of-sample performance and interesting insights. These findings illustrate the usefulness of our approach. |
Keywords: | Individualized variable selection, Dirichlet process, Stochastic search, Heterogeneity, Attribute non-attendance, Feature selection, Bayesian |
JEL: | C23 C11 |
Date: | 2020–09–22 |
URL: | http://d.repec.org/n?u=RePEc:tin:wpaper:20200061&r=all |
By: | Cui, Guowei; Norkute, Milda; Sarafidis, Vasilis; Yamagata, Takashi |
Abstract: | This paper puts forward a new instrumental variables (IV) approach for linear panel data models with interactive effects in the error term and regressors. The instruments are transformed regressors and so it is not necessary to search for external instruments. The proposed method asymptotically eliminates the interactive effects in the error term and in the regressors separately in two stages. We propose a two-stage IV (2SIV) and a mean-group IV (MGIV) estimator for homogeneous and heterogeneous slope models, respectively. The asymptotic analysis for the models with homogeneous slopes reveals that: (i) the \sqrt{NT}-consistent 2SIV estimator is free from asymptotic bias that could arise due to the correlation between the regressors and the estimation error of the interactive effects; (ii) under the same set of assumptions, existing popular estimators, which eliminate interactive effects either jointly in the regressors and the error term, or only in the error term, can suffer from asymptotic bias; (iii) the proposed 2SIV estimator is asymptotically as efficient as the bias-corrected version of estimators that eliminate interactive effects jointly in the regressors and the error, whilst; (iv) the relative efficiency of the estimators that eliminate interactive effects only in the error term is indeterminate. A Monte Carlo study confirms good approximation quality of our asymptotic results and competent performance of 2SIV and MGIV in comparison with existing estimators. Furthermore, it demonstrates that the bias-corrections can be imprecise and noticeably inflate the dispersion of the estimators in finite samples. |
Keywords: | Large panel data; interactive effects; common factors; principal components analysis; instrumental variables. |
JEL: | C13 C15 C23 C26 |
Date: | 2020–09–09 |
URL: | http://d.repec.org/n?u=RePEc:pra:mprapa:102827&r=all |
By: | Alfelt, Gustav (Stockholm University); Mazur, Stepan (Örebro University School of Business) |
Abstract: | In this paper, we consider the sample estimator of the tangency portfolio (TP) weights, where the inverse of the sample covariance matrix plays an important role. We assume that the number of observations is less than the number of assets in the portfolio, and the returns are independent and identically multivariate normally distributed. Under these assumptions, the sample covariance matrix follows a singular Wishart distribution and, therefore, the regular inverse cannot be taken. This paper delivers bounds and approximations for the rst two moments of the estimated TP weights, as well as exact results when the population covariance matrix is equal to the identity matrix, employing the Moore-Penrose inverse. Moreover, exact moments based on the re exive generalized inverse are provided. The properties of the bounds are investigated in a simulation study, where they are compared to the sample moments. The di erence between the moments based on the re exive generalized inverse and the sample moments based the Moore-Penrose inverse is also studied. |
Keywords: | Tangency portfolio; Singular inverse Wishart; Moore-Penrose inverse; Reexive generalized inverse; Estimator moments. |
JEL: | C13 C58 |
Date: | 2020–09–25 |
URL: | http://d.repec.org/n?u=RePEc:hhs:oruesi:2020_008&r=all |
By: | Alexander Klein; Guy Tchuente |
Abstract: | This paper derives identification, estimation, and inference results using spatial differencing in sample selection models with unobserved heterogeneity. We show that under the assumption of smooth changes across space of the unobserved sub-location specific heterogeneities and inverse Mills ratio, key parameters of a sample selection model are identified. The smoothness of the sub-location specific heterogeneities implies a correlation in the outcomes. We assume that the correlation is restricted within a location or cluster and derive asymptotic results showing that as the number of independent clusters increases, the estimators are consistent and asymptotically normal. We also propose a formula for standard error estimation. A Monte-Carlo experiment illustrates the small sample properties of our estimator. The application of our procedure to estimate the determinants of the municipality tax rate in Finland shows the importance of accounting for unobserved heterogeneity. |
Date: | 2020–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2009.06570&r=all |
By: | Martin,William J. |
Abstract: | The gravity model is now widely used for policy analysis and hypothesis testing, but different estimators give sharply different parameter estimates and popular estimators are likely biased because dependent variables are limited-dependent, error variances are nonconstant and missing data frequently reported as zeros. Monte Carlo analysis based on real-world parameters for aggregate trade shows that the traditional Ordinary Least Squares estimator in logarithms is strongly biased downwards. The popular Poisson Pseudo Maximum Likelihood model also suffers from downward bias. An Eaton-Kortum maximum-likelihood approach dealing with the identified sources of bias provides unbiased parameter estimates. |
Keywords: | International Trade and Trade Rules,Trade and Services,Economic Conditions and Volatility,Financial Sector Policy,Transport Services |
Date: | 2020–09–09 |
URL: | http://d.repec.org/n?u=RePEc:wbk:wbrwps:9391&r=all |
By: | Harold D. Chiang; Kengo Kato; Yuya Sasaki |
Abstract: | We consider inference for high-dimensional exchangeable arrays where the dimension may be much larger than the cluster sizes. Specifically, we consider separately and jointly exchangeable arrays that correspond to multiway clustered and polyadic data, respectively. Such exchangeable arrays have seen a surge of applications in empirical economics. However, both exchangeability concepts induce highly complicated dependence structures, which poses a significant challenge for inference in high dimensions. In this paper, we first derive high-dimensional central limit theorems (CLTs) over the rectangles for the exchangeable arrays. Building on the high-dimensional CLTs, we develop novel multiplier bootstraps for the exchangeable arrays and derive their finite sample error bounds in high dimensions. The derivations of these theoretical results rely on new technical tools such as Hoeffding-type decomposition and maximal inequalities for the degenerate components in the Hoeffiding-type decomposition for the exchangeable arrays. We illustrate applications of our bootstrap methods to robust inference in demand analysis, robust inference in extended gravity analysis, and penalty choice for $\ell_1$-penalized regression under multiway cluster sampling. |
Date: | 2020–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2009.05150&r=all |
By: | Kumbhakar, Subal C.; Peresetsky, Anatoly; Shchetynin, Yevgenii; Zaytsev, Alexey |
Abstract: | This paper formally proves that if inefficiency ($u$) is modelled through the variance of $u$ which is a function of $z$ then marginal effects of $z$ on technical inefficiency ($TI$) and technical efficiency ($TE$) have opposite signs. This is true in the typical setup with normally distributed random error $v$ and exponentially or half-normally distributed $u$ for both conditional and unconditional $TI$ and $TE$. We also provide an example to show that signs of the marginal effects of $z$ on $TI$ and $TE$ may coincide for some ranges of $z$. If the real data comes from a bimodal distribution of $u$, and we estimate model with an exponential or half-normal distribution for $u$, the estimated efficiency and the marginal effect of $z$ on $TE$ would be wrong. Moreover, the rank correlations between the true and the estimated values of $TE$ could be small and even negative for some subsamples of data. This result is a warning that the interpretation of the results of applying standard models to real data should take into account this possible problem. The results are demonstrated by simulations. |
Keywords: | Productivity and competitiveness, stochastic frontier analysis, model misspecification, efficiency, inefficiency |
JEL: | C21 C51 D22 D24 M11 |
Date: | 2020–09 |
URL: | http://d.repec.org/n?u=RePEc:pra:mprapa:102797&r=all |
By: | Rico Krueger; Prateek Bansal; Michel Bierlaire; Thomas Gasos |
Abstract: | Models that are robust to aberrant choice behaviour have received limited attention in discrete choice analysis. In this paper, we analyse two robust alternatives to the multinomial probit (MNP) model. Both alternative models belong to the family of robit models, whose kernel error distributions are heavy-tailed t-distributions. The first model is the multinomial robit (MNR) model in which a generic degrees of freedom parameter controls the heavy-tailedness of the kernel error distribution. The second alternative, the generalised multinomial robit (Gen-MNR) model, has not been studied in the literature before and is more flexible than MNR, as it allows for alternative-specific marginal heavy-tailedness of the kernel error distribution. For both models, we devise scalable and gradient-free Bayes estimators. We compare MNP, MNR and Gen-MNR in a simulation study and a case study on transport mode choice behaviour. We find that both MNR and Gen-MNR deliver significantly better in-sample fit and out-of-sample predictive accuracy than MNP. Gen-MNR outperforms MNR due to its more flexible kernel error distribution. Also, Gen-MNR gives more reasonable elasticity estimates than MNP and MNR, in particular regarding the demand for under-represented alternatives in a class-imbalanced dataset. |
Date: | 2020–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2009.06383&r=all |
By: | del Barrio Castro, Tomás; Rachinger, Heiko |
Abstract: | To understand the impact of temporal aggregation on the properties of a seasonal long-memory process, the effects of skip and cumulation sampling on both stationary and nonstationary processes with poles at several potential frequencies are analyzed. By allowing for several poles in the disaggregated process, their interaction in the aggregated series is investigated. Further, by definning the process according to the truncated Type II definition, the proposed approach encompasses both stationary and nonstationary processes without requiring prior knowledge of the case. The frequencies in the aggregated series to which the poles in the disaggregated series are mapped can be directly deduced. Specifically, unlike cumulation sampling, skip sampling can impact on non-seasonal memory properties. Moreover, with cumulation sampling, seasonal long-memory can vanish in some cases. Using simulations, the mapping of the frequencies implied by temporal aggregation is illustrated and the estimation of the memory at the different frequencies is analyzed |
Keywords: | Aggregation, cumulation sampling, skip sampling, seasonal long memory. |
JEL: | C12 C22 |
Date: | 2020–04 |
URL: | http://d.repec.org/n?u=RePEc:pra:mprapa:102890&r=all |
By: | Carolina Caetano; Gregorio Caetano; Eric R. Nielsen |
Abstract: | We show that in models with endogeneity, bunching at the lower or upper boundary of the distribution of the treatment variable may be used to build a correction for endogeneity. We derive the asymptotic distribution of the parameters of the corrected model, provide an estimator of the standard errors, and prove the consistency of the bootstrap. An empirical application reveals that time spent watching television, corrected for endogeneity, has roughly no net effect on cognitive skills and a significant negative net effect on non-cognitive skills in children. |
Keywords: | Bunching; Endogeneity; Bootstrap; Cross-sectional models; Childhood skill development; Clustering |
JEL: | C20 C21 C24 |
Date: | 2020–09–18 |
URL: | http://d.repec.org/n?u=RePEc:fip:fedgfe:2020-80&r=all |
By: | Sayar Karmakar; Arkaprava Roy |
Abstract: | Conditional heteroscedastic (CH) models are routinely used to analyze financial datasets. The classical models such as ARCH-GARCH with time-invariant coefficients are often inadequate to describe frequent changes over time due to market variability. However we can achieve significantly better insight by considering the time-varying analogues of these models. In this paper, we propose a Bayesian approach to the estimation of such models and develop computationally efficient MCMC algorithm based on Hamiltonian Monte Carlo (HMC) sampling. We also established posterior contraction rates with increasing sample size in terms of the average Hellinger metric. The performance of our method is compared with frequentist estimates and estimates from the time constant analogues. To conclude the paper we obtain time-varying parameter estimates for some popular Forex (currency conversion rate) and stock market datasets. |
Date: | 2020–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2009.06007&r=all |
By: | Kirill Borusyak; Peter Hull |
Abstract: | We develop new tools for causal inference in settings where exogenous shocks affect the treatment status of multiple observations jointly, to different extents. In these settings researchers may construct treatments or instruments that combine the shocks with predetermined measures of shock exposure. Examples include measures of spillovers in social and transportation networks, simulated eligibility instruments, and shift-share instruments. We show that leveraging the exogeneity of shocks for identification generally requires a simple but nonstandard recentering, derived from the specification of counterfactual shocks that might as well have been realized. We further show how specification of counterfactual shocks can be used for finite-sample inference and specification tests, and we characterize the recentered instruments that are asymptotically efficient. We use this framework to estimate the employment effects of Chinese market access growth due to high-speed rail construction and the insurance coverage effects of expanded Medicaid eligibility. |
JEL: | C21 C26 F14 I13 R40 |
Date: | 2020–09 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:27845&r=all |
By: | Yanqin Fan; Marc Henry |
Abstract: | This paper introduces vector copulas and establishes a vector version of Sklar's theorem. The latter provides a theoretical justification for the use of vector copulas to characterize nonlinear or rank dependence between a finite number of random vectors (robust to within vector dependence), and to construct multivariate distributions with any given non-overlapping multivariate marginals. We construct Elliptical, Archimedean, and Kendall families of vector copulas and present algorithms to generate data from them. We introduce a concordance ordering for two random vectors with given within-dependence structures and generalize Spearman's rho to random vectors. Finally, we construct empirical vector copulas and show their consistency under mild conditions. |
Date: | 2020–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2009.06558&r=all |
By: | Marc-Aurèle Divernois (EPFL; Swiss Finance Institute) |
Abstract: | This paper proposes a machine learning approach to estimate physical forward default intensities. Default probabilities are computed using artificial neural networks to estimate the intensities of the inhomogeneous Poisson processes governing default process. The major contribution to previous literature is to allow the estimation of non-linear forward intensities by using neural networks instead of classical maximum likelihood estimation. The model specification allows an easy replication of previous literature using linear assumption and shows the improvement that can be achieved. |
Keywords: | Bankruptcy, Credit Risk, Default, Machine Learning, Neural Networks, Doubly Stochastic, Forward Poisson Intensities |
JEL: | C22 C23 C53 C58 G33 G34 |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:chf:rpseri:rp2079&r=all |
By: | KAINOU Kazunari |
Abstract: | This paper explains the result of development of the new Treatment Effect Evaluation methodology for the case of Stable Unit Treatment Value Assumption (SUTVA) may not hold and experimental methodology are hard to apply and only limited observational data are available. By surveillance of related preceding research, it is shown that there are roughly four assumptions for Treatment Effect Evaluation including SUTVA and confirmed that only one experimental methodology are available for SUTVA violation case with interference effects at present. This paper shows that four additional assumptions are necessary for identification of interference effects of treatment in the outcome of control group. And under those four identification assumptions, preparing a lot of sets of three independent random numbers and estimating average coefficients by iterating regression of before after difference by difference in difference using these random number sets with same random number multiplized in their after treatment samples. It enables interference induced bias estimation and that enables to correct the bias contained in the conventional Treatment Effect estimated by difference in difference with synthetic control group methodology. These results are verified by Monte-Carlo simulation and confirmed the estimation accuracy and precision. This paper also shows practical application procedure and relevant placebo-study procedure of the new methodology. Lastly, this paper shows the result of road-test of the new methodology using the case of the rice wholesale price change by Japanese domestic origin and brand caused by the Great East Japan Earthquake and Fukusima Number One Plant Nuclear Accident. |
Date: | 2020–09 |
URL: | http://d.repec.org/n?u=RePEc:eti:rdpsjp:20035&r=all |
By: | Damjana Kokol Bukov\v{s}ek; Toma\v{z} Ko\v{s}ir; Bla\v{z} Moj\v{s}kerc; Matja\v{z} Omladi\v{c} |
Abstract: | Copulas are becoming an essential tool in analyzing data and knowing local copula bounds with a fixed value of a given measure of association is turning into a prerequisite in the early stage of exploratory data analysis. These bounds have been computed for Spearman's rho, Kendall's tau, and Blomqvist's beta. The importance of another two measures of association, Spearman's footrule and Gini's gamma, has been reconfirmed recently. It is the main purpose of this paper to fill in the gap and present the mentioned local bounds for these two measures as well. It turns out that this is a quite non-trivial endeavor as the bounds are quasi-copulas that are not copulas for certain values of the two measures. We also give relations between these two measures of association and Blomqvist's beta. |
Date: | 2020–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2009.06221&r=all |
By: | Alejandro Puerta; Andr\'es Ram\'irez-Hassan |
Abstract: | This paper proposes a Bayesian approach to perform inference regarding the size of hidden populations at analytical region using reported statistics. To do so, we propose a specification taking into account one-sided error components and spatial effects within a panel data structure. Our simulation exercises suggest good finite sample performance. We analyze rates of crime suspects living per neighborhood in Medell\'in (Colombia) associated with four crime activities. Our proposal seems to identify hot spots or "crime communities", potential neighborhoods where under-reporting is more severe, and also drivers of crime schools. Statistical evidence suggests a high level of interaction between homicides and drug dealing in one hand, and motorcycle and car thefts on the other hand. |
Date: | 2020–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2009.05360&r=all |
By: | Andrea Carriero; Todd E. Clark; Massimiliano Marcellino |
Abstract: | We derive a Bayesian prior from a no-arbitrage affine term structure model and use it to estimate the coefficients of a vector autoregression of a panel of government bond yields, specifying a common time-varying volatility for the disturbances. Results based on US data show that this method improves the precision of both point and density forecasts of the term structure of government bond yields, compared to a fully fledged term structure model with time-varying volatility and to a no-change random walk forecast. Further analysis reveals that the approach might work better than an exact term structure model because it relaxes the requirements that yields obey a strict factor structure and that the factors follow a Markov process. Instead, the cross-equation no-arbitrage restrictions on the factor loadings play a marginal role in producing forecasting gains. |
Keywords: | Term structure; volatility; density forecasting; no arbitrage |
JEL: | C32 C53 E43 E47 G12 |
Date: | 2020–09–22 |
URL: | http://d.repec.org/n?u=RePEc:fip:fedcwq:88748&r=all |
By: | Peng Ding |
Abstract: | The Frisch--Waugh--Lovell Theorem states the equivalence of the coefficients from the full and partial regressions. I further show the equivalence between various standard errors. Applying the new result to stratified experiments reveals the discrepancy between model-based and design-based standard errors. |
Date: | 2020–09 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2009.06621&r=all |