nep-cmp 2022-05-30 papers

on Computational Economics

Issue of 2022‒05‒30
fifteen papers chosen by

Stacking machine-learning models for anomaly detection: comparing AnaCredit to other banking datasets By Pasquale Maddaloni; Davide Nicola Continanza; Andrea del Monaco; Daniele Figoli; Marco di Lucido; Filippo Quarta; Giuseppe Turturiello
Supervised machine learning classification for short straddles on the S&P500 By Alexander Brunhuemer; Lukas Larcher; Philipp Seidl; Sascha Desmettre; Johannes Kofler; Gerhard Larcher
Discovering material information using hierarchical Reformer model on financial regulatory filings By Francois Mercier; Makesh Narsimhan
Is your machine better than you? You may never know By Francis de Véricourt; Huseyin Gurkan
A Neural Network Approach to the Environmental Kuznets Curve By Mikkel Bennedsen; Eric Hillebrand; Sebastian Jensen
Application of the XGBoost algorithm and Bayesian optimization for the Bitcoin price prediction during the COVID-19 period By Jakub Drahokoupil
Policy Gradient Stock GAN for Realistic Discrete Order Data Generation in Financial Markets By Masanori Hirano; Hiroki Sakaji; Kiyoshi Izumi
High Performance Export Portfolio: Design Growth-Enhancing Export Structure with Machine Learning By Ms. Natasha X Che; Xuege Zhang
Modeling dynamic volatility under uncertain environment with fuzziness and randomness By Xianfei Hui; Baiqing Sun; Yan Zhou
Fair Governance with Humans and Machines By Yoan Hermstrüwer; Pascal Langenbach
Assessment of Support Vector Machine performance for default prediction and credit rating By Karim Amzile; Mohamed Habachi
Sequence-Based Target Coin Prediction for Cryptocurrency Pump-and-Dump By Sihao Hu; Zhen Zhang; Shengliang Lu; Bingsheng He; Zhao Li
Fuzzy Expert System for Stock Portfolio Selection: An Application to Bombay Stock Exchange By Gour Sundar Mitra Thakur; Rupak Bhattacharyyab; Seema Sarkar
Adaptive Multi-Strategy Market-Making Agent For Volatile Markets By Ali Raheman; Anton Kolonin; Alexey Glushchenko; Arseniy Fokin; Ikram Ansari
Pareto Optimization in Categories By Matilde Marcolli

Stacking machine-learning models for anomaly detection: comparing AnaCredit to other banking datasets

By:	Pasquale Maddaloni (Bank of Italy); Davide Nicola Continanza (Bank of Italy); Andrea del Monaco (Bank of Italy); Daniele Figoli (Bank of Italy); Marco di Lucido (Bank of Italy); Filippo Quarta (Bank of Italy); Giuseppe Turturiello (Bank of Italy)
Abstract:	This paper addresses the issue of assessing the quality of granular datasets reported by banks via machine learning models. In particular, it investigates how supervised and unsupervised learning algorithms can exploit patterns that can be recognized in other data sources dealing with similar phenomena (although these phenomena are available at a different level of aggregation), in order to detect potential outliers to be submitted to banks for their own checks. The above machine learning algorithms are finally stacked in a semi-supervised fashion in order to enhance their individual outlier detection ability. The described methodology is applied to compare the granular AnaCredit dataset, firstly with the Balance Sheet Items statistics (BSI), and secondly with the harmonised supervisory statistics of the Financial Reporting (FinRep), which are compiled for the Eurosystem and the Single Supervisory Mechanism, respectively. In both cases, we show that the performance of the stacking technique, in terms of F1-score, is higher than in each algorithm alone.
Keywords:	banking data, data quality management, outlier and anomaly detection, machine learning, auto-encoder, robust regression, pseudo labelling
JEL:	C18 C81 G21
Date:	2022–04
URL:	http://d.repec.org/n?u=RePEc:bdi:opques:qef_689_22&r=

Supervised machine learning classification for short straddles on the S&P500

By:	Alexander Brunhuemer; Lukas Larcher; Philipp Seidl; Sascha Desmettre; Johannes Kofler; Gerhard Larcher
Abstract:	In this working paper we present our current progress in the training of machine learning models to execute short option strategies on the S&P500. As a first step, this paper is breaking this problem down to a supervised classification task to decide if a short straddle on the S&P500 should be executed or not on a daily basis. We describe our used framework and present an overview over our evaluation metrics on different classification models. In this preliminary work, using standard machine learning techniques and without hyperparameter search, we find no statistically significant outperformance to a simple "trade always" strategy, but gain additional insights on how we could proceed in further experiments.
Date:	2022–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2204.13587&r=

Discovering material information using hierarchical Reformer model on financial regulatory filings

By:	Francois Mercier; Makesh Narsimhan
Abstract:	Most applications of machine learning for finance are related to forecasting tasks for investment decisions. Instead, we aim to promote a better understanding of financial markets with machine learning techniques. Leveraging the tremendous progress in deep learning models for natural language processing, we construct a hierarchical Reformer ([15]) model capable of processing a large document level dataset, SEDAR, from canadian financial regulatory filings. Using this model, we show that it is possible to predict trade volume changes using regulatory filings. We adapt the pretraining task of HiBERT ([36]) to obtain good sentence level representations using a large unlabelled document dataset. Finetuning the model to successfully predict trade volume changes indicates that the model captures a view from financial markets and processing regulatory filings is beneficial. Analyzing the attention patterns of our model reveals that it is able to detect some indications of material information without explicit training, which is highly relevant for investors and also for the market surveillance mandate of financial regulators.
Date:	2022–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2204.05979&r=

Is your machine better than you? You may never know

By:	Francis de Véricourt (ESMT European School of Management and Technology GmbH); Huseyin Gurkan (ESMT European School of Management and Technology GmbH)
Abstract:	Artificial intelligence systems are increasingly demonstrating their capacity to make better predictions than human experts. Yet, recent studies suggest that professionals sometimes doubt the quality of these systems and overrule machine-based prescriptions. This paper explores the extent to which a decision maker (DM) supervising a machine to make high-stake decisions can properly assess whether the machine produces better recommendations. To that end, we study a set-up, in which a machine performs repeated decision tasks (e.g., whether to perform a biopsy) under the DM’s supervision. Because stakes are high, the DM primarily focuses on making the best choice for the task at hand. Nonetheless, as the DM observes the correctness of the machine’s prescriptions across tasks, she updates her belief about the machine. However, the DM observes the machine’s correctness only if she ultimately decides to act on the task. Further, the DM sometimes overrides the machine depending on her belief, which affects learning. In this set-up, we characterize the evolution of the DM’s belief and overruling decisions over time. We identify situations under which the DM hesitates forever whether the machine is better, i.e., she never fully ignores but regularly overrules it. Moreover, the DM sometimes wrongly believes with positive probability that the machine is better. We fully characterize the conditions under which these learning failures occur and explore how mistrusting the machine affects them. Our results highlight some fundamental limitations in determining whether machines make better decisions than experts and provide a novel explanation for human-machine complementarity.
Keywords:	machine accuracy, decision making, human-in-the-loop, algorithm aversion, dynamic learning
Date:	2022–05–23
URL:	http://d.repec.org/n?u=RePEc:esm:wpaper:esmt-22-02&r=

A Neural Network Approach to the Environmental Kuznets Curve

By:	Mikkel Bennedsen (Aarhus University and CREATES); Eric Hillebrand (Aarhus University and CREATES); Sebastian Jensen (Aarhus University and CREATES)
Abstract:	We investigate the relationship between per capita gross domestic product and per capita carbon dioxide emissions using national-level panel data for the period 1960-2018. We propose a novel semiparametric panel data methodology that combines country and time fixed effects with a nonparametric neural network regression component. Globally and for the regions OECD and Asia, we find evidence of an inverse U-shaped relationship, often referred to as an environmental Kuznets curve (EKC). For OECD, the EKC-shape disappears when using consumption-based emissions data, suggesting the EKC-shape observed for OECD is driven by emissions exports. For Asia, the EKC-shape becomes even more pronounced when using consumption-based emissions data and exhibits an earlier turning point. JEL classifcation: C14, C23, C45, C51, C52, C53 Key words: Territorial carbon dioxide emissions, Consumption-based carbon dioxide emissions, Environmental Kuznets curve, Climate econometrics, Panel data, Machine learning, Neural networks
Date:	2022–05–24
URL:	http://d.repec.org/n?u=RePEc:aah:create:2022-09&r=

Application of the XGBoost algorithm and Bayesian optimization for the Bitcoin price prediction during the COVID-19 period

By:	Jakub Drahokoupil
Abstract:	Aim of this paper is to use Machine Learning algorithm called XGBoost developed by Tianqi Chen and Carlos Guestrin in 2016 to predict future development of the Bitcoin (BTC) price and build an algorithmic trading strategy based on the predictions from the model. For the final algorithmic strategy, six XGBoost models are estimated in total, estimating following n-th day BTC Close predictions: 1,2,5,10,20,30. Bayesian optimization techniques are used twice during the development of the trading strategy. First, when appropriate hyperparameters of the XGBoost model are selected. Second, for the optimization of each model prediction weight, in order to obtain the most profitable trading strategy. The paper shows, that even though the XGBoost model has several limitations, it can fairly accurately predict future development of the BTC price, even for further predictions. The paper aims specifically for the potential of algorithmic trading during the COVID-19 period, where BTC cryptocurrency suffered extremely volatile period, reaching its new all-time highest prices as well as 50% losses during few consecutive months. The applied trading strategy shows promising results, as it beats the B&H strategy both from the perspective of total profit, Sharpe ratio or Sortino ratio.
Keywords:	XGBoost, Bayesian Optimization, Bitcoin, Algorithmic trading
JEL:	C11 C39 C61 G11
Date:	2022–03–24
URL:	http://d.repec.org/n?u=RePEc:prg:jnlwps:v:4:y:2022:id:4.006&r=

Policy Gradient Stock GAN for Realistic Discrete Order Data Generation in Financial Markets

By:	Masanori Hirano; Hiroki Sakaji; Kiyoshi Izumi
Abstract:	This study proposes a new generative adversarial network (GAN) for generating realistic orders in financial markets. In some previous works, GANs for financial markets generated fake orders in continuous spaces because of GAN architectures' learning limitations. However, in reality, the orders are discrete, such as order prices, which has minimum order price unit, or order types. Thus, we change the generation method to place the generated fake orders into discrete spaces in this study. Because this change disabled the ordinary GAN learning algorithm, this study employed a policy gradient, frequently used in reinforcement learning, for the learning algorithm. Through our experiments, we show that our proposed model outperforms previous models in generated order distribution. As an additional benefit of introducing the policy gradient, the entropy of the generated policy can be used to check GAN's learning status. In the future, higher performance GANs, better evaluation methods, or the applications of our GANs can be addressed.
Date:	2022–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2204.13338&r=

High Performance Export Portfolio: Design Growth-Enhancing Export Structure with Machine Learning

By:	Ms. Natasha X Che; Xuege Zhang
Abstract:	This paper studies the relationship between export structure and growth performance. We design an export recommendation system using a collaborative filtering algorithm based on countries' revealed comparative advantages. The system is used to produce export portfolio recommendations covering over 190 economies and over 30 years. We find that economies with their export structure more aligned with the recommended export structure achieve better growth performance, in terms of both higher GDP growth rate and lower growth volatility. These findings demonstrate that export structure matters for obtaining high and stable growth. Our recommendation system can serve as a practical tool for policymakers seeking actionable insights on their countries’ export potential and diversification strategies that may be complex and hard to quantify.
Keywords:	export diversification, comparative advantage, machine learning, collaborative filtering, economic growth, international trade; export structure; export portfolio recommendation; export recommendation system; performance export portfolio; export potential; Exports; Comparative advantage; Export diversification; Human capital; Total factor productivity; Global; East Asia
Date:	2022–04–29
URL:	http://d.repec.org/n?u=RePEc:imf:imfwpa:2022/075&r=

Modeling dynamic volatility under uncertain environment with fuzziness and randomness

By:	Xianfei Hui; Baiqing Sun; Yan Zhou
Abstract:	Predicting the dynamic volatility in financial market provides a promising method for risk prediction, asset pricing and market supervision. Barndorff-Nielsen and Shephard model (BN-S) model, used to capture the stochastic behavior of high-frequency time series, is an accepted stochastic volatility model with L\' evy process. Although this model is attractive and successful in theory, it needs to be improved in application. We build a new generalized BN-S model suitable for uncertain environment with fuzziness and randomness. This new model considers the delay phenomenon between price fluctuation and volatility changes, solves the problem of the lack of long-range dependence of classic models. Calculation results show that new model outperforms the classic model in volatility forecasting. Experiments on Dow Jones Industrial Average futures price data are conducted to verify feasibility and practicability of our proposed approach. Numerical examples are provided to illustrate the theoretical result. Three machine learning algorithms are applied to estimate new model parameter. Compared with the classical model, our method effectively combines the uncertain environmental characteristics, which makes the prediction of dynamic volatility more flexible and has ideal performance.
Date:	2022–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2204.12657&r=

Fair Governance with Humans and Machines

By:	Yoan Hermstrüwer (Max Planck Institute for Research on Collective Goods, Bonn); Pascal Langenbach (Max Planck Institute for Research on Collective Goods, Bonn)
Abstract:	How fair are government decisions based on algorithmic predictions? And to what extent can the government delegate decisions to machines without sacrificing procedural fairness? Using a set of vignettes in the context of predictive policing, school admissions, and refugee-matching, we explore how different degrees of human-machine interaction affect fairness perceptions an procedural preferences. We implement four treatments varying the extent of responsibility delegation to the machine and the degree of human involvement in the decision-making process, ranging from full human discretion, machine-based predictions with high human involvement, machine-based predictions with low human involvement, and fully machine-based decisions. We find that machine-based predictions with high human involvement yield the highest and fully machine-based decisions the lowest fairness scores. Different accuracy assessments can partly explain these differences. Fairness scores follow a similar pattern across contexts, with a negative level effect and lower fairness perceptions of human decisions in the context of predictive policing. Our results shed light on the behavioral foundations of several legal human-in-the-loop rules.
Keywords:	algorithms, predictive policing, school admissions, refugee-matching, fairness
Date:	2022–05–24
URL:	http://d.repec.org/n?u=RePEc:mpg:wpaper:2022_04&r=

Assessment of Support Vector Machine performance for default prediction and credit rating

By:	Karim Amzile (Université Mohammed V); Mohamed Habachi (Université Mohammed V)
Abstract:	Predicting the creditworthiness of bank customers is a major concern for banking institutions, as modeling the probability of default is a key focus of the Basel regulations. Practitioners propose different default modeling techniques such as linear discriminant analysis, logistic regression, Bayesian approach, and artificial intelligence techniques. The performance of the default prediction is evaluated by the Receiver Operating Characteristic (ROC) curve using three types of kernels, namely, the polynomial kernel, the linear kernel and the Gaussian kernel. To justify the performance of the model, the study compares the prediction of default by the support vector with the logistic regression using data from a portfolio of particular bank customers. The results of this study showed that the model based on the Support Vector Machine approach with the Radial Basis Function kernel, performs better in prediction, compared to the logistic regression model, with a value of the ROC curve equal to 98%, against 71.7% for the logistic regression model. Also, this paper presents the conception of a support vector machine-based rating tool designed to classify bank customers and determine their probability of default. This probability has been computed empirically and represents the proportion of defaulting customers in each class.
Keywords:	bank,credit risk,data mining,probability of default,scoring,artificial intelligence
Date:	2022
URL:	http://d.repec.org/n?u=RePEc:hal:journl:halshs-03643738&r=

Sequence-Based Target Coin Prediction for Cryptocurrency Pump-and-Dump

By:	Sihao Hu; Zhen Zhang; Shengliang Lu; Bingsheng He; Zhao Li
Abstract:	As the pump-and-dump schemes (P&Ds) proliferate in the cryptocurrency market, it becomes imperative to detect such fraudulent activities in advance, to inform potentially susceptible investors before they become victims. In this paper, we focus on the target coin prediction task, i.e., to predict the pump probability of all coins listed in the target exchange before a pump. We conduct a comprehensive study of the latest P&Ds, investigate 709 events organized in Telegram channels from Jan. 2019 to Jan. 2022, and unearth some abnormal yet interesting patterns of P&Ds. Empirical analysis demonstrates that pumped coins exhibit intra-channel homogeneity and inter-channel heterogeneity, which inspires us to develop a novel sequence-based neural network named SNN. Specifically, SNN encodes each channel's pump history as a sequence representation via a positional attention mechanism, which filters useful information and alleviates the noise introduced when the sequence length is long. We also identify and address the coin-side cold-start problem in a practical setting. Extensive experiments show a lift of 1.6% AUC and 41.0% Hit Ratio@3 brought by our method, making it well-suited for real-world application. As a side contribution, we release the source code of our entire data science pipeline on GitHub, along with the dataset tailored for studying the latest P&Ds.
Date:	2022–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2204.12929&r=

Fuzzy Expert System for Stock Portfolio Selection: An Application to Bombay Stock Exchange

By:	Gour Sundar Mitra Thakur (Mondal); Rupak Bhattacharyyab (Mondal); Seema Sarkar (Mondal)
Abstract:	Selection of proper stocks, before allocating investment ratios, is always a crucial task for the investors. Presence of many influencing factors in stock performance have motivated researchers to adopt various Artificial Intelligence (AI) techniques to make this challenging task easier. In this paper a novel fuzzy expert system model is proposed to evaluate and rank the stocks under Bombay Stock Exchange (BSE). Dempster-Shafer (DS) evidence theory is used for the first time to automatically generate the consequents of the fuzzy rule base to reduce the effort in knowledge base development of the expert system. Later a portfolio optimization model is constructed where the objective function is considered as the ratio of the difference of fuzzy portfolio return and the risk free return to the weighted mean semi-variance of the assets that has been used. The model is solved by applying Ant Colony Optimization (ACO) algorithm by giving preference to the top ranked stocks. The performance of the model proved to be satisfactory for short-term investment period when compared with the recent performance of the stocks.
Date:	2022–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2204.13385&r=

Adaptive Multi-Strategy Market-Making Agent For Volatile Markets

By:	Ali Raheman; Anton Kolonin; Alexey Glushchenko; Arseniy Fokin; Ikram Ansari
Abstract:	Crypto-currency market uncertainty drives the need to find adaptive solutions to maximise gain or at least to avoid loss throughout the periods of trading activity. Given the high dimensionality and complexity of the state-action space in this domain, it can be treated as a "Narrow AGI" problem with the scope of goals and environments bound to financial markets. Adaptive Multi-Strategy Agent approach for market-making introduces a new solution to maximise positive "alpha" in long-term handling limit order book (LOB) positions by using multiple sub-agents implementing different strategies with a dynamic selection of these agents based on changing market conditions. AMSA provides no specific strategy of its own while being responsible for segmenting the periods of market-making activity into smaller execution sub-periods, performing internal backtesting on historical data on each of the sub-periods, doing sub- agent performance evaluation and re-selection of them at the end of each sub- period, and collecting returns and losses incrementally. With this approach, the return becomes a function of hyper-parameters such as market data granularity (refresh rate), the execution sub-period duration, number of active sub-agents, and their individual strategies. Sub-agent selection for the next trading sub-period is made based on return/loss and alpha values obtained during internal backtesting as well as real trading. Experiments with the AMSA have been performed under different market conditions relying on historical data and proved a high probability of positive alpha throughout the periods of trading activity in the case of properly selected hyper-parameters.
Date:	2022–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2204.13265&r=

Pareto Optimization in Categories

By:	Matilde Marcolli
Abstract:	We propose a model of Pareto optimization (multi-objective programming) in the context of a categorical theory of resources. We describe how to adapt multi-objective swarm intelligence algorithms to this categorical formulation.
Date:	2022–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2204.11931&r=

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.