nep-big 2024-04-15 papers

on Big Data

Issue of 2024‒04‒15
twenty-one papers chosen by
Tom Coupé, University of Canterbury

Introducing a Global Dataset on Conflict Forecasts and News Topics By Mueller, H.; Rauh, C.; Seimon, B.
Study of the Impact of the Big Data Era on Accounting and Auditing By Yuxiang Sun; Jingyi Li; Mengdie Lu; Zongying Guo
Predicting Re-Employment: Machine Learning Versus Assessments by Unemployed Workers and by Their Caseworkers By Berg, Gerard J. van den; Kunaschk, Max; Lang, Julia; Stephan, Gesine; Uhlendorff, Arne
From Factor Models to Deep Learning: Machine Learning in Reshaping Empirical Asset Pricing By Junyi Ye; Bhaskar Goswami; Jingyi Gu; Ajim Uddin; Guiling Wang
Predicting IMF-Supported Programs: A Machine Learning Approach By Tsendsuren Batsuuri; Shan He; Ruofei Hu; Jonathan Leslie; Flora Lutz
Understanding online purchases with explainable machine learning By João A. Bastos; Maria Inês Bernardes
빅데이터 기반의 국제거시경제 전망모형 개발 연구(Developing an International Macroeconomic Forecasting Model Based on Big Data) By Baek, Yaein; Yoon, Sang-Ha; Kim, Hyun Hak; Lee, Jiyun
Generative Probabilistic Forecasting with Applications in Market Operations By Xinyi Wang; Lang Tong
Generative Adversarial Networks Applied to Synthetic Financial Scenarios Generation By Matteo Rizzato; Julien Wallart; Christophe Geissler; Nicolas Morizet; Noureddine Boumlaik
Measuring Unemployment Risk By Brendan J. Chapuis; John Coglianese
Application of Deep Learning to Emulate an Agent-Based Model By Njiru, Ruth; Appel, Franziska; Dong, Changxing; Balmann, Alfons
Pre-Publication Revisions of Bank Financial Statements: a novel way to monitor banks? By Andre Guettler; Mahvish Naeem; Lars Norden; Bernardus Van Doornik
Prediction of Corporate Credit Ratings with Machine Learning: Simple Interpretative Models By Koresh Galil; Ami Hauptman; Rosit Levy Rosenboim
Triple/Debiased Lasso for Statistical Inference of Conditional Average Treatment Effects By Masahiro Kato
Movies By Stelios Michalopoulos; Christopher Rauh
Old but Gold or New and Shiny? Comparing Tree Ensembles for Ordinal Prediction with a Classic Parametric Approach By Buczak, Philip; Horn, Daniel; Pauly, Markus
Remotely measuring rural economic activity and poverty : Do we just need better sensors? By GIBSON, John; ZHANG, Xiaoxuan; PARK, Albert; YI, Jiang; XI, Li
Option pricing in the Heston model with Physics inspired neural networks By Hainaut, Donatien; Casas, Alex
Four Facts about International Central Bank Communication By Bertsch, Christoph; Hull, Isaiah; Lumsdaine, Robin L.; Zhang, Xin
Talking in a language that everyone can understand? Clarity of speeches by the ECB Executive Board By Glas, Alexander; Müller, Lena
Language-based game theory in the age of artificial intelligence By Valerio Capraro; Roberto Di Paolo; Matjaz Perc; Veronica Pizziol

Introducing a Global Dataset on Conflict Forecasts and News Topics

By:	Mueller, H.; Rauh, C.; Seimon, B.
Abstract:	This article provides a structured description of openly available news topics and forecasts for armed conflict at the national and grid cell level starting January 2010. The news topics as well as the forecasts are updated monthly at conflictforecast.org and provide coverage for more than 170 countries and about 65, 000 grid cells of size 55x55km worldwide. The forecasts rely on Natural Language Processing (NLP) and machine learning techniques to leverage a large corpus of newspaper text for predicting sudden onsets of violence in peaceful countries. Our goals are to: a) support conflict prevention efforts by making our risk forecasts available to practitioners and research teams worldwide, b) facilitate additional research that can utilise risk forecasts for causal identification, and to c) provide an overview of the news landscape.
Keywords:	Civil War, Conflict, Forecasting, Machine Learning, News Topics, Random Forest, Topic Models
Date:	2024–02–02
URL:	http://d.repec.org/n?u=RePEc:cam:camdae:2404&r=big

Study of the Impact of the Big Data Era on Accounting and Auditing

By:	Yuxiang Sun; Jingyi Li; Mengdie Lu; Zongying Guo
Abstract:	Big data revolutionizes accounting and auditing, offering deep insights but also introducing challenges like data privacy and security. With data from IoT, social media, and transactions, traditional practices are evolving. Professionals must adapt to these changes, utilizing AI and machine learning for efficient data analysis and anomaly detection. Key to overcoming these challenges are enhanced analytics tools, continuous learning, and industry collaboration. By addressing these areas, the accounting and auditing fields can harness big data's potential while ensuring accuracy, transparency, and integrity in financial reporting. Keywords: Big Data, Accounting, Audit, Data Privacy, AI, Machine Learning, Transparency.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.07180&r=big

Predicting Re-Employment: Machine Learning Versus Assessments by Unemployed Workers and by Their Caseworkers

By:	Berg, Gerard J. van den (University of Groningen, University Medical Center Groningen ; IFAU Uppsala ; ZEW ; IZA ; CEPR); Kunaschk, Max (Institute for Employment Research (IAB), Nuremberg, Germany); Lang, Julia (Institute for Employment Research (IAB), Nuremberg, Germany); Stephan, Gesine (Institute for Employment Research (IAB), Nuremberg, Germany); Uhlendorff, Arne (Institute for Employment Research (IAB), Nuremberg, Germany)
Abstract:	"We analyze unique data on three sources of information on the probability of re-employment within 6 months (RE6), for the same individuals sampled from the inflow into unemployment. First, they were asked for their perceived probability of RE6. Second, their caseworkers revealed whether they expected RE6. Third, random-forest machine learning methods are trained on administrative data on the full inflow, to predict individual RE6. We compare the predictive performance of these measures and consider how combinations improve this performance. We show that self-reported (and to a lesser extent caseworker) assessments sometimes contain information not captured by the machine learning algorithm." (Author's abstract, IAB-Doku) ((en))
Keywords:	Bundesrepublik Deutschland ; IAB-Open-Access-Publikation ; berufliche Reintegration ; Fremdbild ; Integrierte Erwerbsbiografien ; Langzeitarbeitslosigkeit ; Profiling ; Prognosegenauigkeit ; Risikoabschätzung ; Selbsteinschätzung ; Arbeitsberater ; Machine learning ; Arbeitslose ; Arbeitslosenversicherung ; Arbeitslosigkeitsdauer ; Arbeitsmarktchancen ; 2012-2013
JEL:	C21 C41 C53 J64 J65 C55
Date:	2024–02–08
URL:	http://d.repec.org/n?u=RePEc:iab:iabdpa:202403&r=big

From Factor Models to Deep Learning: Machine Learning in Reshaping Empirical Asset Pricing

By:	Junyi Ye; Bhaskar Goswami; Jingyi Gu; Ajim Uddin; Guiling Wang
Abstract:	This paper comprehensively reviews the application of machine learning (ML) and AI in finance, specifically in the context of asset pricing. It starts by summarizing the traditional asset pricing models and examining their limitations in capturing the complexities of financial markets. It explores how 1) ML models, including supervised, unsupervised, semi-supervised, and reinforcement learning, provide versatile frameworks to address these complexities, and 2) the incorporation of advanced ML algorithms into traditional financial models enhances return prediction and portfolio optimization. These methods can adapt to changing market dynamics by modeling structural changes and incorporating heterogeneous data sources, such as text and images. In addition, this paper explores challenges in applying ML in asset pricing, addressing the growing demand for explainability in decision-making and mitigating overfitting in complex models. This paper aims to provide insights into novel methodologies showcasing the potential of ML to reshape the future of quantitative finance.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.06779&r=big

Predicting IMF-Supported Programs: A Machine Learning Approach

By:	Tsendsuren Batsuuri; Shan He; Ruofei Hu; Jonathan Leslie; Flora Lutz
Abstract:	This study applies state-of-the-art machine learning (ML) techniques to forecast IMF-supported programs, analyzes the ML prediction results relative to traditional econometric approaches, explores non-linear relationships among predictors indicative of IMF-supported programs, and evaluates model robustness with regard to different feature sets and time periods. ML models consistently outperform traditional methods in out-of-sample prediction of new IMF-supported arrangements with key predictors that align well with the literature and show consensus across different algorithms. The analysis underscores the importance of incorporating a variety of external, fiscal, real, and financial features as well as institutional factors like membership in regional financing arrangements. The findings also highlight the varying influence of data processing choices such as feature selection, sampling techniques, and missing data imputation on the performance of different ML models and therefore indicate the usefulness of a flexible, algorithm-tailored approach. Additionally, the results reveal that models that are most effective in near and medium-term predictions may tend to underperform over the long term, thus illustrating the need for regular updates or more stable – albeit potentially near-term suboptimal – models when frequent updates are impractical.
Keywords:	Early warning systems; IMF Lending; Machine Learning
Date:	2024–03–08
URL:	http://d.repec.org/n?u=RePEc:imf:imfwpa:2024/054&r=big

Understanding online purchases with explainable machine learning

By:	João A. Bastos; Maria Inês Bernardes
Abstract:	Customer profiling in e-commerce is a powerful tool that enables organizations to create personalized offers through direct marketing. One crucial objective of customer profiling is to predict whether a website visitor will make a purchase, thereby generating revenue. Machine learning models are the most accurate means to achieve this objective. However, the opaque nature of these models may deter companies from adopting them. Instead, they may prefer simpler models that allow for a clear understanding of the customer attributes that contribute to a purchase. In this study, we show that companies need not compromise on prediction accuracy to understand their online customers. By leveraging website data from a multinational communications service provider, we establish that the most pertinent customer attributes can be readily extracted from a black-box model. Specifically, we show that features measuring customer activity within the e-commerce platform are the most reliable predictors of conversions. Moreover, we uncover significant non-linear relationships between customer features and the likelihood of conversion.
Keywords:	Customer Profiling; Conversion; Direct marketing; Explainable artificial intelligence; SHAP value; Accumulated local effects.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:ise:remwps:wp03132024&r=big

빅데이터 기반의 국제거시경제 전망모형 개발 연구(Developing an International Macroeconomic Forecasting Model Based on Big Data)

By:	Baek, Yaein (KOREA INSTITUTE FOR INTERNATIONAL ECONOMIC POLICY (KIEP)); Yoon, Sang-Ha (KOREA INSTITUTE FOR INTERNATIONAL ECONOMIC POLICY (KIEP)); Kim, Hyun Hak (KOREA INSTITUTE FOR INTERNATIONAL ECONOMIC POLICY (KIEP)); Lee, Jiyun (KOREA INSTITUTE FOR INTERNATIONAL ECONOMIC POLICY (KIEP))
Abstract:	본 연구에서는 빅데이터를 활용하여 경제성장률을 단기 전망하고, 전통적 통계모형 및 구조적 거시모형의 전망과 비교하여 예측 성과를 분석한다. 미국과 한국의 경제성장률을 예측하기 위해 대량의 거시·금융 지표와 머신러닝을 사용하며, 네이버 검색 데이터와 동적모형 평균화 및 선택을 활용하여 한국의 경제성장률을 전망한다. 이를 통해 빅데이터가 경제성장률 예측 성과에 미치는 영향을 살펴본다. 마지막으로 빅데이터 기반의 전망과 소규모 개방경제 동태확률일반균형 모형의 전망을 종합하여 시사점과 향후 경제전망 연구 방향을 제안한다. The economic uncertainties arising from recent global inflation and the Covid-19 pandemic have significantly amplified the importance of accuracy and timeliness in macroeconomic forecasts. To enhance the predictive abilities of models, harnessing all potentially relevant information is crucial. The advent of big data has spurred active exploration in economic forecasting research, leveraging additional data dimensions. Notably, text data such as online searches and news articles are widely employed to extract sentiments of economic agents, thereby monitoring economic and financial conditions. Additionally, machine learning has emerged as a pivotal tool in macroeconomic forecasting because it efficiently processes and analyzes big data. Given the potential benefits of big data for forecasting and the ongoing development of new methodologies, a collective analysis of forecasts based on big data and traditional macroeconomic models is essential. In this study, we analyze the predictive ability of short-term GDP growth rate forecasts based on big data against those generated by traditional statistical and structural macroeconomic models. Given the contrasting characteristics between big data-based forecasting models and structural models, we comprehensively analyze the results of each model and discuss implications for future economic forecasting research. This study largely consists of four parts. In Chapter 2, we utilize a small open economy dynamic stochastic general equilibrium model (SOE-DSGE) to forecast Korea’s GDP growth. This theoretical model serves as a benchmark for comparing against big data-based forecasts. Using a Bayesian framework, the model examines the impacts of various shocks, such as those related to total factor productivity, government spending, monetary policy, foreign demand, and foreign monetary policy. The findings reveal that the response of model variables to external shocks align with real-world outcomes. One of the strengths of the SOE-DSGE model is that it explicitly includes structural shocks, allowing us to analyze not only forecasts but also the effects of economic policies. However, a limitation is its inability to fully leverage available data due to inherent model constraints.(the rest omitted)
Keywords:	Economic outlook; economic growth; big data; machine learning; macroeconomic outlook
Date:	2023–12–29
URL:	http://d.repec.org/n?u=RePEc:ris:kieppa:2023_024&r=big

Generative Probabilistic Forecasting with Applications in Market Operations

By:	Xinyi Wang; Lang Tong
Abstract:	This paper presents a novel generative probabilistic forecasting approach derived from the Wiener-Kallianpur innovation representation of nonparametric time series. Under the paradigm of generative artificial intelligence, the proposed forecasting architecture includes an autoencoder that transforms nonparametric multivariate random processes into canonical innovation sequences, from which future time series samples are generated according to their probability distributions conditioned on past samples. A novel deep-learning algorithm is proposed that constrains the latent process to be an independent and identically distributed sequence with matching autoencoder input-output conditional probability distributions. Asymptotic optimality and structural convergence properties of the proposed generative forecasting approach are established. Three applications involving highly dynamic and volatile time series in real-time market operations are considered: (i) locational marginal price forecasting for merchant storage participants, {(ii) interregional price spread forecasting for interchange markets, } and (iii) area control error forecasting for frequency regulations. Numerical studies based on market data from multiple independent system operators demonstrate superior performance against leading traditional and machine learning-based forecasting techniques under both probabilistic and point forecast metrics.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.05743&r=big

Generative Adversarial Networks Applied to Synthetic Financial Scenarios Generation

By:	Matteo Rizzato (Advestis); Julien Wallart (Fujitsu Systems Europe); Christophe Geissler (Advestis); Nicolas Morizet (Advestis); Noureddine Boumlaik
Abstract:	The finance industry is producing an increasing amount of datasets that investment professionals can consider to be influential on the price of financial assets. These datasets were initially mainly limited to exchange data, namely price, capitalization and volume. Their coverage has now considerably expanded to include, for example, macroeconomic data, supply and demand of commodities, balance sheet data and more recently extra-financial data such as ESG scores. This broadening of the factors retained as influential constitutes a serious challenge for statistical modeling. Indeed, the instability of the correlations between these factors makes it practically impossible to identify the joint laws needed to construct scenarios. Fortunately, spectacular advances in Deep Learning field in recent years have given rise to GANs. GANs are a type of generative machine learning models that produce new data samples with the same characteristics as a training data distribution in an unsupervised way, avoiding data assumptions and human induced biases. In this work, we are exploring the use of GANs for synthetic financial scenarios generation. This pilot study is the result of a collaboration between Fujitsu and Advestis and it will be followed by a thorough exploration of the use cases that can benefit from the proposed solution. We propose a GANs-based algorithm that allows the replication of multivariate data representing several properties (including, but not limited to, price, market capitalization, ESG score, controversy score, . . .) of a set of stocks. This approach differs from examples in the financial literature, which are mainly focused on the reproduction of temporal asset price scenarios. We also propose several metrics to evaluate the quality of the data generated by the GANs. This approach is well fit for the generation of scenarios, the time direction simply arising as a subsequent (eventually conditioned) generation of data points drawn from the learned distribution. Our method will allow to simulate high dimensional scenarios (compared to ≲ 10 features currently employed in most recent use cases) where network complexity is reduced thanks to a wisely performed feature engineering and selection. Complete results will be presented in a forthcoming study.
Keywords:	Data Augmentation, Financial Scenarios, Risk Management, Generative Adversarial Networks
Date:	2023–08
URL:	http://d.repec.org/n?u=RePEc:hal:journl:hal-03716692&r=big

Measuring Unemployment Risk

By:	Brendan J. Chapuis; John Coglianese
Abstract:	In this note, we introduce a measure of unemployment risk, the likelihood of a worker becoming unemployed within the next twelve months. By using nonparametric machine learning applied to data on millions of workers in the US, we can estimate how unemployment risk varies across individuals and over time.
Date:	2024–03–08
URL:	http://d.repec.org/n?u=RePEc:fip:fedgfn:2024-03-08-1&r=big

Application of Deep Learning to Emulate an Agent-Based Model

By:	Njiru, Ruth; Appel, Franziska; Dong, Changxing; Balmann, Alfons
Abstract:	In light of the dynamic challenges facing agricultural land markets, the conventional analytical frameworks fall short in capturing the intricate interplay of strategic decisions and evolving complexities. This necessitates the development of a novel method, integrating deep learning into Agent-based Modelling, to provide a more realistic and nuanced understanding of land market dynamics, enabling informed policy assessments and contributing to a comprehensive discourse on agricultural structural change. In this paper, different deep learning models are tested and evaluated, as emulators of AgriPoliS (Agricultural Policy Simulator). AgriPoliS is an agent-based model used to model the evolution of structural change in agriculture resultant on the change in the policy environment. This study is part of preliminary works towards integrating deep learning methods and predictions with AgriPoliS to capture strategic decision making and actions of agents in land markets. The paper tests the models on their suitability, computational requirements and run-time complexities. The output from AgriPoliS serves as the input features for the deep learning models. Models are evaluated using a combination of coefficient of determination (R2 score), mean absolute error, visual displays and runtime. The models were able to replicate the variable of interest with a high degree of accuracy with R2 score of more than 90%. The CNN was the most suited for replicating the data. Through this work, we learned the required complexities, computational and training efforts needed to integrate deep learning and AgriPoliS to capture strategic decision-making.
Keywords:	Land Economics/Use
Date:	2024–03–26
URL:	http://d.repec.org/n?u=RePEc:ags:bokufo:340874&r=big

Pre-Publication Revisions of Bank Financial Statements: a novel way to monitor banks?

By:	Andre Guettler; Mahvish Naeem; Lars Norden; Bernardus Van Doornik
Abstract:	We investigate whether pre-publication revisions of bank financial statements contain forward-looking information about bank risk. Using 7.4 million observations of monthly financial reports from all banks in Brazil during 2007-2019, we show that 78% of all revisions occur before the publication of these statements. The frequency, missing of reporting deadlines, and severity of revisions are positively related to future bank risk. Using machine learning techniques, we provide evidence on mechanisms through which revisions affect bank risk. Our findings suggest that private information about pre-publication revisions is useful for supervisors to monitor banks.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:bcb:wpaper:590&r=big

Prediction of Corporate Credit Ratings with Machine Learning: Simple Interpretative Models

By:	Koresh Galil (BGU); Ami Hauptman (Computer Science Department of Sapir College); Rosit Levy Rosenboim (Applied Economics Department of Sapir College)
Keywords:	Corporate Ratings, Machine Learning, Classification and Regression Tree, Support Vector Regression, CART, SVR, Size
JEL:	C45 C53 G24 G32
Date:	2023
URL:	http://d.repec.org/n?u=RePEc:bgu:wpaper:2308&r=big

Triple/Debiased Lasso for Statistical Inference of Conditional Average Treatment Effects

By:	Masahiro Kato
Abstract:	This study investigates the estimation and the statistical inference about Conditional Average Treatment Effects (CATEs), which have garnered attention as a metric representing individualized causal effects. In our data-generating process, we assume linear models for the outcomes associated with binary treatments and define the CATE as a difference between the expected outcomes of these linear models. This study allows the linear models to be high-dimensional, and our interest lies in consistent estimation and statistical inference for the CATE. In high-dimensional linear regression, one typical approach is to assume sparsity. However, in our study, we do not assume sparsity directly. Instead, we consider sparsity only in the difference of the linear models. We first use a doubly robust estimator to approximate this difference and then regress the difference on covariates with Lasso regularization. Although this regression estimator is consistent for the CATE, we further reduce the bias using the techniques in double/debiased machine learning (DML) and debiased Lasso, leading to $\sqrt{n}$-consistency and confidence intervals. We refer to the debiased estimator as the triple/debiased Lasso (TDL), applying both DML and debiased Lasso techniques. We confirm the soundness of our proposed method through simulation studies.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.03240&r=big

Movies

By:	Stelios Michalopoulos; Christopher Rauh
Abstract:	Why are certain movies more successful in some markets than others? Are the entertainment products we consume reflective of our core values and beliefs? These questions drive our investigation into the relationship between a society’s oral tradition and the financial success of films. We combine a unique catalog of local tales, myths, and legends around the world with data on international movie screenings and revenues. First, we quantify the similarity between movies’ plots and traditional motifs employing machine learning techniques. Comparing the same movie across different markets, we establish that films that resonate more with local folklore systematically accrue higher revenue and are more likely to be screened. Second, we document analogous patterns within the US. Google Trends data reveal a pronounced interest in markets where ancestral narratives align more closely with a movie’s theme. Third, we delve into the explicit values transmitted by films, concentrating on the depiction of risk and gender roles. Films that promote risk-taking sell more in entrepreneurial societies today, rooted in traditions where characters pursue dangerous tasks successfully. Films portraying women in stereotypical roles continue to find a robust audience in societies with similar gender stereotypes in their folklore and where women today continue being relegated to subordinate positions. These findings underscore the enduring influence of traditional storytelling on entertainment patterns in the 21st century, highlighting a profound connection between movie consumption and deeply ingrained cultural narratives and values.
JEL:	N0 O10 P0 Z00 Z1 Z10 Z11 Z13
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:nbr:nberwo:32220&r=big

Old but Gold or New and Shiny? Comparing Tree Ensembles for Ordinal Prediction with a Classic Parametric Approach

By:	Buczak, Philip; Horn, Daniel; Pauly, Markus
Abstract:	There is a long tradition of modeling ordinal response data with parametric models such as the proportional odds model. With the advent of machine learning (ML), however, the classical stream of parametric models has been increasingly challenged by a more recent stream of tree ensemble (TE) methods extending popular ML algorithms such as random forest to ordinal response data. Despite selective efforts, the current literature lacks an encompassing comparison between the two methodological streams. In this work, we fill this gap by investigating under which circumstances a proportional odds model is competitive with TE methods regarding its predictive performance, and when TE should be preferred. Additionally, we study whether the optimization of the numeric scores assigned to ordinal response categories, as in Ordinal Forest (OF; Hornung, 2019), is worth the associated computational burden. To this end, we further contribute to the literature by proposing the Ordinal Score Optimization Algorithm (OSOA). Similar, to OF, OSOA optimizes the numeric scores assigned to the ordinal response categories, but aims to enhance the optimization procedure used in OF by employing a non-linear optimization algorithm. Our comparison results show that while TE approaches outperformed the proportional odds model in the presence of strong non-linear effects, the latter was competitive for small sample sizes even under medium non-linear effects. Regarding the TE methods, only subtle differences emerged between the individual methods, showing that the benefit of score optimization was situational. We analyze potential reasons for the mixed benefits of score optimization to motivate further methodological research. Based on our results, we derive practical recommendations for researchers and practitioners.
Date:	2024–03–29
URL:	http://d.repec.org/n?u=RePEc:osf:osfxxx:v7bcf&r=big

Remotely measuring rural economic activity and poverty : Do we just need better sensors?

By:	GIBSON, John; ZHANG, Xiaoxuan; PARK, Albert; YI, Jiang; XI, Li
Abstract:	It is difficult and expensive to measure rural economic activity and poverty in developing countries. The usual survey-based approach is less informative than often realized due to combined effects of the clustered samples dictated by survey logistics and the spatial autocorrelation in rural livelihoods. Administrative data, like sub-national GDP for lower level spatial units, are often unavailable and the informality and seasonality of many rural activities raises doubts about accuracy of such measures. A recent literature argues that high-resolution satellite imagery can overcome these barriers to the measurement of rural economic activity and rural living standards and poverty. Potential advantages of satellite data include greater comparability between countries irrespective of their varying levels of statistical capacity, cheaper and more timely data availability, and the possibility of extending estimates to spatial units below the level at which GDP data or survey data are reported. While there are many types of remote sensing data, economists have particularly seized upon satellite-detected nighttime lights (NTL) as a proxy for local economic activity. Yet there are growing doubts about the universal usefulness of this proxy, with recent evidence suggesting that NTL data are a poor proxy in low-density rural areas of developing countries. This study examines performance in predicting rural sector economic activity and poverty in China with different types of satellite-detected NTL data that come from three generations of sensors of varying resolution. We include the most popular NTL source in economics, the Defense Meteorological Satellite Program data, whose resolution is, at best, 2.7 km, two data sources from the Visible Infrared Imaging Radiometer Suite (VIIRS) on the Suomi/NPP satellite with spatial resolution of 0.74 km, and data from the Luojia-01 satellite that is even more spatially precise, with resolution of 0.13 km. The sensors also vary in ability to detect feeble light and in the time of night that they observe the earth. With this variation we can ascertain whether better sensors lead to better predictions. We supplement this statistical assessment with a set of ground-truthing exercises. Overall, our study may help to inform decisions about future data directions for studying rural economic activity and poverty in developing countries.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:hit:hitcei:2023-08&r=big

Option pricing in the Heston model with Physics inspired neural networks

By:	Hainaut, Donatien (Université catholique de Louvain, LIDAM/ISBA, Belgium); Casas, Alex (Detralytics)
Abstract:	In absence of a closed form expression such as in the Heston model, the option pricing is computationally intensive when calibrating a model to market quotes. this article proposes an alternative to standard pricing methods based on physics-inspired neural networks (PINNs). A PINN integrates principles from physics into its learning process to enhance its efficiency in solving complex problems. In this article, the driving principle is the Feynman-Kac (FK) equation, which is a partial differential equation (PDE) governing the derivative price in the Heston model. We focus on the valuation of European options and show that PINNs constitute an efficient alternative for pricing options with various specifications and parameters without the need for retraining.
Keywords:	Neural networks ; options ; Heston model ; Feynman-Kac equation
Date:	2024–02–01
URL:	http://d.repec.org/n?u=RePEc:aiz:louvad:2024002&r=big

Four Facts about International Central Bank Communication

By:	Bertsch, Christoph (Research Department, Central Bank of Sweden); Hull, Isaiah (BI Norwegian Business School; CogniFrame); Lumsdaine, Robin L. (Kogod School of Business, American University; Erasmus University Rotterdam; National Bureau of Economic Research (NBER); Tinbergen Institute; Center for Financial Stability); Zhang, Xin (Research Department, Central Bank of Sweden)
Abstract:	This paper introduces a novel database of text features extracted from the speeches of 53 central banks from 1996 to 2023 using state-of-the-art NLP methods. We establish four facts: (1) central banks with floating and pegged exchange rates communicate differently, and these differences are particularly pronounced in discussions about exchange rates and the dollar, (2) communication spillovers from the Federal Reserve are prominent in exchange rate and dollar-related topics for dollar peggers and in hawkish sentiment for others, (3) central banks engage in FX intervention guidance, and (4) more transparent institutions are less responsive to political pressure in their communication.
Keywords:	Exchange Rates; Natural Language Processing (NLP); International Spillovers; Monetary Policy
JEL:	C55 E42 E50 F31 F42
Date:	2024–03–01
URL:	http://d.repec.org/n?u=RePEc:hhs:rbnkwp:0432&r=big

Talking in a language that everyone can understand? Clarity of speeches by the ECB Executive Board

By:	Glas, Alexander; Müller, Lena
Abstract:	We use data on speeches held by members of the European Central Bank's (ECB) Executive Board to analyze whether clarity of central bank communication has increased over time. Employing readability measures as proxy variables, we find that clarity of information provision is trending upward since the inception of the ECB. The increase is gradual, rather than being induced by changes in the board composition or major macroeconomic events. Clarity is higher for speeches aimed at general audiences and for speeches by female speakers. We also show that media sentiment about the ECB is negatively related to complexity.
Keywords:	Central Bank Communication, Monetary Policy Transparency, Clarity, Readability
JEL:	E52 E58
Date:	2023
URL:	http://d.repec.org/n?u=RePEc:zbw:zewdip:283611&r=big

Language-based game theory in the age of artificial intelligence

By:	Valerio Capraro; Roberto Di Paolo; Matjaz Perc; Veronica Pizziol
Abstract:	Understanding human behaviour in decision problems and strategic interactions has wide-ranging applications in economics, psychology, and artificial intelligence. Game theory offers a robust foundation for this understanding, based on the idea that individuals aim to maximize a utility function. However, the exact factors influencing strategy choices remain elusive. While traditional models try to explain human behaviour as a function of the outcomes of available actions, recent experimental research reveals that linguistic content significantly impacts decision-making, thus prompting a paradigm shift from outcome-based to language-based utility functions. This shift is more urgent than ever, given the advancement of generative AI, which has the potential to support humans in making critical decisions through language-based interactions. We propose sentiment analysis as a fundamental tool for this shift and take an initial step by analyzing 61 experimental instructions from the dictator game, an economic game capturing the balance between self-interest and the interest of others, which is at the core of many social interactions. Our meta-analysis shows that sentiment analysis can explain human behaviour beyond economic outcomes. We discuss future research directions. We hope this work sets the stage for a novel game theoretical approach that emphasizes the importance of language in human decisions.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.08944&r=big

This nep-big issue is ©2024 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.