nep-big New Economics Papers
on Big Data
Issue of 2023‒11‒13
twenty-two papers chosen by
Tom Coupé, University of Canterbury


  1. Machine learning for economics research: when, what and how By Ajit Desai
  2. Macroeconomic Forecasting with the Use of News Data By Mikhaylov, Dmitry
  3. American Option Pricing using Self-Attention GRU and Shapley Value Interpretation By Yanhui Shen
  4. A systematic review of early warning systems in finance By Ali Namaki; Reza Eyvazloo; Shahin Ramtinnia
  5. Evolución de la pobreza en las comunidades de Bolivia entre 2012 y 2022: Un enfoque de machine learning y teledetección By Bolivar, Osmar
  6. Quantum-Enhanced Forecasting: Leveraging Quantum Gramian Angular Field and CNNs for Stock Return Predictions By Zhengmeng Xu; Hai Lin
  7. Application of Artificial Intelligence for Monetary Policy-Making By Mariam Dundua; Otar Gorgodze
  8. Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language Models By Boyu Zhang; Hongyang Yang; Tianyu Zhou; Ali Babar; Xiao-Yang Liu
  9. A New Weighted Food CPI from Scanner Big Data in China By Zhenkun Zhou; Zikun Song; Tao Ren
  10. Multi-Industry Simplex : A Probabilistic Extension of GICS By Maksim Papenkov; Chris Meredith; Claire Noel; Jai Padalkar; Temple Hendrickson; Daniel Nitiutomo; Thomas Farrell
  11. Integrating Stock Features and Global Information via Large Language Models for Enhanced Stock Return Prediction By Yujie Ding; Shuai Jia; Tianyi Ma; Bingcheng Mao; Xiuze Zhou; Liuliu Li; Dongming Han
  12. Evaluation of feature selection performance for identification of best effective technical indicators on stock market price prediction By Fatemeh Moodi; Amir Jahangard-Rafsanjani
  13. FinGPT: Instruction Tuning Benchmark for Open-Source Large Language Models in Financial Datasets By Neng Wang; Hongyang Yang; Christina Dan Wang
  14. Implementing a Hierarchical Deep Learning Approach for Simulating multilevel Auction Data By Igor Sadoune; Marcelin Joanis; Andrea Lodi
  15. Integration of Fractional Order Black-Scholes Merton with Neural Network By Sarit Maitra
  16. Applying Reinforcement Learning to Option Pricing and Hedging By Zoran Stoiljkovic
  17. Assessing the data challenges of climate-related disclosures in european banks. A text mining study By Ángel Iván Moreno; Teresa Caminero
  18. Variational autoencoder for synthetic insurance data By Jamotton, Charlotte; Hainaut, Donatien
  19. Determinants of renewable energy consumption in Madagascar: Evidence from feature selection algorithms By Ramaharo, Franck Maminirina; RANDRIAMIFIDY, Michael Fitiavana
  20. Valuation of guaranteed minimum accumulation benefits (GMAB) with physics inspired neural networks By Hainaut, Donatien
  21. A hybrid SEM/ANN analysis to understand youtube video content's influence on university students' eLearning acceptance behavior By Phan Cong Thao Tien; Tran Thien Phuc; Nguyen Thi Hai Binh
  22. Financial Stress and Economic Activity: Evidence from a New Worldwide Index By Hites Ahir; Mr. Giovanni Dell'Ariccia; Davide Furceri; Mr. Chris Papageorgiou; Hanbo Qi

  1. By: Ajit Desai
    Abstract: This article reviews selected papers that use machine learning for economics research and policy analysis. Our review highlights when machine learning is used in economics, the commonly preferred models and how those models are used.
    Keywords: Central bank research; Econometric and statistical methods; Economic models
    JEL: A1 A10 B2 B23 C4 C45 C5 C55
    Date: 2023–10
    URL: http://d.repec.org/n?u=RePEc:bca:bocsan:23-16&r=big
  2. By: Mikhaylov, Dmitry (The Russian Presidential Academy of National Economy and Public Administration)
    Abstract: During the last decade a lot of academic papers consider the possibility of predicting the economic fluctuations and macroeconomic variables volatility with the use of news data. The reason for this is the development of new machine learning techniques and enhancement of the existed methods. The scientific problem of our study is the investigation of whether predictive power of the forecast of macroeconomic variables can be improved with the use of news data in the context of Russia. We apply NLU algorithms and techniques for topic modeling. Especially, we implement LDA (Latent Dirichlet Allocation) since this approach has shown its effectiveness in the published papers related to the mentioned framework. Then the frequency news and sentiment news indexes are constructed with the use of modeled topics. The end point of our research is the forecast analysis of the set of macroeconomics variables [CPI (π), Business Confidence Index (BCI), Consumer Confidence Index (CCI), Export (EX), Import (IM), Net Export (NX)] supplemented by inclusion of frequency and sentiment news indexes in order to evaluated the improvement in predictive power. We have shown that the inclusion of frequency news indexes and sentiment news indexes, based on the LDA approach in the forecast models can improve the quality of the predictions and increase the predictive power for some variables.
    Keywords: Macroeconomic Forecasting, Natural Language Processing, Machine Learning
    JEL: E27 E37
    Date: 2023–04–14
    URL: http://d.repec.org/n?u=RePEc:rnp:wpaper:w20220250&r=big
  3. By: Yanhui Shen
    Abstract: Options, serving as a crucial financial instrument, are used by investors to manage and mitigate their investment risks within the securities market. Precisely predicting the present price of an option enables investors to make informed and efficient decisions. In this paper, we propose a machine learning method for forecasting the prices of SPY (ETF) option based on gated recurrent unit (GRU) and self-attention mechanism. We first partitioned the raw dataset into 15 subsets according to moneyness and days to maturity criteria. For each subset, we matched the corresponding U.S. government bond rates and Implied Volatility Indices. This segmentation allows for a more insightful exploration of the impacts of risk-free rates and underlying volatility on option pricing. Next, we built four different machine learning models, including multilayer perceptron (MLP), long short-term memory (LSTM), self-attention LSTM, and self-attention GRU in comparison to the traditional binomial model. The empirical result shows that self-attention GRU with historical data outperforms other models due to its ability to capture complex temporal dependencies and leverage the contextual information embedded in the historical data. Finally, in order to unveil the "black box" of artificial intelligence, we employed the SHapley Additive exPlanations (SHAP) method to interpret and analyze the prediction results of the self-attention GRU model with historical data. This provides insights into the significance and contributions of different input features on the pricing of American-style options.
    Date: 2023–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2310.12500&r=big
  4. By: Ali Namaki; Reza Eyvazloo; Shahin Ramtinnia
    Abstract: Early warning systems (EWSs) are critical for forecasting and preventing economic and financial crises. EWSs are designed to provide early warning signs of financial troubles, allowing policymakers and market participants to intervene before a crisis expands. The 2008 financial crisis highlighted the importance of detecting financial distress early and taking preventive measures to mitigate its effects. In this bibliometric review, we look at the research and literature on EWSs in finance. Our methodology included a comprehensive examination of academic databases and a stringent selection procedure, which resulted in the final selection of 616 articles published between 1976 and 2023. Our findings show that more than 90\% of the papers were published after 2006, indicating the growing importance of EWSs in financial research. According to our findings, recent research has shifted toward machine learning techniques, and EWSs are constantly evolving. We discovered that research in this area could be divided into four categories: bankruptcy prediction, banking crisis, currency crisis and emerging markets, and machine learning forecasting. Each cluster offers distinct insights into the approaches and methodologies used for EWSs. To improve predictive accuracy, our review emphasizes the importance of incorporating both macroeconomic and microeconomic data into EWS models. To improve their predictive performance, we recommend more research into incorporating alternative data sources into EWS models, such as social media data, news sentiment analysis, and network analysis.
    Date: 2023–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2310.00490&r=big
  5. By: Bolivar, Osmar
    Abstract: Esta investigación tiene como objetivo pronosticar la incidencia de pobreza a nivel comunitario en Bolivia para el año 2022 empleando algoritmos de machine learning y teledetección, y contrastar estos pronósticos con los datos de 2012. Se procesaron datos censales de 2012 para crear un indicador de pobreza basado en Necesidades Básicas Insatisfechas (NBI) a nivel de comunidades y se seleccionaron 953 de estas comunidades como unidades de análisis. La generación de variables geoespaciales, el entrenamiento y validación de algoritmos de machine learning, y la posterior aplicación de estos modelos revelaron una disminución general de la pobreza, con aproximadamente el 50% de las comunidades proyectadas por debajo del umbral del 42, 5% en 2022, indicando mejoras significativas desde 2012. Se observó una reducción diferencial de la pobreza, con un impacto más pronunciado en las comunidades con menores niveles de pobreza iniciales. Se vislumbraron disparidades regionales, con tasas de pobreza más bajas en áreas urbanas, subrayando la necesidad de abordar las desigualdades regionales. Además, se evidencio la eficacia de la metodología planteada en este estudio en comparación con investigaciones similares, resaltando la utilidad de esta metodología para predecir la pobreza a nivel comunitario.
    Keywords: pobreza; machine learning; remote sensing
    JEL: C8 I3 I32 O31
    Date: 2023–08
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:118932&r=big
  6. By: Zhengmeng Xu; Hai Lin
    Abstract: We propose a time series forecasting method named Quantum Gramian Angular Field (QGAF). This approach merges the advantages of quantum computing technology with deep learning, aiming to enhance the precision of time series classification and forecasting. We successfully transformed stock return time series data into two-dimensional images suitable for Convolutional Neural Network (CNN) training by designing specific quantum circuits. Distinct from the classical Gramian Angular Field (GAF) approach, QGAF's uniqueness lies in eliminating the need for data normalization and inverse cosine calculations, simplifying the transformation process from time series data to two-dimensional images. To validate the effectiveness of this method, we conducted experiments on datasets from three major stock markets: the China A-share market, the Hong Kong stock market, and the US stock market. Experimental results revealed that compared to the classical GAF method, the QGAF approach significantly improved time series prediction accuracy, reducing prediction errors by an average of 25% for Mean Absolute Error (MAE) and 48% for Mean Squared Error (MSE). This research confirms the potential and promising prospects of integrating quantum computing with deep learning techniques in financial time series forecasting.
    Date: 2023–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2310.07427&r=big
  7. By: Mariam Dundua (Financial and Supervisory Technology Development Department, National Bank of Georgia); Otar Gorgodze (Head of Financial and Supervisory Technologies Department, National Bank of Georgia)
    Abstract: The recent advances in Artificial Intelligence (AI), in particular, the development of reinforcement learning (RL) methods, are specifically suited for application to complex economic problems. We formulate a new approach looking for optimal monetary policy rules using RL. Analysis of AI generated monetary policy rules indicates that optimal policy rules exhibit significant nonlinearities. This could explain why simple monetary rules based on traditional linear modeling toolkits lack the robustness needed for practical application. The generated transition equations analysis allows us to estimate the neutral policy rate, which came out to be 6.5 percent. We discuss the potential combination of the method with state-of-the-art FinTech developments in digital finance like DeFi and CBDC and the feasibility of MonetaryTech approach to monetary policy.
    Keywords: Artificial Intelligence; Reinforcement Learning; Monetary policy
    JEL: C60 C61 C63 E17 C45 E52
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:aez:wpaper:02/2022&r=big
  8. By: Boyu Zhang; Hongyang Yang; Tianyu Zhou; Ali Babar; Xiao-Yang Liu
    Abstract: Financial sentiment analysis is critical for valuation and investment decision-making. Traditional NLP models, however, are limited by their parameter size and the scope of their training datasets, which hampers their generalization capabilities and effectiveness in this field. Recently, Large Language Models (LLMs) pre-trained on extensive corpora have demonstrated superior performance across various NLP tasks due to their commendable zero-shot abilities. Yet, directly applying LLMs to financial sentiment analysis presents challenges: The discrepancy between the pre-training objective of LLMs and predicting the sentiment label can compromise their predictive performance. Furthermore, the succinct nature of financial news, often devoid of sufficient context, can significantly diminish the reliability of LLMs' sentiment analysis. To address these challenges, we introduce a retrieval-augmented LLMs framework for financial sentiment analysis. This framework includes an instruction-tuned LLMs module, which ensures LLMs behave as predictors of sentiment labels, and a retrieval-augmentation module which retrieves additional context from reliable external sources. Benchmarked against traditional models and LLMs like ChatGPT and LLaMA, our approach achieves 15\% to 48\% performance gain in accuracy and F1 score.
    Date: 2023–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2310.04027&r=big
  9. By: Zhenkun Zhou; Zikun Song; Tao Ren
    Abstract: Scanner big data has potential to construct Consumer Price Index (CPI). The study introduces a new weighted price index called S-FCPIw, which is constructed using scanner big data from retail sales in China. We address the limitations of China's CPI especially for its high cost and untimely release, and demonstrate the reliability of S-FCPIw by comparing it with existing price indices. S-FCPIw can not only reflect the changes of goods prices in higher frequency and richer dimension, and the analysis results show that S-FCPIw has a significant and strong relationship with CPI and Food CPI. The findings suggest that scanner big data can supplement traditional CPI calculations in China and provide new insights into macroeconomic trends and inflation prediction. We have made S-FCPIw publicly available and update it on a weekly basis to facilitate further study in this field.
    Date: 2023–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2310.04242&r=big
  10. By: Maksim Papenkov; Chris Meredith; Claire Noel; Jai Padalkar; Temple Hendrickson; Daniel Nitiutomo; Thomas Farrell
    Abstract: Accurate industry classification is a critical tool for many asset management applications. While the current industry gold-standard GICS (Global Industry Classification Standard) has proven to be reliable and robust in many settings, it has limitations that cannot be ignored. Fundamentally, GICS is a single-industry model, in which every firm is assigned to exactly one group - regardless of how diversified that firm may be. This approach breaks down for large conglomerates like Amazon, which have risk exposure spread out across multiple sectors. We attempt to overcome these limitations by developing MIS (Multi-Industry Simplex), a probabilistic model that can flexibly assign a firm to as many industries as can be supported by the data. In particular, we utilize topic modeling, an natural language processing approach that utilizes business descriptions to extract and identify corresponding industries. Each identified industry comes with a relevance probability, allowing for high interpretability and easy auditing, circumventing the black-box nature of alternative machine learning approaches. We describe this model in detail and provide two use-cases that are relevant to asset management - thematic portfolios and nearest neighbor identification. While our approach has limitations of its own, we demonstrate the viability of probabilistic industry classification and hope to inspire future research in this field.
    Date: 2023–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2310.04280&r=big
  11. By: Yujie Ding; Shuai Jia; Tianyi Ma; Bingcheng Mao; Xiuze Zhou; Liuliu Li; Dongming Han
    Abstract: The remarkable achievements and rapid advancements of Large Language Models (LLMs) such as ChatGPT and GPT-4 have showcased their immense potential in quantitative investment. Traders can effectively leverage these LLMs to analyze financial news and predict stock returns accurately. However, integrating LLMs into existing quantitative models presents two primary challenges: the insufficient utilization of semantic information embedded within LLMs and the difficulties in aligning the latent information within LLMs with pre-existing quantitative stock features. We propose a novel framework consisting of two components to surmount these challenges. The first component, the Local-Global (LG) model, introduces three distinct strategies for modeling global information. These approaches are grounded respectively on stock features, the capabilities of LLMs, and a hybrid method combining the two paradigms. The second component, Self-Correlated Reinforcement Learning (SCRL), focuses on aligning the embeddings of financial news generated by LLMs with stock features within the same semantic space. By implementing our framework, we have demonstrated superior performance in Rank Information Coefficient and returns, particularly compared to models relying only on stock features in the China A-share market.
    Date: 2023–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2310.05627&r=big
  12. By: Fatemeh Moodi; Amir Jahangard-Rafsanjani
    Abstract: Due to the influence of many factors, including technical indicators on stock market prediction, feature selection is important to choose the best indicators. One of the feature selection methods that consider the performance of models during feature selection is the wrapper feature selection method. The aim of this research is to identify a combination of the best stock market indicators through feature selection to predict the stock market price with the least error. In order to evaluate the impact of wrapper feature selection techniques on stock market prediction, in this paper SFS and SBS with 10 estimators and 123 technical indicators have been examined on the last 13 years of Apple Company. Also, by the proposed method, the data created by the 3-day time window were converted to the appropriate input for regression methods. Based on the results observed: (1) Each wrapper feature selection method has different results with different machine learning methods, and each method is more correlated with a specific set of technical indicators of the stock market. (2) Ridge and LR estimates alone, and with two methods of the wrapper feature selection, namely SFS and SBS; They had the best results with all assessment criteria for market forecast. (3)The Ridge and LR method with all the R2, MSE, RMSE, MAE and MAPE have the best stock market prediction results. Also, the MLP Regression Method, along with the Sequential Forwards Selection and the MSE, had the best performance. SVR regression, along with the SFS and the MSE, has improved greatly compared to the SVR regression with all indicators. (4) It was also observed that different features are selected by different ML methods with different evaluation parameters. (5) Most ML methods have used the Squeeze_pro, Percentage Price Oscillator, Thermo, Decay, Archer On-Balance Volume, Bollinger Bands, Squeeze and Ichimoku indicator.
    Date: 2023–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2310.09903&r=big
  13. By: Neng Wang; Hongyang Yang; Christina Dan Wang
    Abstract: In the swiftly expanding domain of Natural Language Processing (NLP), the potential of GPT-based models for the financial sector is increasingly evident. However, the integration of these models with financial datasets presents challenges, notably in determining their adeptness and relevance. This paper introduces a distinctive approach anchored in the Instruction Tuning paradigm for open-source large language models, specifically adapted for financial contexts. Through this methodology, we capitalize on the interoperability of open-source models, ensuring a seamless and transparent integration. We begin by explaining the Instruction Tuning paradigm, highlighting its effectiveness for immediate integration. The paper presents a benchmarking scheme designed for end-to-end training and testing, employing a cost-effective progression. Firstly, we assess basic competencies and fundamental tasks, such as Named Entity Recognition (NER) and sentiment analysis to enhance specialization. Next, we delve into a comprehensive model, executing multi-task operations by amalgamating all instructional tunings to examine versatility. Finally, we explore the zero-shot capabilities by earmarking unseen tasks and incorporating novel datasets to understand adaptability in uncharted terrains. Such a paradigm fortifies the principles of openness and reproducibility, laying a robust foundation for future investigations in open-source financial large language models (FinLLMs).
    Date: 2023–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2310.04793&r=big
  14. By: Igor Sadoune; Marcelin Joanis; Andrea Lodi
    Abstract: We present a deep learning solution to address the challenges of simulating realistic synthetic first-price sealed-bid auction data. The complexities encountered in this type of auction data include high-cardinality discrete feature spaces and a multilevel structure arising from multiple bids associated with a single auction instance. Our methodology combines deep generative modeling (DGM) with an artificial learner that predicts the conditional bid distribution based on auction characteristics, contributing to advancements in simulation-based research. This approach lays the groundwork for creating realistic auction environments suitable for agent-based learning and modeling applications. Our contribution is twofold: we introduce a comprehensive methodology for simulating multilevel discrete auction data, and we underscore the potential ofDGMas a powerful instrument for refining simulation techniques and fostering the development of economic models grounded in generative AI. Nous proposons une solution basée sur l'apprentissage profond pour simuler de manière réaliste des données d'enchères scellées. Les enjeux liés à ce type de données résident dans la gestion des variables discrètes de grande dimension et de la structure multiniveau liée à la présence de multiples offres pour une seule et même enchère. Notre approche intègre une modélisation générative profonde avec un système d'apprentissage artificiel, capable de prévoir la distribution des offres en fonction des propriétés de l'enchère. Cette stratégie constitue une base solide pour l'élaboration d'environnements d'enchères artificiels mais réalistes, adaptés à l'apprentissage et à la modélisation basés sur les agents. Notre contribution est double: nous introduisons une méthodologie complète pour simuler des données d'enchères discrètes à plusieurs niveaux, et nous mettons en lumière le potentiel de la modélisation générative profonde pour améliorer les techniques de simulation et promouvoir le développement de modèles économiques s'appuyant sur l'intelligence artificielle générative.
    Keywords: simulation crafting, discrete deep generative modeling, multilevel discrete data, auction data, simulation, modélisation générative discrète et profonde, données discrètes multiniveaux, données d'enchères
    Date: 2023–10–02
    URL: http://d.repec.org/n?u=RePEc:cir:cirwor:2023s-23&r=big
  15. By: Sarit Maitra
    Abstract: This study enhances option pricing by presenting unique pricing model fractional order Black-Scholes-Merton (FOBSM) which is based on the Black-Scholes-Merton (BSM) model. The main goal is to improve the precision and authenticity of option pricing, matching them more closely with the financial landscape. The approach integrates the strengths of both the BSM and neural network (NN) with complex diffusion dynamics. This study emphasizes the need to take fractional derivatives into account when analyzing financial market dynamics. Since FOBSM captures memory characteristics in sequential data, it is better at simulating real-world systems than integer-order models. Findings reveals that in complex diffusion dynamics, this hybridization approach in option pricing improves the accuracy of price predictions. the key contribution of this work lies in the development of a novel option pricing model (FOBSM) that leverages fractional calculus and neural networks to enhance accuracy in capturing complex diffusion dynamics and memory effects in financial data.
    Date: 2023–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2310.04464&r=big
  16. By: Zoran Stoiljkovic
    Abstract: This thesis provides an overview of the recent advances in reinforcement learning in pricing and hedging financial instruments, with a primary focus on a detailed explanation of the Q-Learning Black Scholes approach, introduced by Halperin (2017). This reinforcement learning approach bridges the traditional Black and Scholes (1973) model with novel artificial intelligence algorithms, enabling option pricing and hedging in a completely model-free and data-driven way. This paper also explores the algorithm's performance under different state variables and scenarios for a European put option. The results reveal that the model is an accurate estimator under different levels of volatility and hedging frequency. Moreover, this method exhibits robust performance across various levels of option's moneyness. Lastly, the algorithm incorporates proportional transaction costs, indicating diverse impacts on profit and loss, affected by different statistical properties of the state variables.
    Date: 2023–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2310.04336&r=big
  17. By: Ángel Iván Moreno (Banco de España); Teresa Caminero (Banco de España)
    Abstract: The Intergovernmental Panel on Climate Change (IPCC) estimates that global net-zero should be achieved by 2050. To this end, many private firms are pledging to reach net-zero emissions by 2050. The Climate Data Steering Committee (CDSC) is working on an initiative to create a global central digital repository of climate disclosures, which aims to address the current data challenges. This paper assesses the progress within European financial institutions towards overcoming the data challenges outlined by the CDSC. Using a text-mining approach, coupled with the application of commercial Large Language Models (LLM) for context verification, we calculate a Greenhouse Gas Disclosure Index (GHGDI), by analysing 23 highly granular disclosures in the ESG reports between 2019 and 2021 of most of the significant banks under the ECB’s direct supervision. This index is then compared with the CDP score. The results indicate a moderate correlation between institutions not reporting to CDP upon request and a low GHGDI. Institutions with a high CDP score do not necessarily correlate with a high GHGDI.
    Keywords: ESG, sustainability, environment, climate change, carbon emissions, natural language processing, climate data challenges, OpenAI’s ChatGPT, Google’s text-bison
    JEL: C88 G32 Q56
    Date: 2023–09
    URL: http://d.repec.org/n?u=RePEc:bde:wpaper:2326&r=big
  18. By: Jamotton, Charlotte (Université catholique de Louvain, LIDAM/ISBA, Belgium); Hainaut, Donatien (Université catholique de Louvain, LIDAM/ISBA, Belgium)
    Abstract: This article explores the application of variational autoencoders (VAEs) to insurance data. Previous research has demonstrated the successful use of generative models, particularly VAEs, in various domains such as image recognition, text classification, and recommender systems. However, their application to insurance data, specifically heterogeneous insurance portfolios with mixed continuous and discrete attributes, remains unexplored. This study introduces novel insights into utilizing VAEs for unsupervised learning tasks in the actuarial field, including dimensionality reduction and synthetic data generation. We propose a VAE model with a quantile transformation of continuous data and a reconstruction loss that combines categorical cross-entropy and mean squared error, along with a KL divergence-based regularization term. The architecture of our VAE model eliminates the need for pre-training layers to fine-tune categorical features representations. We analyze our VAE's ability to reconstruct complex insurance data and generate synthetic insurance policies using a motor portfolio. Our experimental results and analysis highlight the potential of VAEs for addressing challenges related to privacy and anti-discriminatory regulations, bias correction, and data availability in the insurance industry.
    Keywords: Autoencoder ; variational inference ; synthetic data generation ; heterogeneous insurance data ; dimensionality reduction
    Date: 2023–06–29
    URL: http://d.repec.org/n?u=RePEc:aiz:louvad:2023025&r=big
  19. By: Ramaharo, Franck Maminirina (Ministry of Economy and Finance (Ministère de l'Economie et des Finances)); RANDRIAMIFIDY, Michael Fitiavana
    Abstract: The aim of this note is to identify the factors influencing renewable energy consumption in Madagascar. We tested 12 features covering macroeconomic, financial, social, and environmental aspects, including economic growth, domestic investment, foreign direct investment, financial development, industrial development, inflation, income distribution, trade openness, exchange rate, tourism development, environmental quality, and urbanization. To assess their significance, we assumed a linear relationship between renewable energy consumption and these features over the 1990–2021 period. Next, we applied different machine learning feature selection algorithms classified as filter-based (relative importance for linear regression, correlation method), embedded (LASSO), and wrapper-based (best subset regression, stepwise regression, recursive feature elimination, iterative predictor weighting partial least squares, Boruta, simulated annealing, and genetic algorithms) methods. Our analysis revealed that the five most influential drivers stem from macroeconomic aspects. We found that domestic investment, foreign direct investment, and inflation positively contribute to the adoption of renewable energy sources. On the other hand, industrial development and trade openness negatively affect renewable energy consumption in Madagascar.
    Date: 2023–10–26
    URL: http://d.repec.org/n?u=RePEc:osf:africa:pfrhx&r=big
  20. By: Hainaut, Donatien (Université catholique de Louvain, LIDAM/ISBA, Belgium)
    Abstract: Guaranteed Minimum Accumulation Benefits (GMAB) are retirement savings vehicles which protect the policyholder against downside market risk. This article proposes a valuation method of these contracts based on physics inspired neural networks (PINN's), in presence of multiple financial and biometric risk factors. A PINN integrates principles from physics into its learning process to enhance its efficiency in solving complex problems. In this article, the driving principle is the Feynman-Kac (FK) equation which is a partial differential equation (PDE) ruling the GMAB price in an arbitrage-free market. In our context, the FK PDE depends on multiple variables and is hard to solve by classical finite difference approximations. In comparison, PINN's constitute a efficient alternative which furthermore can evaluate GMAB's with various specifications without retraining. To illustrate this, we consider a market with four risk factors. We first find a closed form expression for the GMAB that serves as benchmark for the PINN. Next, we propose a scaled version of the FK equation that we solve with a PINN. Pricing errors are analyzed in a numerical illustration.
    Date: 2023–09–18
    URL: http://d.repec.org/n?u=RePEc:aiz:louvad:2023029&r=big
  21. By: Phan Cong Thao Tien; Tran Thien Phuc; Nguyen Thi Hai Binh
    Abstract: A hybrid analysis of Structural Equation Modeling (SEM), Artificial Neural Network (ANN), and Importance- Performance Map Analysis (IPMA) was used to examine how YouTube videos affect university students’ acceptance in Ho Chi Minh City (HCMC). Performance expectation was the most important component by both ANN and IPMA assessments, and theoretically, the presented model gave several explanations for the effect of individual determinants of desire to utilize eLearning from Internet services. The findings support earlier research showing performance and effort expectations strongly impact eLearning adoption. The report encouraged academics in HCMC to utilize YouTube. Respondents wanted to employ modern technologies in their teaching. UTAUT and TAM were used to discuss the findings.
    Keywords: Social media, YouTube, Higher education, eLearning, Ho Chi Minh City’s students, TAM, UTAUT
    Date: 2023
    URL: http://d.repec.org/n?u=RePEc:zbw:esconf:279146&r=big
  22. By: Hites Ahir; Mr. Giovanni Dell'Ariccia; Davide Furceri; Mr. Chris Papageorgiou; Hanbo Qi
    Abstract: This paper uses text analysis to construct a continuous financial stress index (FSI) for 110 countries over each quarter during the period 1967-2018. It relies on a computer algorithm along with human expert oversight and is thus easy to update. The new indicator has a larger country and time coverage and higher frequency than similar measures focusing on advanced economies. And it complements existing binary chronologies in that it can assess the severity of financial crises. We use the indicator to assess the impact of financial stress on the economy using both country- and firm-level data. Our main findings are fivefold: i) consistent with existing literature, we show an economically significant and persistent relationship between financial stress and output; ii) the effect is larger in emerging markets and developing economies and (iii) for higher levels of financial stress; iv) we deal with simultaneous causality by constructing a novel instrument—financial stress originating from other countries—using information from the text analysis, and show that, while there is clear evidence that financial stress harms economic activities, OLS estimates tend to overestimate the magnitude of this effect; (iv) we confirm the presence of an exogenous effect of financial stress through a difference-in-differences exercise and show that effects are larger for firms that are more financially constrained and less profitable.
    Keywords: Financial stress; text analysis; country reports; continuous indicator; country and firm economic activity.
    Date: 2023–10–20
    URL: http://d.repec.org/n?u=RePEc:imf:imfwpa:2023/217&r=big

This nep-big issue is ©2023 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.