9th Annual Bloomberg-Columbia Machine Learning in Finance Workshop 2023
The workshop is organized by:
VQ-TR: Vector Quantized Attention for Time Series Forecasting
Modern time series datasets can easily contain hundreds or thousands of temporal points. However, Transformer-based models scale poorly due to quadratic complexity in sequence length, constraining their context size in the sequence-to-sequence setting. To this end, we introduce VQ-TR, which maps large sequences to a discrete set of latent representations as part of the Attention module. This allows us to attend over larger context windows with linear complexity in sequence length. We evaluate on various public time-series datasets from different domains and highlight the performance using both probabilistic and point forecasting metrics. We show that VQ-TR performs better or comparably to a wide range of competitive deep learning (including Transformer-based) and classical univariate probabilistic models while being computationally efficient.
Anderson is an Executive Director at the Morgan Stanley Machine Learning Research Department. He joined Morgan Stanley in 2019. Anderson was previously a quantitative researcher/trader at Infinium Capital and Tower Research Capital and a senior quantitative researcher at Graham Capital. Anderson has authored and co-authored conference and journal papers. He received his Ph.D. in Economics from University of Minnesota.
Applying Deep Hedging to Commodity Heat Rate Options
Deep hedging is an exciting new machine learning application in front office derivatives risk management, where a neural network replaces traditional risk neutral pricing and risk formulas. It has the advantage of reproducing traditional derivative pricing when the assumptions behind risk neutral pricing hold, but extends naturally to real-world cases where those assumptions are broken. I'll discuss the technique and look at an application of it to the problem of managing a heat rate option, which is a financial version of a gas-fired power plant.
Mark Higgins is the co-founder and Chief Analytics Officer at Beacon Platform Inc, a financial technology company that provides an enterprise technology platform for capital markets, aimed at quants, data scientists, and front office developers. Prior to co-founding Beacon in 2014, Dr Higgins spent eight years at JPMorgan Chase, where he launched the Athena project and co-headed the Quantitative Research group for the Investment Bank. He spent eight years at Goldman Sachs as a quant on the FX and interest rate desks, and has a PhD in astrophysics from Queens University in Canada.
Scientific Machine Learning (SciML): Fast Solving and Automated Construction of Stochastic Models
Scientific machine learning (SciML) methods allow for the automatic discovery of mechanistic models by infusing neural network training into the simulation process. In this talk we will start by showcasing some of the ways that SciML methods, like universal differential equations (UDE), are being used. Demonstrations of the automated discovery of relativistic corrections to black hole physics to the construction of earthquake-safe buildings showcase the successes of the techniques throughout scientific domains. From there, we will demonstrate how many difficult issues in stochastic modeling can be handled with SciML, from the solution of high-dimensional Black-Scholes equations, the automated discovery of stochastic differential equation models, unbiased surrogates and automatic differentiation of jump diffusion models, and the accelerated convergence of Dynamic Stochastic General Equilibrium (DSGE) and stochastic-on-stochastic simulations. We will then show how these SciML methods are transitioning to industrial use, explaining how SciML led to improved trajectory planning in track-side computers for Formula 1 races, accelerated the clinical trials of the Covid vaccine, and improved for crash testing computers. The audience will leave with an understanding of how these latest SciML ideas for incorporating prior mechanistic knowledge into deep learning can greatly improve the ability to predict with small data over purely machine learning techniques.
Chris is the VP of Modeling and Simulation at JuliaHub, the Director of Scientific Research at Pumas-AI, Co-PI of the Julia Lab at MIT, and the lead developer of the SciML Open Source Software Organization. He is the lead developer of the Pumas project and has received a top presentation award at every ACoP in the last 3 years for improving methods for uncertainty quantification, automated GPU acceleration of nonlinear mixed effects modeling (NLME), and machine learning assisted construction of NLME models with DeepNLME. For these achievements, Chris received the Emerging Scientist award from ISoP. For his work in mechanistic machine learning, his work is credited for the 15,000x acceleration of NASA Launch Services simulations and recently demonstrated a 60x-570x acceleration over Modelica tools in HVAC simulation, earning Chris the US Air Force Artificial Intelligence Accelerator Scientific Excellence Award.
Similarity Learning in Finance
Financial literature consists of ample research on similarity and comparison of financial assets and securities such as stocks, bonds, mutual funds, etc. However, going beyond correlations or aggregate statistics has been arduous since financial datasets are noisy, lack useful features, have missing data and often lack ground truth or annotated labels.
Though similarity extrapolated from these traditional models heuristically may work well on an aggregate level, such as risk management when looking at large portfolios, they often fail when used for portfolio construction and trading which require a local and dynamic measure of similarity on top of global measure. In this talk, I will start by demonstrating the importance of our research program on similarity learning in the financial domain with providing many different potential applications. I will then describe various similarity learning methods with their advantages and disadvantages, and finally focus on tree based distance metric learning applied to corporate bonds as an application.
Dr. Dhagash Mehta is the Head of Applied Machine Learning Research (Investment Management) at BlackRock Inc. and an Editorial Board Member at the Journal of Financial Data Science (https://jfds.pm-research.com/) and Journal of ESG and Impact Investing (https://jesg.pm-research.com/).
Previously he was a Senior Manager, Investment Strategist (Machine Learning – Asset Allocation) at Investment Strategy Group at The Vanguard Group. Before joining Vanguard, he was a Senior Research Scientist at United Technologies (now, Raytheon Technologies) Research Center. Prior to that, he was a Research Professor at Department of Applied and Computational Mathematics and Statistics in conjunction with Department of Chemical and Biomolecular Engineering at University of Notre Dame. He was a Fields Institute Postdoc Fellow for the Thematic Program on Computer Algebra at Fields Institute, Toronto, in Fall 2015 and a Visiting Fellow at Simons Institute for Theory of Computing at Berkeley in Fall 2014. Previously, he has held various research positions at the University of Cambridge (the UK), Imperial College London (the UK), the University of Adelaide (Australia), Syracuse University (USA) and National University of Ireland Maynooth (Ireland).
Dr. Mehta’s research areas are machine/deep learning; quantitative finance, and computational mathematics, science and engineering. In particular, he has worked on optimization (convex and nonconvex), computational algebraic geometry, numerical analysis, network science and machine learning to solve various problems arising in financial services and wealth/asset management (and in the past, power systems and control theory; and theoretical and computational physics, jet-engines, HVAC and building systems, chemistry and biology).
Scholar Google page: https://scholar.google.com/citations?user=J7fyX_sAAAAJ&hl=en&oi=ao
A transformer-based model for default prediction in mid-cap corporate markets
We study mid-cap companies, i.e. publicly traded companies with less than US$10 billion in market capitalization. Using a large dataset of US mid-cap companies observed over 30 years, we look to predict the default probability term structure over the short to medium term and understand which data sources (i.e. fundamental, market or pricing data) contribute most to the default risk. Whereas existing methods typically require that data from different time periods are first aggregated and turned into cross-sectional features, we frame the problem as a multi-label panel data classification problem. To tackle it, we then employ transformer models, a state-of-the-art deep learning model emanating from the natural language processing domain. To make this approach suitable to the given credit risk setting, we use a loss function for multi-label classification, to deal with the term structure, and propose a multi-channel architecture with differential training that allows the model to use all input data efficiently. Our results show that the proposed deep learning architecture produces superior performance, resulting in a sizeable improvement in AUC (Area Under the receiver operating characteristic Curve) over traditional models. In order to interpret the model, we also demonstrate how to produce importance ranking for the different data sources and their temporal relationships, using a Shapley approach for feature groups.
Dr. Cristián Bravo is Associate Professor and Canada Research Chair in Banking and Insurance Analytics at the University of Western Ontario, Canada, where he serves as Director of the Banking Analytics Lab. His research focuses on the development and application of data science methodologies in the context of credit risk analytics, in areas such as deep learning, text analytics, image processing, causal inference, and social network analysis. He has over 50 publications in high-impact journals and conferences in operational research, finance, and computer science. He also serves as an editorial board member in Applied Soft Computing and the Journal of Business Analytics and is the co-author of the book “Profit Driven Business Analytics”, with over 6,000 copies sold to date. He has been quoted by The Wall Street Journal, WIRED, CTV, The Toronto Star, The Globe and Mail, and Global News, and is a regular panelist at the CBC News Weekend Business Panel and other national news shows discussing the latest news in Banking, Finance and Artificial Intelligence. He can be reached via LinkedIn, by Twitter @CrBravoR, or through his lab website at https://thebal.ai.
Market Microstructure in the Big-data Era: Improving High-frequency Price Prediction via Machine Learning
Traditional empirical market microstructure models use linear models with a small set of features (e.g, trade imbalance, best bid, and ask price) from a single market to study the price discovery process. However, in the big data era, high-frequency full limit-order-book (LOB) data are available from multiple markets, allowing for a much richer set of features. We construct machine learning models such as boosted decision trees and random forests, which endogenously pick the most relevant features from a large universe of LOB variables. We demonstrate that these models outperform classical linear models in high-frequency price prediction and information attribution. We highlight how the gains achieved by these models can be explained by the nonlinear price impact of LOB features, which is missing in traditional models.
Agostino Capponi is an Associate Professor in the IEOR Department at Columbia University. His research interests are in financial technology, machine learning in finance, market microstructure, and financial networks.
Agostino's research has been funded by major government agencies and private corporations, including NSF, DARPA, DOE, IBM, GRI, Ripple, and Ethereum. His research has been recognized with the 2018 NSF CAREER award, and with the inaugural JP Morgan AI Research Faculty award. His research findings have attracted attention from major media outlets, including Bloomberg, Thomson Reuters, Politico, and the Financial Times. Agostino is a fellow of the crypto and blockchain economics research forum, an academic fellow of Alibaba's Luohan academy, and an external fellow of the FinTech Initiative at Cornell. He serves as an editor of Management Science in the Finance Department, co-editor of Mathematics and Financial Economics, and area editor of Operations Research Letters, and as an associate editor of many premier journals of his field. Agostino is the former Chair of the SIAG/FME Activity Group and of the INFORMS Finance Section, and the founding director of the Columbia Center for Digital Finance and Technologies. He is co-editor of the book "Machine Learning and Data Sciences for Financial Markets: A Guide to Contemporary Practices", recently published by the Cambridge University Press.
Optimal Transport-based Distributionally Robust Decision Making and Applications to Portfolio Selection
We will discuss recent investigations on the development of distributionally robust portfolio selection strategies using optimal transport methods and martingale constraints. Our investigations are motivated by experiments showing that out-of-sample performance of reasonable rolling horizon strategies on a reasonably selected universe of stocks appear to under-perform relative to in-sample predictions. We discuss various optimal transport adversarial formulations with martingale constraints that are tractable and powerful (in the sense of being able to simultaneously both reweight and perturb samples in a way that is informed by current market information).
Jose Blanchet is a Professor of Management Science and Engineering (MS&E) at Stanford. Prior to joining MS&E, he was a professor at Columbia (Industrial Engineering and Operations Research, and Statistics, 2008-2017), and before that he taught at Harvard (Statistics, 2004-2008). Jose is a recipient of the 2010 Erlang Prize and several best publication awards in areas such as applied probability, simulation, operations management, and revenue management. He also received a Presidential Early Career Award for Scientists and Engineers in 2010. He worked as an analyst at Protego Financial Advisors, a leading investment bank in Mexico. He has research interests in applied probability and Monte Carlo methods. He is the Area Editor of Stochastic Models in Mathematics of Operations Research. He has served on the editorial board of Advances in Applied Probability, Bernoulli, Extremes, Insurance: Mathematics and Economics, Journal of Applied Probability, Queueing Systems: Theory and Applications, and Stochastic Systems, among others.
BloombergGPT: A Large Language Model for Finance
The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models (LLMs) have been shown to be effective on a variety of tasks; however, no LLM specialized for the financial domain has been reported in the literature. In this work, we present BloombergGPT, a 50 billion parameter language model that is trained on a wide range of financial data. We construct a 363 billion token dataset based on Bloomberg's extensive data sources, perhaps the largest domain-specific dataset yet, augmented with 345 billion tokens from general-purpose datasets. We validate BloombergGPT on standard LLM benchmarks, open financial benchmarks, and a suite of internal benchmarks that most accurately reflect our intended usage. Our mixed dataset training leads to a model that outperforms existing models on financial tasks by significant margins without sacrificing performance on general LLM benchmarks. Additionally, we explain our modeling choices, training process, and evaluation methodology.
Gideon Mann is the head of the ML Product and Research team in the Office of the CTO at Bloomberg LP. At Bloomberg, he guides corporate strategy for machine learning, natural language processing (NLP), information retrieval, and alternative data. His mandate includes building AI infrastructure (from GPUs to NLP libraries), incubating new technology (e.g., large language models), and new businesses (e.g., Bloomberg Second Measure).
He has over 30 publications and more than 20 patents in machine learning and NLP. He’s served as a founding member of the Data for Good Exchange (D4GX). Before joining Bloomberg in 2014, he worked at Google Research NY, where his team carried out basic research, as well as developing machine learning products such as Colaboratory. He holds a Ph.D. from The Johns Hopkins University.
Early registration is available until Friday, May 5, 2023, after which regular registration rates will apply. The early (regular) registration rates are:
Corporate delegates: $200 ($250)
Academics, Alumni, & Non-Columbia students*: $50 ($75)
Current Columbia students*: $40 ($50)
No refund after May 5, 2023
*Those availing of student rates will be required to show a valid student ID at the event.