Abstract: The drafting of prospectuses entails significant costs for mutual funds, yet the SEC has not demonstrated that they are actually relevant for investors. We use machine learning to study the importance of the textual information contained in prospectuses of a broad sample of US active equity mutual funds, available via EDGAR (the SEC’s online reporting system). Using supervised learning on ex-ante prospectuses we are able to predict which funds are more likely to engage in destructive, agency-related risk-shifting behavior up to 3 years ahead. We also use unsupervised learning to group funds into distinct clusters based on the similarity in prospectus text, uncovering groups with natural interpretations such as “quantitative”, “macro-focused”, “sector-timing”, “regulatory/political risk”, “derivatives risk”, etc. We find that membership in particular clusters is predictive of the shape of funds’ future return distributions.
Bio: Professor Simona Abis joined Columbia Business School in 2017. She holds a PhD from INSEAD. Before joining the PhD program Simona worked as a quantitative researcher for a systematic hedge fund. Her research interests span the fields of information economics, empirical and theoretical asset pricing, machine learning, mutual funds and hedge funds. Overall Simona is interested in the impact of technology on financial markets. Her current research focuses particularly on the impact of technological change on investment management through the rise of quantitative investment and on identifying the informational content of funds’ mandatory disclosures to the regulator.