Analysis of qualitative information has a long tradition in computer science (Natural Language Processing – NLP) and linguistics (corpus linguistics). The analysis of language (spoken and written) can provide powerful insights in studying economic consequences, and methods applied only recently started to gain traction in accounting and finance. Estimates suggest that 90% of all available data created in the last 10 years, 80% of which in a business context, is qualitative/unstructured.
Literature in accounting and finance has only scratched the surface of textual analysis capabilities, and reliance on basic NLP techniques is primarily involving “bag-of-words” methods and make little use of corpus methods. Estimates say that in accounting and finance the state of affairs is significantly behind developments in computational linguistics and machine learning. Surprisingly little is known about how corporate disclosures interact with alternative information sources to influence investor behavior. Recently, however, hedge funds have started to make more and more use of textual analysis for trading purposes, which makes it even more important for corporations to use language strategically.
The amount of corporate information is increasing exponentially and most of it is non-numerical data, such as texts, images and video. Regulatory innovations in the area of financial and non-financial reporting require corporations to provide rich information not only on their financial activities but also on their corporate governance, as well as their environmental and their social activities. Information provided by financial analysts, the media but also by users of social networks add to the mix.
How does this new informational landscape shape corporate transparency? Are financial markets still capable to siphon through all this data, pricing firms correctly? How do institutional investors deal with these questions? Is regulatory intervention needed? These where the motivating questions for the Center for Financial Reporting and Auditing at ESMT Berlin to organize a one-day workshop on natural language processing in financial markets.