Harry Lansley

Harry Lansley

Micro/Subtle Expressions and Behaviour Analysis

In this Blog we explore the compelling case of using linguistic cues as a powerful tool for predicting and identifying fraudulent accounts.

I was talking to a friend of mine discussing performance drags on active equity portfolios. It seem that portfolio managers often experience one or two of the stocks in the portfolio “blow up” exercising downward pressure on overall performance.

Unquestionably, the reasons for these stocks’ plummeting share prices can be many, but more often than not is because something unfavourable and undetected by the analysts/investors has happened. Is this at all possible to address then? I am sure that the views on the matter are as many as there are fund managers. But it would not be unfair to say that in some instances such price reactions are related to profit warnings or even worse to discovery of fraudulent reporting or obfuscation. So what can be done to avoid investing in such stocks?

Since I have a keen interest in the area of deception and fraud detection, my brain quickly recalled an article that I have recently come across about using forensic accounting and language characteristics in an attempt to predict financial statement fraud. It seems that traditional forensic accounting methods do have some merit in identifying the financial statements that are likely to be fraudulent. There is however an issue with these methods, they are also indicating fraud in statements that are later found to be truthful. The so-called false positive rate is relatively high.

The forensic accounting profession has long been working on testing the predictive power of various variables and ratios. On the other hand, the science of linguistics has also worked hard at trying to find indicators that might have predictive power in identification of fraudulent accounts. The contribution of Purda and Skillicorn (2012) is in their innovative way of addressing the issues of high false positives and low predictive power by combining the accounting and the linguistic approaches.

Encouragingly, the results seem to help at least in bringing down the rate of false positives if the data (financial statements) is analysed independently by each method and then the findings cross referenced. With a false positive rate of 9% and an accuracy of correct classification of above 82% – the applications for accounting and finance look promising, I would say.

Another indication of what linguistics can do in fraud detection are illustrated in Goel et. al. (2010) with the study that achieved a predictive accuracy rate of 89% in examining the narrative parts of annual reports. The study found that

Fraudulent annual reports contained more passive-voice sentences, used more uncertainty markers, had a higher type-token ratio lexical variety [I struggle to figure out what that means], and were more difficult to read and comprehend than non-fraudulent annual reports.

However, such studies need often to be taken with a healthy dose of critical thinking, as they are susceptible to data limitations, data selection biases and other methodological issues that limit the ability to achieve the same results in real life applications. Not to mention the need to acquire software solutions and time needed to become proficient in using them and interpreting the output.

From my own experience as a fund manager, the time available for a deep dive analysis of individual investments is relatively limited. Surely, Bloomberg has a few basic forensic accounting ratios in their arsenal to give a very rough indication of the “credibility” of the accounts (e.g. Altman Z-score) and I am sure there are a number of apps that they’ve put on their platform that can add more measurements to the assessment.

But what do you do when you have a “gut feeling” or a suspicion that something is a miss and you have a meeting with the company management and would really like to be able to get a better handle on their credibility? It would be great to have the meeting recorded and then come back and feed the file into an app and get the “green” for truth and “red” for lie, highlighting what was amiss. Two issues:

  • Even though I really would like to record the meetings (Oh boy! Did I feel tempted a few times!), it would be extremely unethical to do it covertly. If you ask the company executives for permission to record the meeting – bye bye rapport and colourful answers to your questions.
  • No such reliable app nor scientific base for that matter exists. So what do you do?

Linguistic indicators can actually be helpful in such a situation, and there are some indicators that you will be able to notice in real time if you are trained in observing them. Researchers found that, for example, use of distancing and more tentative words, specific usage of pronouns, emotional terms, qualifiers or modifying language are related to deception. It has also been found that combinations of such indicators, and even better combinations of indicators across multiple communication channels result in more accurate veracity assessments. For more details see the article about SCAnR system, which has yet to be tested on financial language, but lucky for us, research is already underway.

This is turning to be a really long piece, there are so many other aspects to linguistic analysis applications in the world of finance to be touched upon. I suppose we’ll have to discuss Natural Language Processing and researchers’ obsession with sentiment analysis for the purpose of predicting direction of share price movements in one of the future posts.

About the author

Harry Lansley

Harry Lansley

Specialist in Micro/Subtle Expressions and Behaviour Analysis. Harry is certified to the highest level in the Facial Action Coding System (FACS) used for the objective measurement of facial muscle movement.