Finding the Words

I had lunch with a long-time colleague the other day and the topic of conversation having turned to our May training event next week, he commented that when conducting a fraud examination, he had always found it helpful to come up with a list of words specifically associated with the type of fraud scenario on which he was working.  He found the exercise useful when scanning through the piles of textual material he frequently had to plow through during complex examinations.

Data analysis in the traditional sense involves running rule-based queries on structured data, such as that contained in transactional databases or financial accounting systems. This type of analysis can yield valuable insight into potential frauds. But, a more complete analysis requires that fraud examiners (like my friend) also consider unstructured textual data. Data are either structured or unstructured. Structured data is the type of data found in a database, consisting of recognizable and predictable structures. Examples of structured data include sales records, payment or expense details, and financial reports. Unstructured data, by contrast, is data that would not be found in a traditional spreadsheet or database. It is typically text based.

Our client’s employees are sending and receiving more email messages each year, retaining ever more electronic source documents, and using more social media tools. Today, we can anticipate unstructured data to come from numerous sources, including:

• Social media posts
• Instant messages
• Videos
• Voice files
• User documents
• Mobile phone software applications
• News feeds
• Sales and marketing material
• Presentations

Textual analytics is a method of using software to extract usable information from unstructured text data. Through the application of linguistic technologies and statistical techniques, including weighted fraud indicators (e.g., my friend’s fraud keywords) and scoring algorithms, textual analytics software can categorize data to reveal patterns, sentiments, and relationships indicative of fraud. For example, an analysis of email communications might help a fraud examiner gauge the pressures/incentives, opportunities, and rationalizations to commit fraud that exist in a client organization.

According to my colleague, as a prelude to textual analytics (depending on the type of fraud risk present in a fraud examiner’s investigation), the examiner  will frequently profit by coming up with a list of fraud keywords that are likely to point to suspicious activity. This list will depend on the industry of the client, suspected fraud schemes, and the data set the fraud examiner has available. In other words, if s/he is running a search through journal entry detail, s/he will likely search for different fraud keywords than if s/he were running a search of emails. It might be helpful to look at the ACFE’s fraud triangle when coming up with a keyword list. The factors identified in the triangle are helpful when coming up with a fraud keyword list. Consider how someone in the entity under investigation might have the opportunity to commit fraud, be under pressure to commit fraud, or be able to rationalize the commission of fraud.

Many people commit fraud because of something that has happened in their life that motivates them to steal. Maybe they find themselves in debt, or perhaps they must meet a certain goal to qualify for a performance-based bonus. Keywords that might indicate pressure include deadline, quota, trouble, short, problem, and concern. Think of words that would indicate that someone has the opportunity or ability to commit fraud. Examples include override, write-off, recognize revenue, adjust, discount, and reserve/provision.

Since most fraudsters do not have a criminal background, justifying their actions is a key part of committing fraud. Some keywords that might indicate a fraudster is rationalizing his actions include reasonable, deserve, and temporary.

So, even though the concepts embodied in the fraud triangle are a good place to start when developing a keyword list, it’s also important to consider the nature of the client entity’s industry and the types of payments it makes or is suspected of making. Think about the fraud scenarios that are likely to have occurred. Does the entity do a significant amount of work overseas or have many contractors? If so, there might be an elevated risk of bribery. Focus on the payment text descriptions in journal entries or in work delated documentation, since no one calls it “bribe expense.” Some examples of word combinations in payment descriptions that might merit special attention include:

• Goodwill payment
• Consulting fee
• Processing fee
• Incentive payment
• Donation
• Special commission
• One-time payment
• Special payment
• Friend fee
• Volume contract incentive

Any payment descriptions bearing these, or similar terms warrant extra scrutiny to check for reasonableness. Also, examiners should always be wary of large cash disbursements that have a blank journal payment description.

Beyond key word lists, the ACFE tells us that another way to discover fraud clues hidden in text is to consider the emotional tone of employee correspondence. In emails and instant messages, for instance, a fraud examiner should identify derogatory, surprised, secretive, or worried communications. In one example, former Enron CEO Ken Lay’s emails were analyzed, revealing that as the company came closer to filing bankruptcy, his email correspondence grew increasingly derogatory, confused, and angry. This type of analysis provided powerful evidence that he knew something was wrong at the company.

While advanced textual analytics can be extremely revealing and can provide clues for potential frauds that might otherwise go unnoticed, the successful application of such analytics requires the use of sophisticated software, as well as a thorough understanding of the legal environment of employee rights and workplace searches. Consequently, fraud examiners who are considering adding textual analytics to their fraud detection arsenal should consult with technological and legal experts before undertaking such techniques.

Even with sophisticated data analysis techniques, some data are so vast or complex that they remain difficult to analyze using traditional means. Visually representing data via graphs,  link diagrams, time-series charts, and other illustrative representations can bring clarity to a fraud examination. The utility of visual representations is enhanced as data grow in volume and complexity. Visual analytics build on humans’ natural ability to absorb a greater volume of information in visual rather than numeric form and to perceive certain patterns, shapes, and shades more easily than others.

Link analysis software is used by fraud examiners to create visual representations (e.g., charts with lines showing connections) of data from multiple data sources to track the movement of money; demonstrate complex networks; and discover communications, patterns, trends, and relationships. Link analysis is very effective for identifying indirect relationships and relationships with several degrees of separation. For this reason, link analysis is particularly useful when conducting a money laundering investigation because it can track the placement, layering, and integration of money as it moves around unexpected sources. It could also be used to detect a fictitious vendor (shell company) scheme. For instance, the investigator could map visual connections between a variety of entities that share an address and bank account number to reveal a fictitious vendor created to embezzle funds from a company.  The following are some other examples of the analyses and actions fraud examiners can perform using link analysis software:

• Associate communications, such as email, instant messages, and internal phone records, with events and individuals to reveal connections.
• Uncover indirect relationships, including those that are connected through several intermediaries.
• Show connections between entities that share an address, bank account number, government identification number (e.g., Social Security number), or other characteristics.
• Demonstrate complex networks (including social networks).

Imagine a listing of vendors, customers, employees, or financial transactions of a global company. Most of the time, these records will contain a reference to a location, including country, state, city, and possibly specific street address. By visually analyzing the site or frequency of events in different geographical areas, a fraud investigator has yet another variable with which s/he can make inferences.

Finally, timeline analysis software aids fraud examiners in transforming their data into visual timelines. These visual timelines enable fraud examiners to:

• Highlight key times, dates, and facts.
• More readily determine a sequence of events.
• Analyze multiple or concurrent sequences of events.
• Track unaccounted for time.
• Identify inconsistencies or impossibilities in data.

Leave a Reply