Needles & Haystacks

A long-time acquaintance of mine told me recently that, fresh out of the University of Virginia and new to forensic accounting, his first assignment consisted in searching, at the height of summer, through two unairconditioned trailers full of thousands of savings and loan records for what turned out to be just two documents critical to proving a loan fraud. He told me that he thought then that his job would always consist of finding needles in haystacks. Our profession and our tools have, thankfully, come a long way since then!

Today, digital analysis techniques afford the forensic investigator the ability to perform cost-effective financial forensic investigations. This is achieved through the following:

— The ability to test or analyze 100 percent of a data set, rather than merely sampling the data set.
–Massive amounts of data can be imported into working files, which allows for the processing of complex transactions and the profiling of certain case-specific characteristics.
–Anomalies within databases can be quickly identified, thereby reducing the number of transactions that require review and analysis.
–Digital analysis can be easily customized to address the scope of the engagement.

Overall, digital analysis can streamline investigations that involve a large number of transactions, often turning a needle-in-the-haystack search into a refined and efficient investigation. Digital analysis is not designed to replace the pick-and-shovel aspect of an investigation. However, the proper application of digital analysis will permit the forensic operator to efficiently identify those specific transactions that require further investigation or follow up.

As every CFE knows, there are an ever-growing number of software applications that can assist the forensic investigator with digital analysis. A few such examples are CaseWare International Inc.’s IDEA, ACL Services Ltd.’s ACL Desktop Edition, and the ActiveData plug-in, which can be added to Excel.

So, whether using the Internet in an investigation or using software to analyze data, fraud examiners can today rely heavily on technology to aid them in almost any investigation. More data is stored electronically than ever before; financial data, marketing data, customer data, vendor listings, sales transactions, email correspondence, and more, and evidence of fraud can be located within that data. Unfortunately, fraudulent data often looks like legitimate data when viewed in the raw. Taking a sample and testing it might or might not uncover evidence of fraudulent activity. Fortunately, fraud examiners now have the ability to sort through piles of information by using special software and data analysis techniques. These methods can identify future trends within a certain industry, and they can be configured to identify breaks in audit control programs and anomalies in accounting records.

In general, fraud examiners perform two primary functions to explore and analyze large amounts of data: data mining and data analysis. Data mining is the science of searching large volumes of data for patterns. Data analysis refers to any statistical process used to analyze data and draw conclusions from the findings. These terms are often used interchangeably.

If properly used, data analysis processes and techniques are powerful resources. They can systematically identify red flags and perform predictive modeling, detecting a fraudulent situation long before many traditional fraud investigation techniques would be able to do so.

Big data is now a buzzword in the worlds of business, audit, and fraud investigation. Big data are high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery, and process optimization. Simply put, big data is information of extreme size, diversity, and complexity.

In addition to thinking of big data as a single set of data, fraud investigators should think about the way data grow when different data sets are connected together that might not normally be connected. Big data represents the continuous expansion of data sets, the size, variety, and speed of generation of which makes it difficult to manage and analyze.

Big data can be instrumental to fact gathering during an investigation. Distilled down to its core, how do fraud examiners gather data in an investigation? We look at documents and financial or operational data, and we interview people. The challenge is that people often gravitate to the areas with which they are most comfortable. Attorneys will look at documents and email messages and then interview individuals. Forensic accounting professionals will look at the accounting and financial data (structured data). Some people are strong interviewers. The key is to consider all three data sources in unison. Big data helps to make it all work together to tell the complete picture. With the ever-increasing size of data sets, data analytics has never been more important or useful. Big data requires the use of creative and well-planned analytics due to its size and complexity. One of the main advantages of using data analytics in a big data environment is, as indicated above, that it allows the investigator to analyze an entire population of data rather than having to choose a sample and risk drawing conclusions in the event of a sampling error.

To conduct an effective data analysis, a fraud examiner must take a comprehensive approach. Any direction can (and should) be taken when applying analytical tests to available data. The more creative fraudsters get in hiding their schemes, the more creative the fraud examiner must become in analyzing data to detect these schemes. For this reason, it is essential that fraud investigators consider both structured and unstructured data when planning their engagements.
Data are either structured or unstructured. Structured data is the type of data found in a database, consisting of recognizable and predictable structures. Examples of structured data include sales records, payment or expense details, and financial reports.

Unstructured data, by contrast, is data not found in a traditional spreadsheet or database. Examples of unstructured data include vendor invoices, email and user documents, human resources files, social media activity, corporate document repositories, and news feeds.

When using data analysis to conduct a fraud examination, the fraud examiner might use structured data, unstructured data, or a combination of the two. For example, conducting an analysis on email correspondence (unstructured data) among employees might turn up suspicious activity in the purchasing department. Upon closer inspection of the inventory records (structured data), the fraud examiner might uncover that an employee has been stealing inventory and covering her tracks in the records.

Data mining has roots in statistics, machine learning, data management and databases, pattern recognition, and artificial intelligence. All of these are concerned with certain aspects of data analysis, so they have much in common; yet they each have a distinct and individual flavor, emphasizing particular problems and types of solutions.

Although data mining technologies provide key advantages to marketing and business activities, they can also manipulate financial data that was previously hidden within a company’s database, enabling fraud examiners to detect potential fraud.

Data mining software provides an easy to use process that gives the fraud examiner the ability to get to data at a required level of detail. Data mining combines several different techniques essential to detecting fraud, including the streamlining of raw data into understandable patterns.

Data mining can also help prevent fraud before it happens. For example, computer manufacturers report that some of their customers use data mining tools and applications to develop anti-fraud models that score transactions in real-time. The scoring is customized for each business, involving factors such as locale and frequency of the order, and payment history, among others. Once a transaction is assigned a high-risk score, the merchant can decide whether to accept the transaction, deny it, or investigate further.

Often, companies use data warehouses to manage data for analysis. Data warehouses are repositories of a company’s electronic data designed to facilitate reporting and analysis. By storing data in a data warehouse, data users can query and analyze relevant data stored in a single location. Thus, a company with a data warehouse can perform various types of analytic operations (e.g., identifying red flags, transaction trends, patterns, or anomalies) to assist management with its decision making responsibilities.

In conclusion, after the fraud examiner has identified the data sources, s/he should identify how the information is stored by reviewing the database schema and technical documentation. Fraud examiners must be ready to face a number of pitfalls when attempting to identify how information is stored, from weak or nonexistent documentation to limited collaboration from the IT department.

Moreover, once collected, it’s critical to ensure that the data is complete and appropriate for the analysis to be performed. Depending on how the data was collected and processed, it could require some manual work to make it usable for analysis purposes; it might be necessary to modify certain field formats (e.g., date, time, or currency) to make the information usable.

Leave a Reply