0% found this document useful (0 votes)
23 views52 pages

PICALO - Frauddetectionusingpicalo 091205200125 Phpapp02

The document discusses developing "detectlets" which are encoded algorithms that can detect specific fraud indicators or schemes by analyzing transactional data, with the goal of making fraud detection more accessible to auditors. It provides an example detectlet for detecting bid rigging by analyzing similarities between bids. The detectlets are presented as a potential way to more effectively integrate artificial intelligence into fraud detection performed during audits.

Uploaded by

cvaca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views52 pages

PICALO - Frauddetectionusingpicalo 091205200125 Phpapp02

The document discusses developing "detectlets" which are encoded algorithms that can detect specific fraud indicators or schemes by analyzing transactional data, with the goal of making fraud detection more accessible to auditors. It provides an example detectlet for detecting bid rigging by analyzing similarities between bids. The detectlets are presented as a potential way to more effectively integrate artificial intelligence into fraud detection performed during audits.

Uploaded by

cvaca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Detectlets for Better Fraud Detection

Conan C. Albrecht, PhD


Marriott School of Management
Brigham Young University
Today’s Presentation

• Give a few fraud stories


• Outline the Detectlet vision and Picalo
Architecture
• Show example code and working products
• Describe future research directions and
solicit help
Two Types of Fraud

• Fraud on behalf of an organization


– Financial statement manipulation to make the
company look better to stockholders
– Also called management fraud

• Fraud against an organization


– Stealing assets, information, etc.
– Also called employee or consumer fraud
ACFE Report to the Nation Occupational
Fraud and Abuse

• 2 1/2 year study of 2608 Frauds totaling


$15 million
– Fraud costs U.S. organizations more than
$400 billion annually.
– Fraud and abuse costs employers an average
of $9 a day per employee
– The average organization loses about 6
percent of its total annual revenue to fraud
and abuse admitted to by its own employees
Ernst & Young Fraud Study 2002 (Europe)

• One in five workers are aware of fraud in their


workplace
• 80% would be willing to turn in a colleague but
only 43% have
• Employers lost 20 cents on every dollar to
workplace fraud
• Types of fraud
– Theft of office items—37%
– Claiming extra hours worked—16%
– Inflating expenses accounts—7%
– Taking kickbacks from suppliers—6%
Cost of Fraud

• Fraud Losses Reduce Net


Income $ for $
• If Profit Margin is 10%,
Revenues Must Increase by
10 times Losses to Recover Revenues $100 100%
Affect on Net Income Expenses 90 90%
Net Income $ 10 10%
– Losses……. $1 Million
Fraud 1
– Revenue….$1 Billion Remaining $ 9

To restore income to $10, need


$10 more dollars of revenue to
generate $1 more dollar of
income.
Fraud Cost….Two Examples

• Automobile • Large Bank


Manufacturer – $100 Million Fraud
– $436 Million Fraud – Profit Margin = 10 %
– Profit Margin = 10% – $1 Billion in Revenues
– $4.36 Billion in Needed
Revenues Needed – At $100 per year per
– At $20,000 per Car, Checking Account,
218,000 Cars 10 Million New
Accounts
A Recent Fraud

• Large Fraud of $2.6 Billion


over 9 years 3,000,000,000

– Year 1 $600K 2,500,000,000

– Year 3 $4 million 2,000,000,000


– Year 5 $80 million
1,500,000,000
– Year 7 $600 million
– Year 9 $2.6 billion 1,000,000,000

• In years 8 and 9, four of the 500,000,000

world’s largest banks were 0

involved and lost over $500 Year 1 Year 3 Year 5 Year 7 Year 9

million
Some of the organizations involved: Merrill Lynch, Chase, J.P. Morgan,
Union Bank of Switzerland, Credit Lynnaise, Sumitomo, and others.
Every Person Has A Price

• Abraham Lincoln once threw a man out of


his office, angrily turning down a
substantial bribe. “Every man has his
price”, explained Lincoln, “and he was
getting close to mine.”
Examples of Data-Based
Detection
Superhuman Workers

• Summed all hours


(normal, OT, DT) per
two week period,
regardless of invoice
or timecard)
• Workers were
logging hours on two
timecards for
simultaneous jobs
The Family Business

Work Orders Authorized By Purchaser


The Family Business

Invoice Charges Authorized By Purchaser


The Family Business

Work Orders Given To Contractor Crew


The Family Business

• Tip stated that kickbacks were occurring


with a certain company
• We researched the company and
determined which purchaser authorized
the work
• A contractor crew and company purchaser
were family
Systematic Increases In Spending
Systematic Increases In Spending
Unexpected Peaks In Spending
Increases In Only Part Of A Trend
Caught by his Pool…
Research Background
Accounting History

• 1940 SEC Statement: “Accountants can be expected to


detect gross overstatements of assets and profits
whether resulting from collusive fraud or otherwise”
(Accounting Series Release 1940)
• 1961: “If the ten (auditing) standards now accepted were
satisfactory for their purpose we would not have the
pleas for guidance on the extent of (auditors’)
responsibility for the detection of irregularities we now
find in our professional literature.” (Mautz & Sharaf 1961)
• 1997 - SAS 82
• 2002 - SAS 99

Expectation Gap
Historical Fraud Research

• Excellent literature review by Nieschwietz,


Shultz, & Zimbelman (2000)
– Who commits fraud
– Red flags
– Expectation gap
– Auditor expectations
– Game theory between auditors and management
– Auditor-client relationships
– Risk assessment, decision aids
– Management factors affecting fraud
FS Fraud using Ratio Analysis

• Hansen, et. al (1996) developed a generalized


qualitative-response model from internal sources
• Green and Choi (1997) used neural networks to classify
fraudulent cases
• Summers and Sweeny (1998) identified FS fraud using
external and internal information
• Benish (1999) developed a probit model using ratios for
fraud identification
• Bell and Carcello (2000) developed a logistic regression
model to identify fraud
• Current work by McKee and by Cecchini and by Albrecht
• None have found the “silver bullet” in using external
information to identify fraud
– Management (FS) fraud is very difficult to find
What are the Big 4 Doing?

• Each firm seems to have different groups


working on fraud detection
– No best practices model has emerged
• IT auditors perform control testing on
company systems, not fraud detection
• Meeting with Bill Titera of EY
Why Don’t “They” Find Fraud?

• Limited time
– Our most precious resource is our attention
• History
– Heavy use of sampling - lack of detail
– Lack of historical fraud detection instruction
• Lack of fraud symptom expertise
• Lack of fraud-specific tools
• Lack of analysis skills
• Lack of expertise in technology
• Auditors do find 20-30 percent of fraud
» ACFE 2004 Report to the Nation
Isn’t there a better way?
Reasonable time requirements Integrate AI and
auto-detection
Within reach of most auditors
(highly technical skills not required)

Integrate easily into different


database schemas
Cost effective
Initial Thoughts

• A small “manual” about frauds


– Cliff notes about different types of fraud
– Describes the scheme
– Describes the indicators of the scheme
• Worldwide repository wth contributions
from many different industries
• Primary focus was training
Detectlets

• A detectlet encodes:
– Background information on a scheme
– Detail on a specific indicator of the scheme
– Wizard interface to walk the user through
input selection
– Algorithm coded in standard format
– “How to interpret results” follow-up
• Input is one or more table objects
• Output is one or more table objects
Detectlet Demonstration

• Bid rigging where one person prepares all


bids 1.1.10
Item BidderAUnit BidderATotal BidderBUnit BidderBTotal BidderCUnit BidderCTotal
1829.85 1829.65 2100.00 1895.00
1.1.20 1256.99 1256.99 1380.00 1301.88
1.1.30 3467.52 3467.52 3900.00 3591.36
1.1.40 4.21 421.00 4.65 465.00 4.36 436.00
1.1.50 1.91 229.20 2.10 252.00 1.98 237.00
1.1.60 13328.00 13328.00 15100.00 13804.00
1.1.70 3360.00
1.2.10 32.48 162.40 35.60 178.00 33.62 168.20
1.2.20 13.22 661.00 14.50 725.00 13.69 684.50
1.2.30 13.89 694.00 15.25 762.50 14.38 719.00
1.2.40 9.97 229.10 10.95 328.50 10.32 309.60
1.3.10 124.43 373.29 136.65 409.95 128.88 386.64
1.3.20 139.63 279.26 153.35 306.70 144.62 289.24
1.3.30 34.12 102.36 37.45 112.35 35.34 106.02
1.3.40 124.43 622.15 136.65 683.25 128.88 644.40
1.3.50 26.82 536.40 29.45 589.00 27.78 655.60
1.3.60 20.80 416.00 22.85 457.00 21.54 430.80
1.3.70 39.66 793.20 43.55 871.00 41.08 821.60
1.3.80 51.48 1287.00 56.55 1413.75 53.32 1333.00
1.3.90 52.96 1324.00 58.10 1452.60 54.85 1371.25
1.3.100 52.96 847.36 58.10 929.60 54.85 877.60
1.3.110 277.28 11091.20 304.50 12180.00 287.19 11487.60
1.3.120 203.53 223.50 210.80
1.3.130 45.99 2759.40 50.50 3030.00 47.63 2857.80
1.3.140 12.19 487.60 13.40 536.00 12.63 505.20
1.3.150 11.70 468.00 12.85 514.00 12.12 484.80
1.3.160 12.49 249.80 13.70 274.00 12.94 258.80
1.3.170 2.45 24.50 2.70 27.00 2.54 25.40
1.3.180 326.39 326.39 358.00 338.05
1.4.10 9541.68 9541.62 10480.00 10480.00 9882.46 9882.46
Potential Supporting Platforms

• MS Access
• ACL or IDEA
• Build ground up application
– Allows total control over platform
– Stays with open source rather than tying the program
to a particular platform
• For example, consider PowerBuilder
– Supports Windows, Unix, Linux, Mac
– Allows embedded use within a greater platform
– Personal preference was Python
Picalo: The Supporting Platform
Central Detectlet Repository
How Detectlets Address the Problem

• Limited Time: Detectlets provide a wizard


interface for quick execution; they can be
chained and automated into a larger
system
• High Cost: Detectlets are based in open
source software, putting them within reach
of small and large accounting firms; they
also create a community environment for
fraud detection
How Detectlets Address the Problem

• Lack of fraud symptom expertise:


Detectlets provide a large library of
available routines to both train and walk
auditors through the detection process
• Lack of fraud-specific tools: Picalo
provides an open solution that we can
improve over time; it puts a fraud-specific
toolkit in the hands of auditors
How Detectlets Address the Problem

• Lack of analysis skills: Detectlets


encode full algorithms and code, allowing
the auditor to stay at the conceptual level
rather than the implementation level
• Lack of expertise in technology:
Detectlets provide a wizard-based solution
that are easy to use; Picalo provides an
Excel-like user interface
Picalo Level 1 API
Data Structures

The Table object is the basic data structure. Nearly all routines
both input and return tables, allowing them to be chained. Its
methods include sorting, column operations, row operations,
import/export from delimited text and Excel formats.

Column types include Boolean, Integer, Floating Point, Date,


DateTime, String, etc.
Simple Module

Provides joining, matching, fuzzy matching, and selection.

col_join, col_left_join, col_right_join, col_match,


col_match_same, col_match_diff, compare_records, custom_match,
custom_match_same, custom_match_diff, describe,
expression_match, find_duplicates, find_gaps, fuzzysearch,
fuzzymatch, fuzzycoljoin, get_unordered, join, left_join,
right_join, select, select_by_value, select_outliers,
select_outliers_z, select_nonoutliers, select_nonoutliers_z,
select_records, soundex, soundexcol, sort, etc.
Benfords Module

calc_benford: Calculates probability for a single digit


get_expected: Calculates probability for a full number
analyze: Analyzes an entire data set and calculates summarized
results
Crosstable Module

pivot: Similar to Excel’s pivot table function


pivot_table: Pivots and keeps detail in each cell
pivot_map: Pivots and keeps results in a dictionary rather than a
grid
pivot_map_detail: Pivots and keeps results in a very detailed
fashion using a dictionary
Database Module

OdbcConnection: Connects to any ODBC-compliant database


PostgreSQLConnection: Connects to PostgreSQL
MySQLConnection: Connects to MySQL

Also includes various query helper functions, such as query


creation, results analysis, etc.
Financial Module

Calculates various financial ratios to help in financial


statement analysis:

Current ratio
Quick ratio
Net working capital
Return on assets
Return on equity
Return on common equity
Profit margin
Earnings per share
Asset turnover
Inventory turnover
Debt to equity
Price earnings
Grouping Module

Stratification gives the details behind SQL GROUP BY. It keeps the
detail tables rather than summarizing them.

stratify: Stratifies a table into N number of tables


stratify_by_expression: Stratifies a table using an arbitrary
expression
stratify_by_value: Stratifies on unique values
stratify_by_step: Stratifies based on a set numerical range
stratify_by_date: Stratifies based on a date range

Summarizing is similar to SQL GROUP BY, but it allows any type of


function to be used for summarization (GROUP BY generally only
allows sum, stdev, mean, etc.)
This can by done in the same ways as stratification.
Trending Module

Various ways of analyzing trends and patterns over time.

cusum, highlow_slope, average_slope, regression, handshake_slope


Python Libraries

Powerful yet easy language with a significant online community


Full object-oriented support (classes, inheritance, etc.)
Text maniuplation and analysis routines
Web site spidering routines
Email analysis routines
Random number generation
Connection to nearly all databases
Web site development and maintenance
Countless libraries available online (almost all are open source)
Research Directions
Level 1 Research

• Foundation routines for fraud detection


– Development, testing, empirical use, field studies
• Connections to production software
– Standard SAP, Oracle, Peoplesoft, JD Edwards, etc.
modules
• Application of CS, statistics, other techniques to
fraud detection
– Time series analysis
– Pattern recognition for fraud detection
Level 2 Research

• Studies about detectlet presentation, user


interface
• Creation and testing of detectlets for
industries, data schemas, etc.
• Detectlets for financial statement fraud
detection
• Testing of detectlet vs. traditional ACL-
type fraud detection
• Patterns of detectlet development, best
practices
Level 3 Research

• Automatic mapping of field schemas to a


common schema
• Application of expert system, learning
models for automatic detection
– Decision trees
– Classification models
• Meta-detectlets to combine various Level
2 detectlets into higher-level logic
Other Research

• Group-oriented processes for the central


repository
– Searching, categorization
– Testing, rating systems
• Marketplaces for detectlets
• Development of Picalo itself
My Hope

• In 5 years we’ll have a large repository of


detectlets to:
– Support both external and internal auditors
– Teach students in fraud classes
– Conduct theoretical and empirical research

http://www.picalo.org/

You might also like