Fraud-detection-analysis

Summary :

This repository discusses fraud detection analysis used to predict fake transactions made to retail shops, utilizing their transaction history details. It finds out which combination of sampling and machine learning model is best for fraud detection by comparing their quality by PR curve comparision.

Code:

Link: fraud transaction analysis

Project outcome:

I have increased the accuracy of the genuine transaction by 47% after sorting out the best model and data pre-processing.
Analyzed, cleaned, and pre-processed for 41989 records of datasets, assessed using various combinations of samplings and machine learning models.
Instead of using a single model used 3 unique models and sampling methods created 9 combinations to select the best combo.

Steps followed:

Import libraries, keep the dataset prepared, ready for analysis, filter important variables, and well-structured by running them in MS Excel.
Started with data cleaning, instead of removing incomplete records replaced them with the median of the column to bring more accurate insights.
Data pre-processing, need to convert categorical data into, numerical data for convenience, converted 5 categorical variables.
Implementing sampling to support the least count data, in the project made use of over-sampling, under-sampling, and SMOTE sampling.
With 3 different samples are processed to find fake transactions using 3 models to get 9 predictions.
Used PR curve to find each model's behavior precision, recall, and f1 score are compared for each model.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fraud-detection-analysis

Summary :

Code:

Project outcome:

Steps followed:

Outcomes:

Confusion Matrices of random forest model:

Confusion Matrices for logistic regression:

Confusion Matrices for K-means:

PR curves of all models :

About

Uh oh!

Releases

Packages

Languages

Dhana-karthik/Fraud-detection-analysis

Folders and files

Latest commit

History

Repository files navigation

Fraud-detection-analysis

Summary :

Code:

Project outcome:

Steps followed:

Outcomes:

Confusion Matrices of random forest model:

Confusion Matrices for logistic regression:

Confusion Matrices for K-means:

PR curves of all models :

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages