Ecommerce fraud

Fraudulent purchases are quite expensive for online retailers as they are out the merchandise while not receiving the funds. Further, there are economic consequences for too high chargeback rates. Identifying these fraudulent transactions efficiently is key to managing losses. However, blocking legitimate transactions leads to poor customer experience and can quickly lead those customers to switch to a competitor.

Overview

Accurately identify fraudulent transactions.

TARGET

Is this transaction fraudulent?

Challenge

Fraud is thankfully rare, but that means that there are very few frauds compared to the volume of legitimate transactions. Simply applying traditional machine learning techniques (along with standard metrics like accuracy, sensitivity, specificity) will tend to result in poor models. Data will need to be downsampled and metrics less susceptible to unbiased datasets will need to be selected.

First, the fraud database will need to be matched to the transactional data. This information will need to be enhanced by adding broader fraud trends from aggregations of the fraud database in conjunction with aggregations of the transaction database. Next, this customer’s history needs to be added to the record (again from the transaction database). All this data needs to be cleaned, joined and transformed into valuable ML features before going into model training. This pre-modeling prep process can be frustrating and time consuming. We are here to help.

Modeling techniques and libraries

Machine learning analysis

Build machine learning models to predict if the transaction is fraudulent as a function of the independent variables. Use model interpretability packages to evaluate the impact of the independent variables on the prediction.

Packages:
  • Sklearn
  • ELI5
  • LIME
  • SHAP

Data features

# of Website Sessions
Web Analytics DB
Data Type
Continuous
Target
No
Yes
All customer average transaction amount
POS
Data Type
Continuous
Target
No
Yes
All customer average transaction size
POS
Data Type
Continuous
Target
No
Yes
Average transaction amount
POS
Data Type
Continuous
Target
No
Yes
Billing and Shipping Address Match
POS
Data Type
Binary
Target
No
Yes
Customer Age
CRM
Data Type
Continuous
Target
No
Yes
Customer Gender
CRM
Data Type
Categorical
Target
No
Yes
Customer State
CRM
Data Type
Categorical
Target
No
Yes
Customer Zip
CRM
Data Type
Categorical
Target
No
Yes
Fraud rate by demographic
Fraud DB
Data Type
Continuous
Target
No
Yes
Fraud rate by state
Fraud DB
Data Type
Continuous
Target
No
Yes
Fraud rate by zip
Fraud DB
Data Type
Continuous
Target
No
Yes

Related accelerators

No items found.

Get your data science on.