New account fraud

New fraudulent account openings can be extremely costly to organizations as fraudsters can rack up large charges before they are identified and the account shut down.  In addition, with limited information gathered at the time of account creation, it can be challenging to identify risky accounts.  Using additional and external data can enhance an organization's ability to identify these fraudulent new accounts.

Overview

Identify new accounts at high risk of being fraudulent.

TARGET

Is the account fraudulent?

Challenge

At the time of account opening, the internal data available to the organization can be limited.  Additional information can be gleaned from account information, but needs significant accumulation. Ideally, additional third-party data about the customer would be available to enable better identification of fraudulent accounts. Finally, the fraud data is often kept at a transaction level and may not distinguish between account takeover vs new account fraud. All this data needs to be cleaned, joined and transformed into valuable ML features before going into model training. This pre-modeling prep process can be frustrating and time consuming. We are here to help.

Modeling techniques and libraries

Machine learning analysis

Build machine learning models to predict if the new account is fraudulent as a function of the independent variables. Use model interpretability packages to evaluate the impact of the independent variables on the prediction.

Packages:
  • Sklearn
  • ELI5
  • LIME
  • SHAP

Data features

# of Website Sessions
Web Analytics DB
Data Type
Continuous
Target
No
Yes
Account is fraudulent
Fraud DB
Data Type
Binary
Target
No
Yes
Customer Age
CRM
Data Type
Continuous
Target
No
Yes
Customer Gender
CRM
Data Type
Categorical
Target
No
Yes
Customer State
CRM
Data Type
Categorical
Target
No
Yes
Customer Zip
CRM
Data Type
Categorical
Target
No
Yes
Fraud rate by demographic
Fraud DB
Data Type
Continuous
Target
No
Yes
Fraud rate by state
Fraud DB
Data Type
Continuous
Target
No
Yes
Fraud rate by zip
Fraud DB
Data Type
Continuous
Target
No
Yes
Fraud rate from IP address
Fraud DB
Data Type
Continuous
Target
No
Yes
Fraud rate from subnet
Fraud DB
Data Type
Continuous
Target
No
Yes
IP Address
Web Analytics DB
Data Type
Categorical
Target
No
Yes

Related accelerators

No items found.

Get your data science on.