In this art­icle we will go through a fraud detec­tion and pre­ven­tion solu­tion using Machine Learn­ing (ML) and Tableau dash­boards. We will show you some of the deliv­er­ables enabled by this tech­no­logy and look at the advant­ages of using this solu­tion, rather than the tra­di­tional rule-based approach, in terms of pro­ductiv­ity, robust­ness, and over­all results.

Intro­duc­tion

Fraud is rife around the world and on the rise, affect­ing a wide range of sec­tors such as bank­ing, insur­ance, tax, intel­lec­tual prop­erty, etc.

Fraud can have a series of impacts:

  • Fin­an­cial: Not only the money lost through fraud, but also the asso­ci­ated costs: assess­ment, detec­tion, invest­ig­a­tion, and response.
  • Repu­ta­tion, brand, and employee mor­ale: Fraud always induces a neg­at­ive cor­por­ate image, and can ruin careers by asso­ci­ation and deter employ­ees, investors, sup­pli­ers, and customers.
  • Account­ing and cap­ital: Once fraud is detec­ted, there may be pro­gramme reviews or audits; attract­ing fund­ing could also be affected.
  • Digital dis­rup­tion: New sys­tems may need to be put in place, as well as a secur­ity overhaul.

In Experian’s 2020 Global Iden­tity and Fraud Report, we can see that around 50% of busi­nesses report the highest fraud losses to be asso­ci­ated with account open­ing; and even though the major­ity know that Advanced Ana­lyt­ics must be a part of their strategy, only half think that it is import­ant for iden­tity and authentication.

So, we decided to do some research and come up with an ML solu­tion to improve fraud detec­tion and pre­ven­tion in those com­pan­ies exper­i­en­cing it.

The Prob­lem with Fraud

At ClearPeaks, we are aware of the dif­fi­culty of detect­ing and pre­vent­ing fraud, as well as the import­ance of hav­ing a set of tools to ease the work of fraud depart­ments and specialists.

Due to legal ques­tions, fraud detec­tion is rarely fully auto­mated, so there is a heavy human work­load to check sus­pi­cious oper­a­tions. Instead of auto­mat­ing this work, we decided to reduce the num­ber of cases for ana­lysis and also empower the resources tasked with ana­lys­ing these cases by offer­ing tools.

Without a model, or using mod­els based on logic rules – which are prone to miss – we are not work­ing with data insights, caus­ing a lot of false pos­it­ives and cre­at­ing a lot of pro­files for review by work­ers, with the human error this implies.

Our model approach can under­stand the con­nec­tions inside the data and find fraud pat­terns impossible for a human to spot, able to clas­sify oper­a­tions clearly as poten­tially fraud­u­lent or licit. Our solu­tion sig­ni­fic­antly decreases the num­ber of false pos­it­ives and gives a pre­cise rat­ing of fraud prob­ab­il­ity, show­ing these res­ults in sev­eral dash­boards for the sep­ar­ate roles in the com­pany fraud department.

Inter­act­ive Tool

We believed it was import­ant to offer an inter­act­ive dash­board cap­able of using our ML model to make infer­ences in real time using para­met­ers set up in it. Sadly, this is still not a stand­ard in dash­board technology.

To do so we opted for Tabpy, a tool cap­able of using Tableau data in Python. Tabpy is offi­cially sup­por­ted by Tableau, and it allows us to extract the data input in the dash­board to execute the ML model and dis­play its results.

For the other parts of the solu­tion we used a tra­di­tional Azure struc­ture, using Data Fact­ory as the orches­trator and adding Dat­ab­ricks to handle all the model-related tasks.

Res­ult­ing Dashboards

When design­ing this tool we had mul­tiple user roles in mind, each with their own user stor­ies. We wanted to provide a min­imum num­ber of dash­boards to help these par­tic­u­lar profiles.

For our spe­cific­a­tions we con­sidered the fol­low­ing user roles:

  • Fraud Analyst: The key role for this iter­a­tion of the solu­tion, the fraud analyst is tasked with check­ing all the trans­ac­tions flagged as fraud­u­lent and with car­ry­ing out a deep ana­lysis of each, determ­in­ing which are fraud­u­lent while clear­ing the legit­im­ate ones.
  • Busi­ness Analyst: Tak­ing a more BI-based approach, this user is tasked with ana­lys­ing the state of the busi­ness, detect­ing trends, alert­ing to pos­sible com­plic­a­tions, and com­ing up with plans to adapt the busi­ness to mar­ket requirements.
  • Data Sci­ent­ist: This user main­tains and improves the fraud detec­tion model: refresh­ing the model when data drift is detec­ted, and also ensur­ing that the model is as impact­ful as pos­sible. To do so, they need detailed inform­a­tion about exactly how the model is work­ing with recent data.
  • Fin­ance User: This user is tasked with fraud pre­ven­tion, using the tool as an assist­ant and a guideline when decid­ing whether to approve new credit requests. This user’s ana­lysis is more time-crit­ical than the oth­ers, so we want to high­light just one or two points from the full ana­lysis for them.

To sat­isfy the require­ments for these users, we decided on 3 dif­fer­ent dash­boards tack­ling the dif­fer­ent parts of the prob­lem. These can be seen in the fig­ure below, together with the user roles they aim to help:

Now that we have estab­lished the user and object­ives for each dash­board, we can ana­lyse how well they perform.

Data Dash­board

In this dash­board we aim to offer a quick over­view of the busi­ness, dis­play­ing mul­tiple KPIs indic­at­ing how fraud is affect­ing us and where exactly it is affect­ing us the most. This dash­board is apt for high-level decision-mak­ing, such as decid­ing what kind of fraud to tackle next.

Oper­a­tional Dashboard

This dash­board provides some­thing more akin to a tool: you can input the data of a new or exist­ing cus­tomer and get a real-time pre­dic­tion of fraud prob­ab­il­ity, as well as an explan­a­tion of the reas­ons behind this clas­si­fic­a­tion. This can high­light the ris­kier parts of the trans­ac­tion early in the con­ver­sa­tion, or be used by a fraud analyst in order to know what to look out for, allow­ing the oper­ator to gather more details.

Model Dash­board

In this dash­board we can take an over­all look at the model met­rics, mak­ing this the dash­board where we can cer­tify that the inform­a­tion given by the model to the other dash­boards is reli­able. With this object­ive in mind, we have also included some explain­ab­il­ity met­rics, explor­ing the fea­tures and the evol­u­tion of the model’s accur­acy, offer­ing the data sci­ent­ist some insights into ways to improve the model, and the busi­ness user some stat­ist­ics to sup­port cer­tain busi­ness actions.

We also provide the revi­sion spend­ing met­ric for the busi­ness user, indic­at­ing the amount saved by using the model in revis­ing cases.

Meas­ur­ing Cost Benefit

To meas­ure the total cost bene­fit of this solu­tion, we must split and meas­ure both sources that affect it:

Fraud Revi­sion Reduction:

This is the reduc­tion in cost gained by review­ing less false pos­it­ives: if we reduce the num­ber of trans­ac­tions wrongly marked as fraud­u­lent, we can reduce spend­ing. This is the major com­pon­ent in meas­ur­ing the cost bene­fit, as the sav­ings here can out­grow the losses due to fraud­u­lent trans­ac­tions. To cal­cu­late this, we can define “Revi­sion Spend­ing” as:

Of course, this just tells us what we are spend­ing on review­ing cases that are not really fraud­u­lent. To see the real bene­fit of the tool, we must com­pare it with some­thing, so as a baseline we used a rule-based model – as is sug­ges­ted by law and com­monly used – and we then com­pared the revi­sion spend­ing of both mod­els, provid­ing us with the exact amount saved in a month by using the tool.

Redu­cing the cost of reviewed cases not only affects the use of time and resources, but it also impacts ana­lysis qual­ity and the time spent on flagged cases.

Fraud Pre­ven­tion:

Fraud pre­ven­tion comes into play before accept­ing the cus­tomer as a cli­ent, giv­ing us deep insights into fraud­u­lent cus­tom­ers and allow­ing us to avoid them, thus redu­cing the amount of fraud cases and improv­ing the over­all qual­ity of the service.

Sadly, this com­pon­ent is hard to quantify, as without an abla­tion study we can­not clearly under­stand the tool’s con­tri­bu­tion to help­ing oper­at­ors pre­vent fraud. Nev­er­the­less, while we do not know the exact amount of help the tool provides, we believe that it brings a net bene­fit to the operator’s user experience.

Con­clu­sions

Fraud is a grow­ing crime and it’s a big prob­lem across many busi­nesses and com­pan­ies, who need to invest heav­ily to be able to deal with it. At ClearPeaks, we believe that the use of Advanced Ana­lyt­ics offers a major step for­ward in the fight against fraud, and in this art­icle we have shown how this help may come in the form of fraud revi­sion reduc­tion. Some of the key factors behind its suc­cess are the abil­ity of the model to explain the rationale behind its decisions, and the ease of inter­ac­tion with the model to explore its learned knowledge.

For more details and inter­est­ing cases like this, please do not hes­it­ate to con­tact us at info@clearpeaks.com and we will get back to you as soon we can. If you have enjoyed what you have read here, please fol­low the ClearPeaks blog as well as our You­Tube chan­nel for more inter­est­ing art­icles and topics.