Presentation: Real-Time Fraud Detection with Graphs

Location:

Duration

Duration: 
10:35am - 11:25am

Day of week:

Key Takeaways

  • Understand how graph technology is being used in finance. 
  • Hear use cases of detecting fraudulent activity using graph databases. 
  • Understand the capability of graph technology to help a modern security organization.

Abstract

Finance is awash with data, but much of it is discrete items locked up in silos waiting to be joined up to provide insights. Graph data is different: it's joined by default and oozes domain-specific insight.

In this talk we'll discuss several kinds of fraud common in financial services and see how each naturally decomposes into a straightforward graph use-case. To demonstrate the power of connected data, we'll explore use-cases using Neo4j and the (now open standard) Cypher query language to showcase just how performant, pleasant and powerful graphs can be, and how the fraudsters need to beware!

Interview

Question: 
QCon: Can you tell me about what you are doing today?
Answer: 
Jim: I am Chief Scientist at Neo4j. I mostly look at future versions of the database, and because my background is historically in transaction processing and fault tolerance, that’s what I look at within the context of a graph database.
Question: 
QCon: What do you mean when you say in your abstract that “finance is awash in data, discreet items are locked up in silos just waiting to be joined for insight, graph data is joined by default and uses domain specific insight”?
Answer: 
Jim: I think in one way finance is well ahead of the curve in terms of data. Where there is money, there is innovation, right? I think the banks in particular are very good at harvesting data and using it. But, like any large enterprise, they’ve ended up with a bunch of siloed systems, and they’ve invested heavily in technology which doesn’t necessarily allow them to see the joined up picture. Just like most enterprises, finance companies have loads of very robust SQL databases. They’ve also got plenty of NoSQL databases, particularly document and column stores, but none of those data models allows you to connect facts together.
The closest we have is the join in SQL, but that is the poor man’s relation. What does a join even mean in SQL? Can you give a name to a join? Informally perhaps, yes, but data is manifestly connected. We have connections between each other that were historical and current and evolving. Capturing the richness and nuance of that in a bunch of tables with joins is very difficult. Say you try to describe our relationship now during this interview, it’s something to do with QCon, something Skype, something technology, track hosting, or maybe speaking. It is kind of complex, right? In finance, it’s even more complex because of those domains are increasingly arcane.
The people who build financial products are building highly sophisticated entities and being able to understand the relationships is key. Disconnected data tech doesn’t give you the ability to join the dots, and fraudsters are going to exploit that. If they can see that you can’t figure it when they can take a particular path through your system, then they are going to exploit that path through your system. It is kind of an arms race going on, and I think that graph data is the next evolution in this race for the good guys. The good guys will be able to understand far better user behavior and be able to categorize that user behavior in a much richer context.
For example if you say “Jim just used his credit card in the USA. Deny the transaction.” This is irritating, because I’m not a fraudster. But if you could see my context of being a frequent traveller to the United States and some other details, then suddenly these false positives don’t appear. When it appears that I use my card in New Zealand on something I don’t normally do, in a currency I don’t normally use, you can bring these facts together and say “Perhaps that one is a bit dodgy.”
In the talk I am going to go into some specific use cases where financial services people have used graph technology and the insights you get from joining facts together in order to solve some billion scale problems. I will pay particular attention where fraudsters have been able to exploit financial institutions for absolutely eye watering amounts of money.
Question: 
QCon: Can you give me an idea of the use cases you will discuss?
Answer: 
Jim: Since the focus of this track is security, I am going to focus very much on the fraud stuff. There are cases where a user of the system doesn’t set any triggers off lower down the stack. They are not hacking in. They are not trying buffer overflows. They are simply trying to use the application within normal parameters. By using certain applications (for example consumer credit), fraudsters can start to gain access to large aggregate pools of credit which the bank doesn’t really notice.
A few thousand dollars on a credit card over here, a few thousand of dollars on a consumer checking account overdraft over here, and so on. If you sum these lines of credit up, and then multiply them by a bunch of synthetic identities, you can see that the fraudsters can amass a large line of credit. Then one day they simply walk out the door with it. There are two problems inherent here. One is that you couldn’t connect all those different lines of credit being carefully built up and nursed until the fraud actually happens. You second is you actually have no idea about the amount of the fraud. You don’t know the boundary around it because it is all just different bits of data dotted around.
To keep it a bit light hearted, I am also going to talk about whiplash fraudsters who operate in this ingenious secondary market in passenger seats. This example is something that you could see when you start to connect these dots together. You can see fraudsters crashing cars deliberately, but they also have this clever secondary market where as a passenger, you can pay $2000 to be in the car that is going to be crashed. You can get your whiplash injury and your compensation pays up 20 times that. So there are all these kinds of layers upon layers of criminality that once you find one part of the network. A fraudulent passenger is an example.

Tracks

Covering innovative topics

Monday, 7 March

Tuesday, 8 March

Wednesday, 9 March

Conference for Professional Software Developers