When geeks grow up
Speakers' Corner Jeff Jonas, IBM Entity Analytics
IN 1979, when Jeff Jonas was 14 years old, the LA school district paid him a few hundred dollars for a word processor he'd written for the Commodore Pet. It was an epiphany: "People will pay you? To do your hobby? Everything since has really been centered around my love for information systems."
He started his first company at 17, and went bankrupt owing over $100,000 by the time he was 20. (About five years ago he paid it all back plus three per cent compound interest except for a few people he couldn’t find.) Shortly afterwards he started his next company from the back of his car, "based on what I had learned".
In January 2005, he sold that company, Systems Research and Development, to IBM, which merged it into an existing unit and renamed it IBM Entity Analytics. "The phrase I use is semantic reconciliation – recognising when two things are the same despite being described differently."
SRD was aimed at "people with unique problems and needing help." Jonas devised a genealogy system for the North American Llama Society and modelled sewers under cities to predict flows if the population grew as expected.
Most Las Vegas casinos, he says, use technology his company invented. Vegas was, he says, a useful laboratory. "Vegas really wants customers to have a high degree of anonymity." Plus, "The false negatives and false positives have to be very low." False positive: an innocent customer is asked to leave. False negative: a criminal is only spotted after he's gone. Both: expensive. "So it's a good proving ground to try technologies – how to protect the organisation yet still give people a sense of freedom."
Entity analytics, he says, "detects low signal". For example, "How do you detect it when the person who is stealing from you has become an expert at lying?" Take Jerome Kerviel, the rogue trader who cost the Société Générale billions of euros. "They trusted that guy. The amount of damage a single person can do is increasing. As networks open up, you end up with your adversaries on the same network as you, on your same infrastructure."
Someone who has six identities with no similar elements across those identities and who never visits the same location using more than one of those identities is impossible to detect. But entity analytics can help put obscure common elements together. Jonas likens the process to putting together a jigsaw puzzle with large areas of red and white.
"If you don't have pieces with both colours you have no way to put it together." Say you're a bank and have one piece of data that’s an email address, and another that's an IP number and an address but there are no overlapping features between the two. "Entity analytics is designed for the moment that you discover somewhere that you have a name, address, and email address."
The key, he says, is not throwing out data, unlike many business processes. "If you're trying to stitch a puzzle together and discover context, it turns out that if you try to polish all your pieces into perfection you lose some of the features that allow matching. One of the unique things about entity analytics is that it remembers all the subtle little things that are different." Every misspelled name, confused date of birth, or old address might be a key later.
In the last five years, Jonas has come to recognise the privacy implications of this approach. Some of his recent work includes an encryption scheme that uses one-way hashes to allow an airline and a government agency to cooperatively operate a no-fly list without either ever seeing the other's data.
Matching is done on the encrypted data, so that the government only sees passenger records that match the no-fly list. Elsewhere, he's written about creating immutable logs for large database systems that cannot themselves be datamined. Nothing has, however, identified the stranger who saved his life 20 years ago when he broke his neck in a car crash. That remains a mystery. µ
