Real Time Analytics

Oscar Kilo Ltd provides data analysis services and tools. These include:

  • Complex Event Processing
  • Data Screening and Classification Services
  • Data Mining
  • Statistical Analysis, AI
  • Software development
  • Database services
  • Geographical Information Systems
OK was formed to deploy some unique and proven techniques that had been developed for our adaptive real-time classification system (DETECT). The principals behind DETECT were originally developed to address ‘difficult’ classification problems. There are very few exemplars in such data on which to base predictions. There are many technical difficulties associated with dealing with such rare-event classification problems some of which are described later.

Data Screening and Classification Services

The colloquial example of a hard search problem is 'finding a needle in a haystack', however it is even harder to find a particular grass type in a haystack. The needle is very different and can be easily identified. Suppose, however, that the hay contains hundreds of different species of grass, and that the task is to identify a particular rare species of grass that is subtly different from all the others. To make things even more difficult you have previously only ever seen a few thousand examples of the special grass and this is all that classifications can be based on.

This contrived example illustrates a type of problem that is common. A typical real-life example is the credit-card fraud detection problem; here the data consists of millions of transactions which have a multiplicity of features such that no two are exactly the same. Most are perfectly legal. Only a tiny percentage are known to be fraudulent and there is almost as much variation amongst these as there is in the legal transactions.

Data like this is so sparse and lacking in consistency, it seems an impossible task to spot future fraud using a system based on the sample data as the basis of each decision. However, while there simply is not enough information to go on to make black and white decisions, we can perform powerful screening, in real time, to flag which transactions should be investigated further by human experts, who would otherwise be overwhelmed by the sheer quantity of data. The name of the game is to flag likely fraud without ‘crying wolf’ too often.

We use Complex Event Processing and Statistical techniques to building systems which can classify large numbers of events or other kind of data record or document. Such data typically has the following characteristics:
  • Each record has many different attributes (fields).
  • There is a logical relationships between records, such as time order.
  • Large range of possible variation between records.
  • Sparse occurrence of the kind of record we are interested in identifying (a fraction of a percent).
  • Insufficient data to make positive IDs from the information contained in the records alone.

Some typical applications are:
  • Market Abuse Detection: detecting suspicious patterns of trading in securities markets in real-time.
  • Credit Card Fraud Detection: detecting fraudulent transactions in real-time.
  • Insurance Fraud Detection: assessing which claims to investigate.
  • Delinquent Spending on Store Cards: spotting indicators of behaviour likely to lead to runaway spending.
  • Mobile phone fraud: identifying calls from stolen or cloned phones or other network misuse.
  • Medical: Occasionally patients have adverse reactions to otherwise commonly used drugs. It may be possible to identify patients who might benefit from closer monitoring. A system like ours which can be constantly re-calibrated might pick up on geographical or other changing risk factors which might otherwise be discovered later or missed altogether.
  • Shipping container security: identifying which containers to inspect from bills of lading data.
  • Tax fraud: selecting which tax returns to investigate. There may be many analogous applications in central and local government as more and more business (both C2G and B2G) is conducted online. The E-government agenda is that all transactions will be conducted online within 18 months.
  • Police and Security Services: these may have access to many of the data sources from other application areas but be looking from another perspective- deciding which organisations or individuals to investigate further when combining the data with previous known criminal or terrorist activity.
  • Sales, Marketing and Business Intelligence: Modern call centre operations accumulate large quantities of data- there may be applications in deciding which calls to follow up on the basis of data from previous successful sales. The ability to keep up with changing trends might give our system the edge of traditional techniques.

All of these examples require a very different kind of search to that which Google (and OK’s Octopus) is adept at, where large numbers of documents are searched for a very specific feature set (key words). It is much more similar to the problem of filtering email messages for junk mail on the basis of examples of junk and non-junk except that junk email is not a rare event and email is largely 'unstructured' data.

Contact us:

If you feel we can help your business please contact us to discuss your requirements.

Copyright © 2011 Oscar Kilo Ltd . All rights reserved. | HOME | CONTACT |