Silence is Golden
by Deborah Volk on July 21st, 2009

Even with more daylight, I struggle with finding enough time to juggle family, work, and blog (not necessarily in that order, but pretty close most days). As a result of increased activity, I have been silent on the blogging front. This is not to say that I have not been thinking about all the interesting things to write about. With the workload increasing, the number of topics that I would like to discuss in an online forum also grows. Unfortunately entropy is hard to beat and without a perpetuum mobile as a source of energy, I have to find that elusive equilibrium between space and time. Thus we arrive at: How Authoritative is Authoritative?
If you have been reading our blog for a while, you will know this is not a new theme. It seems that a heretofore undiscovered sequence exists in Nature: skeptics -> identity management folks -> Identigral. For evidence, see Seek and Destroy (and its counterparts), Spring Cleaning, and Through the Looking Glass (our data quality entries)). In earlier posts we looked at these issues from a perspective of a specific use case with only one common underlying theme: checks and balances. If the IDM solution is told by the business "this is our authoritative source of information," is the IDM team supposed to take everything at face value, with no inspection?
At Identigral, we are proponents of checks and balances (ergo, the common theme in all of the aforementioned posts) ... but can the IDM program be proactive when it comes to "bad data" in the authoritative systems? After all, based on extensive, peer-reviewed identity management research that runs many volumes and comes with serious statistical analysis (read: anecdotal evidence gathered by Identigral), bad data is expensive. It is a broken spoke that stops the wheel from turning smoothly, causing exceptions to bubble up in an otherwise automated process. When exceptions happen, people become part of the process and that's expensive. In the IDM world, multiple teams of people spanning groups, applications and continents might become part of the process and that's VERY expensive.

Let's segregate the data into different types of "badness":

1) Typo or data entry error
2) Deterministic errror in systems which causes generation of bad data (e.g., a script that terminates everyone)
3) Malicious data change (e.g., an IT support person promoting himself 3 pay grades during a change implementation)

The first type of error is very hard to detect. It typically affects a few records and there might be very little correlation to any other event or data related to the user who was typoed (yes, that's a new verb). Obviously, there are some rules that might be applied to certain data fields: certain addresses are invalid, some job code, department code combinations are invalid, etc. But, for the most part, these will go unchecked and result in the necessity of "spring cleaning" and "checks and balances."

The second type of error is usually caught and the cost associated with these errors isn't in a failed audit or damages but rather in clean-up operation. The clean-up is a very manual and labor-intensive process because IDM solutions don't account for the mis-hiring or the mis-firing scenarios. They're not resilient to failure. There are two things the IDM solution designers can do: 1) Put in processes to deal with mis-hires, mis-fires, and mis-updates and 2) With some relatively simply analytics, a system could tell if 100 terminations in 1 hour is out of the norm, or if people are hired in California on Christmas Day (and you are a bank, as opposed to a grocery store). Fancy analytics is expensive and it's usually applied in scenarios with a lot of data, e.g. modeling the fraud risk on a credit-card transaction based on patterns detected in 100 million other transactions. But there are some simple threshhold-based checks that could be put in an IDM solution that could lower the cost of cleanup efforts when something goes wrong. If you want to get fancy, you can make your thresholds be adaptive by using a linear transformation. This technique (albeit in a different domain) is described in Traffic Jams blogs.

The third type of error is the hardest to detect, especially given the ubiquitous assumption that if you have the right to do something you are allowed to do it. So if a support person can adjust his pay grade, then voila! he has been adjusted. Segregation of Duties and, broadly speaking, GRC solutions are supposed to prevent this from occurring but how many companies out there have a working Segregation of Duties or GRC implementation? While you wait for that, you can catch some malicious events in the IDM land by analyzing the events coming from the data sources. In fact, the bulk of work in a GRC implementation is coming up with credible detection rules that reflect the business and not just flag generic "write a check, approve a check" type of scenarios. If you can come up with these rules, you can put them to work in IDM by looking at incoming events.

Posted in Data Quality    Tagged with no tags


Leave a Comment

2012 (1)
2011 (2)
2010 (2)
2009 (64)
March (11)
April (18)
May (18)
June (4)
July (1)
August (1)
September (5)
October (5)
December (1)