Finding a Needle in a Haystack – Child’s Play!
March 17, 2011 1 Comment
“Finding a Needle in a Haystack” is without doubt one of the most overused analogies in IT security. After seeing it repeatedly at RSA I offer the following analysis of the analogy:
Finding a needle in a haystack is child’s play. A walk in the park. Enormous oversimplification. Luxury (yes, this is a reference to Monty Python’s “Four Yorkshiremen” sketch – look it up, it fits).
First, the “needle” component is all wrong because it presumes what we are looking for is known. The problem in malware detection is in seeing those attacks for which there is no prior knowledge. We don’t know what we are looking for, but we are expected to find it – whatever “it” is. Unfortunately, the vast majority of malware detection software relies upon prior knowledge to detect malware, leaving a wide detection gap for unknown attacks such as zero day attacks and the advanced persistent threat. This is problematic as the number of unknown attacks increases in volume and complexity daily, and, as we saw in the recent report of the Nasdaq breach, some needles remain undetected in the haystack for over a year.
The analogy has now degraded to “Finding a <unknown something> in a Haystack”. Obviously not knowing what it is you are looking for causes some complication, but this problem is addressable given that we are looking for an unknown thing in a well-defined, consistent population, namely the hay in the haystack. Logic dictates that if you were able to remove everything that was hay, what is left should be the thing you are looking for, even if you do not know what that thing was.
The trouble with that solution is that the people hiding stuff in your haystack do not want you to find it. So they will make their stuff look, feel, and smell like hay, making it very difficult to readily distinguish what is hay and what is the unknown. You also face the very real possibility of discovering multiple non-hay unknowns after the hay is removed. Do you assume they are all undesirable? If not, how do you differentiate the benign unknowns from the undesirable unknowns?
None of that matters anyway, because the analogy breaks down further because endpoint populations are most definitely not haystacks. In fact, unless you have instituted the most epically draconian lock-down process of all time, there is very little homogeneity in any endpoint population. Consequently, you are not looking for something in a well-defined, consistent population, you are looking for something in a confusing maelstrom of one-off configurations that changes daily. It is the furthest possible opposite of homogenous.
We are now left with “Finding an <unknown> in a <ill-defined, shifting maelstrom>”. Makes you yearn for the luxury of “Finding a Needle in a Haystack”, doesn’t it?
Finally, no one really suffers from a needle in a haystack, unless you metaphorically jump into the metaphorical hay and through a turn of enormous bad luck get inadvertently, metaphorically stuck. Needles do not exfiltrate intellectual property or financial data. Needles do not turn computers into spam spewing Conficker zombies. Moreover, needles do not land your organization on the front page of the New York Times and create reputational risk that can lower the market cap of the company. Of course, the <unknown> in your <ill-defined, shifting maelstrom> certainly can.
The obvious question: how do I find an <unknown> in my <ill-defined shifting maelstrom>? I will offer you one solution tomorrow.