In my recent blog “Does Automation Really Eliminate Human Error?” I wrote about the importance of observability in automated systems to ensure that the “automation” at hand does not rely on flawed programming to produce its results. An automated system, after all, is only as good as the code with which it is built.
In this second blog focused on automated systems, I start with an example from the world of finance to demonstrate how interactions between those systems can yield unforeseen consequences.
On Monday, October 19, 1987, the Dow Jones Industrial Average fell by 22.6 percent and the S&P 500 by 20.4 percent in a single day of trading. When the dust had settled the following day, it’s estimated that $1.7 trillion had been wiped from the value of markets worldwide, with $500 billion lost in the United States alone.
Most observers agree there was a distinct bubble in the markets of early October 1987, and those markets were certainly due a correction. However, many also believe there was a second factor that contributed to the crash: it was the first time there was a significant number of automated trading algorithms in play when a major shift in the markets started.
By 1987, a situation had developed in which market traders were a mix of humans and automated trading algorithms. Each algorithm was developed in isolation to meet a specific goal, but perhaps more importantly, each algorithm was permitted to operate autonomously when interacting with other traders, including other algorithms.
We shouldn’t expect trading algorithms to cooperate and prevent a market crash—their job is to maximize profits and minimize losses for their operators. After all, human traders would behave no differently. Indeed, as the markets started to slide on October 19, the trading algorithms behaved quite logically—and sold heavily.
The Importance of Feedback Loops
In my previous blog, I discussed how automation doesn’t exclude the possibility of human error. However, in the case of the 1987 crash, the problem wasn’t trading algorithms behaving erroneously; it was that the collective result of their interactions hadn’t been fully anticipated. And this is my key point here: as we interconnect systems that were independently designed to complete specific tasks, we should not be surprised if we elicit unexpected outcomes.
The concept of feedback is familiar to all of us: feedback occurs when we create a chain of connections, and the end of the chain is connected back to its start. At this point, we have—either by design or by accident—created a feedback loop.
Negative feedback is an enormously useful mechanism that is present in almost all complex systems, whether mechanical, electrical, software, or natural. For example, negative feedback keeps our drones in the air and our bodies at the correct temperature. By contrast, positive feedback often results in an unstable and undesirable outcome. Most of us have experienced the horrible howling that occurs when a microphone is held too close to the speaker in a public address system.
In complex systems such as IT environments, it’s increasingly common to allow systems to share data. For example, the owners of System A may decide their system can benefit from information produced by System B. Because most modern systems make such data sharing straightforward, it should be easy to allow System A to ingest and act on data from System B. However, the owners of System A may be blissfully unaware of what else System B is connected to.
It’s therefore important to recognize that whenever we connect two systems together, we are creating a new “supersystem.” If we do not understand how the original systems are already connected to more distant systems, the supersystem we create may be substantially more extensive than we realize. We may even create unanticipated feedback loops.
If we are to truly realize our digital transformation ambitions, we will need to rely on systems running largely autonomously—for example, under the guidance of artificial intelligence operations (AIOps) algorithms. We will also need to allow more and more systems to share data with one another.
It is therefore essential that we design pervasive observability into our IT environments so we can understand the interdependencies between different systems and subsystems, maintain the independent oversight needed to identify emerging anomalous behavior, and retain control, thereby avoiding our own IT stock market crash.
Read the White paper: New Digital Architectures Require an Innovative Approach to Service Assurance