How to predict market crashes

By Marco Lagi | August 26, 2015

As I woke up recently to the news of a 1000-point drop of the Dow Jones, I thought back to a paper we recently published about predicting stock market crashes.

Over the last few decades, the US stock market has had several crashes, none of which compare to the 2007-2008 financial crisis, where the market lost about 50% of its value in 6 months. Such a massive loss can only be explained with a combination of bad news and crowd behavior.

When big (or many) investors start selling, they drive the price down, which initiates fear for further losses in other investors, which in turn start selling. If this loop gets out of control, the market experiences panic.

Panic, in economics as in every other aspect of life, may be due not only to real threats (external to the system), but also to self-generated nervousness (internal to the system). Whether the threat is real or imaginary, a living collective system panics when it’s overwhelmed by agitation, anxiety and fear. The larger the collection of organisms in the system, the more catastrophic the effects of panic can be:

  • Oct, 1929: the US stock market loses 23% of its value in 2 days
  • Mar, 1999: run on Malaysian Central Bank deposits for $4.5B
  • Aug, 2005: 953 people killed in the Baghdad bridge stampede

Trying to predict the behavior of economic and social systems is usually a mess. They are composed of human agents, all with their own individual interests, strategies and goals. But there are cases in which the interactions of individuals give rise to collective behavior. In these cases, prediction is possible because when everyone moves together, the space of possible outcomes of the system shrinks. So if panic is really the root of financial crashes, maybe by quantifying it we can predict when these will occur.

Let’s try to answer two questions:

  • can panic be quantified?
  • can measures of panic be used to predict crashes?

TL;DR: yup, and yup!

How panicky is the market?

In traditional economics, prices reflect expectations based upon news: only external information triggers decisions. But in reality, markets are networks of influence (Fig. 1): people talk to each other, and look at each other for decisions. This means that in order to have a complete picture of the system, internal mimicry has to be considered, and the influence from outside the system has to be quantified and not taken for granted.

Fig. 1 Comparison between a traditional economic perspective of market dynamics, where only external news influence agent decisions, and a complex systems/behavioral economics perspective, where agent interactions are taken into account.

The model

So we built a model that includes both factors, and tested it on financial data to quantify the two effects.

Without going into the math, the model can be represented as a fully connected network of N nodes, where each node can only assume binary values, +1 or -1.

At every step, every node looks at its neighbors, picks one at random and, with a certain probability, copies whatever the neighbor is doing. There are also a few nodes that are fixed and don’t change their value. You can probably guess where this is going: the fluctuating nodes represent individual stocks, the fixed ones economic news from the media.

Sometimes stocks take their value from news, other times they just copy their neighbors. The binary value that the stocks can assume represents the sign of their return. We call the number of fixed nodes influencing in a positive direction U (up, positive news), and the number influencing in a negative direction D (down, negative news).

This model is similar to the Ising model, in the sense that it can describe order-disorder transitions. It’s a very general model and it has been used before in different contexts (e.g. the Moran model in population genetics). In all cases, there’s an ordered behavior (when nodes do the same thing) and a disordered one (when nodes do different things).

There are only 2 important parameters that determine the behavior of this model:

  • the ratio of external to internal links (i.e. the influence of news, (U+D)/N)
  • the fraction of positive fixed entities (i.e. the external bias, (U-D)/N).

Evaluating the model using real data

The model can be solved analytically to get the distribution of stocks with positive return. And this is something we can compare to real data!

Let’s take the underlying constituents of the Russell 3000 index and plot the distribution of the fraction of stocks with positive return (Fig. 2). The solid lines are the experimental distributions, while the dashed lines are the best fit using the model. It fits pretty well!

Fig. 2 Plotted is the fraction of trading days during the year (vertical axis) in which a certain fraction of stocks (horizontal axis) moved up. Empirical data are shown (solid lines) along with theoretical fits (dashed lines) for the years indicated.

A couple of observations:

  • If there is an equal amount of positive and negative news, U = D, the distribution is centered at 0.5 and the model reduces to only one parameter, U. This is what we see empirically for all years.

  • If the traditional economic view were right, i.e. if bad news were the driving force of the financial crisis, the peak would shift away from the middle as we approach 2008. Instead, the market goes up one day and down the next, while the distribution is always centered at 0.5.

  • In the year 2000, markets displayed a disordered behavior (U >> 1): everyone was doing their own thing. But as time goes by, the distribution gets flatter (U ~ 1), the system becomes more ordered, people stop listening to the news and start following each other.

  • Fast forward to 2008 and the distribution is basically flat. The probability that a large fraction of the market moves in the same direction, either up or down, on any given day, increases dramatically. We have reached the critical value of the model, the transition between disordered and ordered states, and the market has reached the financial crisis.

At this point, we have answered the first question: we have a signature of increased mimicry approaching the 2008 financial crisis, i.e. a decrease of our model parameter U (influence of the news on the system). Now to the second question!

Is prediction of a market crash possible?

In the top panel of Fig. 3 you see the time dependence of U. While the 1990s were a relatively healthy period for financial markets, nowadays we follow each other much more. But there is no sign that U is predictive of financial crashes from the plot, besides reaching the U ~ 1 value during the 2008 crisis. This feature must then be characteristic only of major dislocations.

Let’s consider the 20 largest single-day drops of the US stock market (i.e. what can be fairly confidently called a “panic”). Of those, 8 are in the time period of the lower panel of Fig. 3 (1985-2010, represented by red vertical lines) clustered in 4 windows (blue shaded regions). How do we use U to postdict them?

Fig. 3 Top panel is the model parameter U as a function of time. Bottom panel is the annual change of U as a fraction of its standard deviation, computed over the previous year. Four year-long windows (blue shading) follow two standard deviation drops in the model parameter after periods of increase.

If U = 1 is not predictive, then what is? As it turns out, significant changes in U are predictive. In the bottom panel of Fig. 3 we show a measure of exactly this: the change in U from one year to the previous, divided by its own uncertainty (standard deviation, the fluctuations). A simple signature pattern precedes each drop. Let’s start the clock when the change in U goes below 2 standard deviations in this plot: within the next year, a major crash will occur. We then reset the clock when the relative change becomes positive again. If we follow this pattern, we get 8 true positives. No false positives, no false negatives.

Collective behavior of large complex systems is usually like this. Given that it takes time to build, there has to be a period when the system is practicing, so to speak, a period of time before the widespread financial panic when people are just watching one another and slowly building coherence. And this creates enough advance warning to predict a crash.

As a final note, large jumps also happen in the same blue windows because of the role of mimcy, but always after a single-day crash. There’s no one-day out-of-the-blue increase. So the situation is not completely symmetric, probably because in humans the fear of loss is more powerful than the desire of gain. When people start following one another, it’s mostly because they are afraid.


It turns out that large single-day panics in the US stock market are often, if not always, preceded by long periods of quantifiable mimicry and weak influence of external news. This feature allows us, in principle, to anticipate intra-day crashes (not due to fat fingers or market manipulation, that is).

Building models is fun and, even though all models are wrong, some are definitely useful. Part of what we do at Kemvi is exactly this. But instead of modeling the behavior of why investors buy stocks, we model why companies buy your product. And instead of predicting market indexes, we predict whether you’ll be able to close that one big account this quarter. If you have a Salesforce account, try it out for free!