The Correlation Of Gold Versus Other Markets
( Part 1 of 2 )
John Peterson
jcp@eskimo.com
04 June 2007
Abstract
The question of how closely correlated are Copper and Gold was addressed in a recent article penned by Adrian Douglas [1]. While I found his article interesting in it's own right it also raised some questions in my own mind. The quantitative approach that he used addressed correlation over long time scales (years). I was very curious about the nature of this relationship over much shorter (daily) time scales.

To be more specific, the question I set out to answer is the following; If on any given day Copper closes in a particular direction (up or down, relative to the previous day's close), on that same day does Gold tend to follow in the same direction, move in the opposite direction, or just do it's own thing? Daily end of day data, which is readily available is adequate to provide some insight into this question.

After going to the trouble of dusting off my analytical tool belt and taking a look at Copper, I decided to expand my analysis and estimate the correlation for a variety of markets and instrument versus Gold. For Gold, I used two different price series; the London PM Gold FIX and the StreetTracks Gold Trust (GLD). The sum total of my investigation was intriguing enough to inspire me to write up the results.

Transforming the Data

To address such short time scale correlations, the "trick" is to first convert or transform the data from daily closing prices into units that are unambiguously related to the daily returns. Then compute the sample correlation statistic using the converted data. I will give more specific details of what I mean by daily returns after providing some motivation for the best transformation to use in this situation.

One obvious transformation is to subtract yesterday's quote (e.g. the close) from today's quote. This is just the net change as reported by price quote services. The flaw with using the net change is that it does not quantify the profit potential or how much one would earn when a fixed dollar amount is hypothetically invested overnight (or over whatever the sample period of the data is) in two different instruments with generally different price levels [2]. One quantitative measure that correctly reflects this is the ratio of today's quote divided by yesterday's quote, which is what I will call the daily return in this article.

A simple but much more elegant measure is the logarithm of the daily return. A curious property of the logarithm [3] allows this transformation to be written in two ways which at first might appear to be different but are actually identical. The two forms of the transformation are;

log( today's quote / yesterday's quote )
or equivalently
log( today's quote ) - log( yesterday's quote )

The first form shows that it correctly quantifies profit potential as described above regardless of the general price level. For example, inspection shows that a move from 1 to 2 produces the same number as a move from 50 to 100, etc. If you are not familiar with logarithms you might want to grab a calculator and punch in a few examples. You will find that a price gain (a return greater than one) yields a number greater than zero, while a price decline (a return less than one) will yield a number less than zero.

While this transformation might seem rather odd to some of you there is nothing newfangled about it at all. From the second form above, the transformation can be seen as the moral equivalent of measuring the vertical distance between two points plotted on a chart that uses logarithmic scaling of the price axis. The advantages of using logarithmic charts was mentioned as early as 1932 in the first edition of the classic text on chart patterns by Richard W. Schabacker [4]. (Although I'm relatively certain that the use of such charts goes back even much further).

If you have a log scaled chart and a ruler handy you will find that when you measure the vertical distance of a move from say 1 to 2, this distance will also correspond to a doubling of the price anywhere on the chart. If you are careful you can even meaningfully compare the log charts of any two arbitrary instruments for profit potential. In the days of my youth when charts were constructed with pencil and paper that simply meant using the same kind of graph paper (something you bought at a book or stationary store) for both charts. With today's computer generated log scaled charts it's often possible to plot both on the same chart. If not, plot them separately then verify they have identical physical dimensions and the ratio of the price at the top boundary divided by the price at the bottom boundary is the same for both.

It may seem like I am jumping up and down on a dead horse, but the important point here is that each data point will get equal weighting in the computation of the sample correlation statistic when this transformation is used. Factors not associated with what we want to measure (day to day correlation) will not bias our estimates. Different average price levels will not favor one instrument over the other. A sustained up or down trend in one or both of the instruments will not cause some of the samples to be weighted differently than others [5].

To give the reader a feel for the transformed data, a plot of the London PM Gold FIX after conversion is shown in figure 1 below. To make it easier to mentally digest the values, the tick marks on the vertical axis correspond to units of percent change, something most readers will readily interpret.

Figure 1. London PM Gold FIX - log(daily returns)

One thing you will notice from figure 1 above is that any visual clues of medium or long term trends are largely gone. If your goal is to identify such trends this transformation will greatly hinder rather than help you. Actually, the trend information is still there but it is in a subtle form that isn't easily processed by your eyes, brain [7].

As an aside it is worth pointing out that the transformation retains almost all the information contained in the data, (with the exception of the general price levels). If you have saved the first, last, or any other single raw price quote it is possible to reverse the transformation and completely recover all the original raw price quotes [8].

One thing that does jump out at you from figure 1 above is the volatile, random looking character of day to day price action. The figure also highlights the non-stationary nature of financial time series. (For the purposes of this article take that to mean that its characteristics vary with time). A visual inspection shows that there are quiet periods where the market makes relatively small day to day swings and other periods that exhibit much wider swings, and with greater frequency. Such "clumps" of high volatility are observed in almost all financial time series.

Some examples of high volatility include the weeks following the announcement of the Washington Central Bank Gold Agreement in September 1999. The ferocious nature of the correction from the May 2006 swing high was gut wrenching for those who were riding it out and that is also evident upon inspection of figure 1.

Speaking subjectively, I think the good news for Gold bulls here is that the wild downside volatility seems to be subsiding. While a disappointment to many, the recent correction has been fairly orderly and without wild volatility.

To quantify volatility I computed a 200 day moving standard deviation (see reference [9]) of the data using a time centered sampling window (half the samples are to the left of the plotted value, the other half are to the right). The light blue curves correspond to plus and minus two times the estimates of standard deviation.

Estimates of standard deviation are very helpful in portfolio risk assessment. The estimates from any two financial instruments can be directly compared to get a measure of their relative volatility. I will tabulate some values for a selection of markets later in this article.

Another component of risk management that can be addressed from a plot such as figure 1 is the frequency of wild points or outliers. Namely those days when the price makes a very substantial move up or down. They can be particularly important in situations where some sort of leverage is being used (e.g. purchasing stocks on margin or futures contracts).

One can compare the character of the outliers of two different instruments by comparing each of them to the normal or Gaussian probability distribution [10]. For data from a normal probability distribution, one would expect the number of samples that exceed two standard deviations to be approximately 2.275 percent of the total number of samples. The same is expected for the number of samples less than minus two standard deviations. Comparing the actual counts to these values can give some insight into the nature of the outliers or tails of the distribution. Figure 1 shows that for the entire period going back to 1999, London Gold exhibits tails that are not all that different from a normal distribution.

Figure 2. Standard and Poor's 500 Stock Index - log(daily returns)

A number of authors have commented on the relative lack of volatility in the US stock markets over the last few years. An inspection of figure 2 above shows that since late 2003, the S&P500 has been relatively quiet when compared to the period right before it. That calm was recently punctuated with a dose of some wild downside volatility during the late February, early March stock market correction. Just a guess, but given that wild points often come in "clumps", there is a good possibility there may be more in store in the months ahead.

Correlation of Copper versus Gold

In this section I will present correlation estimates for the logarithm of the daily returns of COMEX High Grade Copper versus Gold. I used two different price series for Gold, each having its own quirks and limitations. In this article I used the ordinary sample correlation statistic, sometimes called the Pearson correlation statistic. In the interests of brevity I will refer those who are unfamiliar with the details to any good statistics text or the online reference [11].

There is one special consideration that is worth mentioning. When correlating the returns of two financial price series, care must be taken to be sure that the dates associated with each pair of samples match up correctly (e.g. accounting for differences in exchange holidays, etc.)

Figure 3. Correlation of COMEX High Grade Copper versus
London PM Gold FIX
Left: Normalized Scattergram, Right: Correlation versus Lead / Lag
Date Range: 01 Jul, 2004 to 01 Jun, 2007

In figure 3 above, the left hand plot is a scattergram of COMEX High Grade Copper plotted against the corresponding London PM Gold FIX. The data points (the logarithm of the returns) have been normalized by their respective standard deviations for plotting purposes. This guarantees you will see a fuzzy ball when the two variables are uncorrelated, and a football (American football, not soccer ball) or cigar like shape when the variables are correlated.

The right hand plot shows the correlation for various time shifts or leads and lags applied to the first series (Copper in this case) before computing the correlation between the two. The dashed blue lines show the uncertainty in the estimates that can be attributed to the number of samples used. The more samples used, the less uncertainty. (Unfortunately, my database for COMEX Copper only goes back to mid 2004). Roughly speaking, in the case where the two variables are not correlated with each other, one would expect almost all of the estimates to be contained within the dashed blue lines.

Not surprisingly, there is a positive correlation of Copper versus Gold on a day to day basis that is well outside the uncertainty bands. There is also a positive correlation of Copper with a lead, lag of minus one day versus London Gold. In the later situation when COMEX Copper closes in a given direction (either up or down), there is a slight tendency for the London PM Gold FIX to move in the same direction (up or down) the following day.

The mechanism responsible the positive correlation of the daily returns of COMEX Copper with the returns of the London PM Gold FIX for the following day isn't too difficult to understand. It does take some explaining so bear with me.

The trading sessions for the COMEX Copper daily returns correspond to the time intervals from the settlement at 1 PM New York time from one day to the next. The trading sessions associated with the London PM Gold FIX correspond to the time intervals from 3 PM London time (11 AM New York) from one day to the next. If I adopt a New York centric view of time, every 24 hour period can be divided into two distinct periods; interval 1 from 11 AM (3 PM London) to 1 PM, and interval 2 from 1 PM to 11 AM the following day. The price discovery that occurs during interval 1 is part of today's Copper session and tomorrow's PM FIX. Similarly, the price discovery that occurs during interval 2 is part of tomorrow's Copper session and tomorrow's PM FIX. So the observed correlations can in part be explained by the fact that the two series are not sampled at exactly the same time of day. A mental image of this effect that is not too off the mark is one of a bleeding or smearing of the positive correlation between the two lags.

This quirk is certainly a bit of an annoyance in one sense. It makes it a little harder to get a real handle on how Copper and Gold are correlated. However, one gets a teeny bit of extra information in exchange. Whatever causal force is there which is moving both Copper and Gold is fairly active during the 2 hour long window from 11 AM to 1 PM New York.

Figure 4. Correlation of COMEX High Grade Copper versus StreetTracks Gold Trust
Left: Normalized Scattergram, Right: Correlation versus Lead / Lag
Date Range: 19 Nov, 2004 to 01 Jun, 2007

Figure 4 above shows the same calculations for COMEX Copper versus the StreetTracks Gold Trust. The StreetTracks ETF closes at the end of the New York day at 4 PM along with all the stock exchanges. In this case, COMEX Copper does not exhibit positive correlation that is statistically significant for the case of a lead, lag of plus or minus one. The scattergram also exhibits more of a football-like shape that is associated with two variables that are correlated.

Once again, I get a teeny bit of extra information from the overlap of the trading sessions. In this case I can divide every 24 period into two disjoint periods; interval 1 from COMEX settlement at 1 PM to the settlement of StreetTracks at 4 PM, interval two from 4 PM to 1 PM the following day. Whatever forces are driving both markets to move together are active during interval 2, but not particularly so during interval one.

Correlation of Gold shares versus Gold

Given the interest of the Gold community in the behavior of Gold shares versus Gold, it seemed appropriate to have a closer look at them. In this section I will present estimates of the correlation of the logarithm of the daily returns of the AMEX Gold BUGS Index versus both the London PM Gold FIX and the StreetTracks Gold Trust.

The scattergram and correlation estimates for the London PM Gold FIX are shown in figure 5 below. In a somewhat arbitrary fashion I chose to estimate the correlation using data starting from September 2000 (which roughly corresponds to a couple months before the start of the Gold share bull market) to the present.

Figure 5. Correlation of AMEX Gold BUGS Index versus London PM Gold FIX
Left: Normalized Scattergram, Right: Correlation versus Lead / Lag
Date Range: 01 Sep, 2000 to 01 Jun, 2007

The quirk associated with the series being sampled at different times of the day once again make it more difficult to get a real handle on the correlation of Gold shares versus Gold. In this case, the correlation observed for a lead, lag of minus one is actually slightly stronger than that observed for no lead, lag. To be totally objective, they don't differ by more than the uncertainty of the estimates. But it is safe to say that the correlations for lead, lags of minus one and zero are comparable.

In exchange for this impairment, once again I get a small bit of extra information. Whatever forces are moving Gold shares and Gold together are slightly more active (on average) during the period from 11 AM New York through the close of the stock markets at 4 PM as compared to the "overnight" period from 4 PM through 11 AM the following day. For what it is worth, the period from 11 AM to 4 PM New York includes all but the first hour and one half of the entire trading session for the stocks that make up the BUGS Index. The observed correlations are not really too surprising given this.

As noted before, one of the trademarks of financial series is their non-stationary (changing) character. I thought it might be revealing to estimate the correlation for these lead, lags values applied to the BUGS Index over a much shorter time window of just 50 days. Figure 6 below shows the results of calculating the correlations over a trailing moving window of 50 samples over the same range of dates. (The date associated with a trailing window is that of the last or most recent sample).

Figure 6. Moving Sample Correlation (Trailing 50 Day Window)
Solid: Lead / Lag = 0, Dashed: Lead / Lag = -1
Date Range: 01 Sep, 2000 to 01 Jun, 2007

I must caution you to not read too much into the very short time scale fluctuations seen in the figure 6 above. A sample size of 50 days is on the small side and the associated uncertainty is quite large (plus or minus 0.279 to be precise, just a tad more than the spacing of the grid lines on the vertical axis in graphical terms). For small sample sizes the estimates can also exhibit large one day jumps when "heavy" wild points move in or out of the sampling window. In exchange for this additional fluctuation in the estimates you get more sensitivity to non-stationary behavior. A closer inspection of figure 6 above shows that whatever forces that move Gold shares and Gold together do ebb and flow over time.

Moving on to the StreetTracks Gold Trust (GLD) price series, the correlation estimates for it against various lead, lags applied to the AMEX Gold BUGS Index are shown below in figure 7.

Figure 7. Correlation of AMEX Gold BUGS Index versus StreetTracks Gold Trust
Left: Normalized Scattergram, Right: Correlation versus Lead / Lag
Date Range: 19 Nov, 2004 to 01 Jun, 2007

An inspection of figure 7 above shows that only the BUGS Index for a lead, lag of zero exhibits positive correlation with the StreetTracks Gold Trust. The absence of significant correlation for other values of lead, lag is not surprising since these two price series are sampled at precisely the same time of day (4 PM New York). This make the StreetTracks Gold Trust series an attractive choice when estimating the correlation of stocks that trade on the New York exchanges.

********

(Part 2 to follow)

In part 1 of this article I presented a simple transformation for daily closing data that produces a series suitable for the estimation of day to day correlation of two different markets, instruments. The technique was used to estimate the correlation of COMEX HG Copper versus Gold, and the AMEX Gold BUGS Index versus Gold. In part 2 of this article I will present correlation estimates for a variety of markets, instruments versus Gold in tabular form.


Footnotes, References

  1. Gold and Copper - An Astonishing Relationship, by Adrian Douglas, 6 April 2007, Le Metropole Cafe. [www.lemetropolecafe.com/pfv.cfm?pfvID=6015] (subscription required)
  2. Consider the following hypothetical situation. A sum of $1000 is used to purchase 1000 shares of stock A which moves from a price of 1 to 2. Another sum of $1000 is used to purchase 100 shares of stock B which moves from a price of 10 to 11. In both cases, the net change is just one, suggesting they are somehow the same. But in the former case, we walk away with a tidy profit of $1000, but only $100 in the later case. Most certainly not the same situation.
  3. Wikipedia: Logarithm, [http://en.wikipedia.org/wiki/Logarithm]
  4. Technical Analysis and Stock Market Profits, by Richard W. Schabacker, 1997, Pearson Professional Limited, ISBN 0-273-63095-4.
  5. When you compute the Pearson or sample correlation statistic using raw price data, the price samples that are farthest away from their sample means get weighted most heavily in the correlation sum as compared to those near their average which introduces an unwanted bias. There are some situations in finance where having an intentional weighting can be meaningful. Many (perhaps most) stock indices are such that the individual components are weighted in the summation by their market capitalization (e.g. the AMEX Gold BUGS Index). This is perfectly fine provided you understand what the index represents, and interpret it appropriately. Cap weighted indices roughly reflect movement of money in terms of dollars in and out of the aggregate pool of all the components, but they provide a biased view of how the individual components are performing in terms of their returns. I presented an unbiased index (with respect to returns) of gold and silver shares using geometric or logarithmic averaging in reference [6]. 6. The Bigger They Are, The Harder They Fall, by John Peterson, 30 October, 2001, Gold-Eagle.com, [www.gold-eagle.com/editorials_01/peterson103001pv.html]
  6. If you compute an arithmetic mean value using N contiguous samples of the transformed data, then the anti-log of that number, you get an average daily return for the N sample period. (An instrument that gave a fixed return of that value every day would provide the same total compounded return over the N sample period). If you calculate some typical numbers, you will quickly realize the mean value is too small to be easily perceived by eyeballing a chart.
  7. As a further aside, the transformation to the logarithm of daily returns can be used to elegantly splice together futures data for testing trading systems, etc. First transform each of the individual contract months separately. Now splice the transformed pieces into a single data series, then invert the transformation to get back a raw price series. The resulting series will not exhibit any unwanted spikes due to changes in time premiums at the contract roll over dates.
  8. Wikipedia: Standard Deviation, [http://en.wikipedia.org/wiki/Standard_deviation]
  9. Wikipedia: Normal distribution, [http://en.wikipedia.org/wiki/Normal_distribution]
  10. Wikipedia: Correlation statistic, [http://en.wikipedia.org/wiki/Correlation]