Trading Strategies

Arbitrage on cross-section returns in Brazilian stock market

A lot of research already have been done trying to find out how to predict the stocks market returns. On a daily time frame, both trend-following and mean-reversal trading strategies applied to single stocks can’t sustain a stable Sharpe ratio across the time, making us believe that even if the random walk hypotesis is wrong, we still can’t find a model that precisely describe how the prices moves. But, as discussed in my previous article (Technical Analysis for intraday stocks trading? FORGET IT!), the relationship between the stock’s future index  and the stocks individualy allow us to gather some Alpha.

The trading idea that will be presented here is quite simple and was first proposed by Andrew W. Lo and A. Craig MacKinlay, in their paper When Are Contrarian Profits Due to Stock Market Overreacction (1990)The same pattern was also documented in the paper Contrarian Strategies and Cross-Autocorrelations in Stock Returns (1998) in the French Stock market.

The idea

The base point of this trading strategy is that stocks do overreact. To test this, it’s natural to test for the presence of autocorrelation in the stocks returns. For those who doesn’t know, correlation is a number that measure the presence os statiscical relationship between two time series. Let X and Y two random variables, the correlation between them is:

\rho _{X,Y}=\mathrm {corr} (X,Y)={\mathrm {cov} (X,Y) \over \sigma _{X}\sigma _{Y}}={E[(X-\mu _{X})(Y-\mu _{Y})] \over \sigma _{X}\sigma _{Y}},

Where \boldsymbol{\sigma_{X}\sigma_{Y}} are the sample standard deviation of the series and \boldsymbol{\mu_{X}} and \boldsymbol{ \mu_{Y}} are the sample means. The concept of autocorrelation derives from the fact that we are trying to find some kind of correlation in X and in its own past values, not with another random variable.

So, the obvious starting point for testing stocks overreaction is evaluating the presence of autocorrelation between today’s return and past returns.

The Dataset

In this article, i’m going to test this hypothesis for the Brazilian stocks. The data used here was directly from the BM&F Bovespa FTP. This database contains more than 2 years of tick-by-tick data, but, for this study, was adjusted to the daily frequency. Th data span vary from 25/11/2014 and 20/01/2017.

The data was corrected from dividends, splits, inplits and subscription rights accordingly to Bloomberg’s database.

Results

As mentioned before, the first basic idea for testing for overreaction in stocks is looking if there is any kind of autocorrelation in the time series of returns for individual stocks. The language i’ll be using for this is Python.

Let’s first retrieve the data first. I’ve put all my data in a MongoDB server. The acess to it is done via an excellent library called Arctic. For those how don’t know Arctic project, you can check it here.

store = Arctic('localhost:27017') # Acessing the server
lib = store['CSSA'] # Selecting the library
equities = lib.read('data').data # Reading the data

Now, equities is a Pandas DataFrame cointaining the first 97 most liquid stocks in brazilian equity market. We can do this also using Yahoo/Google Finance databases; you’ll find the same results.

equities['data'] = equities.index.date # Group days
close = equities.groupby(['data']).last().copy() # Get the last information per day, since equities was retrieved from a intraday database

ret_eq = np.log(close.shift(1)) - np.log(open.shift(1)) # Create the series of returns

Right now, ret_eq will be our return serie. It’s a 535 x 97 DataFrame (535 days for 97 stocks). The autocorrelation is simple done using acf function, contained in the statsmodels.tsa.stattools library.

list_returns = [ret_eq.iloc[1:, i] for i in range(0, ret_eq.shape[1])] # Create a list with all the individual series
acf_individual = map(statsmodels.tsa.stattools.acf, list_returns)
acf_individual = np.array(list(acf_individual)) # Converting to numpy.ndarray

In this test, we’ll gonna be using a 80% confidence interval. It’s not a huge confidence, but the main purpose here is always to find Alpha and not to prove our models has a super high statistical security. When it comes to stock market, few things have that and normally mostly of the Alpha do exist in situations where the confidence interval is lower than the usual 95%.

When we mean our individual results and plot it, what we get is not so exciting:

acf_individual

This graph is simple the means of the autocorrelation values between the return series and itself lagged in days. As you can see, there isn’t any lag that gave us some statistical confidence, even in 80%. So, can we conclude then that stocks do not overreact? NO.

As mentioned in my previous post, stocks have a lot of cross-correlation. Why don’t we try mesure the overreaction using this concept? The idea now will be mesure if there is some kind of overreaction when some stock outperforms (or underperforms) the mean return of the market (instead of just see if it went up or down like before).

To do this, i used the same metodology as Lo and Craig. Instead of using the Index to track market perfomance, they simple used a equally-weighted returns of all stocks.

list_returns = [ret_eq.iloc[1:, i] for i in range(0, ret_eq.shape[1])] # Create a list with all the individual series
acf_individual = map(statsmodels.tsa.stattools.acf, list_returns)
acf_individual = np.array(list(acf_individual)) # Converting to numpy.ndarray

Quite simple.

Now, using the same calculation than before, our results changes a little:

acf_group

Now we can see that there is some kind of negative autocorrelation in the relative return series and itself lagged by 1 day. That means that mostly of the days when a stock outperform the market, the next day it tends to underperform.

If we do simple sell the outperformers in the market close and buy the underperformers, using the same capital allocation to everyone, we get this result curve:

result

Easy, huh?

Obviously, that’s not so simple. Any kind of cost was added to this backtest. When dealing with stocks that’s critical, so that’s why i won’t measure any kind of perfomance of this trading strategy yet because that’s still on the rough stone. There’s a lot to mold. We can test other more intelligent types of position sizing, filters, stops… We can the same ideia for higher and lower frequencies. And that’s what I intend to do, but on next articles.

The idea here is to show that’s still Alpha around there, even with pretty old ideas.

12 thoughts on “Arbitrage on cross-section returns in Brazilian stock market

  1. i guess i would try and split the results into todays underperformers outperforming tommorow vs todays outperformers underperforming tommorow (i.e. split the strategy into a long only and a short only strategy rather than a long short strategy)

    because, sometimes, shorting stocks is hard. and constructing an equal weighted rather than cap weighted index as the baseline would tend to skew your shorts to illiquid stocks that ard hard to borrow to effectuate the short sale

    1. (oops, bad mail typo on original submission)

      i guess i would try and split the results into todays underperformers outperforming tommorow vs todays outperformers underperforming tommorow (i.e. split the strategy into a long only and a short only strategy rather than a long short strategy)

      because, sometimes, shorting stocks is hard. and constructing an equal weighted rather than cap weighted index as the baseline would tend to skew your shorts to illiquid stocks that are hard to borrow to effectuate the short sale

      1. Hello, TC. First of all, thanks for your comment. In the second part of this article, i’ll make a study about the perfomance of that strategy if applied on intraday time frame, aiming avoid the necessity to borrow stocks and running away  from the overnight risk. The first results here seems promising.

  2. A person essentially assist to make significantly articles I’d state. This is the first time I frequented your website page and up to now? I surprised with the analysis you made to create this particular put up extraordinary. Fantastic job!

  3. Hi fantastic blog! Does running a blog such as this take a massive amount work? I’ve virtually no understanding of coding however I was hoping to start my own blog in the near future. Anyhow, if you have any ideas or tips for new blog owners please share. I understand this is off topic but I simply needed to ask. Thank you!

    1. Thanks for the praise! Well, I think the hardest part is to find a interesting content. There are a lot of blogs around covering the basics concepts of Machine Learning and quantitative trading, but when it comes to present something really new, there aren’t so much. This article takes me a whole day to create the graphs, the text and the codes. But the good part is that when you have to explain something to someone, you also learn.

  4. Howdy just wanted to give you a brief heads up and let you know a few of the images aren’t loading properly. I’m not sure why but I think its a linking issue. I’ve tried it in two different internet browsers and both show the same results.

Leave a Reply

Your email address will not be published. Required fields are marked *