pair trading strategy in r
Background
For those of you who have been pursuing my blog posts for the last 6 months will know that I have taken disunite in thedannbsp;Administrator Programme in Recursive Tradingdannbsp;offered by QuantInsti.
IT's been a journey and this article serves as a reputation happening my final visualize focusing on statistical arbitrage, coded in R. This article is a combination of my class notes and my reservoir code.
I uploaded everything todannbsp;GitHubdannbsp;in order to wanted readers to conduce, better, habituate, or work on this project. Information technology will also bod part of mydannbsp;Agape Source Hedge Investment trust projectdannbsp;on my web log QuantsPortal
I would the like to articulate a uncommon thank you to the squad at QuantInsti. Thank you for all the revisions of my inalterable project, for going away out of your agency to help me learn, and the very falsetto level of guest services.
History of Statistical Arbitrage
Low developed and used in the mid-1980s by Nunzio Tartaglia's quantitative group at John Pierpont Morgan Stanly.
- Couplet Trading is a "contrarian strategy" studied to harness tight-reverting behavior of the copulate ratio
- David Shaw, founder of D.E Artie Shaw danamp; Carbon monoxide gas, unexhausted Morgan Stanley and started his own "Quant" trading firm in the late 1980s dealing mainly in pair trading
What is Pair Trading?
Statistical arbitrage trading or pairs trading equally it is commonly famous is defined as trading one fiscal pawn operating room a basket of financial instruments – in just about cases to make a value achromatic basketball hoop.
IT is the mind that a co-integrated pair is mean regressive in nature. There is a spread between the instruments and the further information technology deviates from its ignoble, the greater the probability of a reversal.
Note yet that statistical arbitrage is not a put on the line independent strategy. Say for example that you have entered positions for a pair and then the spread picks up a trend rather than mean reverting.
The Concept
Mistreat 1: Find 2 related securities
See two securities that are in the synoptic sector / industry, they should have correspondent market capitalisation and average volume listed.
An example of this is Anglo Gold and Harmony Gold.
Footprint 2: Cypher the spread
In the code to follow I misused the match ratio to indicate the spread. It is bu the terms of asset A / price asset B.
Step 3: Calculate the mean, standard divagation, and z-score of the pair ratio / spread.
Footprint 4: Screen for atomic number 27-integration
In the encode to observe I use the Augmented Dickey Fuller Test (ADF Trial) to test for co-consolidation. I set ascending three tests, each with a different number of observations (120, 90, 60), all three tests have to reject the null hypothesis that the pair is not co-co-ed.
Footfall 5: Generate trading signals
Trading signals are based along the z-score, conferred they pass the try out for co-integration. In my project I used a z-sexual conquest of 1 as I noticed that other algorithms that I was competitory with were using rattling low parameters. (I would take in preferred a z-score of 2, as it better matches the literature, however IT is less profit-making)
Step 6: Process minutes supported signals
Step 7: Reporting
Clause write ascending for my project
Import packages and set directory
The first step is always to import the packages necessary.
This strategy testament be endure happening shares catalogued on the Johannesburg Old-hat Exchange (JSE); because of this I North Korean won't be using the quantmod parcel to pull data from Yahoo finance, alternatively I suffer already gotten and cleaned the data that I stored in a SQL database and moved to csv files on the Desktop. (I did this so that readers could import the CSV files as an alternative of needing my SQL database. I didn't use quantmod because I wanted to show that I could build the backtester from opening principals)
I added all the pairs used in the strategy to a folder which I now set to be the working directory.
Functions that will be called from inside other functions (No substance abuser interaction)
Next: Create all the functions that will be needed. The functions below will be named from within other functions and then you don't want to interest about the arguments.
AddColumns
The AddColumns function is used to add columns to the dataframe that will be needed to store variables.
PrepareData
The PrepareData function calculates the pair ratio and the log prices of the span. IT also calls the AddColumns function within IT.
PrepareData
The PrepareData function calculates the pair ratio and the log prices of the pair. It also calls the AddColumns function within it.
GenerateRowValue
The GenerateRowValue function Calculates the mean, standard deviation and the z-grievance for a given words in the dataframe.
GenerateSignal
The GenerateSignal function creates a long, low, operating room close signal based on the z-score. You can manually change the z-score. I have set it to 1 and -1 for entry signals and any z-score betwixt 0.5 and -0.5 will create a incommunicative/exit signal.
GenerateTransactions
The GenerateTransactions function is causative setting the entry and exit prices for the individual long and short positions requisite to create a pair.
Note: QuantInsti taught us a real specific way of backtesting a trading strategy. They utilised excel to teach strategies and when I coded this strategy I used a large part of the excel methodology.
Going forward however I would explore other ways of storing variables. One of the great things about this method acting is that you can pull the entire dataframe and psychoanalyze wherefore a barter was successful and all the details pertaining to that.
GetReturnsDaily
GetReturnsDaily calculates the daily returns on each position and then calculates the total returns and adds slippage.
GenerateReports
The next two arguments are used to generate reports. A report includes the following: Charting: 1. An Fairness curve 2. Drawdown curve 3. Daily returns bar graph
Statistics: 1. Annual Returns 2. Annualized Sharpe Ratio 3. Supreme Drawdown
Table: 1. Top 5 drawdowns and their length
Note: If you have some extra time then you can further break this function down into smaller functions in order to reduce the lines of encrypt and improve usability. Little code = Less Bugs
Functions that the user will pass parameters to
The side by side two functions are the only functions that the substance abuser should shirk with.
BacktestPair
BacktestPair is victimised when you want to run a backtest on a trading pair (the copulate is passed in via the csv file)
Functions arguments:
- pairData = the csv file date
- mean = the number of observations victimized to calculate the mean of the spread.
- slippage = the amount of money of basis points that act Eastern Samoa brokerage atomic number 3 well as slippage
- adfTest = a boolean appreciate - if the backtest should test for co-integration
- criticalValue = Critical Treasure used in the ADF Test to test for co-consolidation
- generateReport = a boolean value - if a report essential be generated
BacktestPortfolio
BacktestPortfolio accepts a vector of csv files and then generates an equaly weighted portfolio.
Functions arguments:
- name calling = an attomic vector of csv file names, example: c('DsyLib.csv', 'OldSanlam.csv')
- mean = the number of observations accustomed calculate the mean of the spread.
- leverage = how so much purchase you want to lend oneself to the portfolio
Running Backtests
At once we can start testing strategies using our code.
Unpolluted arbitrage on the JSE
When starting this project the main focus was on using statistical arbitrage to find pairs that were co-integrated and then to trade those, yet I very chop-chop realised that the same code could be used to trade in shares that had both its primary listing too as access to its secondary listing on the same exchange.
If both listings are saved on the same exchange, IT opens the door for a pure arbitrage strategy due to some listings refering to the same plus. Therefore you dont need to exam for co-integration.
There are two very provable examples on the JSE.
First Example Investec:
Primary = Investec Ltd : Secondary = Investec PLC
Investec In-Sample Test (2005-01-01 - 2012-11-23)
Test the following parameters
- The Investec ltd / plc pair
- mean = 35
- Set adfTest = F (Don't quiz for co-integrating)
- Leverage of x3
Statistical Arbitrage on the JSE
Next we will view a pair trading strategy.
Typically a pair consists of 2 shares that:
- Part a marketplace sector
- Have a similar food market cap
- Similar business model and clients
- Are co-desegrated
In all of the portfolios under I use 3x leverage
Conclusion:
At the end of all my testing, and trust me – in that location is much many examination I did than what is in this reputation, I came to the conclusion that the Pure Arbitrage Strategy has great hope in being exploited arsenic a strategy using tangible money, but the Pair Trading Strategy along portfolios of stocks in a disposed sector is strained and not likely to be used in production in its flow form.
There are many things that I think could glucinium added to improve the performance. Going bumptious I wish investigate using Kalman filters.
Many happening the Pure Arbitrage Trading Scheme:
I have only base cardinal shares that have duel listings connected the same exchange; this means that we can't allocate large sums of money to the scheme as it wish have a high marketplace impact, however we could purpose multiple exchanges and increase the number of shares used.
More on the Geminate Trading Scheme:
- The number of observations in use in the ADF Tests are largely to blame. The problem is that a run for co-integration has to live done systematic to make a claim for statistical arbitrage, nonetheless by victimisation 120, 90, and 60 as parameters to the threesome tests, it is very tricky to bump pairs that match the criteria and that will continue in that grade for the near hereafter. (Kalman filtering Crataegus oxycantha be useful present)
- I oasis't spent a pot of time changing the disparate parameters like the number of observations in the mean computing. (This requires further exploration)
- From the above sector portfolios, we can see that the previous years are very profitable simply the further down the timeline we go, the lower returns get. I have spoken to a few people in the manufacture equally well as my friends doing stat arb projects at the University of Cape Town, the local lore has it that in 2009 Goldman switched happening their stat arb package, in regards to the JSE listed securities.
- The same is detected with unusual portfolios that I didn't admit in this report but is in the R Code file.
- I consider that this is due to large institutions exploitation the same bread and butter strategy. You wish note (if you spend plenty clip testing all the strategies) that in 2009 there seems to beryllium a sudden teddy in the data to lower returns.
- I flavour that the oddment of day data I am using is limiting Pine Tree State and if I were to test the strategy on intraday information then profits would be higher. (I ran one run on intraday data connected Mondi and the results were such high, but I am still to test it along sector portfolios)
- This is one of the simpler statistical arbitrage strategies and I believe that if we were to improve along the way we calculate the paste and change or s of the entry and exit rules, the strategy would become more paying.
If you made it to the end of this article, I give thanks you and promise that it added some value. This is actually a much advisable read from my Github account.
Github repository:https://github.com/Jackal08/QuantInsti-Final-Project-Statistical-Arbitrage
pair trading strategy in r
Source: https://www.linkedin.com/pulse/statistical-arbitrage-strategy-r-jacques-joubert
Posted by: trevinoexpeithe.blogspot.com

0 Response to "pair trading strategy in r"
Post a Comment