Backesting a trading system amounts to perform a simulation of the trading rules on historical data. All trading rules depend to some extent on a set of parameters. These parameters can be the lookback periods used for defining technical indicators or the hyperparameters of a complex machine learning model.
It is very important to study the parameter dependence of the key statistical indicators, for example the Sharpe ratio. A parameter choice which maximizes the value of the Sharpe ratio when the simulation is performed on the past data is a source of backtest overfitting and leads to poor performance on live data.
In this template we provide a tool for studying the parameter dependence of the statistical indicators used for assessing the quality of a trading system.
We recommend optimizing your strategy in a separate notebook because a parametric scan is a time consuming task.
Alternatively it is possible to mark the cells which perform scans using the #DEBUG#
tag. When you submit your notebook, the backtesting engine which performs the evaluation on the Quantiacs server will skip these cells.
You can use the optimizer also in your local environment on your machine. Here you can use more workers and take advantage of parallelization to speed up the grid scan process.
%%javascript
IPython.OutputArea.prototype._should_scroll = function(lines) { return false; }
// disable widget scrolling
import qnt.data as qndata
import qnt.ta as qnta
import qnt.output as qnout
import qnt.stats as qns
import qnt.log as qnlog
import qnt.optimizer as qnop
import qnt.backtester as qnbt
import xarray as xr
For defining the strategy we use a single-pass implementation where all data are accessed at once. This implementation is very fast and will speed up the parametric scan.
You should make sure that your strategy is not implicitly forward looking before submission, see how to prevent forward looking.
The strategy is going long only when the rate of change in the last roc_period
trading days (in this case 10) of the linear-weighted moving average over the last wma_period
trading days (in this case 20) is positive.
def single_pass_strategy(data, wma_period=20, roc_period=10):
wma = qnta.lwma(data.sel(field='close'), wma_period)
sroc = qnta.roc(wma, roc_period)
weights = xr.where(sroc > 0, 1, 0)
weights = weights / len(data.asset) # normalize weights so that sum=1, fully invested
with qnlog.Settings(info=False, err=False): # suppress log messages
weights = qnout.clean(weights, data) # check for problems
return weights
Let us first check the performance of the strategy with the chosen parameters:
#DEBUG#
# evaluator will remove all cells with this tag before evaluation
data = qndata.futures.load_data(min_date='2004-01-01') # indicators need warmup, so prepend data
single_pass_output = single_pass_strategy(data)
single_pass_stat = qns.calc_stat(data, single_pass_output.sel(time=slice('2006-01-01', None)))
display(single_pass_stat.to_pandas().tail())
A parametric scan over pre-defined ranges of wma_period
and roc_period
can be performed with the Quantiacs optimizer function:
#DEBUG#
# evaluator will remove all cells with this tag before evaluation
data = qndata.futures.load_data(min_date='2004-01-01') # indicators need warmup, so prepend data
result = qnop.optimize_strategy(
data,
single_pass_strategy,
qnop.full_range_args_generator(
wma_period=range(10, 150, 5), # min, max, step
roc_period=range(5, 100, 5) # min, max, step
),
workers=1 # you can set more workers when you run this code on your local PC to speed it up
)
qnop.build_plot(result) # interactive chart in the notebook
print("---")
print("Best iteration:")
display(result['best_iteration']) # as a reference, display the iteration with the highest Sharpe ratio
The arguments for the iteration with the highest Sharpe ratio can be later defined manually or calling result['best_iteration']['args']
for the final strategy. Note that cells with the tag #DEBUG#
are disabled.
The final multi-pass call backtest for the optimized strategy is very simple, and it amounts to calling the last iteration of the single-pass implementation with the desired parameters:
best_args = dict(wma_period=20, roc_period=80) # highest Sharpe ratio iteration (not recommended, overfitting!)
def best_strategy(data):
return single_pass_strategy(data, **best_args).isel(time=-1)
weights = qnbt.backtest(
competition_type="futures",
lookback_period=2 * 365,
start_date='2006-01-01',
strategy=best_strategy,
analyze=True,
build_plots=True
)
import qnt.data as qndata
import qnt.ta as qnta
import qnt.log as qnlog
import qnt.backtester as qnbt
import qnt.output as qnout
import xarray as xr
best_args = dict(wma_period=20, roc_period=80) # highest Sharpe ratio iteration (not recommended, overfit!)
def single_pass_strategy(data, wma_period=20, roc_period=10):
wma = qnta.lwma(data.sel(field='close'), wma_period)
sroc = qnta.roc(wma, roc_period)
weights = xr.where(sroc > 0, 1, 0)
weights = weights / len(data.asset)
with qnlog.Settings(info=False, err=False): # suppress log messages
weights = qnout.clean(weights, data) # check for problems
return weights
def best_strategy(data):
return single_pass_strategy(data, **best_args).isel(time=-1)
weights = qnbt.backtest(
competition_type="futures",
lookback_period=2 * 365,
start_date='2006-01-01',
strategy=best_strategy,
analyze=True,
build_plots=True
)
You can use this code snippet for checking forward looking. A large difference in the Sharpe ratios is a sign of forward looking for the single-pass implementation used for the parametric scan.
#DEBUG#
# evaluator will remove all cells with this tag before evaluation
# single pass
data = qndata.futures.load_data(min_date='2004-01-01') # warmup period for indicators, prepend data
single_pass_output = single_pass_strategy(data)
single_pass_stat = qns.calc_stat(data, single_pass_output.sel(time=slice('2006-01-01', None)))
# multi pass
multi_pass_output = qnbt.backtest(
competition_type="futures",
lookback_period=2*365,
start_date='2006-01-01',
strategy=single_pass_strategy,
analyze=False,
)
multi_pass_stat = qns.calc_stat(data, multi_pass_output.sel(time=slice('2006-01-01', None)))
print('''
---
Compare multi-pass and single pass performance to be sure that there is no forward looking. Small differences can arise because of numerical accuracy issues and differences in the treatment of missing values.
---
''')
print("Single-pass result:")
display(single_pass_stat.to_pandas().tail())
print("Multi-pass result:")
display(multi_pass_stat.to_pandas().tail())