Trading System Optimization¶

Backesting a trading system amounts to perform a simulation of the trading rules on historical data. All trading rules depend to some extent on a set of parameters. These parameters can be the lookback periods used for defining technical indicators or the hyperparameters of a complex machine learning model.

It is very important to study the parameter dependence of the key statistical indicators, for example the Sharpe ratio. A parameter choice which maximizes the value of the Sharpe ratio when the simulation is performed on the past data is a source of backtest overfitting and leads to poor performance on live data.

In this template we provide a tool for studying the parameter dependence of the statistical indicators used for assessing the quality of a trading system.

We recommend optimizing your strategy in a separate notebook because a parametric scan is a time consuming task.

Alternatively it is possible to mark the cells which perform scans using the #DEBUG# tag. When you submit your notebook, the backtesting engine which performs the evaluation on the Quantiacs server will skip these cells.

You can use the optimizer also in your local environment on your machine. Here you can use more workers and take advantage of parallelization to speed up the grid scan process.

%%javascript
IPython.OutputArea.prototype._should_scroll = function(lines) { return false; }
// disable widget scrolling

import qnt.data as qndata
import qnt.ta as qnta
import qnt.output as qnout
import qnt.stats as qns
import qnt.log as qnlog
import qnt.optimizer as qnop
import qnt.backtester as qnbt

import xarray as xr

For defining the strategy we use a single-pass implementation where all data are accessed at once. This implementation is very fast and will speed up the parametric scan.

You should make sure that your strategy is not implicitly forward looking before submission, see how to prevent forward looking.

The strategy is going long only when the rate of change in the last roc_period trading days (in this case 10) of the linear-weighted moving average over the last wma_period trading days (in this case 20) is positive.

def single_pass_strategy(data, wma_period=20, roc_period=10):
    wma = qnta.lwma(data.sel(field='close'), wma_period)
    sroc = qnta.roc(wma, roc_period)
    weights = xr.where(sroc > 0, 1, 0)
    weights = weights / len(data.asset) # normalize weights so that sum=1, fully invested
    with qnlog.Settings(info=False, err=False): # suppress log messages
        weights = qnout.clean(weights, data) # check for problems
    return weights

Let us first check the performance of the strategy with the chosen parameters:

#DEBUG#
# evaluator will remove all cells with this tag before evaluation

data = qndata.futures.load_data(min_date='2004-01-01') # indicators need warmup, so prepend data
single_pass_output = single_pass_strategy(data)
single_pass_stat = qns.calc_stat(data, single_pass_output.sel(time=slice('2006-01-01', None)))
display(single_pass_stat.to_pandas().tail())

A parametric scan over pre-defined ranges of wma_period and roc_period can be performed with the Quantiacs optimizer function:

#DEBUG#
# evaluator will remove all cells with this tag before evaluation

data = qndata.futures.load_data(min_date='2004-01-01') # indicators need warmup, so prepend data

result = qnop.optimize_strategy(
    data,
    single_pass_strategy,
    qnop.full_range_args_generator(
        wma_period=range(10, 150, 5), # min, max, step
        roc_period=range(5, 100, 5)   # min, max, step
    ),
    workers=1 # you can set more workers when you run this code on your local PC to speed it up
)

qnop.build_plot(result) # interactive chart in the notebook

print("---")
print("Best iteration:")
display(result['best_iteration']) # as a reference, display the iteration with the highest Sharpe ratio

100% (532 of 532) |######################| Elapsed Time: 0:04:51 Time:  0:04:51

VBox(children=(HBox(children=(Dropdown(description='coord_x', options=('wma_period', 'roc_period', 'sharpe_rat…

---
Best iteration:

{'args': {'wma_period': 20, 'roc_period': 80},
 'result': {'equity': 1.3625438679785011,
  'relative_return': -0.002916734055067205,
  'volatility': 0.045709692043196824,
  'underwater': -0.04049080046998221,
  'max_drawdown': -0.15125006247921802,
  'sharpe_ratio': 0.43744047376391276,
  'mean_return': 0.019995269342978572,
  'bias': 1.0,
  'instruments': 75.0,
  'avg_turnover': 0.016955481472089345,
  'avg_holding_time': 84.18966408268821},
 'weight': 0.43744047376391276,
 'exception': None}

The arguments for the iteration with the highest Sharpe ratio can be later defined manually or calling result['best_iteration']['args'] for the final strategy. Note that cells with the tag #DEBUG# are disabled.

The final multi-pass call backtest for the optimized strategy is very simple, and it amounts to calling the last iteration of the single-pass implementation with the desired parameters:

best_args = dict(wma_period=20, roc_period=80) # highest Sharpe ratio iteration (not recommended, overfitting!)

def best_strategy(data):
    return single_pass_strategy(data, **best_args).isel(time=-1)

weights = qnbt.backtest(
    competition_type="futures",
    lookback_period=2 * 365,
    start_date='2006-01-01',
    strategy=best_strategy,
    analyze=True,
    build_plots=True
)

Run last pass...
Load data...
Run pass...
Ok.
---
Run first pass...
Load data...
Run pass...
Ok.
---
Load full data...
---
Run iterations...

100% (3922 of 3922) |####################| Elapsed Time: 0:01:40 Time:  0:01:40

Merge outputs...
Load data for cleanup and analysis...
ffill if the current price is None...
Check missed dates...
Ok.
Normalization...
Done.
Write output: /root/fractions.nc.gz
---
Analyze results...
Check...
Check missed dates...
Ok.
Check the sharpe ratio...
Period: 2006-01-01 - 2021-03-04
Sharpe Ratio = 0.4386617181385554

ERROR! The sharpe ratio is too low. 0.4386617181385554 < 1

Check correlation.

Ok. This strategy does not correlate with other strategies.
---
Calc global stats...
---
Calc stats per asset...
Build plots...
---
Select the asset (or leave blank to display the overall stats):

interactive(children=(Combobox(value='', description='asset', options=('', 'F_AD', 'F_AE', 'F_AH', 'F_AX', 'F_…

The full code for the optimized strategy¶

import qnt.data as qndata
import qnt.ta as qnta
import qnt.log as qnlog
import qnt.backtester as qnbt
import qnt.output as qnout

import xarray as xr


best_args = dict(wma_period=20, roc_period=80) # highest Sharpe ratio iteration (not recommended, overfit!)


def single_pass_strategy(data, wma_period=20, roc_period=10):
    wma = qnta.lwma(data.sel(field='close'), wma_period)
    sroc = qnta.roc(wma, roc_period)
    weights = xr.where(sroc > 0, 1, 0)
    weights = weights / len(data.asset)
    with qnlog.Settings(info=False, err=False): # suppress log messages
        weights = qnout.clean(weights, data) # check for problems
    return weights


def best_strategy(data):
    return single_pass_strategy(data, **best_args).isel(time=-1)


weights = qnbt.backtest(
    competition_type="futures",
    lookback_period=2 * 365,
    start_date='2006-01-01',
    strategy=best_strategy,
    analyze=True,
    build_plots=True
)

Preventing forward-looking¶

You can use this code snippet for checking forward looking. A large difference in the Sharpe ratios is a sign of forward looking for the single-pass implementation used for the parametric scan.

#DEBUG#
# evaluator will remove all cells with this tag before evaluation

# single pass
data = qndata.futures.load_data(min_date='2004-01-01') # warmup period for indicators, prepend data
single_pass_output = single_pass_strategy(data)
single_pass_stat = qns.calc_stat(data, single_pass_output.sel(time=slice('2006-01-01', None)))

# multi pass
multi_pass_output = qnbt.backtest(
    competition_type="futures",
    lookback_period=2*365,
    start_date='2006-01-01',
    strategy=single_pass_strategy,
    analyze=False,
)
multi_pass_stat = qns.calc_stat(data, multi_pass_output.sel(time=slice('2006-01-01', None)))

print('''
---
Compare multi-pass and single pass performance to be sure that there is no forward looking. Small differences can arise because of numerical accuracy issues and differences in the treatment of missing values.
---
''')

print("Single-pass result:")
display(single_pass_stat.to_pandas().tail())

print("Multi-pass result:")
display(multi_pass_stat.to_pandas().tail())

field	equity	relative_return	volatility	underwater	max_drawdown	sharpe_ratio	mean_return	bias	instruments	avg_turnover	avg_holding_time
time
2021-02-26	1.123324	-0.009606	0.045465	-0.085132	-0.215389	0.164476	0.007478	1.0	75.0	0.043735	26.242790
2021-03-01	1.126595	0.002912	0.045465	-0.082468	-0.215389	0.168561	0.007664	1.0	75.0	0.043733	26.250796
2021-03-02	1.127687	0.000969	0.045460	-0.081579	-0.215389	0.169912	0.007724	1.0	75.0	0.043733	26.262064
2021-03-03	1.125862	-0.001619	0.045456	-0.083066	-0.215389	0.167584	0.007618	1.0	75.0	0.043724	26.262064
2021-03-04	1.122304	-0.003160	0.045457	-0.085963	-0.215389	0.163046	0.007412	1.0	75.0	0.043717	26.356442