@darwinps Hi,
There are two key elements missing in the single pass approach that lead to the discrepancy:
-
Insufficient Data Loaded
Most strategies require an initialization period to set everything up, like calculation of technical indicators used in strategy. In this case, it’s one day, and to produce weights for a specific date, you need the previous day’s close and open prices. For example, to get accurate weights for '2006-01-03,' you’ll need the prices from the previous trading day, '2005-12-30.'
For calculating statistics, a key factor is Relative Returns, which cannot be computed without considering slippage. Slippage is approximated as 14-day Average True Range (ATR(14)) multiplied by 0.04 (4% of ATR(14) for futures) or 0.05 (for stocks). Therefore, to get accurate stats from '2006-01-01,' you need data loaded at least 14 prior trading days. This is handled in backtester with lookback_period (365 days in this case), which means that data will be loaded from 365 days before start_date.
-
Manually Filtering Data to a Single Asset
When using the backtester, the strategy’s output is aligned with all assets in the dataset. Although this doesn’t affect the weights—assets not included in the strategy will simply have their weights set to zero—there may be days in the full dataset that don’t appear in the filtered data. To ensure proper alignment in single pass mode, the clean() and align() functions from qnt.output should be run manually.
The code below should produce identical results:
# 365 days lookback period + 60 additional days as minimum tail for loading data for allignment hardcoded in the backtester, therefore min_date='2004-11-05'
data = qndata.futures_load_data(min_date='2004-11-05', max_date='2007-01-01')
f_es_data = qndata.futures_load_data(min_date='2005-12-01', max_date='2007-01-01', assets= ['F_ES']) # enough for stats and weights, one month before
### OR ###
# f_es_data = data.sel(asset=['F_ES'])
weights = strategy(f_es_data)
weights = qnout.clean(weights, data, 'futures') # in the backtester, clean() uses the same data as strategy() (f_es_data)
weights = qnout.align(weights, data, start='2006-01-01') # assign weights from 2006-01-01
stats = qnst.calc_stat(data, weights)
display(stats.sel(time=slice("2006-01-01", None)).to_pandas().head(15)) # show stats from 2006-01-01
In general, it's acceptable to generate some signals based on specific assets, but manually selecting assets for weight allocation is not allowed. Weight allocation should be dynamic across the entire dataset. Therefore, it’s recommended to load the entire dataset for the corresponding competition, which the strategy function will use as "data" parameter.
The following example uses the same strategy foundation and outputs as the initial one but applied to the full dataset. This resulted in identical statistics for both single and multi-pass approaches (though it’s still not compliant with the rules due to the hand-picked asset).
def load_data(period):
return qndata.futures_load_data(tail=period)
def strategy(data):
_data = data.sel(asset=['F_ES'])
close = _data.sel(field='close')
close_one_day_ago = qnta.shift(close, periods=1)
_open = _data.sel(field='open')
open_one_day_ago = qnta.shift(_open, periods=1)
weights = xr.where(open_one_day_ago < close_one_day_ago, 1, 0)
return weights
weights_multi = qnbt.backtest(
competition_type= 'futures',
load_data= load_data,
lookback_period= 365,
start_date= '2006-01-01',
end_date= '2007-01-01',
strategy= strategy,
)
### For single pass
data = qndata.futures_load_data(min_date='2004-11-05', max_date='2007-01-01')
weights_single = strategy(data).sel(time=slice('2006-01-01', None))
weights_single = qnout.clean(weights_single, data, 'futures')
stats = qnst.calc_stat(data, weights_single)
display(stats.sel(time=slice("2006-01-01", None)).to_pandas().head(15))
The greatest advantage of the single pass is its execution speed, which is especially important during the optimization process. However, it requires more attention to ensure that all aspects are handled properly. For instance, it’s quite easy to incorporate forward-looking information in a single pass, which is precisely what the multi-pass approach aims to prevent.
Try to use the opposite shift direction in the strategy for main variables which produce signals:
close_one_day_ago = qnta.shift(close, periods=-1)
open_one_day_ago = qnta.shift(_open, periods=-1)
Run it as single and multi pass and check the results.