Quantiacs Community

stefanm

@magenta-kabuto Hi, maybe this can help with your strategy also:
In your function regime_trade() before returning signals_df, convert it to pandas.Series structure, and name it to corresponding asset (e.g. "NAS: AAPL").

series = signals_df.squeeze().rename(asset_name)
return series

After putting it to trades = dict(), you can concatenate trades.values(), which will create pandas.DataFrame of signals by assets:

pd_signals = pd.concat(trades.values(), axis=1)

Then, convert it to xarray.DataArray and pass it as weights to backtester.

import xarray as xr

xr_signals = xr.DataArray(pd_signals, dims=('time', 'asset'))

stefanm

@magenta-kabuto Hi, yes that's right, it seems that xarray DataArray is passed to function which expects pandas DataFrame (your entire algorithm is not visible, which is ok, so this is an assumption), but maybe you can try with this:

### use your regime_trade(Stockdata, param_2=0.15) as helper function


def strategy(data):
# param data: the data returned from load_data function, xarray.DataArray structure. 
    Stockdata = ... # prepare data for regime_trade input, like you did for single pass
    trades = {}
    for j in logopenmod.keys():
        trades[j] = regime_trade(Stockdata[j].iloc[:,3], 0.15)
    pd_signals = pd.concat(trades.values(), axis=1)
    xr_signals = xr.DataArray(pd_signals, dims=('time', 'asset'))
    is_liquid = data.sel(field="is_liquid") # assume that "stocks" is exactly the same as data is
    return xr_signals * is_liquid

Try it with Multi-pass, and change the name in competition_type. Set the start date as below:

weights = qnbk.backtest(
  competition_type = "stocks_nasdaq100",
  load_data = load_data, # if omitted it loads data by competition_type
  lookback_period = 365,
  start_date = "2006-01-01", # set start_date 
  strategy = strategy,
  analyze = True,
)

Regards

stefanm

@magenta-kabuto No problem, you're welcome. Can you please change the start_date to '2006-01-01' when running backtester, and let us know if it worked?

stefanm

@darwinps Hi,
There are two key elements missing in the single pass approach that lead to the discrepancy:

Insufficient Data Loaded
Most strategies require an initialization period to set everything up, like calculation of technical indicators used in strategy. In this case, it’s one day, and to produce weights for a specific date, you need the previous day’s close and open prices. For example, to get accurate weights for '2006-01-03,' you’ll need the prices from the previous trading day, '2005-12-30.'

For calculating statistics, a key factor is Relative Returns, which cannot be computed without considering slippage. Slippage is approximated as 14-day Average True Range (ATR(14)) multiplied by 0.04 (4% of ATR(14) for futures) or 0.05 (for stocks). Therefore, to get accurate stats from '2006-01-01,' you need data loaded at least 14 prior trading days. This is handled in backtester with lookback_period (365 days in this case), which means that data will be loaded from 365 days before start_date.
Manually Filtering Data to a Single Asset
When using the backtester, the strategy’s output is aligned with all assets in the dataset. Although this doesn’t affect the weights—assets not included in the strategy will simply have their weights set to zero—there may be days in the full dataset that don’t appear in the filtered data. To ensure proper alignment in single pass mode, the clean() and align() functions from qnt.output should be run manually.

The code below should produce identical results:

# 365 days lookback period + 60 additional days as minimum tail for loading data for allignment hardcoded in the backtester, therefore min_date='2004-11-05'
data = qndata.futures_load_data(min_date='2004-11-05', max_date='2007-01-01') 
f_es_data = qndata.futures_load_data(min_date='2005-12-01', max_date='2007-01-01', assets= ['F_ES']) # enough for stats and weights, one month before
### OR ###
# f_es_data = data.sel(asset=['F_ES'])

weights = strategy(f_es_data)
weights = qnout.clean(weights, data, 'futures') # in the backtester, clean() uses the same data as strategy() (f_es_data)
weights = qnout.align(weights, data, start='2006-01-01') # assign weights from 2006-01-01

stats = qnst.calc_stat(data, weights)
display(stats.sel(time=slice("2006-01-01", None)).to_pandas().head(15)) # show stats from 2006-01-01

In general, it's acceptable to generate some signals based on specific assets, but manually selecting assets for weight allocation is not allowed. Weight allocation should be dynamic across the entire dataset. Therefore, it’s recommended to load the entire dataset for the corresponding competition, which the strategy function will use as "data" parameter.

The following example uses the same strategy foundation and outputs as the initial one but applied to the full dataset. This resulted in identical statistics for both single and multi-pass approaches (though it’s still not compliant with the rules due to the hand-picked asset).

def load_data(period):
    return qndata.futures_load_data(tail=period)

def strategy(data):
    _data = data.sel(asset=['F_ES'])
    close = _data.sel(field='close')
    close_one_day_ago = qnta.shift(close, periods=1)
    _open = _data.sel(field='open')
    open_one_day_ago = qnta.shift(_open, periods=1)
    weights = xr.where(open_one_day_ago < close_one_day_ago, 1, 0)
    return weights

weights_multi = qnbt.backtest(
    competition_type= 'futures',
    load_data= load_data,
    lookback_period= 365,
    start_date= '2006-01-01',
    end_date= '2007-01-01',
    strategy= strategy,
)


### For single pass
data = qndata.futures_load_data(min_date='2004-11-05', max_date='2007-01-01')
weights_single = strategy(data).sel(time=slice('2006-01-01', None))
weights_single = qnout.clean(weights_single, data, 'futures')

stats = qnst.calc_stat(data, weights_single)
display(stats.sel(time=slice("2006-01-01", None)).to_pandas().head(15))

The greatest advantage of the single pass is its execution speed, which is especially important during the optimization process. However, it requires more attention to ensure that all aspects are handled properly. For instance, it’s quite easy to incorporate forward-looking information in a single pass, which is precisely what the multi-pass approach aims to prevent.

Try to use the opposite shift direction in the strategy for main variables which produce signals:

close_one_day_ago = qnta.shift(close, periods=-1)

open_one_day_ago = qnta.shift(_open, periods=-1)

Run it as single and multi pass and check the results.

stefanm

@angusslq Hi,

The qnbt.backtest function you used can return only weights (as xarray.DataArray structure), or tuple (weights, state). It depends on your strategy() function, whether it has additional arguments (beside "data") or not. The "state" doesn't affect your weights if it hasn't been used in the strategy you passed, and in this case, the state is most likely None.

Anyway, it's not necessary to call write() function from qnt.output module after calculating weights using qnt.backtester, since the backtest() function already calls write() function, so the weights have been written automatically.

stefanm

@darwinps Hi,

Your change should produce completely the same weights, in case you used the same input (data). If f_es_data that you passed as parameter to strategy() function is different than data you use further e.g close = data.sel(field='close'), you will get different output.

If you still get discrepancy, please share the entire code you used.
Best regards,

stefanm

@darwinps Hi,
There are two key elements missing in the single pass approach that lead to the discrepancy:

Insufficient Data Loaded
Most strategies require an initialization period to set everything up, like calculation of technical indicators used in strategy. In this case, it’s one day, and to produce weights for a specific date, you need the previous day’s close and open prices. For example, to get accurate weights for '2006-01-03,' you’ll need the prices from the previous trading day, '2005-12-30.'

For calculating statistics, a key factor is Relative Returns, which cannot be computed without considering slippage. Slippage is approximated as 14-day Average True Range (ATR(14)) multiplied by 0.04 (4% of ATR(14) for futures) or 0.05 (for stocks). Therefore, to get accurate stats from '2006-01-01,' you need data loaded at least 14 prior trading days. This is handled in backtester with lookback_period (365 days in this case), which means that data will be loaded from 365 days before start_date.
Manually Filtering Data to a Single Asset
When using the backtester, the strategy’s output is aligned with all assets in the dataset. Although this doesn’t affect the weights—assets not included in the strategy will simply have their weights set to zero—there may be days in the full dataset that don’t appear in the filtered data. To ensure proper alignment in single pass mode, the clean() and align() functions from qnt.output should be run manually.

The code below should produce identical results:

# 365 days lookback period + 60 additional days as minimum tail for loading data for allignment hardcoded in the backtester, therefore min_date='2004-11-05'
data = qndata.futures_load_data(min_date='2004-11-05', max_date='2007-01-01') 
f_es_data = qndata.futures_load_data(min_date='2005-12-01', max_date='2007-01-01', assets= ['F_ES']) # enough for stats and weights, one month before
### OR ###
# f_es_data = data.sel(asset=['F_ES'])

weights = strategy(f_es_data)
weights = qnout.clean(weights, data, 'futures') # in the backtester, clean() uses the same data as strategy() (f_es_data)
weights = qnout.align(weights, data, start='2006-01-01') # assign weights from 2006-01-01

stats = qnst.calc_stat(data, weights)
display(stats.sel(time=slice("2006-01-01", None)).to_pandas().head(15)) # show stats from 2006-01-01

In general, it's acceptable to generate some signals based on specific assets, but manually selecting assets for weight allocation is not allowed. Weight allocation should be dynamic across the entire dataset. Therefore, it’s recommended to load the entire dataset for the corresponding competition, which the strategy function will use as "data" parameter.

The following example uses the same strategy foundation and outputs as the initial one but applied to the full dataset. This resulted in identical statistics for both single and multi-pass approaches (though it’s still not compliant with the rules due to the hand-picked asset).

def load_data(period):
    return qndata.futures_load_data(tail=period)

def strategy(data):
    _data = data.sel(asset=['F_ES'])
    close = _data.sel(field='close')
    close_one_day_ago = qnta.shift(close, periods=1)
    _open = _data.sel(field='open')
    open_one_day_ago = qnta.shift(_open, periods=1)
    weights = xr.where(open_one_day_ago < close_one_day_ago, 1, 0)
    return weights

weights_multi = qnbt.backtest(
    competition_type= 'futures',
    load_data= load_data,
    lookback_period= 365,
    start_date= '2006-01-01',
    end_date= '2007-01-01',
    strategy= strategy,
)


### For single pass
data = qndata.futures_load_data(min_date='2004-11-05', max_date='2007-01-01')
weights_single = strategy(data).sel(time=slice('2006-01-01', None))
weights_single = qnout.clean(weights_single, data, 'futures')

stats = qnst.calc_stat(data, weights_single)
display(stats.sel(time=slice("2006-01-01", None)).to_pandas().head(15))

The greatest advantage of the single pass is its execution speed, which is especially important during the optimization process. However, it requires more attention to ensure that all aspects are handled properly. For instance, it’s quite easy to incorporate forward-looking information in a single pass, which is precisely what the multi-pass approach aims to prevent.

Try to use the opposite shift direction in the strategy for main variables which produce signals:

close_one_day_ago = qnta.shift(close, periods=-1)

open_one_day_ago = qnta.shift(_open, periods=-1)

Run it as single and multi pass and check the results.

stefanm

@nosaai Hi,
can you try using another progressbar2 version in your environment, it should work with 3.55.0:

pip install progressbar2==3.55.0

Please let us know if it worked.

stefanm

@captain-dog Hi,

This is the total number of different companies that have been constituents of Nasdaq-100 index at some point, from 2006-01-01 and so on. At any point, there are 100 companies in index (sometimes more, up to 108) and those are considered as liquid ("is_liquid"=1.0), as long as they are part of the index.

stefanm

@magenta-kabuto No problem, you're welcome. Can you please change the start_date to '2006-01-01' when running backtester, and let us know if it worked?

stefanm

@magenta-kabuto Hi, yes that's right, it seems that xarray DataArray is passed to function which expects pandas DataFrame (your entire algorithm is not visible, which is ok, so this is an assumption), but maybe you can try with this:

### use your regime_trade(Stockdata, param_2=0.15) as helper function


def strategy(data):
# param data: the data returned from load_data function, xarray.DataArray structure. 
    Stockdata = ... # prepare data for regime_trade input, like you did for single pass
    trades = {}
    for j in logopenmod.keys():
        trades[j] = regime_trade(Stockdata[j].iloc[:,3], 0.15)
    pd_signals = pd.concat(trades.values(), axis=1)
    xr_signals = xr.DataArray(pd_signals, dims=('time', 'asset'))
    is_liquid = data.sel(field="is_liquid") # assume that "stocks" is exactly the same as data is
    return xr_signals * is_liquid

Try it with Multi-pass, and change the name in competition_type. Set the start date as below:

weights = qnbk.backtest(
  competition_type = "stocks_nasdaq100",
  load_data = load_data, # if omitted it loads data by competition_type
  lookback_period = 365,
  start_date = "2006-01-01", # set start_date 
  strategy = strategy,
  analyze = True,
)

Regards

stefanm

@magenta-kabuto Hi, maybe this can help with your strategy also:
In your function regime_trade() before returning signals_df, convert it to pandas.Series structure, and name it to corresponding asset (e.g. "NAS: AAPL").

series = signals_df.squeeze().rename(asset_name)
return series

After putting it to trades = dict(), you can concatenate trades.values(), which will create pandas.DataFrame of signals by assets:

pd_signals = pd.concat(trades.values(), axis=1)

Then, convert it to xarray.DataArray and pass it as weights to backtester.

import xarray as xr

xr_signals = xr.DataArray(pd_signals, dims=('time', 'asset'))

stefanm

@stefanm

Best posts made by stefanm

Latest posts made by stefanm