sliding 3d array

wool.dewgong

I am building a strategy based on reinforcement learning.
I need to create a sliding 3d array with instruments * history * ohlc.
Can you help or guide me to achieve that with qnt?

support

@wool-dewgong

Hello.

Can you give me more details?

What do you mean when you talk about a sliding 3d array?

Do you want to use rolling like in pandas?

Or do you mean an algorithm that uses a sliding window? Well, in this case, multi-pass backtesting suits.

If you give me an example of such an algorithm, I will be more precise.

Regards.

support

@wool-dewgong To add some context, the Quantiacs toolbox uses native 3d data structures, xarray.DataArray. For example:

https://quantiacs.com/documentation/en/reference/data_load_functions.html#loading-futures-data

The loaded data is a 3d array with coordinates (time, asset, field)

wool.dewgong

I am actually not sure how my strategy could be back tested since i need to train it multiple times on historical data, for example last 2000 days (and build 3d array of historical data), choose the best performing model,, then run it for sometime (a week for example), retrain it on fresh data, re-run etc.
Can you advise how to test model based on machine learning (in my case reinforcement learning)?
"Or do you mean an algorithm that uses a sliding window? Well, in this case, multi-pass backtesting suits."
What do you mean by that? Hiw multi pass works exactly and how do you ensure that forward testing is does on untrained data?

support

@wool-dewgong

Hello.

There are 2 options for how to test this model:

The first option is to split the data into 2 pieces: "training" and "testing".

The training data will contain all the data except some last years (1-3). The test data will contain the remaining piece of data. You train the ML model using the "training" piece and evaluate the performance using the "testing" piece (you can use the backtester for this). It will give you a rough estimate of how your model will perform in the contest. Before submitting, you can train your model using all available data.

The second approach is more tricky. You can slice the data into multiple training and testing pieces.

For example, if you want to evaluate a model for futures, you need the output which contains 16 years of data. Obviously, the rare model will work for so long time without retraining. So you can decide, that your model can work properly for 4 years. In that case, you need to train and evaluate your model 4 times (min). Suppose, your model needs 10 years for training, then you have to perform these passes:

1 pass:

training piece: 1995-04-18 - 2005-04-18
testing piece: 2005-04-19 - 2009-04-19

2 pass:

training piece: 1999-04-18 - 2009-04-18
testing piece: 2009-04-19 - 2013-04-19

3 pass:

training piece: 2003-04-18 - 2013-04-18
testing piece: 2013-04-19 - 2017-04-19

4 pass:

training piece: 2007-04-18 - 2017-04-18
testing piece: 2017-04-19 - 2021-04-19

When you finish these passes, you will get 4 outputs. You can join these outputs and estimate the performance of your model.

Before sending your model, you can train your model using 10 last years of data.

I guess you need the second approach. I am working on the ML example right now. I will add the necessary code for this option.
I notify you when I finish and publish this example.

Regards.

support

@wool-dewgong Hello! We added one template which should address your issue and allow you to perform a rolling fast ML training with retraining. It is available in your user space in the Examples section and you can read it here also in the public docs:

https://quantiacs.com/documentation/en/examples/machine_learning_with_a_voting_classifier.html