How does "qnbt.backtest_ml" really work?

EDDIEE

Specifically, I am interested in the exact time period of the first train period.

In this example, the train period equals 4x365 days and the backtest starts at 2006-01-01.
Does is it mean that the first train period spans from 2006-01-01 to 2009-12-31?
So the first four years of the backtest periods are actually in-sample predictions?

When the model is retrained every 5x365th day, is the train period always 4x365 days long (rolling window)? Is it possible to implement the expanding window approach for model retraining?

weights = qnbt.backtest_ml(
train=train_model,
predict=predict,
train_period=4x365, # the data length for training in calendar days
retrain_interval=5x365, # how often we have to retrain models (calendar days)
retrain_interval_after_submit=50, # how often retrain models after submission during evaluation (calendar days)
predict_each_day=True, # Is it necessary to call prediction for every day during backtesting?
competition_type='stocks_nasdaq100', # competition type
lookback_period=365, # how many calendar days are needed by the predict function to generate the output
start_date='2006-01-01', # backtest start date
build_plots=True # do you need the chart?
)

support

Dear Eddie, the training takes place on a rolling basis. The prediction at time "t" uses the defined training period (until "t-1"). If you choose the backtest to start at "2006-01-01", then this will be the first "predicted" date.

As the training can be computationally expensive, the retraining option offers the option to freeze the model and to perform the interval every "retrain_interval" days. As you correctly say, the rolling window is still the one of "train_period".

If by "expanding window" you mean a retraining which uses more and more data as time goes on, no, this is not currently implemented, we use a fixed-size rolling window.