Question about the Q17 Machine Learning Example Algo
-
Hi guys,
I was just checking out the Q17 Machine Learning Algo (With Retraining). I don't know if it's just me, but I find the following strange:-
The initial algorithm (has look ahead bias and what not) uses 54 different instruments during the backtest. As far as I can see, there is no "is liquid" filter anywhere, since this is just for educational purposes.
-
When the ML algorithm is passed through the backtester, it only trades 8 instruments in the same timeframe. What gives? Is there some parameter that is tuned when using the backtester instead of the whole data? Is this an error? I'd love to keep testing and exploring ML algorithms, but I think that the total number of traded instruments over 8 years should be more than 8, right?
Please let me know what changes I can make to the code, change the data, competition type, etc. in the backtester parameters, or if this is by design.
Full data "test":
Backtester:
-
-
@cespadilla Hi, sorry for late answer, we are checking and will let you know soon.
-
@cespadilla Hello.
The reason is in "train_model" function.
def train_model(data): asset_name_all = data.coords['asset'].values features_all = get_features(data) target_all = get_target_classes(data) models = dict() for asset_name in asset_name_all: # drop missing values: target_cur = target_all.sel(asset=asset_name).dropna('time', 'any') features_cur = features_all.sel(asset=asset_name).dropna('time', 'any') target_for_learn_df, feature_for_learn_df = xr.align(target_cur, features_cur, join='inner') if len(features_cur.time) < 10: continue model = get_model() try: model.fit(feature_for_learn_df.values, target_for_learn_df) models[asset_name] = model except: logging.exception('model training failed') return models
If there are less than 10 features for training the model, then the model is not created (if len(features_cur.time) < 10).
This condition makes sense. I would not remove it.
The second thing that can affect is the retraining interval of the model ("retrain_interval").
weights = qnbt.backtest_ml( train=train_model, predict=predict_weights, train_period=2 *365, # the data length for training in calendar days retrain_interval=10 *365, # how often we have to retrain models (calendar days) retrain_interval_after_submit=1, # how often retrain models after submission during evaluation (calendar days) predict_each_day=False, # Is it necessary to call prediction for every day during backtesting? # Set it to true if you suspect that get_features is looking forward. competition_type='crypto_daily_long_short', # competition type lookback_period=365, # how many calendar days are needed by the predict function to generate the output start_date='2014-01-01', # backtest start date analyze = True, build_plots=True # do you need the chart? )
-
This post is deleted!