Navigation

    Quantiacs Community

    • Register
    • Login
    • Search
    • Categories
    • News
    • Recent
    • Tags
    • Popular
    • Users
    • Groups

    Question about the Q17 Machine Learning Example Algo

    Strategy help
    4
    4
    431
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • cespadilla
      cespadilla last edited by

      Hi guys,
      I was just checking out the Q17 Machine Learning Algo (With Retraining). I don't know if it's just me, but I find the following strange:

      • The initial algorithm (has look ahead bias and what not) uses 54 different instruments during the backtest. As far as I can see, there is no "is liquid" filter anywhere, since this is just for educational purposes.

      • When the ML algorithm is passed through the backtester, it only trades 8 instruments in the same timeframe. What gives? Is there some parameter that is tuned when using the backtester instead of the whole data? Is this an error? I'd love to keep testing and exploring ML algorithms, but I think that the total number of traded instruments over 8 years should be more than 8, right?

      Please let me know what changes I can make to the code, change the data, competition type, etc. in the backtester parameters, or if this is by design.

      Full data "test":
      54 instruments

      Backtester:
      8 instruments

      support V 2 Replies Last reply Reply Quote 1
      • support
        support @cespadilla last edited by

        @cespadilla Hi, sorry for late answer, we are checking and will let you know soon.

        1 Reply Last reply Reply Quote 0
        • V
          Vyacheslav_B @cespadilla last edited by

          @cespadilla Hello.

          The reason is in "train_model" function.

          def train_model(data):
              asset_name_all = data.coords['asset'].values
              features_all = get_features(data)
              target_all = get_target_classes(data)
          
          
              models = dict()
          
              for asset_name in asset_name_all:
          
                  # drop missing values:
                  target_cur = target_all.sel(asset=asset_name).dropna('time', 'any')
                  features_cur = features_all.sel(asset=asset_name).dropna('time', 'any')
                  
                  
                  target_for_learn_df, feature_for_learn_df = xr.align(target_cur, features_cur, join='inner')
                  if len(features_cur.time) < 10:
                          continue
                  model = get_model()
                  try:
                      model.fit(feature_for_learn_df.values, target_for_learn_df)
                      models[asset_name] = model
          
                          
                  except:
                      logging.exception('model training failed')
          
              return models
          

          If there are less than 10 features for training the model, then the model is not created (if len(features_cur.time) < 10).

          This condition makes sense. I would not remove it.

          The second thing that can affect is the retraining interval of the model ("retrain_interval").

          
          weights = qnbt.backtest_ml(
              train=train_model,
              predict=predict_weights,
              train_period=2 *365,  # the data length for training in calendar days
              retrain_interval=10 *365,  # how often we have to retrain models (calendar days)
              retrain_interval_after_submit=1,  # how often retrain models after submission during evaluation (calendar days)
              predict_each_day=False,  # Is it necessary to call prediction for every day during backtesting?
              # Set it to true if you suspect that get_features is looking forward.
              competition_type='crypto_daily_long_short',  # competition type
              lookback_period=365,  # how many calendar days are needed by the predict function to generate the output
              start_date='2014-01-01',  # backtest start date
              analyze = True,
              build_plots=True  # do you need the chart?
          )
          
          Sjackson3289 1 Reply Last reply Reply Quote 1
          • Sjackson3289
            Sjackson3289 Banned @Vyacheslav_B last edited by

            This post is deleted!
            1 Reply Last reply Reply Quote 0
            • First post
              Last post
            Powered by NodeBB | Contributors
            • Documentation
            • About
            • Career
            • My account
            • Privacy policy
            • Terms and Conditions
            • Cookies policy
            Home
            Copyright © 2014 - 2021 Quantiacs LLC.
            Powered by NodeBB | Contributors