Navigation

    Quantiacs Community

    • Register
    • Login
    • Search
    • Categories
    • News
    • Recent
    • Tags
    • Popular
    • Users
    • Groups

    How to do sliding window of test/train

    Strategy help
    2
    8
    512
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P
      Penrose-Moore last edited by

      Hi,
      I am starting from the simple ema futures example, and I am trying to get comfortable with the api. I want to move my training window forward incrementally, and also move my test period forward.

      I started by trying to set the min_date and max_date in the load_date function, and setting start_date in the backtest function. This leaves a slightly confusing test_period, which I would prefer to calculate myself by simply specifiying the train and test dates, but I notice that in the backtest code pd.Timestamp.today() is used, which means the backtester is bound to somehow use today's calendar date to anchor the test interval.

      Can someone help me paramterize these functions in such a way that I can perform iterative, windowed backtesting? I want to start far in the past to do some pre-training without burning too much data.

      P 1 Reply Last reply Reply Quote 0
      • P
        Penrose-Moore @Penrose-Moore last edited by Penrose-Moore

        And to be more clear, sorry, I want the test period to be also in the distant past, not pinned to recent calendar time.

        support 1 Reply Last reply Reply Quote 1
        • support
          support last edited by support

          Hello.

          @penrose-moore said in How to do sliding window of test/train:

          started by trying to set the min_date and max_date in the load_date function, and setting start_date in the backtest function. This leaves a slightly confusing test_period, which I would prefer to calculate myself by simply specifiying the train and test dates, but I notice that in the backtest code pd.Timestamp.today() is used, which means the backtester is bound to somehow use today's calendar date to anchor the test interval.

          I guess, you were confused by this code:

          if start_date is None:
              start_date = pd.Timestamp.today().to_datetime64() - np.timedelta64(test_period-1, 'D')
          else:
              start_date = pd.Timestamp(start_date).to_datetime64()
              test_period = (pd.Timestamp.today().to_datetime64() - start_date) / np.timedelta64(1, 'D')
          

          Well, I try to clarify it.

          As you see, when you specify the start_date, you don't need to specify the test_period.

          The backtester is developed in order to optimize the execution time on our servers and on client side. For this purpose, 'load_function' receives the parameter period for data and makes these calculations.

          On server side, the evaluator runs the strategy day-by-day. For every day it runs these steps:

          1. backtester calls load_data with period=lookback_period
          2. backtester runs the strategy
          3. Backtester saves the output.
          4. the evaluator gets the last day from the output

          On the client side it works in a different way:

          1. the backtester calls load_data with period=(today - start_date) + lookback_period
          2. for every day between start_date and today) it does these steps:
            2.1 it calls the window function cut the data for iteration
            2.2 calls the strategy with this data fragment
            2.3 saves the last day from the output.
          3. when it finishes all iterations, it joins outputs from 2.3 and calculates statistics.

          This multi-pass backtester with data isolation is for detecting and preventing looking forward issues.

          support 1 Reply Last reply Reply Quote 0
          • support
            support @support last edited by

            And, yes, this backtester is for testing, not for training.

            I propose you to pretrain and save your model to the file and load it for backtesting.

            Model training is a slow process, your model can go over the time limit during evaluation.

            This is an example of how to do it:

            train.ipynb:

            
            ... train your model here ...
            ... or pretrain model on your PC...
            
            import pickle, gzip
            pickle.dump(model, gzip.open('model.pickle.gz', 'w'))
            

            strategy.ipynb:

            import xarray as xr
            
            import qnt.ta as qnta
            import qnt.backtester as qnbt
            import qnt.data as qndata
            
            import pickle, gzip
            
            model = pickle.load(gzip.open('model.pickle.gz', 'r'))
            
            
            def load_data(period):
                return qndata.cryptofutures.load_data(tail=period, dims=("time","field","asset"))
            
            def strategy(data):
                prediction = model.predict(data)
                ...
            
            weights = qnbt.backtest(
                competition_type= "cryptofutures",
                load_data= load_data,
                lookback_period= 365,
                start_date= "2014-01-01",
                strategy= strategy,
                analyze=True,
                build_plots=True
            )
            
            1 Reply Last reply Reply Quote 0
            • support
              support last edited by

              If this example is not relevant for you, could you give me your code example and tell me what do you expect?

              I will modify it.

              Notice: This forum is public and all posts are public here.

              Regards.

              P 1 Reply Last reply Reply Quote 0
              • P
                Penrose-Moore @support last edited by

                @support These clarifying details are helpful, and relevant. I started with that minimal example and the api has parameters that are confusing to me, and I only realized that the backtest function is itself performing a loop after going into the code.

                I don't have any futures data for my own use anymore so I will just focus on porting some models to this framework locally using your data and see if anything good happens. I have worked at a CTA but on daily data the sharpe ratios out of sample over a ten or 15 year period are lower than 1.0 on average. I have been working more with equities, but it is all an uphill battle with one person and limited time and compute. Models that look good in the short term were never the ones that were best long-term, and this is a bit perverse because a greedy strategy of model selection favors the short-term winners. I am not sure I can come up with futures models that place in the competition and are good to trade long-term. I might be able to come up with some models that are good in an ensemble setting, maybe your scoring knows how to judge models on their marginal contribution in a leave-one-out kind of way? Or maybe you could run gradient boosting on all the user models?

                Thanks, I might have more questions soon.

                support 1 Reply Last reply Reply Quote 0
                • support
                  support @Penrose-Moore last edited by

                  @penrose-moore Hello, you are correct, competition winners will be systems which are doing good in the timespan of some months, while there can be very good systems with a moderate Sharpe ratio for 4 months, but good for a long timespan. Winning the contest is one way of getting allocations, but we are interested in the long-term performance of all submitted systems. Currently systems which did not win contests but are doing good in the long term after submission are being traded and quants who developed them are getting a fee.

                  1 Reply Last reply Reply Quote 0
                  • support
                    support @Penrose-Moore last edited by

                    @penrose-moore We finally released a template which allows you to perform a retraining, it is available in the "Examples" section of your private space or publicly in the Documentation:

                    https://quantiacs.com/documentation/en/examples/machine_learning_with_a_voting_classifier.html

                    1 Reply Last reply Reply Quote 0
                    • First post
                      Last post
                    Powered by NodeBB | Contributors
                    • Documentation
                    • About
                    • Career
                    • My account
                    • Privacy policy
                    • Terms and Conditions
                    • Cookies policy
                    Home
                    Copyright © 2014 - 2021 Quantiacs LLC.
                    Powered by NodeBB | Contributors