strategy

Q18 Machine Learning on a Rolling Basis

This example shows how to make a submission to the stock contest using machine learning and retraining.

You can clone and edit this example there (tab Examples).


In this example we predict whether the price will rise or fall by using supervised learning (Bayesian Ridge Regression). This template represents a starting point for developing a system which can take part to the Q18 NASDAQ-100 Stock Long-Short contest.

It consists of two parts.

  • In the first part we just perform a global training of the time series using all time series data. We disregard the sequential aspect of the data and use also future data to train past data.

  • In the second part we use the built-in backtester and perform training and prediction on a rolling basis in order to avoid forward looking. Please note that we are using a specialized version of the Quantiacs backtester which dramatically speeds up the the backtesting process by retraining your model on a regular basis.

Features for learning: we will use several technical indicators trying to capture different features. You can have a look at Technical Indicators.

Please note that:

  • Your trading algorithm can open short and long positions.

  • At each point in time your algorithm can trade all or a subset of the stocks which at that point of time are or were part of the NASDAQ-100 stock index. Note that the composition of this set changes in time, and Quantiacs provides you with an appropriate filter function for selecting them.

  • The Sharpe ratio of your system since January 1st, 2006, has to be larger than 1.

  • Your system cannot be a copy of the current examples. We run a correlation filter on the submissions and detect duplicates.

  • For simplicity we will use a single asset. It pays off to use more assets, ideally uncorrelated, and diversify your positions for a more solid Sharpe ratio.

More details on the rules can be found here.

Need help? Check the Documentation and find solutions/report problems in the Forum section.

More help with Jupyter? Check the official Jupyter page.

Once you are done, click on Submit to the contest and take part to our competitions.

API reference:

  • data: check how to work with data;

  • backtesting: read how to run the simulation and check the results.

Need to use the optimizer function to automate tedious tasks?

  • optimization: read more on our article.
In [1]:
%%javascript
IPython.OutputArea.prototype._should_scroll = function(lines) { return false; }
// disable widget scrolling
In [2]:
import logging

import xarray as xr  # xarray for data manipulation

import qnt.data as qndata     # functions for loading data
import qnt.backtester as qnbt # built-in backtester
import qnt.ta as qnta         # technical analysis library
import qnt.stats as qnstats   # statistical functions

import pandas as pd
import numpy as np

import matplotlib.pyplot as plt

np.seterr(divide = "ignore")

from qnt.ta.macd import macd
from qnt.ta.rsi  import rsi
from qnt.ta.stochastic import stochastic_k, stochastic, slow_stochastic

from sklearn import linear_model
from sklearn.metrics import r2_score
from sklearn.metrics import explained_variance_score
from sklearn.metrics import mean_absolute_error
In [3]:
# loading nasdaq-100 stock data

stock_data = qndata.stocks.load_ndx_data(tail = 365 * 5, assets = ["NAS:AAPL", "NAS:AMZN"])
100% (367973 of 367973) |################| Elapsed Time: 0:00:00 Time:  0:00:00
100% (39443 of 39443) |##################| Elapsed Time: 0:00:00 Time:  0:00:00
100% (186804 of 186804) |################| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 1/1 0s
Data loaded 0s
In [4]:
def get_features(data):
    """Builds the features used for learning:
       * a trend indicator;
       * the moving average convergence divergence;
       * a volatility measure; 
       * the stochastic oscillator;
       * the relative strength index;
       * the logarithm of the closing price.
       These features can be modified and new ones can be added easily.
    """
   
    # trend:
    trend = qnta.roc(qnta.lwma(data.sel(field="close"), 60), 1)
     
    # moving average convergence  divergence (MACD):
    macd = qnta.macd(data.sel(field="close"))
    macd2_line, macd2_signal, macd2_hist = qnta.macd(data, 12, 26, 9)

    # volatility:
    volatility = qnta.tr(data.sel(field="high"), data.sel(field="low"), data.sel(field="close"))
    volatility = volatility / data.sel(field="close")
    volatility = qnta.lwma(volatility, 14)

    # the stochastic oscillator:
    k, d = qnta.stochastic(data.sel(field="high"), data.sel(field="low"), data.sel(field="close"), 14)
    
    # the relative strength index: 
    rsi = qnta.rsi(data.sel(field="close"))
    
    # the logarithm of the closing price:
    price = data.sel(field="close").ffill("time").bfill("time").fillna(0) # fill NaN
    price = np.log(price)
    
    # combine the six features:
    result = xr.concat(
        [trend, macd2_signal.sel(field="close"), volatility,  d, rsi, price],
        pd.Index(
            ["trend",  "macd", "volatility", "stochastic_d", "rsi", "price"],
            name = "field"
        )
    )

    return result.transpose("time", "field", "asset")
In [5]:
# displaying the features:
my_features = get_features(stock_data)
display(my_features.sel(field="trend").to_pandas())
asset NAS:AAPL NAS:AMZN
time
2019-04-26 NaN NaN
2019-04-29 NaN NaN
2019-04-30 NaN NaN
2019-05-01 NaN NaN
2019-05-02 NaN NaN
... ... ...
2024-04-18 -0.213908 0.092568
2024-04-19 -0.243996 0.001076
2024-04-22 -0.219416 0.043393
2024-04-23 -0.190867 0.079823
2024-04-24 -0.142910 0.019401

1258 rows × 2 columns

In [6]:
def get_target_classes(data):
    """ Target classes for predicting if price goes up or down."""
    
    price_current = data.sel(field="close")
    price_future  = qnta.shift(price_current, -1)

    class_positive = 1 # prices goes up
    class_negative = 0 # price goes down

    target_price_up = xr.where(price_future > price_current, class_positive, class_negative)

    return target_price_up
In [7]:
# displaying the target classes:
my_targetclass = get_target_classes(stock_data)
display(my_targetclass.to_pandas())
asset NAS:AAPL NAS:AMZN
time
2019-04-26 1 0
2019-04-29 0 0
2019-04-30 1 0
2019-05-01 0 0
2019-05-02 1 1
... ... ...
2024-04-18 0 0
2024-04-19 1 1
2024-04-22 1 1
2024-04-23 1 0
2024-04-24 0 0

1258 rows × 2 columns

In [8]:
def get_model():
    """This is a constructor for the ML model (Bayesian Ridge) which can be easily 
       modified for using different models.
    """
    
    model = linear_model.BayesianRidge()
    return model
In [9]:
# Create and train the models working on an asset-by-asset basis.

asset_name_all = stock_data.coords["asset"].values

models = dict()

for asset_name in asset_name_all:

        # drop missing values:
        target_cur   = my_targetclass.sel(asset=asset_name).dropna("time", "any")
        features_cur = my_features.sel(asset=asset_name).dropna("time", "any")
        
        # align features and targets:
        target_for_learn_df, feature_for_learn_df = xr.align(target_cur, features_cur, join="inner")

        if len(features_cur.time) < 10:
            # not enough points for training
                continue

        model = get_model()

        try:
            model.fit(feature_for_learn_df.values, target_for_learn_df)
            models[asset_name] = model
                
        except:
            logging.exception("model training failed")
            
print(models)
{'NAS:AAPL': BayesianRidge(), 'NAS:AMZN': BayesianRidge()}
In [10]:
# Showing which features are more important in predicting:

importance = models["NAS:AAPL"].coef_
importance

for i,v in enumerate(importance):
    print('Feature: %0d, Score: %.5f' % (i,v))
    
plt.bar([x for x in range(len(importance))], importance)
plt.show()
Feature: 0, Score: -0.00002
Feature: 1, Score: -0.00028
Feature: 2, Score: -0.00000
Feature: 3, Score: 0.00010
Feature: 4, Score: -0.00034
Feature: 5, Score: -0.00011
In [11]:
# Performs prediction and generates output weights:

asset_name_all = stock_data.coords["asset"].values
weights = xr.zeros_like(stock_data.sel(field="close"))
    
for asset_name in asset_name_all:
    if asset_name in models:
        model = models[asset_name]
        features_all = my_features
        features_cur = features_all.sel(asset=asset_name).dropna("time", "any")
        if len(features_cur.time) < 1:
            continue
        try:
            weights.loc[dict(asset=asset_name, time=features_cur.time.values)] = model.predict(features_cur.values)
        except KeyboardInterrupt as e:
            raise e
        except:
            logging.exception("model prediction failed")
            
print(weights)
<xarray.DataArray 'stocks_nasdaq100' (time: 1258, asset: 2)>
array([[0.        , 0.        ],
       [0.        , 0.        ],
       [0.        , 0.        ],
       ...,
       [0.52352907, 0.54051222],
       [0.52321886, 0.53408119],
       [0.52254874, 0.53292474]])
Coordinates:
  * asset    (asset) <U8 'NAS:AAPL' 'NAS:AMZN'
  * time     (time) datetime64[ns] 2019-04-26 2019-04-29 ... 2024-04-24
    field    <U5 'close'
In [12]:
def get_sharpe(stock_data, weights):
    """Calculates the Sharpe ratio"""
    rr = qnstats.calc_relative_return(stock_data, weights)
    sharpe = qnstats.calc_sharpe_ratio_annualized(rr).values[-1]
    return sharpe

sharpe = get_sharpe(stock_data, weights)
sharpe
Out[12]:
0.6887842673556639

The sharpe ratio using the method above follows from forward looking. Predictions for (let us say) 2017 know about the relation between features and targets in 2020. Let us visualize the results:

In [13]:
import qnt.graph as qngraph

statistics = qnstats.calc_stat(stock_data, weights)

display(statistics.to_pandas().tail())

performance = statistics.to_pandas()["equity"]
qngraph.make_plot_filled(performance.index, performance, name="PnL (Equity)", type="log")

display(statistics[-1:].sel(field = ["sharpe_ratio"]).transpose().to_pandas())

# check for correlations with existing strategies:
qnstats.print_correlation(weights,stock_data)
field equity relative_return volatility underwater max_drawdown sharpe_ratio mean_return bias instruments avg_turnover avg_holding_time
time
2024-04-18 2.517230 -0.008553 0.294664 -0.048797 -0.414514 0.691766 0.203839 1.0 2.0 0.018622 299.009961
2024-04-19 2.469567 -0.018935 0.294681 -0.066808 -0.414514 0.675474 0.199049 1.0 2.0 0.018616 299.301347
2024-04-22 2.494129 0.009946 0.294591 -0.057526 -0.414514 0.683180 0.201259 1.0 2.0 0.018615 299.552648
2024-04-23 2.518522 0.009780 0.294501 -0.048309 -0.414514 0.690760 0.203429 1.0 2.0 0.018606 299.558616
2024-04-24 2.513475 -0.002004 0.294387 -0.050216 -0.414514 0.688784 0.202769 1.0 2.0 0.018600 342.260322
time 2024-04-24
field
sharpe_ratio 0.688784
WARNING! Can't calculate correlation.
ERROR:root:Correlation check failed.
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/qnt/stats.py", line 823, in check_correlation
    cr_list = calc_correlation(rr, False)
  File "/usr/local/lib/python3.7/site-packages/qnt/stats.py", line 923, in calc_correlation
    raise e
  File "/usr/local/lib/python3.7/site-packages/qnt/stats.py", line 886, in calc_correlation
    with request.urlopen(ENGINE_CORRELATION_URL + "?participantId=" + PARTICIPANT_ID) as response:
  File "/usr/local/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/lib/python3.7/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/usr/local/lib/python3.7/urllib/request.py", line 641, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/local/lib/python3.7/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/usr/local/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/usr/local/lib/python3.7/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: 
In [14]:
"""R2 (coefficient of determination) regression score function."""
r2_score(my_targetclass, weights, multioutput="variance_weighted")
Out[14]:
-0.04681516905457273
In [15]:
"""The explained variance score explains the dispersion of errors of a given dataset"""
explained_variance_score(my_targetclass, weights, multioutput="uniform_average")
Out[15]:
-0.04452586396549041
In [16]:
"""The explained variance score explains the dispersion of errors of a given dataset"""
mean_absolute_error(my_targetclass, weights)
Out[16]:
0.4989112506979233

Let us now use the Quantiacs backtester for avoiding forward looking.

The backtester performs some transformations: it trains the model on one slice of data (using only data from the past) and predicts the weights for the following slice on a rolling basis:

In [17]:
def train_model(data):
    """Create and train the model working on an asset-by-asset basis."""
    
    asset_name_all = data.coords["asset"].values
    features_all   = get_features(data)
    target_all     = get_target_classes(data)

    models = dict()

    for asset_name in asset_name_all:

        # drop missing values:
        target_cur   = target_all.sel(asset=asset_name).dropna("time", "any")
        features_cur = features_all.sel(asset=asset_name).dropna("time", "any")
        
        target_for_learn_df, feature_for_learn_df = xr.align(target_cur, features_cur, join="inner")
        
        if len(features_cur.time) < 10:
                continue
                
        model = get_model()
        
        try:
            model.fit(feature_for_learn_df.values, target_for_learn_df)
            models[asset_name] = model
                
        except:
            logging.exception("model training failed")

    return models
In [18]:
def predict_weights(models, data):
    """The model predicts if the price is going up or down.
       The prediction is performed for several days in order to speed up the evaluation."""
    
    asset_name_all = data.coords["asset"].values
    weights = xr.zeros_like(data.sel(field="close"))
    
    for asset_name in asset_name_all:
        if asset_name in models:
            model = models[asset_name]
            features_all = get_features(data)
            features_cur = features_all.sel(asset=asset_name).dropna("time", "any")

            if len(features_cur.time) < 1:
                continue

            try:
                weights.loc[dict(asset=asset_name, time=features_cur.time.values)] = model.predict(features_cur.values)

            except KeyboardInterrupt as e:
                raise e
            
            except:
                logging.exception("model prediction failed")                

    return weights
In [19]:
# Calculate weights using the backtester:
weights = qnbt.backtest_ml(
    train                         = train_model,
    predict                       = predict_weights,
    train_period                  =  2 *365,  # the data length for training in calendar days
    retrain_interval              = 10 *365,  # how often we have to retrain models (calendar days)
    retrain_interval_after_submit = 1,        # how often retrain models after submission during evaluation (calendar days)
    predict_each_day              = False,    # Is it necessary to call prediction for every day during backtesting?
                                              # Set it to True if you suspect that get_features is looking forward.
    competition_type              = "stocks_nasdaq100",  # competition type
    lookback_period               = 365,                 # how many calendar days are needed by the predict function to generate the output
    start_date                    = "2005-01-01",        # backtest start date
    analyze                       = True,
    build_plots                   = True  # do you need the chart?
)
Run the last iteration...
100% (39443 of 39443) |##################| Elapsed Time: 0:00:00 Time:  0:00:00
100% (9023624 of 9023624) |##############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 1/1 3s
Data loaded 3s
100% (756972 of 756972) |################| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 1/1 2s
Data loaded 3s
Output cleaning...
fix uniq
ffill if the current price is None...
Check liquidity...
WARNING! Strategy trades non-liquid assets.
Fix liquidity...
Ok.
Check missed dates...
Ok.
Normalization...
Output cleaning is complete.
Write output: /root/fractions.nc.gz
State saved.
---
Run First Iteration...
100% (39443 of 39443) |##################| Elapsed Time: 0:00:00 Time:  0:00:00
100% (9041556 of 9041556) |##############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 1/1 3s
Data loaded 3s
---
Run all iterations...
Load data...
100% (39443 of 39443) |##################| Elapsed Time: 0:00:00 Time:  0:00:00
100% (14585384 of 14585384) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 1/7 1s
100% (14587980 of 14587980) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 2/7 2s
100% (14587980 of 14587980) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 3/7 3s
100% (14585360 of 14585360) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 4/7 4s
100% (14585288 of 14585288) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 5/7 4s
100% (14585288 of 14585288) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 6/7 5s
100% (13371836 of 13371836) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 7/7 6s
Data loaded 7s
100% (14717216 of 14717216) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 1/6 1s
100% (14720244 of 14720244) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 2/6 2s
100% (14717184 of 14717184) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 3/6 3s
100% (14717100 of 14717100) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 4/6 4s
100% (14717100 of 14717100) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 5/6 5s
100% (13667388 of 13667388) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 6/6 6s
Data loaded 6s
Backtest...
100% (39443 of 39443) |##################| Elapsed Time: 0:00:00 Time:  0:00:00
100% (14838336 of 14838336) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 1/6 1s
100% (14841364 of 14841364) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 2/6 2s
100% (14838304 of 14838304) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 3/6 3s
100% (14838220 of 14838220) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 4/6 4s
100% (14838220 of 14838220) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 5/6 5s
100% (13779868 of 13779868) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 6/6 6s
Data loaded 6s
Output cleaning...
fix uniq
ffill if the current price is None...
Check liquidity...
WARNING! Strategy trades non-liquid assets.
Fix liquidity...
Ok.
Check missed dates...
Ok.
Normalization...
Output cleaning is complete.
Write output: /root/fractions.nc.gz
State saved.
---
Analyze results...
Check...
Check liquidity...
Ok.
Check missed dates...
Ok.
Check the sharpe ratio...
Period: 2006-01-01 - 2024-04-24
Sharpe Ratio = 0.49088000820772354
ERROR! The Sharpe Ratio is too low. 0.49088000820772354 < 1
Improve the strategy and make sure that the in-sample Sharpe Ratio more than 1.
Check correlation.
WARNING! Can't calculate correlation.
Correlation check failed.
---
Align...
Calc global stats...
---
Calc stats per asset...
Build plots...
---
Output:
asset NAS:AAL NAS:AAPL NAS:ABNB NAS:ADBE NAS:ADI NAS:ADP NAS:ADSK NAS:AEP NAS:AKAM NAS:ALGN
time
2024-04-11 0.0 0.006786 0.0 0.008203 0.006704 0.007646 0.006652 0.006762 0.0 0.0
2024-04-12 0.0 0.006649 0.0 0.008287 0.006779 0.007278 0.006618 0.006722 0.0 0.0
2024-04-15 0.0 0.006582 0.0 0.008185 0.006773 0.007110 0.006607 0.006702 0.0 0.0
2024-04-16 0.0 0.006669 0.0 0.007704 0.006734 0.007077 0.006609 0.006712 0.0 0.0
2024-04-17 0.0 0.006730 0.0 0.007773 0.006742 0.007155 0.006613 0.006655 0.0 0.0
2024-04-18 0.0 0.006769 0.0 0.007889 0.006767 0.007131 0.006605 0.006621 0.0 0.0
2024-04-19 0.0 0.006781 0.0 0.008078 0.006837 0.006879 0.006550 0.006589 0.0 0.0
2024-04-22 0.0 0.006807 0.0 0.007920 0.006793 0.006688 0.006577 0.006614 0.0 0.0
2024-04-23 0.0 0.006816 0.0 0.007564 0.006735 0.006941 0.006595 0.006636 0.0 0.0
2024-04-24 0.0 0.006803 0.0 0.007654 0.006631 0.007186 0.006632 0.006655 0.0 0.0
Stats:
field equity relative_return volatility underwater max_drawdown sharpe_ratio mean_return bias instruments avg_turnover avg_holding_time
time
2024-04-11 5.437522 0.004081 0.182414 -0.010877 -0.560319 0.503962 0.091930 1.0 214.0 0.021807 134.416262
2024-04-12 5.386552 -0.009374 0.182409 -0.020149 -0.560319 0.500941 0.091376 1.0 214.0 0.021805 134.414901
2024-04-15 5.351978 -0.006419 0.182397 -0.026438 -0.560319 0.498867 0.090992 1.0 214.0 0.021803 134.453023
2024-04-16 5.348156 -0.000714 0.182379 -0.027133 -0.560319 0.498588 0.090932 1.0 214.0 0.021801 134.478083
2024-04-17 5.319675 -0.005325 0.182364 -0.032314 -0.560319 0.496862 0.090610 1.0 214.0 0.021799 134.477549
2024-04-18 5.300878 -0.003533 0.182348 -0.035733 -0.560319 0.495702 0.090390 1.0 214.0 0.021798 134.492917
2024-04-19 5.281341 -0.003686 0.182332 -0.039287 -0.560319 0.494495 0.090162 1.0 214.0 0.021796 134.500569
2024-04-22 5.304151 0.004319 0.182315 -0.035138 -0.560319 0.495770 0.090386 1.0 214.0 0.021794 134.507261
2024-04-23 5.336400 0.006080 0.182301 -0.029272 -0.560319 0.497583 0.090710 1.0 214.0 0.021791 134.503439
2024-04-24 5.352938 0.003099 0.182283 -0.026263 -0.560319 0.498484 0.090865 1.0 214.0 0.021791 139.211534
---
100% (4861 of 4861) |####################| Elapsed Time: 0:03:03 Time:  0:03:03

The Sharpe ratio is obviously smaller as the training process is not looking forward (as it happens by processing data on a global basis), but performed on a rolling basis.

May I import libraries?

Yes, please refer to the file init.ipynb in your home directory. You can for example use:

! conda install -y scikit-learn

How to load data?

Daily stock data for the Q18 Nasdaq-100 contest can be loaded using:

data = qndata.stocks.load_ndx_data(tail = 17*365, dims = ("time", "field", "asset"))

Cryptocurrency daily data used for the Q16/Q17 contests can be loaded using:

data = qndata.cryptodaily.load_data(tail = 17*365, dims = ("time", "field", "asset"))

Futures data for the Q15 contest can be loaded using:

data= qndata.futures.load_data(tail = 17*365, dims = ("time", "field", "asset"))

BTC Futures data for the Q15 contest can be loaded using:

data= qndata.cryptofutures.load_data(tail = 17*365, dims = ("time", "field", "asset"))

How to view a list of all tickers?

data.asset.to_pandas().to_list()

How to see which fields are available?

data.field.to_pandas().to_list()

How to load specific tickers?

data = qndata.stocks.load_ndx_data(tail=17 * 365, assets=["NAS:AAPL", "NAS:AMZN"])

How to select specific tickers after loading all data?

def get_data_filter(data, assets):
    filler= data.sel(asset=assets)
    return filler

get_data_filter(data, ["NAS:AAPL", "NAS:AMZN"])

How to get the prices for the previous day?

qnta.shift(data.sel(field="open"), periods=1)

or:

data.sel(field="open").shift(time=1)

How to get the Sharpe ratio?

import qnt.stats as qnstats

def get_sharpe(market_data, weights):
    rr = qnstats.calc_relative_return(market_data, weights)
    sharpe = qnstats.calc_sharpe_ratio_annualized(rr).values[-1]
    return sharpe

sharpe = get_sharpe(data, weights) # weights.sel(time=slice("2006-01-01",None))

How do I get a list of the top 3 assets ranked by Sharpe ratio?

import qnt.stats as qnstats

data = qndata.stocks.load_ndx_data(tail = 17*365, dims = ("time", "field", "asset"))

def get_best_instruments(data, weights, top_size):
    # compute statistics:
    stats_per_asset = qnstats.calc_stat(data, weights, per_asset=True)
    # calculate ranks of assets by "sharpe_ratio":
    ranks = (-stats_per_asset.sel(field="sharpe_ratio")).rank("asset")
    # select top assets by rank "top_period" days ago:
    top_period = 1
    rank = ranks.isel(time=-top_period)
    top = rank.where(rank <= top_size).dropna("asset").asset

    # select top stats:
    top_stats = stats_per_asset.sel(asset=top.values)

    # print results:
    print("SR tail of the top assets:")
    display(top_stats.sel(field="sharpe_ratio").to_pandas().tail())

    print("avg SR = ", top_stats[-top_period:].sel(field="sharpe_ratio").mean("asset")[-1].item())
    display(top_stats)
    return top_stats.coords["asset"].values

get_best_instruments(data, weights, 3)

How can I check the results for only the top 3 assets ranked by Sharpe ratio?

Select the top assets and then load their data:

best_assets= get_best_instruments(data, weights, 3)

data= qndata.stocks.load_ndx_data(tail = 17*365, assets=best_assets)

How can prices be processed?

Simply import standard libraries, for example numpy:

import numpy as np

high= np.log(data.sel(field="high"))

How can you reduce slippage impace when trading?

Just apply some technique to reduce turnover:

def get_lower_slippage(weights, rolling_time=6):
    return weights.rolling({"time": rolling_time}).max()

improved_weights = get_lower_slippage(weights, rolling_time=6)

How to use technical analysis indicators?

For available indicators see the source code of the library: /qnt/ta

ATR

def get_atr(data, days=14):
    high = data.sel(field="high") * 1.0 
    low  = data.sel(field="low") * 1.0 
    close= data.sel(field="close") * 1.0

    return qnta.atr(high, low, close, days)

atr= get_atr(data, days=14)

EMA

prices= data.sel(field="high")
prices_ema= qnta.ema(prices, 15)

TRIX

prices= data.sel(field="high")
prices_trix= qnta.trix(prices, 15)

ADL and EMA

adl= qnta.ad_line(data.sel(field="close")) * 1.0 
adl_ema= qnta.ema(adl, 18)

How can you check the quality of your strategy?

import qnt.output as qnout
qnout.check(weights, data, "stocks_nasdaq100")

or

stat= qnstats.calc_stat(data, weights)
display(stat.to_pandas().tail())

or

import qnt.graph   as qngraph
statistics= qnstats.calc_stat(data, weights)
display(statistics.to_pandas().tail())

performance= statistics.to_pandas()["equity"]
qngraph.make_plot_filled(performance.index, performance, name="PnL (Equity)", type="log")

display(statistics[-1:].sel(field = ["sharpe_ratio"]).transpose().to_pandas())
qnstats.print_correlation(weights, data)

An example using pandas

One can work with pandas DataFrames at intermediate steps and at the end convert them to xarray data structures:

def get_price_pct_change(prices):
    prices_pandas = prices.to_pandas()
    assets = data.coords["asset"].values
    for asset in assets:
        prices_pandas[asset] = prices_pandas[asset].pct_change()
    return prices_pandas

prices = data.sel(field="close") * 1.0
prices_pct_change = get_price_pct_change(prices).unstack().to_xarray()

How to submit a strategy to the competition?

Check that weights are fine:

import qnt.output as qnout
qnout.check(weights, data, "stocks_nasdaq100")

If everything is ok, write the weights to file:

qnout.write(weights)

In your personal account:

  • choose a strategy;
  • click on the Submit button;
  • select the type of competition.

At the beginning you will find the strategy under the Checking area:

  • Sent strategies > Checking.

If technical checks are successful, the strategy will go under the Candidates area:

  • Sent strategies > Candidates.

Otherwise it will be Filtered:

  • Sent strategies > Filtered

and you should inspect error and warning messages.

Note that a strategy under the Candidates area should have a Sharpe ratio larger than 1 for being eligible for a prize. Please check warning messages in your Candidates area!