strategy

Q18 Ridge Classifier

This strategy opens positions on stocks using a Ridge classifier.

You can clone and edit this example there (tab Examples).


This template shows you how to use a Ridge Classifier for developing a trading algorithm and taking part to the Q18 NASDAQ-100 Stock Long-Short contest.

Please note that:

  • Your trading algorithm can open short and long positions.

  • At each point in time your algorithm can trade all or a subset of the stocks which at that point of time are or were part of the NASDAQ-100 stock index. Note that the composition of this set changes in time, and Quantiacs provides you with an appropriate filter function for selecting them.

  • The Sharpe ratio of your system since January 1st, 2006, has to be larger than 1.

  • Your system cannot be a copy of the current examples. We run a correlation filter on the submissions and detect duplicates.

  • For simplicity we will use a single asset. It pays off to use more assets, ideally uncorrelated, and diversify your positions for a more solid Sharpe ratio.

More details on the rules can be found here.

Need help? Check the Documentation and find solutions/report problems in the Forum section.

More help with Jupyter? Check the official Jupyter page.

Once you are done, click on Submit to the contest and take part to our competitions.

API reference:

  • data: check how to work with data;

  • backtesting: read how to run the simulation and check the results.

Need to use the optimizer function to automate tedious tasks?

  • optimization: read more on our article.
In [1]:
%%javascript
IPython.OutputArea.prototype._should_scroll = function(lines) { return false; }
// disable widget scrolling
In [2]:
import xarray as xr

import qnt.backtester as qnbt
import qnt.data as qndata
import numpy as np
import logging
In [3]:
# We will open positions on Amazon as predicted by the Ridge Classifier. For learning we use the logarithm of 
# closing prices for the last 18 days. The parameter is selected by the backtester call.



def load_data(period):
    return qndata.stocks.load_ndx_data(tail=period, assets=["NAS:AMZN"])



def predict_weights(market_data):

    def get_ml_model():
        # you can use any machine learning model here
        from sklearn.linear_model import RidgeClassifier
        model = RidgeClassifier(random_state=18)
        return model

    def get_features(data):
        # define here features for learning
        def take_log(prices_pandas_):
            prices_pandas = prices_pandas_.copy(True)
            assets = prices_pandas.columns
            for asset in assets:
                prices_pandas[asset] = np.log(prices_pandas[asset])
            return prices_pandas
        price = data.sel(field="close").ffill("time").bfill("time").fillna(0) # fill NaN
        for_result = price.to_pandas()
        features_df = take_log(for_result)
        return features_df

    def get_target_classes(data):
        # define categorical targets
        price_current = data.sel(field="close").dropna("time") # rm NaN
        price_future = price_current.shift(time=-1).dropna("time")

        class_positive = 1
        class_negative = 0

        target_is_price_up = xr.where(price_future > price_current, class_positive, class_negative)
        return target_is_price_up.to_pandas()

    data = market_data.copy(True)

    asset_name_all = data.coords["asset"].values
    features_all_df = get_features(data)
    target_all_df = get_target_classes(data)

    predict_weights_next_day_df = data.sel(field="close").isel(time=-1).to_pandas()

    for asset_name in asset_name_all:
        target_for_learn_df = target_all_df[asset_name]
        feature_for_learn_df = features_all_df[asset_name][:-1] # last value reserved for prediction

        # align features and targets
        target_for_learn_df, feature_for_learn_df = target_for_learn_df.align(feature_for_learn_df, axis=0, join="inner")

        model = get_ml_model()
        try:
            model.fit(feature_for_learn_df.values.reshape(-1, 1), target_for_learn_df)

            feature_for_predict_df = features_all_df[asset_name][-1:]

            predict = model.predict(feature_for_predict_df.values.reshape(-1, 1))
            predict_weights_next_day_df[asset_name] = predict
        except:
            logging.exception("model failed")
            # if there is exception, return zero values
            return xr.zeros_like(data.isel(field=0, time=0))

    return predict_weights_next_day_df.to_xarray()



weights = qnbt.backtest(
    competition_type = "stocks_nasdaq100",
    load_data        = load_data,
    lookback_period  = 18,
    start_date       = "2005-01-01",
    strategy         = predict_weights,
    analyze          = True,
    build_plots      = True
)
Run last pass...
Load data...
100% (367973 of 367973) |################| Elapsed Time: 0:00:00 Time:  0:00:00
100% (39443 of 39443) |##################| Elapsed Time: 0:00:00 Time:  0:00:00
100% (1676 of 1676) |####################| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 1/1 0s
Data loaded 2s
Run strategy...
Load data for cleanup...
100% (3880 of 3880) |####################| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 1/1 0s
Data loaded 0s
Output cleaning...
fix uniq
Check liquidity...
Ok.
Normalization...
Output cleaning is complete.
Write result...
Write output: /root/fractions.nc.gz
---
Run first pass...
Load data...
100% (1600 of 1600) |####################| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 1/1 0s
Data loaded 0s
Run strategy...
---
Load full data...
100% (370352 of 370352) |################| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 1/1 0s
Data loaded 0s
---
Run iterations...

100% (4852 of 4852) |####################| Elapsed Time: 0:00:53 Time:  0:00:53
Merge outputs...
Load data for cleanup and analysis...
100% (39443 of 39443) |##################| Elapsed Time: 0:00:00 Time:  0:00:00
100% (14856504 of 14856504) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 1/6 1s
100% (14859532 of 14859532) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 2/6 3s
100% (14856472 of 14856472) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 3/6 4s
100% (14856388 of 14856388) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 4/6 6s
100% (14856388 of 14856388) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 5/6 8s
100% (13796740 of 13796740) |############| Elapsed Time: 0:00:00 Time:  0:00:00
fetched chunk 6/6 8s
Data loaded 9s
Output cleaning...
fix uniq
ffill if the current price is None...
Check liquidity...
Ok.
Check missed dates...
Ok.
Normalization...
Output cleaning is complete.
Write result...
Write output: /root/fractions.nc.gz
---
Analyze results...
Check...
Check liquidity...
Ok.
Check missed dates...
Ok.
Check the sharpe ratio...
Period: 2006-01-01 - 2024-04-12
Sharpe Ratio = 0.07459631916422892
ERROR! The Sharpe Ratio is too low. 0.07459631916422892 < 1
Improve the strategy and make sure that the in-sample Sharpe Ratio more than 1.
Check correlation.
WARNING! Can't calculate correlation.
Correlation check failed.
---
Align...
Calc global stats...
---
Calc stats per asset...
Build plots...
---
Output:
asset NAS:AAL NAS:AAPL NAS:ABNB NAS:ADBE NAS:ADI NAS:ADP NAS:ADSK NAS:AEP NAS:AKAM NAS:ALGN
time
2024-04-01 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2024-04-02 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2024-04-03 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2024-04-04 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2024-04-05 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2024-04-08 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2024-04-09 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2024-04-10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2024-04-11 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2024-04-12 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Stats:
field equity relative_return volatility underwater max_drawdown sharpe_ratio mean_return bias instruments avg_turnover avg_holding_time
time
2024-04-01 1.240775 0.003271 0.250615 -0.273902 -0.617344 0.045035 0.011286 1.0 1.0 0.144530 7.695783
2024-04-02 1.238728 -0.001650 0.250590 -0.275100 -0.617344 0.044684 0.011197 1.0 1.0 0.144502 7.695783
2024-04-03 1.250495 0.009499 0.250573 -0.268214 -0.617344 0.046662 0.011692 1.0 1.0 0.144473 7.695783
2024-04-04 1.233923 -0.013252 0.250566 -0.277912 -0.617344 0.043854 0.010988 1.0 1.0 0.144451 7.695783
2024-04-05 1.268678 0.028167 0.250621 -0.257573 -0.617344 0.049664 0.012447 1.0 1.0 0.144429 7.695783
2024-04-08 1.269501 0.000648 0.250596 -0.257091 -0.617344 0.049795 0.012478 1.0 1.0 0.144404 7.695783
2024-04-09 1.272791 0.002592 0.250570 -0.255166 -0.617344 0.050333 0.012612 1.0 1.0 0.144374 7.695783
2024-04-10 1.274351 0.001226 0.250545 -0.254253 -0.617344 0.050585 0.012674 1.0 1.0 0.144350 7.695783
2024-04-11 1.295494 0.016591 0.250547 -0.241880 -0.617344 0.054030 0.013537 1.0 1.0 0.144321 7.695783
2024-04-12 1.275553 -0.015393 0.250546 -0.253550 -0.617344 0.050761 0.012718 1.0 1.0 0.144296 7.717718
---

How do I use multiple features to train a model?

Strategy idea: We will open positions on stocks as predicted by the RidgeClassifier.

Features for learning: - The logarithm of closing prices for the last 18 days for futures "F_ES", "F_DX"

import xarray as xr

import qnt.backtester as qnbt
import qnt.data as qndata
import numpy as np
import pandas as pd



def load_data(period):
    futures = qndata.futures.load_data(tail=period, assets=["F_ES", "F_DX"])
    stocks = qndata.stocks.load_ndx_data(tail=period, assets=["NAS:AMZN"])
    return {"futures": futures, "stocks": stocks}, futures.time.values



def build_data_for_one_step(data, max_date: np.datetime64, lookback_period: int):
    min_date = max_date - np.timedelta64(lookback_period, "D")
    return {
        "futures": data["futures"].sel(time=slice(min_date, max_date)),
        "stocks": data["stocks"].sel(time=slice(min_date, max_date)),
    }



def predict_weights(market_data):

    def get_ml_model():
        from sklearn.linear_model import RidgeClassifier
        model = RidgeClassifier(random_state=18)
        return model

    def get_features(data):
        def take_log(prices_pandas_):
            prices_pandas = prices_pandas_.copy(True)
            assets = prices_pandas.columns
            for asset in assets:
                prices_pandas[asset] = np.log(prices_pandas[asset])
            return prices_pandas

        price = data.sel(field="close").ffill("time").bfill("time").fillna(0) # fill NaN
        for_result = price.to_pandas()
        features_df = take_log(for_result)
        return features_df

    def get_target_classes(data):

        price_current = data.sel(field="close").dropna("time")
        price_future = price_current.shift(time=-1).dropna("time")

        class_positive = 1
        class_negative = 0

        target_is_price_up = xr.where(price_future > price_current, class_positive, class_negative)
        return target_is_price_up.to_pandas()

    futures = market_data["futures"].copy(True)
    stocks = market_data["stocks"].copy(True)

    asset_name_all = stocks.coords["asset"].values
    features_all_df = get_features(futures)
    target_all_df = get_target_classes(stocks)

    predict_weights_next_day_df = stocks.sel(field="close").isel(time=-1).to_pandas()

    for asset_name in asset_name_all:
        target_for_learn_df = target_all_df[asset_name]
        feature_for_learn_df = features_all_df[:-1] # last value reserved for prediction

        # align features and targets
        target_for_learn_df, feature_for_learn_df = target_for_learn_df.align(feature_for_learn_df, axis=0, join="inner")

        model = get_ml_model()

        try:
            model.fit(feature_for_learn_df.values, target_for_learn_df)

            feature_for_predict_df = features_all_df[-1:]

            predict = model.predict(feature_for_predict_df.values)
            predict_weights_next_day_df[asset_name] = predict
        except:
            logging.exception("model failed")
            # if there is exception, return zero values
            return xr.zeros_like(stocks.isel(field=0, time=0))

    return predict_weights_next_day_df.to_xarray()



weights = qnbt.backtest(
    competition_type = "stocks_nasdaq100",
    load_data        = load_data,
    lookback_period  = 18,
    start_date       = "2005-01-01",
    strategy         = predict_weights,
    window           = build_data_for_one_step,
    analyze          = True,
    build_plots      = True
)

What libraries are available?

Our library makes extensive use of xarray:

pandas:

and numpy:

Function definitions can be found in the qnt folder in your private root directory.

# Import basic libraries.
import xarray as xr
import pandas as pd
import numpy as np

# Import quantnet libraries.
import qnt.data    as qndata  # load and manipulate data
import qnt.output as output   # manage output
import qnt.backtester as qnbt # backtester
import qnt.stats   as qnstats # statistical functions for analysis
import qnt.graph   as qngraph # graphical tools
import qnt.ta      as qnta    # indicators library

May I import libraries?

Yes, please refer to the file init.ipynb in your home directory. You can for example use:

! conda install -y scikit-learn

How to load data?

Daily stock data for the Q18 Nasdaq-100 contest can be loaded using:

data = qndata.stocks.load_ndx_data(tail = 17*365, dims = ("time", "field", "asset"))

Cryptocurrency daily data used for the Q16/Q17 contests can be loaded using:

data = qndata.cryptodaily.load_data(tail = 17*365, dims = ("time", "field", "asset"))

Futures data for the Q15 contest can be loaded using:

data= qndata.futures.load_data(tail = 17*365, dims = ("time", "field", "asset"))

BTC Futures data for the Q15 contest can be loaded using:

data= qndata.cryptofutures.load_data(tail = 17*365, dims = ("time", "field", "asset"))

How to view a list of all tickers?

data.asset.to_pandas().to_list()

How to see which fields are available?

data.field.to_pandas().to_list()

How to load specific tickers?

data = qndata.stocks.load_ndx_data(tail=17 * 365, assets=["NAS:AAPL", "NAS:AMZN"])

How to select specific tickers after loading all data?

def get_data_filter(data, assets):
    filler= data.sel(asset=assets)
    return filler

get_data_filter(data, ["NAS:AAPL", "NAS:AMZN"])

How to get the prices for the previous day?

qnta.shift(data.sel(field="open"), periods=1)

or:

data.sel(field="open").shift(time=1)

How to get the Sharpe ratio?

import qnt.stats as qnstats

def get_sharpe(market_data, weights):
    rr = qnstats.calc_relative_return(market_data, weights)
    sharpe = qnstats.calc_sharpe_ratio_annualized(rr).values[-1]
    return sharpe

sharpe = get_sharpe(data, weights) # weights.sel(time=slice("2006-01-01",None))

How do I get a list of the top 3 assets ranked by Sharpe ratio?

import qnt.stats as qnstats

data = qndata.stocks.load_ndx_data(tail = 17*365, dims = ("time", "field", "asset"))

def get_best_instruments(data, weights, top_size):
    # compute statistics:
    stats_per_asset = qnstats.calc_stat(data, weights, per_asset=True)
    # calculate ranks of assets by "sharpe_ratio":
    ranks = (-stats_per_asset.sel(field="sharpe_ratio")).rank("asset")
    # select top assets by rank "top_period" days ago:
    top_period = 1
    rank = ranks.isel(time=-top_period)
    top = rank.where(rank <= top_size).dropna("asset").asset

    # select top stats:
    top_stats = stats_per_asset.sel(asset=top.values)

    # print results:
    print("SR tail of the top assets:")
    display(top_stats.sel(field="sharpe_ratio").to_pandas().tail())

    print("avg SR = ", top_stats[-top_period:].sel(field="sharpe_ratio").mean("asset")[-1].item())
    display(top_stats)
    return top_stats.coords["asset"].values

get_best_instruments(data, weights, 3)

How can I check the results for only the top 3 assets ranked by Sharpe ratio?

Select the top assets and then load their data:

best_assets= get_best_instruments(data, weights, 3)

data= qndata.stocks.load_ndx_data(tail = 17*365, assets=best_assets)

How can prices be processed?

Simply import standard libraries, for example numpy:

import numpy as np

high= np.log(data.sel(field="high"))

How can you reduce slippage impace when trading?

Just apply some technique to reduce turnover:

def get_lower_slippage(weights, rolling_time=6):
    return weights.rolling({"time": rolling_time}).max()

improved_weights = get_lower_slippage(weights, rolling_time=6)

How to use technical analysis indicators?

For available indicators see the source code of the library: /qnt/ta

ATR

def get_atr(data, days=14):
    high = data.sel(field="high") * 1.0 
    low  = data.sel(field="low") * 1.0 
    close= data.sel(field="close") * 1.0

    return qnta.atr(high, low, close, days)

atr= get_atr(data, days=14)

EMA

prices= data.sel(field="high")
prices_ema= qnta.ema(prices, 15)

TRIX

prices= data.sel(field="high")
prices_trix= qnta.trix(prices, 15)

ADL and EMA

adl= qnta.ad_line(data.sel(field="close")) * 1.0 
adl_ema= qnta.ema(adl, 18)

How can you check the quality of your strategy?

import qnt.output as qnout
qnout.check(weights, data, "stocks_nasdaq100")

or

stat= qnstats.calc_stat(data, weights)
display(stat.to_pandas().tail())

or

import qnt.graph   as qngraph
statistics= qnstats.calc_stat(data, weights)
display(statistics.to_pandas().tail())

performance= statistics.to_pandas()["equity"]
qngraph.make_plot_filled(performance.index, performance, name="PnL (Equity)", type="log")

display(statistics[-1:].sel(field = ["sharpe_ratio"]).transpose().to_pandas())
qnstats.print_correlation(weights, data)

An example using pandas

One can work with pandas DataFrames at intermediate steps and at the end convert them to xarray data structures:

def get_price_pct_change(prices):
    prices_pandas = prices.to_pandas()
    assets = data.coords["asset"].values
    for asset in assets:
        prices_pandas[asset] = prices_pandas[asset].pct_change()
    return prices_pandas

prices = data.sel(field="close") * 1.0
prices_pct_change = get_price_pct_change(prices).unstack().to_xarray()

How to submit a strategy to the competition?

Check that weights are fine:

import qnt.output as qnout
qnout.check(weights, data, "stocks_nasdaq100")

If everything is ok, write the weights to file:

qnout.write(weights)

In your personal account:

  • choose a strategy;
  • click on the Submit button;
  • select the type of competition.

At the beginning you will find the strategy under the Checking area:

  • Sent strategies > Checking.

If technical checks are successful, the strategy will go under the Candidates area:

  • Sent strategies > Candidates.

Otherwise it will be Filtered:

  • Sent strategies > Filtered

and you should inspect error and warning messages.

Note that a strategy under the Candidates area should have a Sharpe ratio larger than 1 for being eligible for a prize. Please check warning messages in your Candidates area!