Machine Learning Strategy

Reply to Machine Learning Strategy on Mon, 19 Apr 2021 15:42:30 GMT

spancham — Mon, 19 Apr 2021 15:42:30 GMT

Reply to Machine Learning Strategy on Mon, 19 Apr 2021 13:29:59 GMT

Vyacheslav_B — Mon, 19 Apr 2021 13:29:59 GMT

@spancham Hello. Try this

import xarray as xr

import qnt.backtester as qnbt
import qnt.data as qndata
import numpy as np
import pandas as pd
import logging


def load_data(period):
    return qndata.cryptofutures.load_data(tail=period)


def predict_weights(market_data):

    def get_ml_model():
        # you can use any machine learning model
        from sklearn.linear_model import RidgeClassifier
        model = RidgeClassifier(random_state=18)
        return model

    def get_features_dict(data):
        def get_features_for(asset_name):
            data_for_instrument = data.copy(True).sel(asset=[asset_name])

            # Feature 1
            price = data_for_instrument.sel(field="close").ffill('time').bfill('time').fillna(0)  # fill NaN
            price_df = price.to_dataframe()

            # Feature 2
            vol = data_for_instrument.sel(field="vol").ffill('time').bfill('time').fillna(0)  # fill NaN
            vol_df = vol.to_dataframe()

            # Merge dataframes
            for_result = pd.merge(price_df, vol_df, on='time')
            for_result = for_result.drop(['field_x', 'field_y'], axis=1)

            return for_result

        features_all_assets = {}

        asset_all = data.asset.to_pandas().to_list()
        for asset in asset_all:
            features_all_assets[asset] = get_features_for(asset)

        return features_all_assets

    def get_target_classes(data):
        # for classifiers, you need to set classes
        # if 1 then the price will rise tomorrow

        price_current = data.sel(field="close").dropna('time')  # rm NaN
        price_future = price_current.shift(time=-1).dropna('time')

        class_positive = 1
        class_negative = 0

        target_is_price_up = xr.where(price_future > price_current, class_positive, class_negative)
        return target_is_price_up.to_pandas()

    data = market_data.copy(True)

    asset_name_all = data.coords['asset'].values
    features_all_df = get_features_dict(data)
    target_all_df = get_target_classes(data)

    predict_weights_next_day_df = data.sel(field="close").isel(time=-1).to_pandas()

    for asset_name in asset_name_all:
        target_for_learn_df = target_all_df[asset_name]
        feature_for_learn_df = features_all_df[asset_name][:-1]  # last value reserved for prediction

        # align features and targets
        target_for_learn_df, feature_for_learn_df = target_for_learn_df.align(feature_for_learn_df, axis=0,
                                                                              join='inner')

        model = get_ml_model()
        try:
            model.fit(feature_for_learn_df.values, target_for_learn_df)

            feature_for_predict_df = features_all_df[asset_name][-1:]

            predict = model.predict(feature_for_predict_df.values)
            predict_weights_next_day_df[asset_name] = predict
        except:
            logging.exception("model failed")
            # if there is exception, return zero values
            return xr.zeros_like(data.isel(field=0, time=0))

    return predict_weights_next_day_df.to_xarray()


weights = qnbt.backtest(
    competition_type="cryptofutures",
    load_data=load_data,
    lookback_period=18,
    start_date='2014-01-01',
    strategy=predict_weights,
    analyze=True,
    build_plots=True
)

Here is an example with indicators (Sharpe Ratio = 0.8)

 def get_features_for(asset_name):
    data_for_instrument = data.copy(True).sel(asset=[asset_name])

    # Feature 1
    price = data_for_instrument.sel(field="close")
    price = qnt.ta.roc(price, 1)
    price = price.ffill('time').bfill('time').fillna(0)
    price_df = price.to_pandas()

    # Feature 2
    vol = data_for_instrument.sel(field="vol")
    vol = vol.ffill('time').bfill('time').fillna(0)  # fill NaN
    vol_df = vol.to_pandas()

    # Merge dataframes
    for_result = pd.merge(price_df, vol_df, on='time')

    return for_result

Reply to Machine Learning Strategy on Sun, 18 Apr 2021 17:45:10 GMT

spancham — Sun, 18 Apr 2021 17:45:10 GMT

@support
ok guys, I tried what you suggested and I am running into all sorts of problems.
I want to pass several features altogether in one dataframe.
Are you guys thinking that I want to 'test' one feature at a time and that is why you are suggesting working with more than one dataframe?
Here is an example of some code I tried, but I would still have to merge the dataframes in order to pass the feature set to the classifier:

def get_features(data):
        # let's come up with features for machine learning
        # take the logarithm of closing prices
        def remove_trend(prices_pandas_):
            prices_pandas = prices_pandas_.copy(True)
            assets = prices_pandas.columns
            print(assets)
            for asset in assets:
                print(prices_pandas[asset])
                prices_pandas[asset] = np.log(prices_pandas[asset])
            return prices_pandas
        
        # Feature 1
        price = data.sel(field="close").ffill('time').bfill('time').fillna(0) # fill NaN
        price_df = price.to_dataframe()
        
        # Feature 2
        vol = data.sel(field="vol").ffill('time').bfill('time').fillna(0) # fill NaN
        vol_df = vol.to_dataframe()
        
        # Merge dataframes
        for_result = pd.merge(price_df, vol_df, on='time')
        for_result = for_result.drop(['field_x', 'field_y'], axis=1)
            
        features_no_trend_df = remove_trend(for_result)
        return features_no_trend_df

Can you help with some code as to what you are suggesting?
Thanks

Reply to Machine Learning Strategy on Mon, 12 Apr 2021 19:01:13 GMT

spancham — Mon, 12 Apr 2021 19:01:13 GMT

Hi @support
Ok, let me think about what you are suggesting & see if I can get that to work.
Will let you know.
Thanks.

Reply to Machine Learning Strategy on Mon, 12 Apr 2021 18:33:38 GMT

support — Mon, 12 Apr 2021 18:33:38 GMT

@spancham Hello, could you elaborate more on your request? In principle, you could just repeat the procedure you use for the "close" and you will work with more dataframes.

Reply to Machine Learning Strategy on Tue, 06 Apr 2021 14:18:12 GMT

spancham — Tue, 06 Apr 2021 14:18:12 GMT

@support

Can you help pls with an example on how to include more than one feature, such as from the fields (OHLCV)?
And also from the qnt.ta library?
I am running into a problem converting the feature set to pandas when there are more than one features.

price = data.sel(field="close").ffill('time').bfill('time').fillna(0) # fill NaN
        for_result = price.to_pandas()

Thank you.

Reply to Machine Learning Strategy on Thu, 01 Apr 2021 20:55:46 GMT

spancham — Thu, 01 Apr 2021 20:55:46 GMT

@support
Yaay! I got one accepted
I know the SR is at the bottom of the barrel on the Leaderboard, but I'm still grateful I got one accepted.

Ok, I'm inspired that this is doable for me.
Btw, thanks to everyone on your team for responding to my support requests & helping me understand the Quantiacs platform in a few short weeks.

Reply to Machine Learning Strategy on Thu, 01 Apr 2021 20:06:29 GMT

support — Thu, 01 Apr 2021 20:06:29 GMT

@spancham

Yes, you can continue. The system saves a copy when you submit the strategy.

Reply to Machine Learning Strategy on Thu, 01 Apr 2021 19:58:08 GMT

spancham — Thu, 01 Apr 2021 19:58:08 GMT

@support
ok, the system let me submit a new strategy:

I hope this one works.
I can keep working on getting a higher Sharpe Ratio, and update the strategy, right?
Thanks.

Reply to Machine Learning Strategy on Thu, 01 Apr 2021 17:35:25 GMT

spancham — Thu, 01 Apr 2021 17:35:25 GMT

@support
Thank you. I'll try that.

Reply to Machine Learning Strategy on Thu, 01 Apr 2021 15:08:42 GMT

support — Thu, 01 Apr 2021 15:08:42 GMT

Hello.

This strategy correlates with the examples.

The cofactor(correlation factor) must be lower than 0.9 or the Shape Ratio of your strategy must be higher (for the last 3 years).

Try to use the other features: volume, ROC(rate of change), or other technical indicators.

Regards.