strategy

Trading System Optimization by Asset

This template demonstrates how to find optimal parameters for each asset.

You can clone and edit this example there (tab Examples).


This template demonstrates how to find optimal parameters for each asset.

It allows you to develop a strategy which will pass the filters easily.

When you start writing your strategies, the easiest way is to use technical analysis. But pure technical analysis, probably, won't work. You have to adjust parameters of technical indicators each asset.

However, there is a big risk to create an overfitting strategy, when you use such an optimization. There is way how to reduce the impact of overfitting. Instead of using the one optimal parameter set per asset, you can use several paremeters each asset. This example demonstrates how to do it.

Base strategy

In [1]:
%%javascript
window.IPython && (IPython.OutputArea.prototype._should_scroll = function(lines) { return false; })
// disable widget scrolling
In [2]:
import json

import xarray as xr
import numpy as np
import pandas as pd

import qnt.data as qndata          # data loading and manipulation
import qnt.stats as qnstats        # key statistics
import qnt.graph as qngraph        # graphical tools
import qnt.ta as qnta              # technical analysis indicators
import qnt.output as qnout         # for writing output
import qnt.log as qnlog            # log configuration
import qnt.optimizer as qno        # optimizer

# display function for fancy displaying:
from IPython.display import display
# lib for charts
import plotly.graph_objs as go
In [3]:
data = qndata.futures_load_data(min_date='2005-01-01')
100% (35745172 of 35745172) |############| Elapsed Time: 0:00:00 Time:  0:00:00

At first, let's start with a simple trend based strategy.

In [4]:
def strategy_long(data, asset=None, ma_period=150):
    # filter by asset, we need it for optimization
    if asset is not None:
        data = data.sel(asset=[asset])
        
    close = data.sel(field='close')

    ma = qnta.lwma(close, ma_period)
    ma_roc = qnta.roc(ma, 1) 
    
    # define signals
    buy_signal = ma_roc > 0
    stop_signal = ma_roc < 0
    
    # rsi = qnta.rsi(close, rsi_period)
    # buy_signal = np.logical_and(rsi < 30, ma_roc > 0)
    # stop_signal = np.logical_or(rsi > 90, ma_roc < 0)

    # transform signals to positions    
    position = xr.where(buy_signal, 1, np.nan)
    position = xr.where(stop_signal, 0, position)
    position = position.ffill('time').fillna(0)
    
    # clean the output (not necessary)
    # with qnlog.Settings(info=False,err=False): # suppress logging
    #     position = qnout.clean(position, data)
    return position

Next, see the performance of the strategy

In [5]:
#DEBUG#
# evaluator will remove cells with such marks before evaluation

output = strategy_long(data)
stats = qnstats.calc_stat(data, output.sel(time=slice('2006-01-01',None)))
display(stats.to_pandas().tail())
field equity relative_return volatility underwater max_drawdown sharpe_ratio mean_return bias instruments avg_turnover avg_holding_time
time
2024-06-26 0.943915 -0.002607 0.077133 -0.328811 -0.355086 -0.040407 -0.003117 1.0 71.0 0.119781 16.471514
2024-06-27 0.944746 0.000881 0.077125 -0.328220 -0.355086 -0.039787 -0.003069 1.0 71.0 0.119801 16.468758
2024-06-28 0.944302 -0.000470 0.077117 -0.328536 -0.355086 -0.040112 -0.003093 1.0 71.0 0.119803 16.471822
2024-07-01 0.943616 -0.000726 0.077109 -0.329023 -0.355086 -0.040615 -0.003132 1.0 71.0 0.119804 16.469084
2024-07-02 0.944662 0.001109 0.077102 -0.328279 -0.355086 -0.039837 -0.003071 1.0 71.0 0.119806 16.515069

Search for optimal parameters for all assets.

Let's try to optimize the strategy for all assets and see the performance.

In [6]:
#DEBUG#
# evaluator will remove cells with such marks before evaluation

result_for_all = qno.optimize_strategy(
    data,
    strategy_long,
    qno.full_range_args_generator(ma_period=range(10, 200, 10)),
    workers=1 # you can set more workers on your local PC to speed up
)
100% (19 of 19) |########################| Elapsed Time: 0:00:11 Time:  0:00:11
In [7]:
#DEBUG#
# evaluator will remove cells with such marks before evaluation

# chart
scatter = go.Scatter(
    x=[i['args']['ma_period'] for i in result_for_all['iterations']],
    y=[i['result']['sharpe_ratio'] for i in  result_for_all['iterations']],
    mode="markers",
    name="optimization result",
    marker_size=9,
    marker_color='orange'
)
fig = go.Figure(data=scatter)
# fig.update_yaxes(fixedrange=False) # unlock vertical scrolling
fig.show()


print("---")
print("Best iteration:")
display(result_for_all['best_iteration'])
---
Best iteration:
{'args': {'ma_period': 160},
 'result': {'equity': 0.9996725004912969,
  'relative_return': 0.0010884630833658537,
  'volatility': 0.07731458230688791,
  'underwater': -0.3098868713608455,
  'max_drawdown': -0.33900177133090303,
  'sharpe_ratio': -0.00022893378950237717,
  'mean_return': -1.7699920311309292e-05,
  'bias': 1.0,
  'instruments': 71.0,
  'avg_turnover': 0.11657958070466869,
  'avg_holding_time': 16.985527030742244},
 'weight': -0.00022893378950237717,
 'exception': None}

As you see, the result is still bad. That is why you need to optimize parameters per every asset.

Search for optimal parameters for each asset.

There is 1 parameter for this strategy ma_period.

We will perform a full range scan. It will take about 11 minutes.

In [8]:
#DEBUG#
# evaluator will remove cells with such marks before evaluation

result_long = qno.optimize_strategy(
    data,
    strategy_long,
    qno.full_range_args_generator(ma_period=range(10, 200, 10),
                                  asset=data.asset.values.tolist()),
    workers=1 # you can set more workers on your local PC to speed up
)
100% (1349 of 1349) |####################| Elapsed Time: 0:11:39 Time:  0:11:39

Observe the results:

###DEBUG###
# evaluator will remove cells with such marks before evaluation

from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets

assets_perf = [dict(
    asset=asset,
    sharpe=max(i['result']['sharpe_ratio'] for i in result_long['iterations'] if i['args']['asset'] == asset)
) for asset in data.asset.values.tolist()]


assets_perf.sort(key=lambda i: -i['sharpe'])


@interact(asset=[(a['asset'] + (" - %.2f" % a['sharpe']), a['asset']) for a in assets_perf])
def display_results(asset=assets_perf[0]['asset']):
    asset_iterations = [i for i in result_long['iterations'] if i['args']['asset'] == asset]

    scatter = go.Scatter(
        x=[i['args']['ma_period'] for i in asset_iterations],
        y=[i['result']['sharpe_ratio'] for i in asset_iterations],
        mode="markers",
        name="optimization result",
        marker_size=9,
        marker_color='orange'
    ),

    fig = go.Figure(data=scatter)
    # fig.update_yaxes(fixedrange=False) # unlock vertical scrolling
    fig.show()

Select assets and optimal parameters

Now, you can select best parameters each asset. There is a big chance that you will overfit your strategy. So we will select multiple suitable paremeters for every asset.

We will select 15 good assets for the strategy. And we will select 3 best suitable parameters per selected asset. The more the better. It will be less likely that your strategy is overfitting.

In [9]:
#DEBUG#
# evaluator will remove cells with such marks before evaluation

def find_best_parameters(result, asset_count, parameter_set_count):
    assets = data.asset.values.tolist()
    assets.sort(key=lambda a: -asset_weight(result, a, parameter_set_count))
    assets = assets[:asset_count]
    params = []
    for a in assets:
        params += get_best_parameters_for_asset(result, a, parameter_set_count)
    return params


def asset_weight(result, asset, parameter_set_count):
    asset_iterations = [i for i in result['iterations'] if i['args']['asset'] == asset]
    asset_iterations.sort(key=lambda i: -i['result']['sharpe_ratio'])
    # weight is a sum of the three best iterations
    return sum(i['result']['sharpe_ratio'] for i in asset_iterations[:parameter_set_count])


def get_best_parameters_for_asset(result, asset, parameter_set_count):
    asset_iterations = [i for i in result['iterations'] if i['args']['asset'] == asset]
    asset_iterations.sort(key=lambda i: -i['result']['sharpe_ratio'])
    return [i['args'] for i in asset_iterations[:parameter_set_count]]


config = find_best_parameters(result=result_long, asset_count=15, parameter_set_count=3)
# If you change the asset_count and/or parameters_count, you will get a new strategy.

json.dump(config, open('config.json', 'w'), indent=2)

display(config)
[{'ma_period': 110, 'asset': 'F_QT'},
 {'ma_period': 70, 'asset': 'F_QT'},
 {'ma_period': 90, 'asset': 'F_QT'},
 {'ma_period': 160, 'asset': 'F_NQ'},
 {'ma_period': 180, 'asset': 'F_NQ'},
 {'ma_period': 170, 'asset': 'F_NQ'},
 {'ma_period': 100, 'asset': 'F_EB'},
 {'ma_period': 180, 'asset': 'F_EB'},
 {'ma_period': 130, 'asset': 'F_EB'},
 {'ma_period': 50, 'asset': 'F_RB'},
 {'ma_period': 60, 'asset': 'F_RB'},
 {'ma_period': 40, 'asset': 'F_RB'},
 {'ma_period': 20, 'asset': 'F_C'},
 {'ma_period': 30, 'asset': 'F_C'},
 {'ma_period': 40, 'asset': 'F_C'},
 {'ma_period': 20, 'asset': 'F_W'},
 {'ma_period': 30, 'asset': 'F_W'},
 {'ma_period': 150, 'asset': 'F_W'},
 {'ma_period': 80, 'asset': 'F_ES'},
 {'ma_period': 90, 'asset': 'F_ES'},
 {'ma_period': 190, 'asset': 'F_ES'},
 {'ma_period': 100, 'asset': 'F_DM'},
 {'ma_period': 110, 'asset': 'F_DM'},
 {'ma_period': 120, 'asset': 'F_DM'},
 {'ma_period': 40, 'asset': 'F_NH'},
 {'ma_period': 70, 'asset': 'F_NH'},
 {'ma_period': 30, 'asset': 'F_NH'},
 {'ma_period': 150, 'asset': 'F_DT'},
 {'ma_period': 140, 'asset': 'F_DT'},
 {'ma_period': 110, 'asset': 'F_DT'},
 {'ma_period': 110, 'asset': 'F_SS'},
 {'ma_period': 100, 'asset': 'F_SS'},
 {'ma_period': 130, 'asset': 'F_SS'},
 {'ma_period': 90, 'asset': 'F_MD'},
 {'ma_period': 80, 'asset': 'F_MD'},
 {'ma_period': 70, 'asset': 'F_MD'},
 {'ma_period': 70, 'asset': 'F_ED'},
 {'ma_period': 60, 'asset': 'F_ED'},
 {'ma_period': 100, 'asset': 'F_ED'},
 {'ma_period': 190, 'asset': 'F_YM'},
 {'ma_period': 180, 'asset': 'F_YM'},
 {'ma_period': 170, 'asset': 'F_YM'},
 {'ma_period': 20, 'asset': 'F_CF'},
 {'ma_period': 10, 'asset': 'F_CF'},
 {'ma_period': 80, 'asset': 'F_CF'}]

We save config to the file and then load because the all cells with ##DEBUG### will be removed.

In [10]:
config = json.load(open('config.json', 'r'))

Define the result strategy:

In [11]:
def optmized_strategy(data, config):
    results = []
    for c in config:
        results.append(strategy_long(data, **c))
    # align and join results
    results = xr.align(*results, join='outer')
    results = [r.fillna(0) for r in results]
    output = sum(results) / len(results)
    return output

Result

Let's see the performance of the optimized strategy.

In [12]:
output = optmized_strategy(data, config)
output = qnout.clean(output, data) # fix common issues

qnout.check(output, data) 
qnout.write(output)

stats = qnstats.calc_stat(data, output.sel(time=slice('2006-01-01',None)))
display(stats.to_pandas().tail())
qngraph.make_major_plots(stats)
Output cleaning...
fix uniq
ffill if the current price is None...
Check missed dates...
Ok.
Normalization...
Output cleaning is complete.
Check missed dates...
Ok.
Check the sharpe ratio...
Period: 2006-01-01 - 2024-07-02
Sharpe Ratio = 1.2670397424410722
Ok.
Check correlation.
WARNING! Can't calculate correlation.
Correlation check failed.
Write output: /root/fractions.nc.gz
field equity relative_return volatility underwater max_drawdown sharpe_ratio mean_return bias instruments avg_turnover avg_holding_time
time
2024-06-26 2.634625 -0.000116 0.042452 -0.017048 -0.069466 1.267051 0.053788 1.0 15.0 0.047717 24.882457
2024-06-27 2.636502 0.000712 0.042447 -0.016348 -0.069466 1.267864 0.053817 1.0 15.0 0.047717 24.879099
2024-06-28 2.634608 -0.000718 0.042443 -0.017055 -0.069466 1.266745 0.053765 1.0 15.0 0.047707 24.879099
2024-07-01 2.636842 0.000848 0.042439 -0.016221 -0.069466 1.267736 0.053802 1.0 15.0 0.047698 24.879099
2024-07-02 2.635759 -0.000411 0.042435 -0.016625 -0.069466 1.267040 0.053767 1.0 15.0 0.047688 25.348695