@dark-pidgeot Hi! After the release of version qnt “0.0.402” the issue with data loading in the local environment has been resolved. The library now uses newer dependencies, including pandas version 2.2.2.
Posts made by Vyacheslav_B
-
RE: Struggle creating local dev environment
-
RE: Accessing both market and index data in strategy()
@buyers_are_back Hello.
Here is a new example of stock prediction using index data.
I recommend using the single-pass version.
https://quantiacs.com/documentation/en/data/indexes.html -
RE: Acess previous weights
@blackpearl Hello. I don’t use machine learning in trading, and I don’t have similar examples. If you know Python and know how to develop such systems, or if you use ChatGPT (or similar tools) for development, you should not have difficulties modifying existing examples. You will need to change the model training and prediction functions.
One of the competitive advantages of the Quantiacs platform is the ability to test machine learning models from a financial performance perspective.
I haven’t encountered similar tools. Typically, models are evaluated using metrics like F1 score and cross-validation (for example, in the classification task of predicting whether the price will rise tomorrow).
However, there are several problems:
- It is unclear how much profit this model can generate. In real trading, there will be commissions, slippage, data errors, and the F1 score doesn’t account for these factors.
- It is possible to inadvertently look into the future. For instance, data preprocessing techniques like standardization can leak future information into the past. If you subtract the mean or maximum value from each point in the time series, the maximum value reached in 2021 would be known in 2015, which is unacceptable.
The Quantiacs platform provides a tool for evaluating models from a financial performance perspective.
However, practice shows that finding a good machine learning model requires significant computational resources and time for training and testing. My results when testing strategies on real data have not been very good.
-
RE: Acess previous weights
@machesterdragon Hello. I have already answered this question for you. see a few posts above.
Single-pass Version for Participation in the Contest
This code helps submissions get processed faster in the contest. The backtest system calculates the weights for each day, while the provided function calculates weights for only one day. -
RE: Acess previous weights
https://github.com/quantiacs/strategy-ml_lstm_state/blob/master/strategy.ipynb
This repository provides an example of using state, calculating complex indicators, dynamically selecting stocks for trading, and implementing basic risk management measures, such as normalizing and reducing large positions. It also includes recommendations for submitting strategies to the competition.
-
RE: Does evaluation only start from one year back?
@buyers_are_back Hello. Look at the bottom of the table. Only 5 rows are displayed there. At the bottom right you can click a button to scroll to the first row
-
RE: Acess previous weights
@illustrious-felice Hello.
Show me an example of the code.
I don't quite understand what you are trying to do.
Maybe you just don't have enough data in the functions to get the value.
Please note that in the lines I intentionally reduce the data size to 1 day to predict only the last day.
last_time = data.time.values[-1] data_last = data.sel(time=slice(last_time, None))
Calculate your indicators before this code, and then slice the values.
-
RE: WARNING: some dates are missed in the portfolio_history
@multi_byte-wildebeest Hi. Without an example, it's unclear what the problem might be.
If you use a state and a function that returns the prediction for one day, you will not get correct results with precheck.
This was discussed here: https://quantiacs.com/community/topic/555/access-previous-weights/18
-
RE: Accessing both market and index data in strategy()
@buyers_are_back Hello.
Here is an example: example link.
You can view the list of available indexes here.
If you want to use the
load_data
function, take a look at this example. You can implement the index download by analogy: -
RE: Acess previous weights
@machesterdragon
That's how it should be. This code is needed so that submissions are processed faster when sent to the contest. The backtest system will calculate the weights for each day. The function I provided calculates weights for only one day. -
RE: Acess previous weights
@machesterdragon Hello.
If you use a state and a function that returns the prediction for one day, you will not get correct results with precheck.
Theoretically, you can specify the number of partitions as all available days. or you can return all predictions
I have not checked how the precheck works.
If it works in parallel, you will not see the correct result even more so.
State in strategy limits you. I recommend not using it.
Here is an example of a version for one pass; I couldn't test it because my submission did not calculate even one day.
init.ipynb
! pip install torch==2.2.1
strategy.ipynb
import gzip import pickle from qnt.data import get_env from qnt.log import log_err, log_info def state_write(state, path=None): if path is None: path = get_env("OUT_STATE_PATH", "state.out.pickle.gz") try: with gzip.open(path, 'wb') as gz: pickle.dump(state, gz) log_info("State saved: " + str(state)) except Exception as e: log_err(f"Error saving state: {e}") def state_read(path=None): if path is None: path = get_env("OUT_STATE_PATH", "state.out.pickle.gz") try: with gzip.open(path, 'rb') as gz: state = pickle.load(gz) log_info("State loaded.") return state except Exception as e: log_err(f"Can't load state: {e}") return None state = state_read() print(state) # separate cell def print_stats(data, weights): stats = qns.calc_stat(data, weights) display(stats.to_pandas().tail()) performance = stats.to_pandas()["equity"] qngraph.make_plot_filled(performance.index, performance, name="PnL (Equity)", type="log") data_train = load_data(train_period) models = train_model(data_train) data_predict = load_data(lookback_period) last_time = data_predict.time.values[-1] if last_time < np.datetime64('2006-01-02'): print("The first state should be None") state_write(None) state = state_read() print(state) weights_predict, state_new = predict(models, data_predict, state) print_stats(data_predict, weights_predict) state_write(state_new) print(state_new) qnout.write(weights_predict) # To participate in the competition, save this code in a separate cell.
But I hope it will work correctly.
Do not expect any responses from me during this week.
-
RE: Acess previous weights
@magenta-kabuto Hello. Use the following example.
Note that the backtest parameters are set for daily prediction of values.
The prediction function is designed to return a value for one day. Later, I will show how to create a single-pass version.import xarray as xr import qnt.data as qndata import qnt.backtester as qnbt import qnt.ta as qnta import qnt.stats as qns import qnt.graph as qngraph import qnt.output as qnout import numpy as np import pandas as pd import torch from torch import nn, optim import random asset_name_all = ['NAS:AAPL', 'NAS:GOOGL'] lookback_period = 155 train_period = 100 class LSTM(nn.Module): """ Class to define our LSTM network. """ def __init__(self, input_dim=3, hidden_layers=64): super(LSTM, self).__init__() self.hidden_layers = hidden_layers self.lstm1 = nn.LSTMCell(input_dim, self.hidden_layers) self.lstm2 = nn.LSTMCell(self.hidden_layers, self.hidden_layers) self.linear = nn.Linear(self.hidden_layers, 1) def forward(self, y): outputs = [] n_samples = y.size(0) h_t = torch.zeros(n_samples, self.hidden_layers, dtype=torch.float32) c_t = torch.zeros(n_samples, self.hidden_layers, dtype=torch.float32) h_t2 = torch.zeros(n_samples, self.hidden_layers, dtype=torch.float32) c_t2 = torch.zeros(n_samples, self.hidden_layers, dtype=torch.float32) for time_step in range(y.size(1)): x_t = y[:, time_step, :] # Ensure x_t is [batch, input_dim] h_t, c_t = self.lstm1(x_t, (h_t, c_t)) h_t2, c_t2 = self.lstm2(h_t, (h_t2, c_t2)) output = self.linear(h_t2) outputs.append(output.unsqueeze(1)) outputs = torch.cat(outputs, dim=1).squeeze(-1) return outputs def get_model(): def set_seed(seed_value=42): """Set seed for reproducibility.""" random.seed(seed_value) np.random.seed(seed_value) torch.manual_seed(seed_value) torch.cuda.manual_seed(seed_value) torch.cuda.manual_seed_all(seed_value) # if you are using multi-GPU. torch.backends.cudnn.deterministic = True torch.backends.cudnn.benchmark = False set_seed(42) model = LSTM(input_dim=3) return model def get_features(data): close_price = data.sel(field="close").ffill('time').bfill('time').fillna(1) open_price = data.sel(field="open").ffill('time').bfill('time').fillna(1) high_price = data.sel(field="high").ffill('time').bfill('time').fillna(1) log_close = np.log(close_price) log_open = np.log(open_price) features = xr.concat([log_close, log_open, high_price], "feature") return features def get_target_classes(data): price_current = data.sel(field='open') price_future = qnta.shift(price_current, -1) class_positive = 1 # prices goes up class_negative = 0 # price goes down target_price_up = xr.where(price_future > price_current, class_positive, class_negative) return target_price_up def load_data(period): return qndata.stocks.load_ndx_data(tail=period, assets=asset_name_all) def train_model(data): features_all = get_features(data) target_all = get_target_classes(data) models = dict() for asset_name in asset_name_all: model = get_model() target_cur = target_all.sel(asset=asset_name).dropna('time', 'any') features_cur = features_all.sel(asset=asset_name).dropna('time', 'any') target_for_learn_df, feature_for_learn_df = xr.align(target_cur, features_cur, join='inner') criterion = nn.MSELoss() optimiser = optim.LBFGS(model.parameters(), lr=0.08) epochs = 1 for i in range(epochs): def closure(): optimiser.zero_grad() feature_data = feature_for_learn_df.transpose('time', 'feature').values in_ = torch.tensor(feature_data, dtype=torch.float32).unsqueeze(0) out = model(in_) target = torch.zeros(1, len(target_for_learn_df.values)) target[0, :] = torch.tensor(np.array(target_for_learn_df.values)) loss = criterion(out, target) loss.backward() return loss optimiser.step(closure) models[asset_name] = model return models def predict(models, data, state): last_time = data.time.values[-1] data_last = data.sel(time=slice(last_time, None)) weights = xr.zeros_like(data_last.sel(field='close')) for asset_name in asset_name_all: features_all = get_features(data_last) features_cur = features_all.sel(asset=asset_name).dropna('time', 'any') if len(features_cur.time) < 1: continue feature_data = features_cur.transpose('time', 'feature').values in_ = torch.tensor(feature_data, dtype=torch.float32).unsqueeze(0) out = models[asset_name](in_) prediction = out.detach()[0] weights.loc[dict(asset=asset_name, time=features_cur.time.values)] = prediction weights = weights * data_last.sel(field="is_liquid") # state may be null, so define a default value if state is None: default = xr.zeros_like(data_last.sel(field='close')).isel(time=-1) state = { "previus_weights": default, } previus_weights = state['previus_weights'] # align the arrays to prevent problems in case the asset list changes previus_weights, weights = xr.align(previus_weights, weights, join='right') weights_avg = (previus_weights + weights) / 2 next_state = { "previus_weights": weights_avg.isel(time=-1), } # print(last_time) # print("previus_weights") # print(previus_weights) # print(weights) # print("weights_avg") # print(weights_avg.isel(time=-1)) return weights_avg, next_state weights = qnbt.backtest_ml( load_data=load_data, train=train_model, predict=predict, train_period=train_period, retrain_interval=360, retrain_interval_after_submit=1, predict_each_day=True, competition_type='stocks_nasdaq100', lookback_period=lookback_period, start_date='2006-01-01', build_plots=True )
I recommend not using state at all, but rather using the approach I mentioned above.
Because it's faster.
If you need to use a single-pass version, it's better to load more data and calculate the weight values for previous days, then combine them. You will have calculated weights for the previous days. -
RE: Machine Learning - LSTM strategy seems to be forward-looking
@black-magmar Hello. Perhaps this seems strange, but not entirely so.
Notice how the target classes are derived — a shift into the future is used.
For the last available date in this data, there are no target classes.In each iteration, the backtester operates with the latest forecast. Although the entire series is forecasted, only the last value without an available target class is relevant.
This can be verified by running the function in a single-pass mode and examining the final forecast.
I assume this is done in such a way that the strategy can be run in both single-pass and multi-pass modes.
I became curious, so I will additionally verify what I have written to you.
-
RE: Acess previous weights
@magenta-kabuto Hello. I didn't understand your question. If you need portfolio weights for the previous day, you can use
weights.shift(time=1)
import qnt.data as qndata data = qndata.stocks.load_ndx_data(min_date="2005-01-01", assets=['NAS:GOOGL']) close = data.sel(field="close") weights = close - close.shift(time=1) weights_previous = weights.shift(time=1)
-
RE: Data loading failures
@newbiequant9696 Hello. Everything works correctly in the online environment. Where do you run your code? Try recloning your strategy.
import qnt.data as qndata import qnt.stats as qns import qnt.graph as qngraph data = qndata.stocks.load_ndx_data(min_date="2005-01-01", assets=['NAS:GOOGL'])
-
RE: Please provide more examples
@machesterdragon Hello. Yes, it can never have too many examples
Inside the documentation sections, there are examples of strategies and technical indicators.
It's better to keep track of new strategy examples or updates on GitHub: https://github.com/quantiacs -
RE: Cannot cast ufunc 'multiply' output from dtype('float64') to dtype('int64') with casting rule 'same_kind'
@machesterdragon said in Cannot cast ufunc 'multiply' output from dtype('float64') to dtype('int64') with casting rule 'same_kind':
weights *= asset_filter
Hello. Change
weights *= asset_filter
to
weights = weights * asset_filter
You can see usage examples here:
https://quantiacs.com/documentation/en/user_guide/dynamic_assets_selection.htmlEverything is working correctly now, try cloning the strategy.
-
RE: How to filter ticker futures by sharpe
@newbiequant96 Hello. You have a problem with the variable
weight
.You are predicting portfolio weights for stocks with this variable, but you are trying to select the best assets among futures.
If you want to filter by the best futures, then the variable should contain portfolio weights for futures. If you want to filter stocks, then the variable should contain portfolio weights for stocks.
In your code, you are predicting stocks and trying to find the best futures among them.
In this section, I provided an example of code on how to load stocks, futures, and cryptocurrencies together.
https://quantiacs.com/community/topic/556/is-it-possible-to-combine-stocks-with-crypto/2 -
RE: How to filter ticker futures by sharpe
@newbiequant96 said in How to filter ticker futures by sharpe:
import qnt.stats as qnstats
data = qndata.stocks.load_ndx_data(tail = 17*365, dims = ("time", "field", "asset"))
data = qndata.futures_load_data(min_date="2005-01-01")
def get_best_instruments(data, weights, top_size):
# compute statistics:
stats_per_asset = qnstats.calc_stat(data, weights, per_asset=True)
# calculate ranks of assets by "sharpe_ratio":
ranks = (-stats_per_asset.sel(field="sharpe_ratio")).rank("asset")
# select top assets by rank "top_period" days ago:
top_period = 1
rank = ranks.isel(time=-top_period)
top = rank.where(rank <= top_size).dropna("asset").asset# select top stats: top_stats = stats_per_asset.sel(asset=top.values) # print results: print("SR tail of the top assets:") display(top_stats.sel(field="sharpe_ratio").to_pandas().tail()) print("avg SR = ", top_stats[-top_period:].sel(field="sharpe_ratio").mean("asset")[-1].item()) display(top_stats) return top_stats.coords["asset"].values
get_best_instruments(data, weight, 15)
Hello. I suppose there is an issue with the variable
weight
in your code.Here is a working example with futures selection.
import qnt.data as qndata import qnt.stats as qnstats import qnt.ta as qnta data = qndata.futures_load_data(min_date="2005-01-01") def strategy(data, params): s_ = qnta.trix(data.sel(field='high'), params[0]) w_1 = s_.shift(time=params[1]) > s_.shift(time=params[2]) w_2 = s_.shift(time=params[3]) > s_.shift(time=params[4]) weights = (w_1 * w_2) return weights.fillna(0) weights = strategy(data, [196, 125, 76, 12, 192]) def get_best_instruments(data, weights, top_size): # compute statistics: stats_per_asset = qnstats.calc_stat(data, weights, per_asset=True) # calculate ranks of assets by "sharpe_ratio": ranks = (-stats_per_asset.sel(field="sharpe_ratio")).rank("asset") # select top assets by rank "top_period" days ago: top_period = 1 rank = ranks.isel(time=-top_period) top = rank.where(rank <= top_size).dropna("asset").asset # select top stats: top_stats = stats_per_asset.sel(asset=top.values) # print results: print("SR tail of the top assets:") display(top_stats.sel(field="sharpe_ratio").to_pandas().tail()) print("avg SR = ", top_stats[-top_period:].sel(field="sharpe_ratio").mean("asset")[-1].item()) display(top_stats) return top_stats.coords["asset"].values get_best_instruments(data, weights, 5)
-
RE: Fundamental data incomplete?
@buyers_are_back Hello. Yes, you can define your function as mentioned in the documentation.
https://quantiacs.com/documentation/en/data/fundamental.htmlPlease pay attention to the section "Potential Issues in Working with Fundamental Data".
The main problem with market capitalization is that it is necessary to adjust the price for splits, and these data are not always available on time. For example, the report with the number of shares is published after the split on the exchange.
The page contains links to where the data comes from (sec.gov).
import numpy as np import qnt.data as qndata import qnt.data.secgov_fundamental as fundamental def build_market_capitalization(fundamental_facts): shares = fundamental.build_shares(fundamental_facts) prices_no_split = qndata.restore_origin_data(market_data, make_copy=True) close_price = prices_no_split.sel(field='close') market_capitalization = shares * close_price return market_capitalization custom_builder = { 'market_capitalization': { 'facts': fundamental.FACT_GROUPS['shares'], 'build': build_market_capitalization, }, } market_data = qndata.stocks.load_ndx_data(min_date="2005-01-01") indicators_data = fundamental.load_indicators_for(market_data, indicator_names=['market_capitalization'], indicators_builders=custom_builder) display(indicators_data.sel(field="market_capitalization").to_pandas().tail(2)) display(indicators_data.sel(asset='NAS:AAPL').to_pandas().tail(2)) display(indicators_data.sel(asset=['NAS:AAPL']).sel(field="market_capitalization").to_pandas().tail(2))