Notebook

Futures - BLS Macro Data¶

This template uses data from the Bureau of Labor Statistics for trading futures contracts.

You can clone and edit this example there (tab Examples).

The U.S. Bureau of Labor Statistics is the principal agency for the U.S. government in the field of labor economics and statistics. It provides macroeconomic data in several interesting categories: prices, employment and unemployment, compensation and working conditions and productivity.

Quantiacs has implemented these datasets on its cloud and makes them also available for local use on your machine.

In this template we show how to use the BLS data for creating a trading algorithm.

Need help? Check the Documentation and find solutions/report problems in the Forum section.

More help with Jupyter? Check the official Jupyter page.

Check the BLS documentation on the Quantiacs macroeconomics help page.

Once you are done, click on Submit to the contest and take part to our competitions.

API reference:

data: check how to work with data;
backtesting: read how to run the simulation and check the results.

Need to use the optimizer function to automate tedious tasks?

optimization: read more on our article.

In [ ]:

import pandas as pd
import numpy as np

import qnt.data as qndata

In [ ]:

%%javascript
window.IPython && (IPython.OutputArea.prototype._should_scroll = function(lines) { return false; })
// disable widget scrolling

First of all we list the 34 available datasets and inspect them:

In [ ]:

dbs = qndata.blsgov.load_db_list()

display(pd.DataFrame(dbs)) # convert to pandas for better formatting

For each dataset you can see the identifier, the name and the date of the last available update. Each dataset contains several time series which can be used as indicators.

In this example we use AP. Average consumer Prices are calculated for household fuel, motor fuel and food items from prices collected for the Consumer Price Index (CPI). The full description is available in the metadata.

Let us load and display the time series contained in the AP dataset:

In [ ]:

series_list = list(qndata.blsgov.load_series_list('AP'))

display(pd.DataFrame(series_list).set_index('id')) # convert to pandas for better formatting

As you see, the AP Average Price Data dataset contains 1479 time series.

Let us see how we can learn the meaning of the 8 columns. Some of them are obvious, like series_title, begin_year or end_year, but others are not, like area_code, item_code, begin_period, end_period.

Inspect the metadata¶

The Quantiacs toolbox allows you to inspect the meaning of all fields:

In [ ]:

meta = qndata.blsgov.load_db_meta('AP')

for k in meta.keys():
    print('### ' + k + " ###")
    m = meta[k]
    
    if type(m) == str:
        # Show only the first line if this is a text entry.
        print(m.split('\n')[0])
        print('...')
        # Uncomment the next line to see the full text. It will give you more details about the database.
        # print(m) 

    if type(m) == dict:
        # convert dictionaries to pandas DataFrame for better formatting:
        df = pd.DataFrame(meta[k].values())
        df = df.set_index(np.array(list(meta[k].keys())))
        display(df)

These tables allows you to quickly understand the meaning of the fields for each times series in the Average Price Data.

The area_code column reflects the U.S. area connected to the time series, for example 0000 for the entire U.S.

Let us select only time series related to the entire U.S.:

In [ ]:

us_series_list = [s for s in series_list if s['area_code'] == '0000']

display(pd.DataFrame(us_series_list).set_index('id')) # convert to pandas for better formatting

We have 160 time series out of the original 1479. These are global U.S. time series which are more relevant for forecasting global financial markets. Let us select time series which are currently being updated and have at least 20 years of history:

In [ ]:

actual_us_series_list = [s for s in us_series_list if s['begin_year'] <= '2000' and s['end_year'] == '2021' ]

display(pd.DataFrame(actual_us_series_list).set_index('id')) # convert to pandas for better formatting

In [ ]:

len(actual_us_series_list)

We have 55 time series whose history is long enough for our purpose. Now we can load one of these series and use it for our strategy. Let us focus on energy markets. We consider fuel oil APU000072511 on a monthly basis:

In [ ]:

series_data = qndata.blsgov.load_series_data('APU000072511', tail = 30*365)

# convert to pandas.DataFrame
series_data = pd.DataFrame(series_data)
series_data = series_data.set_index('pub_date')

# remove yearly average data, see period dictionary
series_data = series_data[series_data['period'] != 'M13']

series_data

Next, let us consider Futures contracts in the Energy sector:

In [ ]:

futures_list = qndata.futures_load_list()

energy_futures_list = [f for f in futures_list if f['sector'] == 'Energy']

pd.DataFrame(energy_futures_list)

We consider Brent Crude Oil, F_BC, and define a strategy using a multi-pass approach:

In [ ]:

import xarray as xr
import numpy as np
import pandas as pd

import qnt.ta as qnta
import qnt.backtester as qnbt
import qnt.data as qndata


def load_data(period):
    
    futures = qndata.futures_load_data(assets=['F_BC'], tail=period, dims=('time','field','asset'))
    
    ap = qndata.blsgov.load_series_data('APU000072511', tail=period)
    
    # convert to pandas.DataFrame
    ap = pd.DataFrame(ap) 
    ap = ap.set_index('pub_date') 

    # remove yearly average data, see period dictionary
    ap = ap[ap['period'] != 'M13']
    
    # convert to xarray
    ap = ap['value'].to_xarray().rename(pub_date='time').assign_coords(time=pd.to_datetime(ap.index.values))
    
    # return both time series
    return dict(ap=ap, futures=futures), futures.time.values


def window(data, max_date: np.datetime64, lookback_period: int):
    # the window function isolates data which are needed for one iteration
    # of the backtester call
    
    min_date = max_date - np.timedelta64(lookback_period, 'D')
    
    return dict(
        futures = data['futures'].sel(time=slice(min_date, max_date)),
        ap = data['ap'].sel(time=slice(min_date, max_date))
    )


def strategy(data, state):
    
    close = data['futures'].sel(field='close')
    ap = data['ap']
    
    # the strategy complements indicators based on the Futures price with macro data
    # and goes long/short or takes no exposure:
    
    if ap.isel(time=-1) > ap.isel(time=-2) \
            and close.isel(time=-1) > close.isel(time=-20):
        return xr.ones_like(close.isel(time=-1)), 1
    
    elif ap.isel(time=-1) < ap.isel(time=-2) \
            and ap.isel(time=-2) < ap.isel(time=-3) \
            and ap.isel(time=-3) < ap.isel(time=-4) \
            and close.isel(time=-1) < close.isel(time=-40):
        return -xr.ones_like(close.isel(time=-1)), 1 
    
    # When the state is None, we are in the beginning and no weights were generated.
    # We use buy'n'hold to fill these first days.
    elif state is None: 
        return xr.ones_like(close.isel(time=-1)), None
    
    else:
        return xr.zeros_like(close.isel(time=-1)), 1


weights, state = qnbt.backtest(
    competition_type='futures',
    load_data=load_data,
    window=window,
    lookback_period=365,
    start_date="2006-01-01",
    strategy=strategy,
    analyze=True,
    build_plots=True
)