Tag Archives: backtrader

How to create long-short strategy using backtrader

In the last post How to build machine learning model to generate trading signal, I share the general processing flow to apply machine learning based signal prediction to create a long strategy. Now I will share how to create a long-short strategy using backtrader to improve the trading performance. In the following, I will discuss four topics:

  • How to use personalized data format (CSV) in backtrader
  • How to add personalized signal line in plot
  • How to set a long-short strategy
  • Performance comparison between only-long and long-short stragtegy
How to use personalized CSV format

Backtrader provides API to read Yahoo-format OHLC CSV file. When new columns in CSV, you need write your own CSV reader. For example, in machine learning based trading, I predict buy/sell/hold signal and write OHLC together with predicted signal in CSV, like

Example of ML-based signal generation for trading (column predict is signal, 0=hold,1=buy,2=sell)

The CSV cannot read using Yahoo CSV reader in backtrader. In order to read the format, write a new reader based on base class, bt.feeds.PandasData, in backtrader.

class MLPredictCsv(bt.feeds.PandasData):
    '''
    Desc: for customized CSV format
    '''

    # What new data will be availible in the Stratagies line object
    lines = ('predict',)

    # Which columns go to which variable
    params = (
        ('open', 'Open'),
        ('high', 'High'),
        ('low', 'Low'),
        ('close', 'Close'),
        ('Adj Close','Adj Close'),
        ('volume', 'Volume'),
        ('openinterest', 'openinterest'),
        ('predict', 'signal'),
    )

In the above personalized CSV reader, each element in params is a mapping, first value being backtrader internal variable name which is actually used in backtrade test and second being the column name in CSV. Then you can read your CSV like, and use data as normal.

mlpredicted_signal = pd.read_csv(ml_predict_csv_file,
                                parse_dates=True,
                                index_col=0,                             
                            )
 data = MLPredictCsv(dataname=mlpredicted_signal)

How to personalize signal line

Because signal is predicted by third-party tool, the buy/sell/hold signal will not be automatically display in plot. In order to display personal lines, you need write another class based on bt.Indicator.

 class MLSignal(bt.Indicator):

    lines = ('predict',)

    def __init__(self):
        self.lines.predict = self.data0.predict

Then add it in __init__() of strategy class, like

class MLStrategy(bt.Strategy):
    params = dict(
        onlylong = False
    )
    
    def log(self, txt, dt=None):
        ''' Logging function fot this strategy'''
        dt = dt or self.datas[0].datetime.date(0)
        print('%s, %s' % (dt.isoformat(), txt))

    def __init__(self):
        # Keep a reference to the "close" line in the data[0] dataseries
        self.dataclose = self.datas[0].close
        self.dataopen = self.datas[0].open
        self.ml_signal = MLSignal(self.data)
        # To keep track of pending orders
        self.order = None

Then in plot, MLSignal predict line will be shown. Otherwise, it is not.

How to use long-short strategy

In the backtrader source code, there is an example file, LongShortStrategy.py, to show how to use long-short. In the following, I will share my complete code for long-short & long strategy test.

#*-* coding: utf-8 *-*
#!/usr/bin/env python3

'''
long-short strategy evaluation
'''
from __future__ import (absolute_import, division, print_function,
                        unicode_literals)

import datetime  # For datetime objects
import os.path  # To manage paths
import sys  # To find out the script name (in argv[0])
import backtrader as bt
import backtrader.indicators as btind
from backtrader.feeds import GenericCSVData

import itertools
import backtrader as bt
import pandas as pd
import quantstats
from backtrader.analyzers import (SQN, AnnualReturn, TimeReturn, SharpeRatio,
                                  TradeAnalyzer)

# onlylong = False

class MLSignal(bt.Indicator):

    lines = ('predict',)

    def __init__(self):
        self.lines.predict = self.data0.predict

class MLPredictCsv(bt.feeds.PandasData):
    '''
    Desc: for customized CSV format
    '''

    # What new data will be availible in the Stratagies line object
    lines = ('predict',)

    # Which columns go to which variable
    params = (
        ('open', 'Open'),
        ('high', 'High'),
        ('low', 'Low'),
        ('close', 'Close'),
        ('Adj Close','Adj Close'),
        ('volume', 'volume'),
        ('openinterest', 'openinterest'),
        ('predict', 'signal'),
    )

# Create a Stratey
class MLStrategy(bt.Strategy):
    params = dict(
        onlylong = False
    )
    
    def log(self, txt, dt=None):
        ''' Logging function fot this strategy'''
        dt = dt or self.datas[0].datetime.date(0)
        print('%s, %s' % (dt.isoformat(), txt))

    def __init__(self):
        # Keep a reference to the "close" line in the data[0] dataseries
        self.dataclose = self.datas[0].close
        self.dataopen = self.datas[0].open
        self.ml_signal = MLSignal(self.data)
        # To keep track of pending orders
        self.order = None

    def notify_order(self, order):
        if order.status in [order.Submitted, order.Accepted]:
            # Buy/Sell order submitted/accepted to/by broker - Nothing to do
            return

        # Check if an order has been completed
        # Attention: broker could reject order if not enough cash
        if order.status in [order.Completed]:
            if order.isbuy():
                self.log('BUY EXECUTED, %.2f' % order.executed.price)
            elif order.issell():
                self.log('SELL EXECUTED, %.2f' % order.executed.price)

            self.bar_executed = len(self)

        elif order.status in [order.Canceled, order.Margin, order.Rejected]:
            self.log('Order Canceled/Margin/Rejected')

        # Write down: no pending order
        self.order = None

    def next(self):
        #if order is active, no new order allow
        if self.order:
            return

        # Check if we are in the market
        if self.ml_signal.lines.predict > 0:
            if self.position:
                self.log('CLOSE SHORT , %.2f' % self.data.close[0])
                self.close()
            # Buy
            self.log('BUY CREATE, %.2f' % self.dataclose[0])
            self.order = self.buy()
        elif self.ml_signal.lines.predict < 0:
            if self.position:
                self.log('CLOSE LONG , %.2f' % self.data.close[0])
                self.close()
            
            if not self.p.onlylong:
                self.log('SELL CREATE , %.2f' % self.data.close[0])
                self.sell()

def strategyEvaluate(tick_symbol, ml_predict_csv, strategy_log_file, quant_output, quant_output_html, n_stake = 40, cash_capital = 1000, is_onlylong = False):
    '''
    Desc: ML predict signal based strategy evaluation

    '''
    mlpredicted_signal = pd.read_csv(ml_predict_csv,
                                parse_dates=True,
                                index_col=0,                             
                            )
    def mapsignal(x):
        if x == 0:
            return 0 #bt.SIGNAL_NONE
        elif x == 1:
            return 1 #bt.SIGNAL_LONG
        elif x == 2:
            return -1 #bt.SIGNAL_SHORT
    
    mlpredicted_signal['signal'] = mlpredicted_signal['signal'].transform(mapsignal)

    yr1, mth1, day1 = list(map(lambda x:int(x),str(mlpredicted_signal.index[0]).split(' ')[0].split('-')))
    yr2, mth2, day2 = list(map(lambda x:int(x),str(mlpredicted_signal.index[-1]).split(' ')[0].split('-')))
    
    mlpredicted_signal['openinterest'] = 0
    data = MLPredictCsv(dataname=mlpredicted_signal)

    # create Cerebro instance and attach data to it
    cerebro = bt.Cerebro()
    cerebro.adddata(data)
    # Add a strategy
    cerebro.addstrategy(MLStrategy, onlylong = is_onlylong)
    # Set our desired cash start
    cerebro.broker.setcash(cash_capital)

#     cerebro.broker.setcommission(commission=0)

    #Add strategy to Cerebro
    cerebro.addanalyzer(bt.analyzers.SharpeRatio, _name='sharpe_ratio')
    cerebro.addanalyzer(bt.analyzers.PyFolio, _name='PyFolio')
    cerebro.addanalyzer(TradeAnalyzer)

    # better net liquidation value view
    cerebro.addobserver(bt.observers.Value)

    # Default position size
    cerebro.addsizer(bt.sizers.SizerFix, stake=n_stake)    
    
    #add output log file 
    cerebro.addwriter(bt.WriterFile, csv=True, out=strategy_log_file)    


    # Print out the starting conditions
    print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue())

    # Run over everything
    results = cerebro.run()

    # Print out the final result
    print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue())

    #Get strategy stats
    strat = results[0]
    portfolio_stats = strat.analyzers.getbyname('PyFolio')
    returns, positions, transactions, gross_lev = portfolio_stats.get_pf_items()
    returns.index = returns.index.tz_convert(None)
    
    # print(returns)
    # print(positions)
    # print(portfolio_stats)
    quantstats.reports.html(returns, output = quant_output, download_filename = quant_output_html, title = tick_symbol)
    
    import webbrowser
    webbrowser.open(quant_output_html)

    cerebro.plot(iplot=False)

if __name__ == '__main__':
    tick_symbol = 'TQQQ'
    signal_file = f'{tick_symbol}_predict_eval_test.csv'
    o_strategy_log_file = f'{tick_symbol}_predict_eval_strategy.log'
    o_quant_output_file = f'{tick_symbol}_predict_eval_strategy.stats'
    o_quant_output_html = f'{tick_symbol}_predict_eval_strategy.html'
    n_stake = 10
    cash_capital = 1000
    is_onlylong = False #True, False=long-short, True=only long without short

    strategyEvaluate(tick_symbol, signal_file, o_strategy_log_file, o_quant_output_file, o_quant_output_html, n_stake, cash_capital, is_onlylong)
Performance comparison between longshort & long only
Long only
Longshort

The obvious difference between long & longshort in that in 2022, longshort is quite better than long only.

Long only
Longshort
Summary

Issues in ML predicted signal:

  • Prediction is still not robust enough. Some parameters changes will significantly affect performance
  • Predicted signal based on daily, it is too many in/out operation. Need further post-optimize the signal
  • Short is still very dangerous. As a naive, it is just for paper trading and test. Complete strategy need including many factors like signal pruning, risk control and capital management. Currently have no knowledge.

How to build machine learning model to generate trading signal

Recently I am working on how to build machine learning model to predict signal and build a winning strategy. Technical traders will read the price trending curve and use many technical analysis based signal to facilitate strategy operation such as bollinger band, macd, etc. But it is post analysis based on history, not having prediction power. I use the target stock history price sequence with the selected other security, they work together to extract statistics of price moving in different look-back window. Then I collect a set of feature samples and label the sample as buy, sell, and hold, a 3-category classification problem.

In general, the overall processing flow as follows:

  • Download selected target stock history price and other selected stocks to enrich the target stock price
  • Feature engineering, e.g. extract n-look-back day statistics. I do not directly use price as feature because it depends on actual price, sensitive to price scale, and difficult to scale to other tasks
  • Label sample as buy, hold and sell using selected criterio
  • Develop machine learning model
    • Split data into train, development, and evaluation along the time
    • Model training and optimization based on development set
    • Predict buy, sell and hold signal in evaluation set. Save prediction to file for following analysis
  • Based on predict signal, use backtrader, https://www.backtrader.com/, to backtest performance of the machine learning based strategy

Some thinking:

  • Use extra-security price improving prediction power
  • Even a little increase in prediction accuracy, e.g. 1%, will see gain increase & sharpe ratio improvement
  • Feature engineering is very important
  • Next step:
    • strategy needs further improvement.
    • post-processing predicted signal to increase stability
    • need adding capital management.

I have no finance & trading experience. It is just a personal development to investigate if my ML & system experience can work in the domain. Welcome discussion or leave comments if you are interested.

backtest performce on APPLE.

backtest on TQQQ

Signal to suggest operation next day