Neural networks for algorithmic trading. Simple time series forecasting (2024)

Alex Honchar

7 min read

Jun 18, 2016

I want to implement trading system from scratch based only on deep learning approaches, so for any problem we have here (price prediction, trading strategy, risk management) we gonna use different variations of artificial neural networks (ANNs) and check how well they can handle this.

Now I plan to work on next sections:

Simple time series forecasting (and mistakes done)
Correct 1D time series forecasting + backtesting
Multivariate time series forecasting
Volatility forecasting and custom losses
Multitask and multimodal learning
Hyperparameters optimization
Enhancing classical strategies with neural nets
Probabilistic programming and Pyro forecasts

I highly recommend you to check out code and IPython Notebook in this repository.

In this, first part, I want to show how MLPs, CNNs and RNNs can be used for financial time series prediction. In this part we are not going to use any feature engineering. Let’s just consider historical dataset of S&P 500 index price movements. We have information from 1950 to 2016 about open, close, high, low prices for every day in the year and volume of trades. First, we will try just to predict close price in the end of the next day, second, we will try to predict return (close price — open price). Download the dataset from Yahoo Finance or from this repository.

Neural networks for algorithmic trading. Simple time series forecasting (3)

We will consider our problem as 1) regression problem (trying to forecast exactly close price or return next day) 2) binary classification problem (price will go up [1; 0] or down [0; 1]).

For training NNs we gonna use framework Keras.

First let’s prepare our data for training. We want to predict t+1 value based on N previous days information. For example, having close prices from past 30 days on the market we want to predict, what price will be tomorrow, on the 31st day.

We use first 90% of time series as training set (consider it as historical data) and last 10% as testing set for model evaluation.

Here is example of loading, splitting into training samples and preprocessing of raw input data:

It will be just 2-hidden layer perceptron. Number of hidden neurons is chosen empirically, we will work on hyperparameters optimization in next sections. Between two hidden layers we add one Dropout layer to prevent overfitting.

Important thing is Dense(1), Activation(‘linear’) and ‘mse’ in compile section. We want one output that can be in any range (we predict real value) and our loss function is defined as mean squared error.

Let’s see what happens if we just pass chunks of 20-days close prices and predict price on 21st day. Final MSE= 46.3635263557, but it’s not very representative information. Below is plot of predictions for first 150 points of test dataset. Black line is actual data, blue one — predicted. We can clearly see that our algorithm is not even close by value, but can learn the trend.

Neural networks for algorithmic trading. Simple time series forecasting (4)

Let’s scale our data using sklearn’s method preprocessing.scale() to have our time series zero mean and unit variance and train the same MLP. Now we have MSE = 0.0040424330518 (but it is on scaled data). On the plot below you can see actual scaled time series (black)and our forecast (blue) for it:

Neural networks for algorithmic trading. Simple time series forecasting (5)

For using this model in real world we should return back to unscaled time series. We can do it, by multiplying or prediction by standard deviation of time series we used to make prediction (20 unscaled time steps) and add it’s mean value:

MSE in this case equals 937.963649937. Here is the plot of restored predictions (red) and real data (green):

Neural networks for algorithmic trading. Simple time series forecasting (6)

Not bad, isn’t it? But let’s try more sophisticated algorithms for this problem!

I am not going to dive into theory of convolutional neural networks, you can check out this amazing resourses:

cs231n.github.io — Stanford CNNs for Computer Vision course
http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/ — CNNs for text recognition, can be useful for understanding how it works for 1D data

Let’s define 2-layer convolutional neural network (combination of convolution and max-pooling layers) with one fully-connected layer and the same output as earlier:

Let’s check out results. MSEs for scaled and restored data are: 0.227074542433; 935.520550172. Plots are below:

Neural networks for algorithmic trading. Simple time series forecasting (7)

Neural networks for algorithmic trading. Simple time series forecasting (8)

Even looking on MSE on scaled data, this network learned much worse. Most probably, deeper architecture needs more data for training, or it just overfitted due to too high number of filters or layers. We will consider this issue later.

As recurrent architecture I want to use two stacked LSTM layers (read more about LSTMs here).

Plots of forecasts are below, MSEs = 0.0246238639582; 939.948636707.

Neural networks for algorithmic trading. Simple time series forecasting (9)

Neural networks for algorithmic trading. Simple time series forecasting (10)

RNN forecasting looks more like moving average model, it can’t learn and predict all fluctuations.

So, it’s a bit unexpectable result, but we can see, that MLPs work better for this time series forecasting. Let’s check out what will happen if we swith from regression to classification problem. Now we will use not close prices, but daily return (close price-open price) and we want to predict if close price is higher or lower than open price based on last 20 days returns.

Neural networks for algorithmic trading. Simple time series forecasting (11)

Code is changed just a bit — we change our last Dense layer to have output [0; 1] or [1; 0] and add softmax output to expect probabilistic output.

To load binary outputs, change in the code following line:

split_into_chunks(timeseries, TRAIN_SIZE, TARGET_TIME, LAG_SIZE, binary=False, scale=True)split_into_chunks(timeseries, TRAIN_SIZE, TARGET_TIME, LAG_SIZE, binary=True, scale=True)

Also we change loss function to binary cross-entopy and add accuracy metrics.

Train on 13513 samples, validate on 1502 samples
Epoch 1/5
13513/13513 [==============================] - 2s - loss: 0.1960 - acc: 0.6461 - val_loss: 0.2042 - val_acc: 0.5992
Epoch 2/5
13513/13513 [==============================] - 2s - loss: 0.1944 - acc: 0.6547 - val_loss: 0.2049 - val_acc: 0.5965
Epoch 3/5
13513/13513 [==============================] - 1s - loss: 0.1924 - acc: 0.6656 - val_loss: 0.2064 - val_acc: 0.6019
Epoch 4/5
13513/13513 [==============================] - 1s - loss: 0.1897 - acc: 0.6738 - val_loss: 0.2051 - val_acc: 0.6039
Epoch 5/5
13513/13513 [==============================] - 1s - loss: 0.1881 - acc: 0.6808 - val_loss: 0.2072 - val_acc: 0.6052
1669/1669 [==============================] - 0s  Test loss and accuracy: [0.25924376667510113, 0.50209706411917387]

Oh, it’s not better than random guessing (50% accuracy), let’s try something better. Check out the results below.

Train on 13513 samples, validate on 1502 samples
Epoch 1/5
13513/13513 [==============================] - 3s - loss: 0.2102 - acc: 0.6042 - val_loss: 0.2002 - val_acc: 0.5979
Epoch 2/5
13513/13513 [==============================] - 3s - loss: 0.2006 - acc: 0.6089 - val_loss: 0.2022 - val_acc: 0.5965
Epoch 3/5
13513/13513 [==============================] - 4s - loss: 0.1999 - acc: 0.6186 - val_loss: 0.2006 - val_acc: 0.5979
Epoch 4/5
13513/13513 [==============================] - 3s - loss: 0.1999 - acc: 0.6176 - val_loss: 0.1999 - val_acc: 0.5932
Epoch 5/5
13513/13513 [==============================] - 4s - loss: 0.1998 - acc: 0.6173 - val_loss: 0.2015 - val_acc: 0.5999
1669/1669 [==============================] - 0s 
Test loss and accuracy: [0.24841217570779137, 0.54463750750737105]

Train on 13513 samples, validate on 1502 samples
Epoch 1/5
13513/13513 [==============================] - 18s - loss: 0.2130 - acc: 0.5988 - val_loss: 0.2021 - val_acc: 0.5992
Epoch 2/5
13513/13513 [==============================] - 18s - loss: 0.2004 - acc: 0.6142 - val_loss: 0.2010 - val_acc: 0.5959
Epoch 3/5
13513/13513 [==============================] - 21s - loss: 0.1998 - acc: 0.6183 - val_loss: 0.2013 - val_acc: 0.5959
Epoch 4/5
13513/13513 [==============================] - 17s - loss: 0.1995 - acc: 0.6221 - val_loss: 0.2012 - val_acc: 0.5965
Epoch 5/5
13513/13513 [==============================] - 18s - loss: 0.1996 - acc: 0.6160 - val_loss: 0.2017 - val_acc: 0.5965
1669/1669 [==============================] - 0s 
Test loss and accuracy: [0.24823409688551315, 0.54523666868172693]

We can see, that treating financial time series prediction as regression problem is better approach, it can learn the trend and prices close to the actual.

What was surprising for me, that MLPs are treating sequence data better as CNNs or RNNs which are supposed to work better with time series. I explain it with pretty small dataset (~16k time stamps) and dummy hyperparameters choice.

You can reproduce results and get better using code from repository.

I think we can get better results both in regression and classification using different features (not only scaled time series) like some technical indicators, volume of sales. Also we can try more frequent data, let’s say minute-by-minute ticks to have more training data. All these things I’m going to do later, so stay tuned :)

P.S.
Follow me also in Facebook for AI articles that are too short for Medium, Instagram for personal stuff and Linkedin!

Neural networks for algorithmic trading. Simple time series forecasting (2024)

FAQs

Are neural networks good for time series forecasting? ›

Advantages of Recurrent Neural Network

RNNs can find complex patterns in the input time series. RNNs give good results in forecasting more then few-steps. RNNs can model sequence of data so that each sample can be assumed to be dependent on previous ones.

Find Out More ›

Which deep learning algorithm is best for time series forecasting? ›

Predict the Future with MLPs, CNNs and LSTMs in Python

Deep learning methods offer a lot of promise for time series forecasting, such as the automatic learning of temporal dependence and the automatic handling of temporal structures like trends and seasonality.

How is LSTM different from CNN for time series forecasting? ›

CNNs and LSTMs are both widely used in the field of time series analysis. CNNs are powerful for learning local patterns in data, while LSTMs are effective at capturing long-term dependencies in sequential data.

Learn More Now ›

What is the difference between RNN and CNN for time series? ›

CNNs are commonly used to solve problems involving spatial data, such as images. RNNs are better suited to analyzing temporal and sequential data, such as text or videos.

Learn More Now ›

How to use neural network for trading? ›

This typically involves collecting historical price and volume data for the stocks or other assets that you want to trade, and then cleaning and normalizing the data so that it can be used as input for the neural network. Once the data is preprocessed, the next step is to design and train the neural network.

Get More Info ›

Is Tesla using neural network? ›

Unlike the earlier versions, where the car's reactions were predetermined for specific situations, v12 draws on the learning from "end-to-end neural networks." These networks are extensively trained using video clips from real driving situations, allowing the car to make decisions that feel more human-like.

Find Out More ›

Are trading algorithms illegal? ›

Yes, algorithmic trading is legal. There are no rules or laws that limit the use of trading algorithms. Some investors may contest that this type of trading creates an unfair trading environment that adversely impacts markets. However, there's nothing illegal about it.

Find Out More ›

What is the best forecasting method for time series data? ›

ARIMA and SARIMA

AutoRegressive Integrated Moving Average (ARIMA) models are among the most widely used time series forecasting techniques: In an Autoregressive model, the forecasts correspond to a linear combination of past values of the variable.

Know More ›

What is a neural network for time series forecast? ›

This example shows how to forecast time series data using a long short-term memory (LSTM) network. An LSTM network is a recurrent neural network (RNN) that processes input data by looping over time steps and updating the RNN state.

Discover More ›

Which resource is best for time series forecasting? ›

The best resources to start with the time series analysis and forecasting: Forecasting: Principles and Practice (Textbook) Time Series Cheatsheet (Cheatsheet) Machine learning for trading (Udacity-Course)

Get More Info ›

Is neural network better than ARIMA? ›

The ARIMA model generally provided more accurate forecasts than the back-propagation neural network (BPNN) model used. This is more pronounced for the midrange forecasting horizons. Merh et al.

Explore More ›