Each machine learning process related to the use of neural networks consists of at least two parts. The first part is related to data loading and preparation for training. The action is known as ETL (extract, transform, load). The second part concerns the actual training of the network. The overall process can be divided into the following parts and steps.

In the next paragraphs we will go deeper in understanding each of this steps and apply them to our problem – predict the close price of a cryptocurrency one day ahead.

An appropriate set of input variables must be chosen to make the network accurately predict the closing price of the cryptocurrency for the current day. Forecasting market prices and, in particular, forecasting a cryptocurrency trend is not a trivial problem. A neural network won’t be able to predict future prices by only being given the previous trend of the cryptocurrency. The reason for this is that no cyclicality can be observed in the cryptocurrency trends. The neural network is unable to predict future jumps or dips that it has not seen before without additional parameters suggesting similar upcoming events. The graph below shows the Bitcoin trend since the cryptocurrency was created. The X axis reflects time distribution, and the Y axis reflects Bitcoin’s prices in thousands of dollars.

For additional parameters to help in the training of the network, three numerical parameters are chosen. Each one of them is a sentiment based on news classification and expresses the development of the cryptocurrency. Estimates are in the scale of 1 to 10, the higher the estimate, the better the expectations are for the development of the cryptocurrency. The news reflect global events such as wars, financial crises, disasters and others, which have a strong impact on crypto trading and crypto prices, respectively. They provide additional information without which the neural network is incapable of making predictions that accurately reflect reality. The network training parameters are as follows:

- price_close
- news_sentiment
- twitter_sentiment
- reddit_sentiment

The input data is structured in a four-column text file, each of which contains values for the corresponding parameter. The text file has the following structure.

close_price | news_sentiment | twitter_sentiment | reddit_sentiment |
---|---|---|---|

7725.43; | 7.167166666666668; | 5.851458333333333; | 5.006708333333333; |

7603.99; | 6.361833333333332; | 4.2411666666666665; | 4.29825; |

7533.92; | 6.479833333333335; | 5.355999999999999; | 3.1264583333333333; |

7414.08; | 6.669958333333335; | 4.936625; | 3.6385000000000014; |

7009.99; | 6.775500000000003; | 3.5362916666666657; | 4.339458333333334; |

… | … | … | … |

Each row of the text file matches the values for one day. Continuous sequence is important for successfully solving the regression problem. Any lack of information violates the completeness of consistency and leads to inaccuracies in training and forecasting with the data. As a separator in the text file is used the semicolon sign. The extra spacing in the example data is only for clarity.

For loading data in the software environment are almost always used libraries which ease the process. Most of the tools have additional options for visualition of the input data that help in better understanding of the data. Missing or incorrect input values can be discovered while reviewing the data.

For training the neural network, data needs to be appropriately transformed. The problem being solved falls in the supervised learning class, where for each set of parameters describing an example, an output value for this example is also given. Thus, for each example, the neural network compares its assumption with the true value of the output. It minimizes its error function by a technique such as the “Gradient descent” method, adjusting its weights matrix coefficients.

The loaded data contains the entire sequence of days for which there is information about the close price and estimates of the world news, but this data is not in the proper form for machine learning. Each row of data must be matched with a value reflecting what the correct output of the prediction should be. In this case the correct output is the closing price for the next day. After performing transformation on the data, it is in the form as shown below. An example ,representing one day, has values of 7725.43, 7.16717, 5.85146 and 5.00671. The correct output is 7603.99 – close price for the next day.

close_price | news_sentiment | twitter_sentiment | reddit_sentiment |
---|---|---|---|

7725.43; | 7.16717; | 5.85146; | 5.00671; |

7603.99; | 6.36183; | 4.24117; | 4.29825; |

7533.92; | 6.47983; | 5.356; | 3.12646; |

7414.08; | 6.66996; | 4.93663; | 3.6385; |

7009.99; | 6,78; | 3,54; | 4,34; |

… | … | … | … |

Input features for the machine learning algorithm

close_price |
---|

7603.99; |

7533.92; |

7414.08; |

7009.99; |

… |

Output features for the machine learning algorithm

Machine learning data must be divided into two sets, the first of which is used in the network training process, and through which the neural network adjusts the weight coefficients of its layers, and the second one contains test data that has not participated in the process of network training. Test data is used to evaluate the accuracy of the predictions. It is a way to understand how the network responds to new unseen data. The ratio between train and test data sets varies. In the algorithm for predicting cryptocurrency prices it is 8:2. 80% of the data is used for training the network and the other 20% of the data is used for evaluating the accuracy of the predictions.

Data normalization is a technique often used in the process of machine learning data preparation. The purpose of normalization is to convert the values of the input variables in such way, so that they belong to the same numerical range. If the variables belong to different numerical ranges, those whose values exceed the values of others, will have a greater impact on the output. In our case, news-based estimates have values in the range [1-10], while the market closing price varies within the range [3000-10000]. Close price has a significant advantage over the other variables. The four input variables have the same importance for the problem we are solving, which is why data normalization is needed. Another reason for the need for normalization is that the neural network is trained by the gradient descent optimization algorithm and its activation functions have an active range between -1 and 1. After applying data normalization all features have values in the range
[-1; 1].

In the next article we will go through the process of building a neural network model for predicting crypto prices, tuning the network’s hyperparameters and evaluating it’s prediction accuracy.

Digital Solutions Consulting GmbH

Ober der Kirch 3

DE-56412 Girod

Tel.: +49 (0) 179 / 4479005

management@digital-solutions.consulting

Du bist Softwareentwickler oder Consultant mit Berufserfahrung? Dann schau doch mal bei Karriere vorbei! Nichts passendes dabei? Wir freuen uns auch auf Initiativbewerbungen!

We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.

Manage consent

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.