Machine Learning Model Approaches for Price Prediction in Coffee Market using Linear Regression, XGB, and LSTM Techniques

Authors

  • Tesyon Korjo Hwase  Department of Software Engineering, Nankai University, MTT Consulting Architects and Engineers Plc., Addis Ababa, Ethiopia
  • Abdul Joseph Fofanah  Department of Mathematics and Computer Science, Milton Margai Technical University, Freetown, Western Area Rural, Sierra Leone

DOI:

https://doi.org/10.32628/IJSRST218583

Keywords:

Linear Regression (LR), Extreme Gradient Boosting (XGB), Long Short-Term Memory (LSTM), MySQL database, ECPP, ECX

Abstract

Investors and other business persons have a desire to know about the future market price because, if the investors know about the future price of a certain commodity or stock it will help them to make appropriate business decisions and they can also get profit out of their investment. There are many previous researches that has been done on stock market predictions but there is no related research that has been done on Ethiopia commodity exchange (ECX). Performing future price prediction with better accuracy and performing comparative analysis between the algorithms for two of Ethiopia commodity exchange (ECX) items which are Coffee and Sesame as the research key objectives. Three different types of prediction algorithms to predict the future price, such as Linear Regression (LR), Extreme Gradient Boosting (XGB), Long Short-Term Memory (LSTM) was utilized. There are limited researches worked on price prediction of ECX items specifically, the idea of the price prediction on different Stock markets like New York stock market Exchange and other commodity market items prediction in order to develop our research in ECX was presented. The study apart from predicting the future price, comparative analysis was implemented between the prediction algorithms that we used based on their performance. Two different datasets from ECX: coffee and sesame were used. The reason for the utilization of these datasets is, the commodity items are the largest export items in Ethiopia which makes them very important for Ethiopian economy, and the different datasets helps us to get the advantage of evaluating the algorithms with different number of datasets, since sesame dataset has 7205 instances and coffee dataset has 1540 instances and both of them has 11 attributes. We build an android application in order two implement our algorithms on mobile applications and see if it is possible to implement the prediction algorithms on mobile platforms and make it easy and accessible to users. We call this mobile application Ethiopia Coffee Prices Predictor (ECPP). This application will be used to display the prediction result of Ethiopia Coffee price for short period and it is designed in the way to be user friendly. The programming environment used to implement the prediction algorithms is python, java programming language to design our android application and we used PHP to implement the API, and finally we used MySQL database in order to store information’s online and make them accessible everywhere.

References

  1. L. Nunno, “Stock Market Price Prediction Using Linear and Polynomial Regression Models,” p. 6.
  2. “Ethiopian Coffee.” http://www.selamta.net/Ethiopian%20Coffee.htm (accessed Mar. 26, 2020).
  3. “Coffee production in Ethiopia,” Wikipedia. Oct. 27, 2019, Accessed: Dec. 26, 2019. Online]. Available: https://en.wikipedia.org/w/index.php?title=Coffee_production_in_Ethiopia&oldid=923265682.
  4. K. F. Wiersum, T. W. Gole, F. Gatzweiler, J. Volkmann, E. Bognetteau, and O. Wirtu, “CERTIFICATION OF WILD COFFEE IN ETHIOPIA: EXPERIENCES AND CHALLENGES,” For. Trees Livelihoods, vol. 18, no. 1, pp. 9–21, Jan. 2008, doi: 10.1080/14728028.2008.9752614.
  5. “Top 5 Predictive Analytics Models and Algorithms | Logi Analytics Blog,” Logi Analytics, Jul. 09, 2019. https://www.logianalytics.com/predictive-analytics/predictive-algorithms-and-models/ (accessed Mar. 26, 2020).
  6. K. Pahwa and N. Agarwal, “Stock Market Analysis using Supervised Machine Learning,” in 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Feb. 2019, pp. 197–200, doi: 10.1109/COMITCon.2019.8862225.
  7. “Linear Regression and Prediction.” http://jukebox.esc13.net/untdeveloper/RM/Stats_Module_5/Stats_Module_56.html (accessed Dec. 27, 2019).
  8. R. Kelley, “Machine Learning Explained: Algorithms Are Your Friend.” https://blog.dataiku.com/machine-learning-explained-algorithms-are-your-friend (accessed Dec. 26, 2019).
  9. K. Nishida, “Introduction to Extreme Gradient Boosting in Exploratory,” Medium, Mar. 21, 2017. https://blog.exploratory.io/introduction-to-extreme-gradient-boosting-in-exploratory-7bbec554ac7 (accessed Dec. 27, 2019).
  10. J. Brownlee, “A Gentle Introduction to the Gradient Boosting Algorithm for Machine Learning,” Machine Learning Mastery, Sep. 08, 2016. https://machinelearningmastery.com/gentle-introduction-gradient-boosting-algorithm-machine-learning/ (accessed Mar. 26, 2020).
  11. “eXtreme Gradient Boosting (XGBoost): Better than random forest or gradient boosting | Welcome to my blog.” https://liuyanguu.github.io/post/2018/07/09/extreme-gradient-boosting-xgboost-better-than-random-forest-or-gradient-boosting/ (accessed Mar. 26, 2020).
  12. “FinTech - Machine Learning and Recommenders,” Finextra Research, Jan. 17, 2018. https://www.finextra.com/blogposting/14934/fintech---machine-learning-and-recommenders (accessed Mar. 26, 2020).
  13. S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: 10.1162/neco.1997.9.8.1735.
  14. “Understanding LSTM Networks -- colah’s blog.” https://colah.github.io/posts/2015-08-Understanding-LSTMs/ (accessed Dec. 31, 2019).
  15. Y. Dai and Y. Zhang, “Machine Learning in Stock Price Trend Forecasting,” p. 5.
  16. S. Kalra and J. S. Prasad, “Efficacy of News Sentiment for Stock Market Prediction,” in 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Faridabad, India, Feb. 2019, pp. 491–496, doi: 10.1109/COMITCon.2019.8862265.
  17. S. Dey, Y. Kumar, S. Saha, and S. Basak, Forecasting to Classification: Predicting the direction of stock market price using Xtreme Gradient Boosting. 2016.
  18. S. Liu, G. Liao, and Y. Ding, “Stock transaction prediction modeling and analysis based on LSTM,” in 2018 13th IEEE Conference on Industrial Electronics and Applications (ICIEA), May 2018, pp. 2787–2790, doi: 10.1109/ICIEA.2018.8398183.
  19. H. L. Siew and M. J. Nordin, “Regression techniques for the prediction of stock price trend,” in 2012 International Conference on Statistics in Science, Business and Engineering (ICSSBE), Langkawi, Kedah, Malaysia, Sep. 2012, pp. 1–5, doi: 10.1109/ICSSBE.2012.6396535.
  20. A. Ioanes and R. Tirnovan, “Energy Price Prediction on the Romanian Market using Long Short-Term Memory Networks,” in 2019 54th International Universities Power Engineering Conference (UPEC), Bucharest, Romania, Sep. 2019, pp. 1–5, doi: 10.1109/UPEC.2019.8893550.
  21. S. Selvin, R. Vinayakumar, E. A. Gopalakrishnan, V. K. Menon, and K. P. Soman, “Stock price prediction using LSTM, RNN and CNN-sliding window model,” in 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Udupi, Sep. 2017, pp. 1643–1647, doi: 10.1109/ICACCI.2017.8126078.
  22. G. G. Moisen, E. A. Freeman, J. A. Blackard, T. S. Frescino, N. E. Zimmermann, and T. C. Edwards, “Predicting tree species presence and basal area in Utah: A comparison of stochastic gradient boosting, generalized additive models, and tree-based methods,” Ecol. Model., vol. 199, no. 2, pp. 176–187, Nov. 2006, doi: 10.1016/j.ecolmodel.2006.05.021.
  23. H. Singh, “Understanding Gradient Boosting Machines,” Medium, Nov. 04, 2018. https://towardsdatascience.com/understanding-gradient-boosting-machines-9be756fe76ab (accessed Jan. 20, 2020).
  24. Y. Zhang and A. Haghani, “A gradient boosting method to improve travel time prediction,” Transp. Res. Part C Emerg. Technol., vol. 58, pp. 308–324, Sep. 2015, doi: 10.1016/j.trc.2015.02.019.
  25. P. Carmona, F. Climent, and A. Momparler, “Predicting failure in the U.S. banking sector: An extreme gradient boosting approach,” Int. Rev. Econ. Finance, vol. 61, pp. 304–323, May 2019, doi: 10.1016/j.iref.2018.03.008.
  26. M. Zięba, S. K. Tomczak, and J. M. Tomczak, “Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction,” Expert Syst. Appl., vol. 58, pp. 93–101, Oct. 2016, doi: 10.1016/j.eswa.2016.04.001.
  27. Y. Wang, Q. Deng, F. Wen, H. Zhou, F. Liu, and X. Yang, “Combined Use of Support Vector Machine and Extreme Gradient Boosting System for Cost Prediction of Ultra High Voltage Transmission Projects,” in 2019 IEEE Innovative Smart Grid Technologies - Asia (ISGT Asia), Chengdu, China, May 2019, pp. 3708–3712, doi: 10.1109/ISGT-Asia.2019.8881151.
  28. “Forecasting Stock Prices using XGBoost - Towards Data Science.” https://towardsdatascience.com/forecasting-stock-prices-using-xgboost-a-detailed-walk-through-7817c1ff536a#c34b (accessed Jan. 16, 2020).
  29. S. Weisberg, Applied Linear Regression. John Wiley & Sons, 2005.
  30. “Linear Regression Algorithm | Machine Learning Regression Algorithm,” R-ALGO Engineering Big Data, Feb. 02, 2018. https://www.engineeringbigdata.com/linear-regression-algorithm/ (accessed Mar. 27, 2020).
  31. R. Gandhi, “Introduction to Machine Learning Algorithms: Linear Regression,” Medium, May 28, 2018. https://towardsdatascience.com/introduction-to-machine-learning-algorithms-linear-regression-14c4e325882a (accessed Mar. 27, 2020).

Downloads

Published

2021-11-30

Issue

Section

Research Articles

How to Cite

[1]
Tesyon Korjo Hwase, Abdul Joseph Fofanah "Machine Learning Model Approaches for Price Prediction in Coffee Market using Linear Regression, XGB, and LSTM Techniques" International Journal of Scientific Research in Science and Technology(IJSRST), Online ISSN : 2395-602X, Print ISSN : 2395-6011,Volume 8, Issue 6, pp.10-48, November-December-2021. Available at doi : https://doi.org/10.32628/IJSRST218583