Use of Data Science during Worldwide Pandemic : An Efficient and Reliable System to Support Decision Making in Different Sectors

Authors

  • Tithi Choudhary  Information Technology, B.E, AISSMS’s Institute of Information Technology, Pune, Maharashtra, India
  • Muskan Shaikh  Information Technology, B.E, AISSMS’s Institute of Information Technology, Pune, Maharashtra, India
  • Shreya Khedekar  Information Technology, B.E, AISSMS’s Institute of Information Technology, Pune, Maharashtra, India
  • Richa Sirwani  Information Technology, B.E, AISSMS’s Institute of Information Technology, Pune, Maharashtra, India
  • Dr. Meenakshi A. Thalor  Information Technology, HOD, AISSMS’s Institute of Information Technology, Pune, Maharashtra, India

DOI:

https://doi.org//10.32628/IJSRST21825

Keywords:

Data Mining, Data Analytics, Data Visualization and Decision Making, Automated Machine Learning, Prediction, UI, UX.

Abstract

Any calamities or emergency situations have created drastic and everlasting impacts on mankind since history. Such situations need to be managed in an efficient and effective manner. There are countries of the world where the use of evolving technology is not a part of the management system. In this paper using a combination of technologies and tools we will provide a better alternative solution to the intuitive decisions thus making systems more reliable. Data Mining, Data Analytics, Data Visualization and Decision making have key applications demonstrated in this paper. Upon the choice of User Selections, this system will look into the number of deaths and recoveries that have occurred everyday from the start of spread of the COVID-19. Based on this information, the goal will be to analyze and formulate a death and recovery analysis, which will in turn help us in understanding the effect of corona virus on different work sectors, which in our case will be the Business sector of the country. The system will also focus on forecasting the GDP based on the past datasets of the country. Different visualization methods will be shown according to the Automated Machine learning system that will generate statistical depiction from the inflow of dynamic data. These info graphic visuals will provide a better understanding to the user as to how the situation will affect her own sector/domain as well as current stance of the GDP and enhance the overall User Experience (UX) of the user.

References

  1. Xuefang Xu, Yaguo Lei, Zeda Li(2020), “An Incorrect Data Detection Method for Big Data Cleaning of Machinery Condition Monitoring” published in IEEE transactions on industrial electronics.
  2. Mohammad Mahdavi, Felix Neutatz, Larysa Visengeriyeva, and Ziawasch Abedjan,(2019), “Towards Automated Data Cleaning Work?ows” published in in IEEE transactions on industrial electronics.
  3. William E Winkler, “Data Cleaning Methods”, U.S. Bureau of the Census Statistical Research, Washington, DC.
  4. Raju Dara,Dr.Ch. Satyanarayana, Dr.A.Govardhan(2013), “Front End Data Cleaning And Transformation In Standard Printed Form Using Neural Models” published in International Journal on Computational Sciences & Applications (IJCSA).
  5. Jaya Bajpai Pravin S. Metkewar(2016), “Data Quality Issues and Current Approaches to Data Cleaning Process in Data Warehousing” published in GRD Journals- Global Research and Development Journal for Engineering.
  6. H. R. Bhapkar, Parikshit N. Mahalle, Nilanjan Dey, and K. C. Santosh(2020), “Revisited COVID-19 Mortality and Recovery Rates: Are we missing Recovery Period” published in Journal of Medical Systems, Springer.
  7. https://stackoverflow.com/questions/47230817/plotly-notebook-mode-with-google-colaboratory
  8. https://plotly.com/python/multiple-axes/
  9. GDP datasets for prediction and forecasting: https://www.macrotrends.net/countries/IND/india/gdp-gross-domestic-product
  10. Recovery and Death Numbers Dataset: https://api.covid19india.org/documentation/csv/
  11. Shinde, G. R., Kalamkar, A. B., Mahalle, P. N., Dey, N., Chaki, J., & Hassanien, A. E. (2020). Forecasting Models for Coronavirus Disease (COVID-19): A Survey of the State-of-the-Art. SN Computer Science, 1(4). doi:10.1007/s42979-020-00209-9.
  12. Simon James Fong1, Gloria Li , Nilanjan Dey, Rubén González Crespo , Enrique Herrera-Viedma5.Finding an Accurate Early Forecasting Model from Small Dataset: A Case of 2019-nCoV Novel Coronavirus Outbreak. International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 6, No.1, DOI: 10.9781/ijimai.2020.02.002.
  13. Kane, M. J., Price, N., Scotch, M., & Rabinowitz, P. (2014). Comparison of ARIMA and Random Forest time series models for prediction of avian influenza H5N1 outbreaks. BMC Bioinformatics, 15(1), 276. doi:10.1186/1471-2105-15-276.
  14. Neeraj Poonia, Sarita Azad. Short-term forecasts of COVID-19 spread across Indian states until 1 May 2020.
  15. Chimmula, V. K. R., & Zhang, L. (2020). Time Series Forecasting of COVID-19 transmission in Canada Using LSTM Networks. Chaos, Solitons & Fractals, 109864. doi:10.1016/j.chaos.2020.109864.
  16. Yoshiro Suzuki, Ayaka Suzuki, Shun Nakamura, Toshiko Ishikawa, Akira Kinoshita.Machine learning model estimating number of COVID-19 infection cases over coming 24 days in every province of South Korea (XGBoost and MultiOutputRegressor).
  17. Sujath, R., Chatterjee, J. M., & Hassanien, A. E. (2020). A machine learning forecasting model for COVID-19 pandemic in India. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-020-01827-8.
  18. Pai, C., Bhaskar, A., & Rawoot, V. (2020). Investigating the dynamics of COVID-19 pandemic in India under lockdown. Chaos, Solitons & Fractals, 138, 109988. doi:10.1016/j.chaos.2020.109988.
  19. Ray, D., Salvatore, M., Bhattacharyya, R., Wang, L., Du, J., Mohammed, S., … Mukherjee, B. (2020). Predictions, Role of Interventions and Effects of a Historic National Lockdown in India’s Response to the the COVID-19 Pandemic: Data Science Call to Arms. Harvard Data Science Review. https://doi.org/10.1162/99608f92.60e08ed5
  20. Singh RK, Rani M, Bhagavathula AS, Sah R, Rodriguez-Morales AJ, Kalita H, Nanda C, Sharma S, Sharma YD, Rabaan AA, Rahmani J, Kumar P. Prediction of the COVID-19 Pandemic for the Top 15 Affected Countries: Advanced Autoregressive Integrated Moving Average (ARIMA) Model. JMIR Public Health Surveill 2020;6(2):e19115.doi: 10.2196/19115
  21. AutoRegressive Integrated Moving Average – https://www.investopedia.com/terms/a/autoregressive-integrated-moving-average-arima.asp#:~:text=What%20Is%20an%20Autoregressive%20Integrated,or%20to%20predict%20future%20trends.
  22. Introduction to ARIMA: nonseasonal models - https://people.duke.edu/~rnau/411arim.htm
  23. Understanding LSTM Networks - https://colah.github.io/posts/2015-08-Understanding-LSTMs/
  24. Borchani, H., Varando, G., Bielza, C., & Larrañaga, P. (2015). A survey on multi-output regression. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 5(5), 216–233. doi:10.1002/widm.1157
  25. An End-to-End guide to Understand the Math behind XGBOOST – https://www.analyticsvidhya.com/blog/2018/09/an-end-to-end-guide-to-understand-the-math-behind-xgboost/
  26. Marcek, D. and M. Marcek, 2006. Neural Networks and their Applications. EDIS Publ., Slovakia.
  27. Yunhai Wang, Fubo Han, Lifeng Zhu, Oliver Deussen, and Baoquan Chen, “Line Graph or Scatter Plot? Automatic Selection of Methods for Visualizing Trends in Time Series” , IEEE Transactions on Visualization and Computer Graphics 2017.
  28. Benjamin Born, Alexander M. Dietrich, Gernot J. Müller, “The lockdown effect: A counterfactual for Sweden”, July 2020, CEPR Discussion Papers 14744, C.E.P.R. Discussion Papers.
  29. Mahipal Jadeja, Kesha Shah “TREE-MAP: A VISUALIZATION TOOL FOR LARGE DATA” Published in GSB@SIGIR 2015.
  30. Niyazi ARI, “Matplotlib In Python", 2014 11th International Conference on Electronics, Computer and Computation (ICECCO), doi: 10.1109/ICECCO.2014.6997585
  31. “What is data visualization” https://www.klipfolio.com/resources/articles/what-is-data-visualization#DataViz2
  32. “The impact of COVID-19 — data visualization using Plotly and comparative analysis with SARS” https://towardsdatascience.com/the-impact-of-covid-19-data-analysis-and-visualization-560e54262dc
  33. “What is Data Visualization and Why Is It Important?” https://www.import.io/post/what-is-data-visualization/
  34. “Treemaps in python” https://plotly.com/python/treemaps/
  35. “Polar chart” https://doc-archives.microstrategy.com/producthelp/10.6/AdvancedReportingGuide/WebHelp/Lang_1033/Content/AdvancedReporting/Polar_chart.htm
  36. Website-investopedia.com/terms/d/datamining.asp
  37. Gunther Schuh, Jan-Philipp Prote,Philipp Hunnekes(2019),“Data Mining Methods For Macro Level Process Planning”,13th CIRP Conference on Intelligent Computation in Manufacturing Engineering.

Downloads

Published

0000-00-00

Issue

Section

Research Articles

How to Cite

[1]
Tithi Choudhary, Muskan Shaikh, Shreya Khedekar, Richa Sirwani, Dr. Meenakshi A. Thalor, " Use of Data Science during Worldwide Pandemic : An Efficient and Reliable System to Support Decision Making in Different Sectors, International Journal of Scientific Research in Science and Technology(IJSRST), Online ISSN : 2395-602X, Print ISSN : 2395-6011, Volume 8, Issue 2, pp.17-27, March-April-2021. Available at doi : https://doi.org/10.32628/IJSRST21825