Autonomous Tagging of Stack Overflow Questions

B. Ravinder Reddy; K. Bhadri Prabhav; P. Hemanth Sagar; P. Rahul

doi:10.32628/IJSRST52310240

Authors

B. Ravinder Reddy Assistant Professor, Department of CSE, Anurag University, Hyderabad, T.S., India
K. Bhadri Prabhav Department of CSE, Anurag University, Hyderabad, T.S., India
P. Hemanth Sagar Department of CSE, Anurag University, Hyderabad, T.S., India
P. Rahul Department of CSE, Anurag University, Hyderabad, T.S., India

DOI:

https://doi.org/10.32628/IJSRST52310240

Keywords:

Tagging autonomously, stack overflow

Abstract

Educational resources like question-and-answer websites like Stack Exchange and Quora are growing in popularity online. A large number of these gatherings depend on labeling, which includes a part marking a post with a suitable assortment of subjects that depict the post and make it more straightforward to find and sort. We give a multi-name order framework that naturally distinguishes clients' requests to upgrade the client experience. A straight SVM and a carefully selected portion of the researched highlight set are used to create a one-versus-rest classifier for a Stack Overflow dataset. By utilizing a subsample of the initial data that is restricted to 100 labels and at least 500 events of each label throughout the data, our characterization framework achieves an ideal F1 score of 62.35 percent.

References

Prof. Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features.
Kaggle (2013). Facebook recruiting III - keyword extraction. https://www.kaggle.com/c/ facebook-recruiting-iii-keyword-extraction.
Klassen, M. and Paturi, N. (2010). Web document classification by keywords using random forests. In Networked Digital Technologies, volume 88 of Communications in Computer and Information Science, pages 256–261. Springer Berlin Heidelberg.
Loper, E. and Bird, S. (2002). Nltk: The natural language toolkit. In Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics, ETMTNLP ’02, pages 63–70, Stroudsburg, PA, USA. Association for Computational Linguistics.
Manning, C. D., Raghavan, P., and Sch¨utze, H. (2008). Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA.
McCallum, A. K. (1999). Multi-label text classification with a mixture model trained by em. In AAAI 99 Workshop on Text Learning.
Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient estimation of word representations in vector space. CoRR, abs/1301.3781.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikitlearn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.
Pennington, J., Socher, R., and Manning, C. D. . (2014). Glove: Global vectors for word representation. Conference on Empirical Methods in Natural Language Processing.
Schapire, R. E. and Singer, Y. (2000). Boostexter: A boosting-based system for text categorization.
Thorsten Joachims. Transductive inference for text classification using support vector machines. 99:200–209, 1999.
Ron Kohavi et al. A study of cross-validation and bootstrap for accuracy estimation and model selection. 14(2):1137– 1145, 1995. 5
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Ima-genet classification with deep convolutional neural networks. pages 1097–1105, 2012.
Yann LeCun, YoshuaBengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):436, 2015.
Andy Liaw, Matthew Wiener, et al. Classification and regression by randomforest. R news, 2(3):18–22, 2002.
Yutaka Matsuo and Mitsuru Ishizuka. Keyword extraction from a single document using word co-occurrence statistical information. International Journal on Artificial Intelligence Tools, 13(01):157–169, 2004.
S. Nashat, A. Abdullah, S. Aramvith, and M. Z. Abdullah. Original paper: Support vector machine approach to real-time inspection of biscuits on moving conveyor belt. Comput. Electron. Agric., 75(1):147–158, January 2011.
SuphakitNiwattanakul, JatsadaSingthongchai, EkkachaiNaenudorn, and SupachanunWanapu. Using of jaccardcoeffi-cient for keywords similarity. In Proceedings of the International MultiConference of Engineers and Computer Scientists, volume 1, 2013.
Payam Refaeilzadeh, Lei Tang, and Huan Liu. Crossvalidation. pages 532–538, 2009.
Sebastian Schuster, Wanying Zhu, and Yiying Cheng. Predict-ing tags for stackoverflow questions. CS229 Projects, Stanford university, 2013.

Autonomous Tagging of Stack Overflow Questions

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite