Autonomous Tagging of Stackoverflow Questions
Abstract
With the massive volume of text available online these days, text categorization has become a very useful technique for managing and handling text data. Text categorization can be used to classify documents, extract useful information and identify connections in text data. In this project, we are using question-answer data from stackoverflow to try and identify tags from the question text. Once the tags for questions is identified based on its content, we can use it to recommend to the users. We can also add tags to existing questions which are not tagged and thus categorize questions better. In the current implementation, stackoverflow only suggests tags to the users after they start typing first letter(s) of a tag. Apart from finding tags, we will also try to see correlations among the tags present on the questions. We will also see which tags questions are the most answered ones and which tags are likely to get more upvotes and downvotes.