*WINNER* Sentiment Analysis Using Google's Word2Vec Machine Learning Method
Abstract
Natural language processing is an important research area of Artificial Intelligence. By competing in the “Bag of Words meets Bag of Popcorn” Kaggle challenge, we plan to produce a machine learning model that has been trained to analyze movie reviews and understand the meaning and semantic relationship among words to determine the sentiment behind the reviews. Our model should be able to distinguish between positive and negative reviews using Google’s Word2Vec, a deep-learning inspired method that focuses on the meaning of words. After our model has been trained using two different data sets (one containing reviews with a positive or negative sentiment label, the other containing reviews without sentiment labels), we will be able to test our model and determine its accuracy. After analyzing our model, we will produce graphical representations of our analysis and collected data. For analysis and goal evaluation, we will be computing a classification accuracy score and plotting an ROC curve. The classification accuracy score should be above 50% and by visually inspecting the ROC curve, the line should be above the line of no-discrimination for the model to be considered successful. We will also generate a report of outcomes wherein we will assess and evaluate the model’s performance and accuracy and provide insight into any changes or improvements that could be made in the future to improve upon the performance of the model.