Sentiment Analysis using NLP Libraries

2 min readJan 18, 2021

Problem Statement :

The main objective in this Internship Project is to predict the sentiment for a number of movie reviews obtained from the Internet Movie Database (IMDb). This dataset contains 50,000 movie reviews that have been pre-labeled with “positive” and “negative” sentiment class labels based on the review content. Besides this, there are additional movie reviews that are unlabeled. The dataset can be obtained from http://ai.stanford.edu/~amaas/data/sentiment/, courtesy of Stanford University and Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. They have datasets in the form of raw text as well as an already processed bag of word formats. We will only be using the raw labeled movie reviews for our analyses. Hence our task will be to predict the sentiment of 15,000 labeled movie reviews and use the remaining 35,000 reviews for training our supervised models

Dataset — Kaggle Dataset