Sentiment Analysis using NLP Libraries

Bhargav Joshi
2 min readJan 18, 2021

--

Problem Statement :

The main objective in this Internship Project is to predict the sentiment for a number of movie reviews obtained from the Internet Movie Database (IMDb). This dataset contains 50,000 movie reviews that have been pre-labeled with “positive” and “negative” sentiment class labels based on the review content. Besides this, there are additional movie reviews that are unlabeled. The dataset can be obtained from http://ai.stanford.edu/~amaas/data/sentiment/, courtesy of Stanford University and Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. They have datasets in the form of raw text as well as an already processed bag of word formats. We will only be using the raw labeled movie reviews for our analyses. Hence our task will be to predict the sentiment of 15,000 labeled movie reviews and use the remaining 35,000 reviews for training our supervised models

Dataset — Kaggle Dataset

Model Evaluation — Supervised

Model performance Unsupervised

Text Normalization

After Normalizing the data & Checking the normalize_corpus[document] we saw that we have good accuracy in normalization also.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

No responses yet

Write a response