Table of Contents

Project


This project explores the dataset of 1000 most popular movies from the IMDB database during the period of 2006-2016. The project is divided into 5 main components:

  1. Problem Statement
  2. Data Wrangling
  3. Questions and EDA
  4. Conclusion
  5. Actionable Insights
  6. Communicate

The components from 1 through 5 are captured in Jupyter Notebook. Component 6 is done through Presentation and Voice Overlay of Presentation (You will have to download these from the links)

Components


1. Problem Statement

After collecting some initial questions, I came up with a hypothetical problem: In 2017, a certain production company, ABC decides to produce movies that will earn the best in terms of revenue, popularity and acclaim. This company approaches agency, XYZ, and asks them to come up with characteristics of movies that will help them achieve their purpose.

2. Data Wrangling

I gathered the data, examined and cleaned it to make it ready for EDA.

3. Questions and EDA

Then I added more questions that align with the Problem Statement. Used these questions to explore the data using descriptive statistics and visualization. Noted down my findings from the exploration.

4. Conclusion

I drew conclusions from my data exploration in this section.

5. Actionable Insights

In this section, I came up with actionable insights from the exploration and conclusion to solve the Problem Statement.

6. Communicate

Finally, I communicated my results through Presentation and Voice Overlay.

Project Repository:

Everything is hard before it is easy.