Predictive analysis of YouTube trending videos using machine learning

Authors

Niture, Aakash Ashok

Issue Date

2021

Degree

MSc in Data Analytics

Publisher

Dublin Business School

Rights

Items in eSource are protected by copyright. Previously published items are made available in accordance with the copyright policy of the publisher/copyright holder.

Abstract

YouTube is a world-famous video sharing interactive platform which allows its users to rate, share, save, comment, and upload the content. Unlike popular videos which get number of likes and views by the time they are stated as popular, YouTube trending videos represents the content which is gaining viewership over a certain time period and has a potential to be popular. Despite their importance YouTube trending video’s analysis have not been a well-researched area yet. This research proposes to analyse interactive features to determine correlation and importance of variables for the trendiness of a video. Study focuses on how interactive video features helps a video trend on YouTube. Research is based on YouTube trending video’s viewership statistics of more than 40000 videos over a certain time period. Since trending video statistics consists of number of Views, Likes, Dislikes and Comment counts, the research performed Linear regression model of Machine Learning for predictive analysis of number of views for YouTube trending videos. In addition, the study performs a comparative analysis of a number of classification models namely Random Forest, SVM, Decision Tree, Logistic Regression and Gaussian Naïve Bayes, to determine which model suits better for predicting the number of days a video will take to get trending from its upload time and the number of days a video will trend on the trending list. Research achieved maximum accuracy of 62.53% for predicting YouTube’s trending video’s lifecycle. Cross Validation method have been used for statistical significance testing and the performance evaluation matrix has compared and determined the most useful classifiers. Furthermore, this research follows CRISP DM methodology design with correlational quantitative research method. Study will bring objectivity towards the popularity constraint of YouTube trending videos.