Exploring the relationship between Irish Times headlines and Irish Times stock exchange

No Thumbnail Available
Jethwani, Bharat
Issue Date
MSc in Data Analytics
Dublin Business School
Items in eSource are protected by copyright. Previously published items are made available in accordance with the copyright policy of the publisher/copyright holder.
This research studies the effect of headline risk on Irish Stock Market using Irish Times headlines. For doing the same, it scrapes headlines from Irish times for "Bank of Ireland," one of the oldest stocks in ISE, and extracts the stock prices from Yahoo Finance. Apart from studying the aforementioned relationship, it aims to classify headlines into different polarity groups, namely; Positive, Negative, and Neutral. It uses the CRISP-DM methodology, which divided the whole project into six stages. For establishing a relationship between ISE and IT headlines, this study uses sentimental analysis tool kit (nlkt) and one of its most advanced library vader_lexicon, followed by the use of parametric statistical methods like t-test to reject the hypothesis that there is no relationship between Price difference (the difference between the ADJ closing price for the current day and previous day) and Polarity score( Company sentimental score of headlines for every single day). For headline classification, the research uses various text pre-processing techniques like Sentence Tokenization, Regular Expression, Word Tokenization, and converting words into features. The preprocessed data is passed to different probabilistic and nonprobabilistic Supervised machine learning classifiers to do a comparative analysis between different classifiers. The comparative Analysis finds a surprising result that nonprobabilistic algorithms perform much better than as compared to probabilistic algorithms. But this made sense as Tableau insights established a strong parallel relationship between Price distribution and the Number of headlines published for that date. This study reveals that rapidminer is very user friendly as compared to Python but is very slow when it comes to computation time. This research successfully establishes a relationship between Irish times headlines and Irish stock Exchange.