Abstract
This research studies the effect of headline risk on Irish Stock Market using Irish Times
headlines. For doing the same, it scrapes headlines from Irish times for "Bank of Ireland," one
of the oldest stocks in ISE, and extracts the stock prices from Yahoo Finance. Apart from
studying the aforementioned relationship, it aims to classify headlines into different polarity
groups, namely; Positive, Negative, and Neutral. It uses the CRISP-DM methodology, which
divided the whole project into six stages. For establishing a relationship between ISE and IT
headlines, this study uses sentimental analysis tool kit (nlkt) and one of its most advanced
library vader_lexicon, followed by the use of parametric statistical methods like t-test to reject
the hypothesis that there is no relationship between Price difference (the difference between
the ADJ closing price for the current day and previous day) and Polarity score( Company
sentimental score of headlines for every single day). For headline classification, the research
uses various text pre-processing techniques like Sentence Tokenization, Regular Expression,
Word Tokenization, and converting words into features. The preprocessed data is passed to
different probabilistic and nonprobabilistic Supervised machine learning classifiers to do a
comparative analysis between different classifiers. The comparative Analysis finds a surprising
result that nonprobabilistic algorithms perform much better than as compared to probabilistic
algorithms. But this made sense as Tableau insights established a strong parallel relationship
between Price distribution and the Number of headlines published for that date. This study
reveals that rapidminer is very user friendly as compared to Python but is very slow when it
comes to computation time. This research successfully establishes a relationship between Irish
times headlines and Irish stock Exchange.