Enhancing Abstractive and Extractive Reviews Text Summarization using NLP and Neural Networks
Authors
Rawat, Ankit
Issue Date
2024
Degree
MSc in Data Analytics
Publisher
Dublin Business School
Rights holder
Rights
Abstract
In today's digital age, the exponential growth of online reviews has presented businesses with the challenge of efficiently processing huge volumes of textual data. The ability to filter valuable insights from these reviews is crucial for understanding consumer sentiments and facilitating informed decision-making. This thesis investigates the application of a Seq2Seq Long Short-Term Memory (LSTM) model for text summarization, aiming to develop an automated system capable of generating concise and informative summaries from input texts. The model underwent comprehensive training, comprising three epochs, during which it exhibited considerable progress in reducing loss and improving cosine similarity metrics, signifying its ability to learn from the dataset.
Upon evaluation, the model showcased its potential by generating summaries. However, a critical analysis revealed certain limitations in the generated summaries, notably their brevity without substantive depth and occasional lack of coherence. This highlighted the current challenges in achieving nuanced and contextually appropriate summarizations.
A comparative analysis against established models emphasized the need for further advancements. The model's deficiency in comprehensiveness and contextual relevance, particularly when faced with complex sentence structures or nuanced semantic relations, was evident.
Identified limitations encompassed the depth and complexity of summaries, reliance on a limited dataset size, and struggles with processing intricate linguistic features. To address these challenges and enhance model performance, potential avenues include dataset augmentation with diverse text samples, exploration of advanced architectures such as transformer-based models, and meticulous hyperparameter tuning.
In conclusion, while the Seq2Seq LSTM model exhibits promise for text summarization tasks, its current limitations impede the generation of comprehensive and contextually accurate summaries. The thesis underscores the necessity for future endeavors to mitigate these limitations, thereby advancing the model's capability to produce more informative and contextually apt summaries.