Implementing real time recommendation systems using graph algorithms & exploring graph analytics in a graph database platform (Neo4j)
No Thumbnail Available
Authors
Kumar, Amit
Issue Date
2019
Degree
MSc in Data Analytics
Publisher
Dublin Business School
Rights holder
Rights
Items in eSource are protected by copyright. Previously published items are made available in accordance with the copyright policy of the publisher/copyright holder.
Abstract
Recommendation Systems play a very important role in our lives in the modern era. With the advent of Big Data in recent years, an enormous amount of information (in both structured and un structured data formats) is being generated every second from various data sources. Recommendation Systems are very helpful for generating meaningful insights from massive amounts of data. Slower batch approaches are enhanced by real time recommendations enabled by storing data in a graph database platform. This dissertation aims to build and implement a real time recommendation system using different graph algorithms in Neo4j, the current leader in graph operational database management systems. The requirements or use cases for this research were proposed by the AI and Analytics Team of Neo4j, for their ongoing research and development activities. Various research papers were studied to get an overview of various graph algorithms currently used in recommendation systems in graph databases. A customized graph data model was implemented to provide solutions for the research questions. Cypher, the Neo4j query language was used to implement a selection of recommendation graph algorithms on this data model. The graph algorithms used were Overlap Similarity, Cosine Similarity and PageRank. For providing a comparison between traditional and graph databases, FP-Growth, a traditional Association Rule Algorithm, was implemented using Rapid Miner, a leading Data Mining Tool, showcasing a particular use case using a traditional approach. A Python Script was developed to prepare the data for loading to the customized graph data model. Also, a data profiling or statistical analysis was performed on the loaded data providing a thorough analysis of the structure, contents and meta data of the loaded data. Graph analytics were only been introduced for an operational graph database management system in the fourth quarter of 2017. The results obtained from this research highlight the enormous potential for real time recommendations using the algorithms of a graph database platform like Neo4j.