Competitive Analysis of Embedding Models in Retrieval-Augmented Generation for Indian Motor Vehicle Law Chat Bots

No Thumbnail Available
Authors
Mohanan, Monisha
Issue Date
2024
Degree
MSc in Artificial Intelligence
Publisher
Dublin Business School
Rights
Abstract
This study evaluates eight embedding models in Retrieval-Augmented Generation (RAG) systems for a chatbot tailored to Indian Motor Vehicle Law. The models examined are OpenAIEmbeddings, UAE-Large-V1, all-MiniLM-L6-v2, all-distilroberta-v1, all-mpnet-base-v2, bge-large-en-v1.5, ember-v1, and gte-large. Through Cosine Similarity and ROUGE metrics, the analysis distinguishes OpenAIEmbeddings and gte-large for their superior semantic understanding. These models showed remarkable alignment with expert-generated answers, indicating their efficacy in AI-driven legal assistance. The study's outcomes underscore the importance of embedding model selection in legal chatbot development, focusing on semantic comprehension capabilities. This research is pivotal for enhancing AI legal assistance, offering insights into the effective integration of embedding models in legal technology applications.