Competitive Analysis of Embedding Models in Retrieval-Augmented Generation for Indian Motor Vehicle Law Chat Bots

Authors

Mohanan, Monisha

Issue Date

2024

Degree

MSc in Artificial Intelligence

Publisher

Dublin Business School

Rights

Abstract

This study evaluates eight embedding models in Retrieval-Augmented Generation (RAG) systems for a chatbot tailored to Indian Motor Vehicle Law. The models examined are OpenAIEmbeddings, UAE-Large-V1, all-MiniLM-L6-v2, all-distilroberta-v1, all-mpnet-base-v2, bge-large-en-v1.5, ember-v1, and gte-large. Through Cosine Similarity and ROUGE metrics, the analysis distinguishes OpenAIEmbeddings and gte-large for their superior semantic understanding. These models showed remarkable alignment with expert-generated answers, indicating their efficacy in AI-driven legal assistance. The study's outcomes underscore the importance of embedding model selection in legal chatbot development, focusing on semantic comprehension capabilities. This research is pivotal for enhancing AI legal assistance, offering insights into the effective integration of embedding models in legal technology applications.