학술논문

Chatbots in Academia: A Retrieval-Augmented Generation Approach for Improved Efficient Information Access

Document Type

Conference

Author

Maryamah, Maryamah; Irfani, Muhammad Maula; Tri Raharjo, Edric Boby; Rahmi, Netri Alia; Ghani, Mohammad; Raharjana, Indra Kharisma

Source

2024 16th International Conference on Knowledge and Smart Technology (KST) Knowledge and Smart Technology (KST), 2024 16th International Conference on. :259-264 Feb, 2024

Subject

Computing and Processing
Analytical models
Databases
Virtual assistants
Search methods
Natural languages
Oral communication
Chatbots
Academic Chatbots
Retrieval-Augmented Generation
Large Language Models
Technology

Language

ISSN

2473-764X

Abstract

In today's digital age, higher education utilizes chatbots as virtual assistants to assist users, especially prospective students to access information easily. A chatbot is an application in natural language conversations to simulate intelligent interactions. Intelligent chatbots are needed to understand user needs and answer questions relevantly. We propose a chatbot with Retrieval Augmented Generation approach involving a retriever with cosine similarity search using OpenAI Ada embeddings to obtain relevant documents. The LLM OpenAI GPT-3.5- Turbo then generates the final answer. The chatbot mechanism begins with the retrieval module systematically identifying documents stored in the vector database that contain relevant information related to the user's query. The selected documents and query are provided to the LLM as part of the prompt to generate responses based on the knowledge provided in the relevant documents. The retrieval method is evaluated based on two criteria: the search method and the embedding model. The comparison method uses similarity search with Maximum Marginal Relevance (MMR) Search and the proposed embedding method against other models such as Google Embedding-001 and MPNet-Multilingual. The retrieval process is assessed using an evaluation dataset that incorporates Recall and Precision metrics, while answer generation is measured with BLEU and ROUGE Score. The observed disparity result between similarity search and MMR is not notably significant. Nonetheless, our chatbot holds an advantage in referencing past conversations due to its ability to store conversation history. Furthermore, potential enhancements are identified by augmenting the knowledge provided to the LLM in forthcoming iterations.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송