BabyJay - A RAG Based Chatbot for the University of Kansas


Student Name: Pavan Sai Reddy Pendry
Defense Date:
Location: Eaton Hall, Room 2001B
Chair: David Johnson

Rachel Jarvis

Prasad Kulkarni

Abstract:

The University of Kansas maintains hundreds of departmental and unit websites, leaving students without a unified way to find information. General-purpose chatbots hallucinate KU-specific facts, and static FAQ pages cannot hold a conversation. This work presents BabyJay, a Retrieval-Augmented Generation chatbot that answers student questions using content scraped from official KU sources, with inline citations on every response. The pipeline combines query preprocessing and decomposition, an intent classifier that routes most queries to fast JSON lookups, hybrid retrieval (BM25 and ChromaDB vector search merged via Reciprocal Rank Fusion), a cross-encoder re-ranker, and generation by Claude Sonnet 4.6 under a context-only system prompt. Evaluation on 46 question-answer pairs across five difficulty tiers and eight domains produced a composite score of 0.72, entity precision of 93%, and zero runtime errors. Retrieval, rather than generation, emerged as the primary bottleneck, motivating future work on multi-domain query handling.

Degree: MS Project Defense (CS)
Degree Type: MS Project Defense
Degree Field: Computer Science