A framework for embedding hybrid term proximity score with standard TF-IDF to improve the performance of recipe retrieval system


Student Name: Paul Gomes
Defense Date:
Location: Eaton Hall, Room 2001B
Chair: Prasad Kulkarni

David Johnson

Hongyang Sun

Abstract:

Information retrieval system plays an important role in the modern era in retrieving relevant information from a large collection of data, such as documents, webpages, and other multimedia content. Having an information retrieval system in any domain allows users to collect relevant information. Unfortunately, navigating a modern-day recipe website presents the audience with numerous recipes in a colorful user interface but with very little capability to search and narrow down your content based on your specific interests. The goal of the project is to develop a search engine for recipes using standard TF-IDF weighting and to improve the performance of the standard IR by implementing term proximity. The approach used to calculate term proximity in this project is a hybrid approach, a combination of span-based and pair-based approaches. The project architecture includes a crawler, a database, an API, a service responsible for TF-IDF weighting and term proximity calculation, and a web application to present the search results. 

Degree: MS Project Defense (CS)
Degree Type: MS Project Defense
Degree Field: Computer Science