Defense Notices


All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

Soumya Baddham

Battling Toxicity: A Comparative Analysis of Machine Learning Models for Content Moderation

When & Where:


Eaton Hall, Room 2001B

Committee Members:

David Johnson, Chair
Prasad Kulkarni
Hongyang Sun


Abstract

With the exponential growth of user-generated content, online platforms face unprecedented challenges in moderating toxic and harmful comments. Due to this, Automated content moderation has emerged as a critical application of machine learning, enabling platforms to ensure user safety and maintain community standards. Despite its importance, challenges such as severe class imbalance, contextual ambiguity, and the diverse nature of toxic language often compromise moderation accuracy, leading to biased classification performance.

This project presents a comparative analysis of machine learning approaches for a Multi-Label Toxic Comment Classification System using the Toxic Comment Classification dataset from Kaggle.  The study examines the performance of traditional algorithms, such as Logistic Regression, Random Forest, and XGBoost, alongside deep architectures, including Bi-LSTM, CNN-Bi-LSTM, and DistilBERT. The proposed approach utilizes word-level embeddings across all models and examines the effects of architectural enhancements, hyperparameter optimization, and advanced training strategies on model robustness and predictive accuracy.

The study emphasizes the significance of loss function optimization and threshold adjustment strategies in improving the detection of minority classes. The comparative results reveal distinct performance trade-offs across model architectures, with transformer models achieving superior contextual understanding at the cost of computational complexity. At the same time, deep learning approaches(LSTM models) offer efficiency advantages. These findings establish evidence-based guidelines for model selection in real-world content moderation systems, striking a balance between accuracy requirements and operational constraints.


Manu Chaudhary

Utilizing Quantum Computing for Solving Multidimensional Partial Differential Equations

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Esam El-Araby, Chair
Perry Alexander
Tamzidul Hoque
Prasad Kulkarni
Tyrone Duncan

Abstract

Quantum computing has the potential to revolutionize computational problem-solving by leveraging the quantum mechanical phenomena of superposition and entanglement, which allows for processing a large amount of information simultaneously. This capability is significant in the numerical solution of complex and/or multidimensional partial differential equations (PDEs), which are fundamental to modeling various physical phenomena. There are currently many quantum techniques available for solving partial differential equations (PDEs), which are mainly based on variational quantum circuits. However, the existing quantum PDE solvers, particularly those based on variational quantum eigensolver (VQE) techniques, suffer from several limitations. These include low accuracy, high execution times, and low scalability on quantum simulators as well as on noisy intermediate-scale quantum (NISQ) devices, especially for multidimensional PDEs.

 In this work, we propose an efficient and scalable algorithm for solving multidimensional PDEs. We present two variants of our algorithm: the first leverages finite-difference method (FDM), classical-to-quantum (C2Q) encoding, and numerical instantiation, while the second employs FDM, C2Q, and column-by-column decomposition (CCD). Both variants are designed to enhance accuracy and scalability while reducing execution times. We have validated and evaluated our proposed concepts using a number of case studies including multidimensional Poisson equation, multidimensional heat equation, Black Scholes equation, and Navier-Stokes equation for computational fluid dynamics (CFD) achieving promising results. Our results demonstrate higher accuracy, higher scalability, and faster execution times compared to VQE-based solvers on noise-free and noisy quantum simulators from IBM. Additionally, we validated our approach on hardware emulators and actual quantum hardware, employing noise mitigation techniques. This work establishes a practical and effective approach for solving PDEs using quantum computing for engineering and scientific applications.


Alex Manley

Taming Complexity in Computer Architecture through Modern AI-Assisted Design and Education

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Heechul Yun, Chair
Tamzidul Hoque
Prasad Kulkarni
Mohammad Alian

Abstract

The escalating complexity inherent in modern computer architecture presents significant challenges for both professional hardware designers and students striving to gain foundational understanding. Historically, the steady improvement of computer systems was driven by transistor scaling, predictable performance increases, and relatively straightforward architectural paradigms. However, with the end of traditional scaling laws and the rise of heterogeneous and parallel architectures, designers now face unprecedented intricacies involving power management, thermal constraints, security considerations, and sophisticated software interactions. Prior tools and methodologies, often reliant on complex, command-line driven simulations, exacerbate these challenges by introducing steep learning curves, creating a critical need for more intuitive, accessible, and efficient solutions. To address these challenges, this thesis introduces two innovative, modern tools.

The first tool, SimScholar, provides an intuitive graphical user interface (GUI) built upon the widely-used gem5 simulator. SimScholar significantly simplifies the simulation process, enabling students and educators to more effectively engage with architectural concepts through a visually guided environment, both reducing complexity and enhancing conceptual understanding. Supporting SimScholar, the gem5 Extended Modules API (gEMA) offers streamlined backend integration with gem5, ensuring efficient communication, modularity, and maintainability.

The second contribution, gem5 Co-Pilot, delivers an advanced framework for architectural design space exploration (DSE). Co-Pilot integrates cycle-accurate simulation via gem5, detailed power and area modeling through McPAT, and intelligent optimization assisted by a large language model (LLM). Central to Co-Pilot is the Design Space Declarative Language (DSDL), a Python-based domain-specific language that facilitates structured, clear specification of design parameters and constraints.

Collectively, these tools constitute a comprehensive approach to taming complexity in computer architecture, offering powerful, user-friendly solutions tailored to both educational and professional settings.


Past Defense Notices

Dates

Sathya Mahadevan

Implementation of ID3 for Data Stored in Multiple SQL Databases

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Man Kong
Prasad Kulkarni


Abstract

Data classification is a methodology of data mining used to retrieve meaningful information from data. A model is built from the input training set which is later used to classify new observations. One of the most widely used models is a decision tree which uses a tree like structure to list all possible outcomes. Decision trees are preferred for their simple structure, requiring little effort for data preparation and easy interpretation. This project implements ID3, an algorithm for building the decision tree using information gain. The decision tree is converted to a set of rules and the error rate is calculated using the test dataset. The dataset is usually stored in a relational database in the form tables. In practice, it might be desired that data be stored across multiple databases. In such scenarios, retrieving and coordinating data from the databases could be a challenging task. This project provides the implementation of ID3 algorithm with the convenience of reading data stored at multiple data sources.


SATHYA MAHADEVAN

Implementation of ID3 for Data Stored in Multiple SQL Databases

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Man Kong
Prasad Kulkarni


Abstract

Data classification is a methodology of data mining used to retrieve meaningful information from data. A model is built from the input training set which is later used to classify new observations. One of the most widely used models is a decision tree which uses a tree like structure to list all possible outcomes. Decision trees are preferred for their simple structure, requiring little effort for data preparation and easy interpretation. This project implements ID3, an algorithm for building the decision tree using information gain. The decision tree is converted to a set of rules and the error rate is calculated using the test dataset. The dataset is usually stored in a relational database in the form tables. In practice, it might be desired that data be stored across multiple databases. In such scenarios, retrieving and coordinating data from the databases could be a challenging task. This project provides the implementation of ID3 algorithm with the convenience of reading data stored at multiple data sources.


CHAO LAN

Inequity Coefficient and Fair Transfer Learning

When & Where:


250 Nichols Hall

Committee Members:

Luke Huan, Chair
Lingjia Liu
Bo Luo
Xintao Wu
Jin Feng

Abstract

Fair machine learning is an emerging and urgent research topic that aims to avoid discriminatory predictions against protected groups of people in real-world decision makings. This project aims to advance the field in two dimensions. First, we propose a more practical measurement of individual fairness called inequity coefficient, which integrates the current individual fairness framework that lacks of practice and the current situation testing practice that lacks of principle. We develop certain foundations of the measurement and present its practice. Second, we propose a first study of fairness in the context of transfer learning, with focuses on the hypothesis transfer and multi-task settings over two tasks. We illustrate a new challenge called discriminatory transfer, where discrimination is enforced by traditional task relatedness constraints that only aim to find accurate hypotheses. We propose a set of new algorithms that aim to avoid discriminatory transfer across tasks or promote fairness within each task.


Chao Lan

Inequity Coefficient and Fair Transfer Learning

When & Where:


250 Nichols Hall

Committee Members:

Luke Huan, Chair
Lingjia Liu
Bo Luo
Xintao Wu
Jin Feng

Abstract

Fair machine learning is an emerging and urgent research topic that aims to avoid discriminatory predictions against protected groups of people in real-world decision makings. This project aims to advance the field in two dimensions. First, we propose a more practical measurement of individual fairness called inequity coefficient, which integrates the current individual fairness framework that lacks of practice and the current situation testing practice that lacks of principle. We develop certain foundations of the measurement and present its practice. Second, we propose a first study of fairness in the context of transfer learning, with focuses on the hypothesis transfer and multi-task settings over two tasks. We illustrate a new challenge called discriminatory transfer, where discrimination is enforced by traditional task relatedness constraints that only aim to find accurate hypotheses. We propose a set of new algorithms that aim to avoid discriminatory transfer across tasks or promote fairness within each task.


ROHIT BANERJEE

Extraction and Analysis of Amazon Reviews

When & Where:


246 Nichols Hall

Committee Members:

Fengjun Li, Chair
Man Kong
Bo Luo


Abstract

Amazon.com is one of the largest online retail stores in the world. Besides selling millions of product on their website, Amazon provides a variety of Web services including Amazon Review and Recommendation System. Users are encouraged to write product reviews to help others to understand products’ features and make purchase decisions. However, product reviews, as a type of user generated content (UGC), suffer from quality and trust problems. To help evaluating the quality of reviews, Amazon also provides the users with the helpfulness vote feature so that a user can support a review that he considers helpful. In this project we aim to study the relation between helpfulness votes and the ranks of the reviews. In particular, we are looking for answers to questions such as “how does the helpfulness votes affect review ranks?” and “how review rank and its presentation mechanism affect people’s voting behavior?” To investigate on these questions, we built a crawler to collect reviews and votes of reviews from Amazon at a daily basis. Then, we conducted an analysis on a dataset with over 50,000 Amazon reviews to identify the voting patterns and their impact on the review ranks. Our results show that there exists a positive correlation between the review ranks and the helpfulness votes. 


Rohit Banerjee

Extraction and Analysis of Amazon Reviews

When & Where:


246 Nichols Hall

Committee Members:

Fengjun Li, Chair
Man Kong
Bo Luo


Abstract

Amazon.com is one of the largest online retail stores in the world. Besides selling millions of product on their website, Amazon provides a variety of Web services including Amazon Review and Recommendation System. Users are encouraged to write product reviews to help others to understand products’ features and make purchase decisions. However, product reviews, as a type of user generated content (UGC), suffer from quality and trust problems. To help evaluating the quality of reviews, Amazon also provides the users with the helpfulness vote feature so that a user can support a review that he considers helpful. In this project we aim to study the relation between helpfulness votes and the ranks of the reviews. In particular, we are looking for answers to questions such as “how does the helpfulness votes affect review ranks?” and “how review rank and its presentation mechanism affect people’s voting behavior?” To investigate on these questions, we built a crawler to collect reviews and votes of reviews from Amazon at a daily basis. Then, we conducted an analysis on a dataset with over 50,000 Amazon reviews to identify the voting patterns and their impact on the review ranks. Our results show that there exists a positive correlation between the review ranks and the helpfulness votes.​


BIJAL PARIKH

A Comparison of Tolerance Relation and Valued Tolerance Relation for Incomplete Datasets

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Prasad Kulkarni
Bo Luo


Abstract

Rough set theory is a popular approach for decision rule induction. However, it requires the objects in the information system to be completely described. Many real life data sets are incomplete, so we cannot apply directly rough set theory for rule induction. This project implements and compares two generalizations of rough set theory, used for rule induction from incomplete data: Tolerance Relation and Valued Tolerance Relation. A comparative analysis is conducted for the lower and upper approximations and decision rules induced by the two methods. Our experiments show that Valued Tolerance Relation provides better approximations than Simple Tolerance Relation when the percentage of missing attribute values in the datasets is high.


Bijal Parikh

A Comparison of Tolerance Relation and Valued Tolerance Relation for Incomplete Datasets

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Prasad Kulkarni
Bo Luo


Abstract

Rough set theory is a popular approach for decision rule induction. However, it requires the objects in the information system to be completely described. Many real life data sets are incomplete, so we cannot apply directly rough set theory for rule induction. This project implements and compares two generalizations of rough set theory, used for rule induction from incomplete data: Tolerance Relation and Valued Tolerance Relation. A comparative analysis is conducted for the lower and upper approximations and decision rules induced by the two methods. Our experiments show that Valued Tolerance Relation provides better approximations than Simple Tolerance Relation when the percentage of missing attribute values in the datasets is high.


ALHANOOF ALTHNIAN

Evolutionary Learning of Goal-Driven Multi-Agent Communication

When & Where:


2001B Eaton Hall

Committee Members:

Arvin Agah, Chair
Prasad Kulkarni
Fengjun Li
Bo Luo
Elaina Sutley

Abstract

Multi-agent systems are a common paradigm for building distributed systems in different domains such as networking, health care, swarm sensing, robotics, and transportation. Systems are usually designed or adjusted in order to reflect the performance trade-offs made according to the characteristics of the mission requirement. 
Research has acknowledged the crucial role that communication plays in solving many performance problems. Conversely, research efforts that address communication decisions are usually designed and evaluated with respect to a single predetermined performance goal. This work introduces Goal-Driven Communication, where communication in a multi-agent system is determined according to flexible performance goals. 
This work proposes an evolutionary approach that, given a performance goal, produces a communication strategy that can improve a multi-agent system’s performance with respect to the desired goal. The evolved strategy determines what, when, and to whom the agents communicate. The proposed approach further enables tuning the trade-off between the performance goal and communication cost, to produce a strategy that achieves a good balance between the two objectives, according the system designer’s needs. 


JYOTI GANGARAJU

A Laboratory Manual for an Introduction to Communication Systems Course

When & Where:


2001B Eaton Hall

Committee Members:

Victor Frost, Chair
Dave Petr
Glenn Prescott


Abstract

Communication systems laboratory is a hands-on way to effectively visualize the real life applications of communication systems in its simplest form. Recently, hardware equipment such as spectrum analyzer, oscilloscope, and function generator were replaced by Pico Scope 6, a software based data analyzer. The Pico Scope 6 is a user friendly software which enables its users to capture and analyze analog and digital signals with a comparatively higher accuracy. Additionally, it is an economically viable solution, from both the procurement and maintenance stand point. The current effort focuses on developing a laboratory user manual, based on Pico Scope 6, for undergraduates of the Department of Electrical Engineering and Computer Science (EECS). The series of laboratory exercises developed follows the course outline of Introduction to Communication Systems – EECS 562. The expected outcomes of this laboratory manual is an improved understanding of analog modulations, digital modulations, and noise analysis of communication systems.