Defense Notices


All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

Soumya Baddham

Battling Toxicity: A Comparative Analysis of Machine Learning Models for Content Moderation

When & Where:


Eaton Hall, Room 2001B

Committee Members:

David Johnson, Chair
Prasad Kulkarni
Hongyang Sun


Abstract

With the exponential growth of user-generated content, online platforms face unprecedented challenges in moderating toxic and harmful comments. Due to this, Automated content moderation has emerged as a critical application of machine learning, enabling platforms to ensure user safety and maintain community standards. Despite its importance, challenges such as severe class imbalance, contextual ambiguity, and the diverse nature of toxic language often compromise moderation accuracy, leading to biased classification performance.

This project presents a comparative analysis of machine learning approaches for a Multi-Label Toxic Comment Classification System using the Toxic Comment Classification dataset from Kaggle.  The study examines the performance of traditional algorithms, such as Logistic Regression, Random Forest, and XGBoost, alongside deep architectures, including Bi-LSTM, CNN-Bi-LSTM, and DistilBERT. The proposed approach utilizes word-level embeddings across all models and examines the effects of architectural enhancements, hyperparameter optimization, and advanced training strategies on model robustness and predictive accuracy.

The study emphasizes the significance of loss function optimization and threshold adjustment strategies in improving the detection of minority classes. The comparative results reveal distinct performance trade-offs across model architectures, with transformer models achieving superior contextual understanding at the cost of computational complexity. At the same time, deep learning approaches(LSTM models) offer efficiency advantages. These findings establish evidence-based guidelines for model selection in real-world content moderation systems, striking a balance between accuracy requirements and operational constraints.


Manu Chaudhary

Utilizing Quantum Computing for Solving Multidimensional Partial Differential Equations

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Esam El-Araby, Chair
Perry Alexander
Tamzidul Hoque
Prasad Kulkarni
Tyrone Duncan

Abstract

Quantum computing has the potential to revolutionize computational problem-solving by leveraging the quantum mechanical phenomena of superposition and entanglement, which allows for processing a large amount of information simultaneously. This capability is significant in the numerical solution of complex and/or multidimensional partial differential equations (PDEs), which are fundamental to modeling various physical phenomena. There are currently many quantum techniques available for solving partial differential equations (PDEs), which are mainly based on variational quantum circuits. However, the existing quantum PDE solvers, particularly those based on variational quantum eigensolver (VQE) techniques, suffer from several limitations. These include low accuracy, high execution times, and low scalability on quantum simulators as well as on noisy intermediate-scale quantum (NISQ) devices, especially for multidimensional PDEs.

 In this work, we propose an efficient and scalable algorithm for solving multidimensional PDEs. We present two variants of our algorithm: the first leverages finite-difference method (FDM), classical-to-quantum (C2Q) encoding, and numerical instantiation, while the second employs FDM, C2Q, and column-by-column decomposition (CCD). Both variants are designed to enhance accuracy and scalability while reducing execution times. We have validated and evaluated our proposed concepts using a number of case studies including multidimensional Poisson equation, multidimensional heat equation, Black Scholes equation, and Navier-Stokes equation for computational fluid dynamics (CFD) achieving promising results. Our results demonstrate higher accuracy, higher scalability, and faster execution times compared to VQE-based solvers on noise-free and noisy quantum simulators from IBM. Additionally, we validated our approach on hardware emulators and actual quantum hardware, employing noise mitigation techniques. This work establishes a practical and effective approach for solving PDEs using quantum computing for engineering and scientific applications.


Alex Manley

Taming Complexity in Computer Architecture through Modern AI-Assisted Design and Education

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Heechul Yun, Chair
Tamzidul Hoque
Prasad Kulkarni
Mohammad Alian

Abstract

The escalating complexity inherent in modern computer architecture presents significant challenges for both professional hardware designers and students striving to gain foundational understanding. Historically, the steady improvement of computer systems was driven by transistor scaling, predictable performance increases, and relatively straightforward architectural paradigms. However, with the end of traditional scaling laws and the rise of heterogeneous and parallel architectures, designers now face unprecedented intricacies involving power management, thermal constraints, security considerations, and sophisticated software interactions. Prior tools and methodologies, often reliant on complex, command-line driven simulations, exacerbate these challenges by introducing steep learning curves, creating a critical need for more intuitive, accessible, and efficient solutions. To address these challenges, this thesis introduces two innovative, modern tools.

The first tool, SimScholar, provides an intuitive graphical user interface (GUI) built upon the widely-used gem5 simulator. SimScholar significantly simplifies the simulation process, enabling students and educators to more effectively engage with architectural concepts through a visually guided environment, both reducing complexity and enhancing conceptual understanding. Supporting SimScholar, the gem5 Extended Modules API (gEMA) offers streamlined backend integration with gem5, ensuring efficient communication, modularity, and maintainability.

The second contribution, gem5 Co-Pilot, delivers an advanced framework for architectural design space exploration (DSE). Co-Pilot integrates cycle-accurate simulation via gem5, detailed power and area modeling through McPAT, and intelligent optimization assisted by a large language model (LLM). Central to Co-Pilot is the Design Space Declarative Language (DSDL), a Python-based domain-specific language that facilitates structured, clear specification of design parameters and constraints.

Collectively, these tools constitute a comprehensive approach to taming complexity in computer architecture, offering powerful, user-friendly solutions tailored to both educational and professional settings.


Past Defense Notices

Dates

ANIRUDH NARASIMMAN

Arcana: Private Tweets on a Public Microblog Platform

When & Where:


250 Nichols Hall

Committee Members:

Bo Luo, Chair
Luke Huan
Prasad Kulkarni


Abstract

As one of the world’s most famous online social networks (OSN), Twitter now has 320 million monthly active users. Accompanying the large user group and abundant personal information, users increasingly realize the vulnerability of tweets and have reservations of showing certain tweets to different follower groups, such as colleagues, friends and other followers. However, Twitter does not offer enough privacy protection or access control functions. Users can just set an account as protected, which results in only the user’s followers seeing the tweet. The protected tweet does not appear in the public domain, third party sites and search engines cannot access the tweet. However, a protected account cannot distinguish between different follower groups or users who use multiple accounts. To serve the demand of the user so that they can restrict the access of each tweet to certain follower groups, we propose a browser plug-in system, which utilizes CP-ABE (Ciphertext Policy Attribute based encryption), allowing the user to select followers based on predefined attributes. Through simple installation and pre-setting, the user can encrypt and decrypt tweets conveniently and can avoid the fear of information leakage.


PRATHAP KUMAR VALSAN

Towards Achieving Predictable Memory Performance on Multi-core Based Mixed Criticality Embedded Systems

When & Where:


250 Nichols Hall

Committee Members:

Heechul Yun, Chair
Esam El-Araby
Prasad Kulkarni


Abstract

The shared resources in multi-core systems, mainly the memory subsystem(caches and DRAM), if not managed properly would affect the predictability of real-time tasks in the presence of co-runners. In this work, we first studied the design of COTS DRAM controllers and its impact on predictability and, proposed a DRAM controller design, called MEDUSA, to provide predictable memory performance in multi-core based real-time systems. In our approach, the OS partially partitions DRAM banks into reserved banks and shared banks. The reserved banks are exclusive to each core to provide predictable timing while the shared banks are shared by all cores to efficiently utilize the resources. MEDUSA has two separate queues for read and write requests, and it prioritizes reads over writes. In processing read requests, MEDUSA employs a two-level scheduling algorithm that prioritizes the memory requests to the reserved banks in a Round Robin fashion to provide strong timing predictability. In processing write requests, MEDUSA largely relies on the FR-FCFS for high throughput. We implemented MEDUSA in a cycle-accurate full-system simulator. The results show that MEDUSA achieves up to 91% better worst-case performance for real-time tasks while achieving up to 29% throughput improvement for non-real-time tasks 

Second, we studied the contention at shared caches and its impact on predictability. We demonstrate that the prevailing cache partition techniques does not necessarily ensure predictable cache performance in modern COTS multi-core platforms that use non-blocking caches to exploit memory-level-parallelism (MLP). Through carefully designed experiments using three real COTS multi-core platforms (four distinct CPU architectures) and a cycle-accurate full system simulator, we show that special hardware registers in non-blocking caches, known as Miss Status Holding Registers (MSHRs), which track the status of outstanding cache-misses, can be a significant source of contention. We propose a hardware and system software (OS) collaborative approach to efficiently eliminate MSHR contention for multi-core real-time systems.We implement the hardware extension in a cycle-accurate full-system simulator and the scheduler modification in Linux 3.14 kernel. In a case study, we achieve up to 19% WCET reduction (average: 13%) for a set of EEMBC benchmarks compared to a baseline cache partitioning setup. 


LEI SHI

Multichannel Sense-and-Avoid Radar for Small UAVs

When & Where:


2001B Eaton Hall

Committee Members:

Chris Allen, Chair
Glenn Prescott
Jim Stiles
Heechul Yun
Lisa Friis

Abstract

This dissertation investigates the feasibility of creating a multichannel sense-and-avoid radar system for small fixed-wing unmanned aerial vehicles (UAVs, also known as sUAS or drones). These aircraft are projected to have a significant impact on the U.S. economy in both the commercial and government sectors, however, their lack of situation awareness has caused the FAA to strictly limit their use. Through this dissertation, a miniature, multichannel, FMCW radar system was created with a small enough size, weight, and power (SWaP) that would allow it to be mounted onboard a sUAS providing inflight target detection. The primary hazard to avoid are general aviation (GA) aircraft such as a Cessna 172 which was estimated to have a radar cross section (RCS) of approximately 1 sqr meter. The radar system is capable of locating potential hazards in range, Doppler, and 3-dimensional space using a patent pending 2-D FFT process and interferometry. The initial prototype system has a detection range of approximately 800 m, with 360-degree azimuth coverage, and +/- 15-degree elevation coverage and draws less than 20 W. From the radar data, target detection, tracking, and the extrapolation of the target behavior in 6-degree of freedom was demonstrated.


RANJITH SOMPALLI

Implementation of Invertebrate Paleontology Knowledge base using Integration of Textual Ontology & Visual Features

When & Where:


2001B Eaton Hall

Committee Members:

Bo Luo, Chair
Jerzy Grzymala-Busse
Richard Wang


Abstract

The Treatise on Invertebrate Paleontology is the most authoritative compilation of the invertebrate fossil records. The quality of studies in paleontology, in particular depends on the accessibility of fossil data. Unfortunately, the PDF version of Treatise currently available is just a scanned copy of the paper publications and the content is in no way organized to facilitate search and knowledge discovery. This project builds an Information Retrieval based system, to extract the fossil descriptions, images and other available information from Treatise. This project is divided into two parts. The first part deals with the extraction of the text and images from the Treatise, organize the information in a structured format and store in a relational database, build a search engine to browse fossil data. Extracting text requires identifying common textual patterns and a text parsing algorithm is developed to identify the patterns and organize the information in a structural format. Images are extracted using the image processing techniques like image segmentation, morphological operations etc., and then associated with the corresponding textual descriptions. A Search engine is built to efficiently browse the extracted information and also the web interface provides options to perform many useful tasks with ease. The second part of this research focuses on the implementation of Content Based Information Retrieval System. All images from treatise are grayscale fossil images and identifying the matching images based on the visual image features is a very difficult task. Hence, we employed an approach that integrates textual and visual features to identify matching images. Textual features are extracted from the description of the fossils and using statistical approaches and Parts of Speech tagging approaches, an ontology is generated, that forms attribute – property pairs explaining how a region looks like in each shell. Popular image features like SIFT, GIST, and HOG features are extracted from fossil images. Both the textual and image features are then integrated to extract the information related to the fossil image matching the query image.


NAGABHUSHANA GARGESHWARI MAHADEVASWAMY

How Duplicates Affect the Error Rate of Data Sets During Validation

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Prasad Kulkarni
Bo Luo


Abstract

In data mining, duplicate data plays a huge role in deciding the set of rules. In this project, an analysis has been made on finding the impact of duplicates in the input data set on the rule set. The effect of duplicates is being analyzed using the error rate factor. Error rate is calculated by comparing the obtained rule set against the testing part of input data. The results of experiments have shown decrement of error rate with the increase of percentage of duplicates in the input data set, which demonstrates that the duplicate data plays a crucial role in validation process of machine learning. LEM2 algorithm and rule checker application have been implemented as a part of project. LEM2 algorithm is used to induce the rule set for the given input data set and rule checker application is used to calculate the error rate.


GOWTHAM GOLLA

Developing Novel Machine Learning Algorithms to Improve Sedentary Assessment for Youth Health Enhancement

When & Where:


2001B Eaton Hall

Committee Members:

Luke Huan, Chair
Jerzy Grzymala-Busse
Jordan Carlson


Abstract

Sedentary behavior of youth is an important determinant of health. However, better measures are needed to improve understanding of this relationship and the mechanisms at play, as well as to evaluate health promotion interventions. Even though wearable devices like accelerometers (e.g. activPAL) are considered as the standard for assessing physical activity in research, the machine learning algorithms that we propose will allow us to re-examine existing accelerometer data to better understand the association between sedentary time and health in various populations. In order to achieve this, we collected two datasets, one is laboratory-controlled dataset and second is free-living dataset. We trained machine learning classifiers on both datasets and compared their behaviors on these datasets. The classifiers predict five postures: sit, stand, sit-stand, stand-sit, and stand\walk. We have also compared manually constructed Hidden Markov model(HMM) with automated HMM from existing software on both datasets to better understand the algorithm and existing software. When we tested on the laboratory-controlled dataset and the free-living dataset, the manually constructed HMM gave more F1-Macro score.


RITANKAR GANGULY

Graph Search Algorithms and Their Applications

When & Where:


2001B Eaton Hall

Committee Members:

Man Kong, Chair
Nancy Kinnersley
Jim Miller


Abstract

Depth- First Search (DFS) and Breadth- First Search are two of the most extensively used graph traversal algorithms to compile information about the graph in linear time. These two graph traversal mechanisms overlay a path to explore further the applications based on them that are widely used in Network Engineering, Web Analytics, Social Networking, Postal Services and Hardware Implementations. The difference between DFS and BFS results in the order in which they explore vertices and the implementation techniques for storing the discovered but un-processed vertices in the graph. BFS algorithm usually needs less time but consumes more computer memory than a DFS implementation. DFS algorithm is based on LIFO mechanism and is implemented using stack. BFS algorithm is based on FIFO technique and is realized using a queue. The order in which the vertices are visited using DFS or BFS can be realized with the help of a tree. The type of graph (directed or undirected) along with the edges of these trees form the basis of all the applications on BFS or DFS. Determining the shortest path between vertices of an un-weighted graph can be used in network engineering to transfer data packets. Checking for the presence of cycle can be critical in minimizing redundancy in telecommunications and is extensively used by social networking websites these days to analyse information as how people are connected. Finding bridges in a graph or determining the set of articulation vertices help minimize vulnerability in network design. Finding the strongly connected components in a graph can be used by model checkers in computer science. Determining an Euler circuit in a graph can be used by the postal service industries and the algorithm can be successfully implemented with linear running time using enhanced data structures. This survey project briefly defines and explains the basics of DFS and BFS traversal and explores some of the applications that are based on these algorithms. 


MICHAEL BLECHA

Implementation of a 2.45GHz Power Amplifier for use in Collision Avoidance Radar

When & Where:


2001B Eaton Hall

Committee Members:

Chris Allen, Chair
Glenn Prescott
Jim Stiles


Abstract

The integration of a RF power amplifier into a Collision Avoidance Radar will increase the maximum detection distance of the radar. Increasing the maximum detection distance will allow a radar system mounted on an Unmanned Aircraft Vehicle to observe obstacles earlier and give the UAV more time to react. The UAVradars project has been miniaturized to support operation on an unmanned aircraft and could benefit from an increase in maximum detection distance. 
The goal of this project is to create a one watt power amplifier for the 2.4GHz-2.5GHz band that can be integrated into the UAVradars project. The amplifier will be powered from existing power supplies in the radar system and must be small and lightweight to support operation on board the UAV in flight. This project will consist of the schematic and layout design, simulations, fabrication, and characterization of the power amplifier. The power amplifier will be designed to fit into the current system with minimal system modifications required. 


HARSHUL ROUTHU

A Comparison of Two Decision Tree Generating Algorithms C4.5 and CART Based on Testing Datasets with Missing Attribute Values

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Prasad Kulkarni
Bo Luo


Abstract

In data mining, missing data are a common occurrence and can have a significant effect on the conclusions that can be drawn from the data. Classification of missing data is a challenging task. One of the most popular techniques for classifying missing data is decision tree induction. 
In this project, we compare two decision tree generating algorithms CART and C4.5 with their original implementations on different datasets with missing attribute values, taken from University of California Irvine (UCI). The comparative analysis of these two implementations is carried out in terms of accuracy on training and testing data, and decision tree complexity based on its depth and size. Results from experiments show that there is statistically insignificant difference between C4.5 and CART in terms of accuracy on testing data and complexity of the decision tree. On the other hand, accuracy on training data is significantly better for CART compared to C4.5. 


HADEEL ALABANDI

A Survey of Metrics Employed to Assess Software Security

When & Where:


246 Nichols Hall

Committee Members:

Prasad Kulkarni, Chair
Andy Gill
Heechul Yun


Abstract

Measuring and assessing software security is a critical concern as it is undesirable to develop risky and insecure software. Various measurement approaches and metrics have been defined to assess software security. For researchers and software developers, it is significant to have different metrics and measurement models at one place either to evaluate the existing measurement approaches, to compare between two or more metrics or to be able to find the proper metric to measure the software security at a specific software development phase. There is no existing survey of software security metrics that covers metrics available at all the software development phases. In this paper, we present a survey of metrics used to assess and measure software security, and we categorized them based on software development phases. Our findings reveal a critical lack of automated tools, and the necessity to possess detailed knowledge or experience of the measured software as the major hindrances in the use of existing software security metrics.