Defense Notices


All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

Soumya Baddham

Battling Toxicity: A Comparative Analysis of Machine Learning Models for Content Moderation

When & Where:


Eaton Hall, Room 2001B

Committee Members:

David Johnson, Chair
Prasad Kulkarni
Hongyang Sun


Abstract

With the exponential growth of user-generated content, online platforms face unprecedented challenges in moderating toxic and harmful comments. Due to this, Automated content moderation has emerged as a critical application of machine learning, enabling platforms to ensure user safety and maintain community standards. Despite its importance, challenges such as severe class imbalance, contextual ambiguity, and the diverse nature of toxic language often compromise moderation accuracy, leading to biased classification performance.

This project presents a comparative analysis of machine learning approaches for a Multi-Label Toxic Comment Classification System using the Toxic Comment Classification dataset from Kaggle.  The study examines the performance of traditional algorithms, such as Logistic Regression, Random Forest, and XGBoost, alongside deep architectures, including Bi-LSTM, CNN-Bi-LSTM, and DistilBERT. The proposed approach utilizes word-level embeddings across all models and examines the effects of architectural enhancements, hyperparameter optimization, and advanced training strategies on model robustness and predictive accuracy.

The study emphasizes the significance of loss function optimization and threshold adjustment strategies in improving the detection of minority classes. The comparative results reveal distinct performance trade-offs across model architectures, with transformer models achieving superior contextual understanding at the cost of computational complexity. At the same time, deep learning approaches(LSTM models) offer efficiency advantages. These findings establish evidence-based guidelines for model selection in real-world content moderation systems, striking a balance between accuracy requirements and operational constraints.


Manu Chaudhary

Utilizing Quantum Computing for Solving Multidimensional Partial Differential Equations

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Esam El-Araby, Chair
Perry Alexander
Tamzidul Hoque
Prasad Kulkarni
Tyrone Duncan

Abstract

Quantum computing has the potential to revolutionize computational problem-solving by leveraging the quantum mechanical phenomena of superposition and entanglement, which allows for processing a large amount of information simultaneously. This capability is significant in the numerical solution of complex and/or multidimensional partial differential equations (PDEs), which are fundamental to modeling various physical phenomena. There are currently many quantum techniques available for solving partial differential equations (PDEs), which are mainly based on variational quantum circuits. However, the existing quantum PDE solvers, particularly those based on variational quantum eigensolver (VQE) techniques, suffer from several limitations. These include low accuracy, high execution times, and low scalability on quantum simulators as well as on noisy intermediate-scale quantum (NISQ) devices, especially for multidimensional PDEs.

 In this work, we propose an efficient and scalable algorithm for solving multidimensional PDEs. We present two variants of our algorithm: the first leverages finite-difference method (FDM), classical-to-quantum (C2Q) encoding, and numerical instantiation, while the second employs FDM, C2Q, and column-by-column decomposition (CCD). Both variants are designed to enhance accuracy and scalability while reducing execution times. We have validated and evaluated our proposed concepts using a number of case studies including multidimensional Poisson equation, multidimensional heat equation, Black Scholes equation, and Navier-Stokes equation for computational fluid dynamics (CFD) achieving promising results. Our results demonstrate higher accuracy, higher scalability, and faster execution times compared to VQE-based solvers on noise-free and noisy quantum simulators from IBM. Additionally, we validated our approach on hardware emulators and actual quantum hardware, employing noise mitigation techniques. This work establishes a practical and effective approach for solving PDEs using quantum computing for engineering and scientific applications.


Alex Manley

Taming Complexity in Computer Architecture through Modern AI-Assisted Design and Education

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Heechul Yun, Chair
Tamzidul Hoque
Prasad Kulkarni
Mohammad Alian

Abstract

The escalating complexity inherent in modern computer architecture presents significant challenges for both professional hardware designers and students striving to gain foundational understanding. Historically, the steady improvement of computer systems was driven by transistor scaling, predictable performance increases, and relatively straightforward architectural paradigms. However, with the end of traditional scaling laws and the rise of heterogeneous and parallel architectures, designers now face unprecedented intricacies involving power management, thermal constraints, security considerations, and sophisticated software interactions. Prior tools and methodologies, often reliant on complex, command-line driven simulations, exacerbate these challenges by introducing steep learning curves, creating a critical need for more intuitive, accessible, and efficient solutions. To address these challenges, this thesis introduces two innovative, modern tools.

The first tool, SimScholar, provides an intuitive graphical user interface (GUI) built upon the widely-used gem5 simulator. SimScholar significantly simplifies the simulation process, enabling students and educators to more effectively engage with architectural concepts through a visually guided environment, both reducing complexity and enhancing conceptual understanding. Supporting SimScholar, the gem5 Extended Modules API (gEMA) offers streamlined backend integration with gem5, ensuring efficient communication, modularity, and maintainability.

The second contribution, gem5 Co-Pilot, delivers an advanced framework for architectural design space exploration (DSE). Co-Pilot integrates cycle-accurate simulation via gem5, detailed power and area modeling through McPAT, and intelligent optimization assisted by a large language model (LLM). Central to Co-Pilot is the Design Space Declarative Language (DSDL), a Python-based domain-specific language that facilitates structured, clear specification of design parameters and constraints.

Collectively, these tools constitute a comprehensive approach to taming complexity in computer architecture, offering powerful, user-friendly solutions tailored to both educational and professional settings.


Past Defense Notices

Dates

HARISH SAMPANGI

Delay Feedback Reservoir (DFR) Design in Neuromorphic Computing Systems and its Application in Wireless Communications

When & Where:


2001B Eaton Hall

Committee Members:

Yang Yi, Chair
Glenn Prescott
Jim Rowland


Abstract

As semiconductor technologies continue to scale further into the nanometer regime, it is important to study how non-traditional computer architectures may be uniquely suited to take advantage of the novel behavior observed for many emerging technologies. Neuromorphic computing system represents a type of non-traditional architecture encompassing evolutionary. Reservoir computing, a computational paradigm inspired on neural systems, has become increasingly popular for solving a variety of complex recognition and classification problems. The traditional reservoir computing methods employs three different layers – the input layer, the reservoir and the output layer. The input layer feeds the input signals to the reservoir via fixed random weighted connections. These weights will scale the input that is given to the nodes, creating different input scaling for the input nodes. The second layer, which is called the reservoir, usually consists of a large number of randomly connected nonlinear nodes, constituting a recurrent network. Finally, the output weights are extracted from the output layer. Contrary to this traditional approach, the delayed feedback reservoir replaces the entire network of connected non-liner nodes just with a single nonlinear node subjected to delayed feedback. This approach does not only provide a drastic simplification of the experimental implementation of artificial neural networks for computing purposes, it also demonstrates the huge computational processing power hidden in even the simplest delay-dynamical system. Previous implementation of reservoir computing using the echo state network has been proven efficient for channel estimation in wireless Orthogonal Frequency-Division Multiplexing (OFDM) systems. This project aims at verifying the performance of DFR in channel estimation, by calculating its bit error rate (BER) and comparing it with other standard techniques like the LS and MMSE.


AUDREY SEYBERT

Analysis of Artifacts Inherent to Real-Time Radar Target Emulation

When & Where:


246 Nichols Hall

Committee Members:

Chris Allen, Chair
Shannon Blunt
Jim Stiles


Abstract

Executing high-fidelity tests of radar hardware requires real-time fixed-latency target emulation. Because fundamental radar measurements occur in the time domain, real-time fixed latency target emulation is essential to producing an accurate representation of a radar environment. Radar test equipment is further constrained by the application-specific minimum delay to a target of interest, a parameter that limits the maximum latency through the target emulator algorithm. These time constraints on radar target emulation result in imperfect DSP algorithms that generate spectral artifacts. Knowledge of the behavior and predictability of these spectral artifacts is the key to identifying whether a particular suite of hardware is sufficient to execute tests for a particular radar design. This work presents an analysis of the design considerations required for development of a digital radar target emulator. Further considerations include how the spectral artifacts inherent to the algorithms change with respect to the radar environment and an analysis of how effectively various DSP algorithms can be used to produce an accurate representation of simple target scenarios. This work presents a model representative of natural target motion, a model that is representative of the side effects of digital target emulation, and finally a true HDL simulation of a target.


CHRISTOPHER SEASHOLTZ

Security and Privacy Vulnerabilities in Unmanned Aerial Vehicles

When & Where:


246 Nichols Hall

Committee Members:

Bo Luo, Chair
Joe Evans
Fengjun Li


Abstract

In the past few years, UAVs have become very popular amongst the average citizen. Much like their military counterpart, these UAVs provide the ability to be controlled by computers, instead of a remote controller. While this may not appear to be a major security issue, the information gained from compromising a UAV can be used for other malicious activities. To understand potential attack surfaces of various UAVs, this paper presents the theory behind multiple possible attacks, as well as implementations of a select number of attacks mentioned. The main objective of this project was to obtain complete control of a UAV while in flight. Only a few of the attacks demonstrated, or mentioned, provide this ability. The remaining attacks mentioned provide information that can be used in conjunction with others in order to provide full control, or complete knowledge, of a system. Once the attacks have been proven possible, measures for proper defense must be taken. For each attack described in this paper, possible countermeasures will be given and explained.


ARIJIT BASU

Analyzing Bag of Visual Words for Efficient Content Based Image Retrieval and Classification

When & Where:


250 Nichols Hall

Committee Members:

Richard Wang, Chair
Prasad Kulkarni
Bo Luo


Abstract

Content Based Image Retrieval also known as QBIC (Query by Image Content) is a retrieval technique where detailed analysis of the features of an image is done for retrieving similar images from the image base. Content refers to any kind of information that can derived from the image itself like textures, color, shape which are primarily global features and local features like Sift, Surf, Hog etc. Content Based image retrieval as opposed to traditional text based image retrieval has been in the limelight for quite a while owing to its contribution in putting away too much responsibility from the end user and trying to bridge the semantic gap between low level features and high level human perception. 
Image Categorization is the process of classifying distinct image categories based on image features extracted from a subset of images or the entire database from each category followed by feeding it to a machine learning classifier which predicts the category labels eventually. Bag of Words Model is a very well known flexible model that represents an image as a histogram of visual patches. The idea originally comes from application of Bag of Words model in document retrieval and texture classification. Clustering is a very important aspect of the BOW model. It helps in grouping identical features from the entire dataset and hence feeding it to the Support Vector Machine Classifier. The SVM classifier takes into account every image that has been represented as a bag of visual features after clustering and then performs quality predictions. In this work we first apply the Bag of Words on well known datasets and then obtain accuracy parameters like Confusion Matrix, MCC, (Matthews Correlation Coefficient) and other statistical measures. For Feature selection we considered SURF Features owing to their rotation and scale invariant characteristics. The model has been trained and applied on two well known datasets Caltech 101 and Flickr- 25K followed by detailed performance analysis in different scenarios. 


SOUMYAJIT SARKAR

Biometric Analysis of Human Ear Recognition Using Traditional Approach

When & Where:


246 Nichols Hall

Committee Members:

Richard Wang, Chair
Jerzy Grzymala-Busse
Bo Luo


Abstract

Biometric ear authentication has received enormous popularity in recent years due to its uniqueness for each and every individual, even for identical twins. In this paper, two scale and rotation invariant feature detectors, SIFT and SURF, are adopted for recognition and authentication of ear images. An extensive analysis has been made on how these two descriptors work under certain real-life conditions; and a performance measure has been given. The proposed technique is evaluated and compared with other approaches on two data sets. Extensive experimental study demonstrates the effectiveness of the proposed strategy. Robust Estimation algorithm has been implemented to remove several false matches and improved results have been provided. Deep Learning has become a new way to detect features in objects and is also used extensively for recognition purposes. Sophisticated deep learning techniques like Convolutional Neural Networks(CNNs) have also been implemented and analysis has been done.Deep Learning Models need a lot of data to give a good result, unfortunately ear datasets available publicly are not very large and thus CNN simulations are being carried out on other state of the art datasets related to this research for evaluation of the model.


RUXIN XIE

Single-fiber-laser-based-multimodal coherent Raman System

When & Where:


250 Nichols Hall

Committee Members:

Ron Hui, Chair
Chris Allen
Shannon Blunt
Victor Frost
Carey Johnson

Abstract

Coherent Raman scattering (CRS) is an appealing technique for spectroscopy and microscopy, due to its selectivity and sensitivity. We designed and built single-fiber-laser-based coherent Raman scattering spectroscopy and microscopy system which can automatically maintain frequency synchronization between pump and Stokes beam. The Stokes frequency shift is generated by soliton self-frequency shift (SSFS) through a photonic crystal fiber. The impact of pulse chirping on the signal power reduction of coherent anti-Stokes Raman scattering (CARS) and stimulated Raman scattering (SRS) have been investigate through theoretical analysis and experiment. 

Our multimodal system provides measurement diversity among CARS, SRS and photothermal, which can be used for comparison and offering complementary information. Distribution of hemoglobin in human red blood cells and lipids in sliced mouse brain sample have been imaged. Frequency and power dependency of photothermal signal is characterized. 
Based on the polarization dependency of the third-order susceptibility of the material, the polarization switched SRS method is able to eliminate the nonresonant photothermal signal from the resonant SRS signal. Red blood cells and sliced mouse brain samples were imaged to demonstrate the capability of the proposed technique. The result shows that polarization switched SRS removes most of the photothermal signal. 


MAHITHA DODDALA

Properties of Probabilistic Approximations Applied to Incomplete Data

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Man Kong
Bo Luo


Abstract

The main focus of the project is to discuss mining of incomplete data which we find frequently in real-life records. For this, I considered the probabilistic approximations as they have a direct application to mining incomplete data. I have examined the results obtained from the experiments conducted on eight real-life data sets taken from University of California at Irvine Machine Learning Repository. I also investigated the properties of singleton, subset, and concept approximations and corresponding consistencies. The main objective was to compare the global and local approximations and generalize the consistency definition for incomplete data with two interpretations of missing attribute values: lost values and "do not care" conditions. In addition to this comparison, the most useful approach among singleton, subset and concept approximations is also tested for which the conclusion is the best approach would be selected with the help of tenfold cross validation after applying all three approaches. Also it’s shown that even if there exist six types of consistencies, there are only four distinct consistencies of incomplete data as two pairs of such consistencies are equivalent.


ROHIT YADAV

Automatic Text Summarization of Email Corpus Using Importance of Sentences

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Prasad Kulkarni
Bo Luo


Abstract

With the advent of Internet, the data being added online have been increasing at an enormous rate. Though search engines use information retrieval (IR) techniques to facilitate the search requests from users, the results may not always be effective or the efficiency of results according to a search query may not be high. The user has to go through certain web pages before getting at the web page he/she needs. This problem of information overload can be solved using automatic text summarization. Summarization is a process of obtaining an abridged version of documents so that user can have a quick understanding of the document. A new technique to produce a summary of an original text is investigated in this project. 
Email threads from the World Wide Web consortium’s sites (W3C) corpus are used in this system.Our system is based on identification and extraction of important sentences from the input document. Apart from common IR features like term frequency and inverse document frequency, novel features such as Term Frequency-Inverse Document Frequency,subject words, sentence position and thematic words have also been implemented. The model consists of four stages. The pre-processing stage converts the unstructured (all those things that can't be so readily classified) text into structured (any data that resides in a fixed field within a record or file). In the first stage each sentence is partitioned into the list of tokens and stop words are removed. The second stage is to extract the important key phrases in the text by implementing a new algorithm through ranking the candidate words. The system uses the extracted keywords/key phrases to select the important sentence. Each sentence is ranked depending on many features such as the existence of the keywords/key phrase in it, the relation between the sentence and the title by using a similarity measurement and other many features. The third stage of the proposed system is to extract the sentences with the highest rank. The fourth stage is the filtering stage where sentences from email threads are ranked as per features and summaries are generated. This system can be considered as a framework for unsupervised learning in the field of text summarization. 


ARJUN MUTHALAGU

Flight Search Application

When & Where:


250 Nichols Hall

Committee Members:

Prasad Kulkarni, Chair
Andy Gill
Jerzy Grzymala-Busse


Abstract

“Flight-search” application is an Angular JS application implemented in a client side architecture. The application displays the flight results from different airline companies based on the input parameters. The application also has custom filtering conditions and custom pagination, which a user can interact with to filter the result and also limit the results displayed in the browser. The application uses QPX Express API to pull data for the flight searches.


SATYA KUNDETI

A comparison of Two Decision Tree Generating Algorithms: C4.5 and CART Based on Numerical Data

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Luke Huan
Bo Luo


Abstract

In Data Mining, classification of data is a challenging task. One of the most popular techniques for classifying data is decision tree induction. In this project, two decision tree generating algorithms CART and C4.5, using their original implementations, are compared on different numerical data sets, taken from University of California Irvine (UCI). The comparative analysis of these two implementations is carried out in terms of accuracy and decision tree complexity. Results from experiments show that there is statistically insignificant difference(5% level of significance, two-tailed test)between C4.5 and CART in terms of accuracy. On the other hand, decision trees generated by C4.5 and CART have significant statistical difference in terms of their complexity.