Defense Notices


All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

Soumya Baddham

Battling Toxicity: A Comparative Analysis of Machine Learning Models for Content Moderation

When & Where:


Eaton Hall, Room 2001B

Committee Members:

David Johnson, Chair
Prasad Kulkarni
Hongyang Sun


Abstract

With the exponential growth of user-generated content, online platforms face unprecedented challenges in moderating toxic and harmful comments. Due to this, Automated content moderation has emerged as a critical application of machine learning, enabling platforms to ensure user safety and maintain community standards. Despite its importance, challenges such as severe class imbalance, contextual ambiguity, and the diverse nature of toxic language often compromise moderation accuracy, leading to biased classification performance.

This project presents a comparative analysis of machine learning approaches for a Multi-Label Toxic Comment Classification System using the Toxic Comment Classification dataset from Kaggle.  The study examines the performance of traditional algorithms, such as Logistic Regression, Random Forest, and XGBoost, alongside deep architectures, including Bi-LSTM, CNN-Bi-LSTM, and DistilBERT. The proposed approach utilizes word-level embeddings across all models and examines the effects of architectural enhancements, hyperparameter optimization, and advanced training strategies on model robustness and predictive accuracy.

The study emphasizes the significance of loss function optimization and threshold adjustment strategies in improving the detection of minority classes. The comparative results reveal distinct performance trade-offs across model architectures, with transformer models achieving superior contextual understanding at the cost of computational complexity. At the same time, deep learning approaches(LSTM models) offer efficiency advantages. These findings establish evidence-based guidelines for model selection in real-world content moderation systems, striking a balance between accuracy requirements and operational constraints.


Manu Chaudhary

Utilizing Quantum Computing for Solving Multidimensional Partial Differential Equations

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Esam El-Araby, Chair
Perry Alexander
Tamzidul Hoque
Prasad Kulkarni
Tyrone Duncan

Abstract

Quantum computing has the potential to revolutionize computational problem-solving by leveraging the quantum mechanical phenomena of superposition and entanglement, which allows for processing a large amount of information simultaneously. This capability is significant in the numerical solution of complex and/or multidimensional partial differential equations (PDEs), which are fundamental to modeling various physical phenomena. There are currently many quantum techniques available for solving partial differential equations (PDEs), which are mainly based on variational quantum circuits. However, the existing quantum PDE solvers, particularly those based on variational quantum eigensolver (VQE) techniques, suffer from several limitations. These include low accuracy, high execution times, and low scalability on quantum simulators as well as on noisy intermediate-scale quantum (NISQ) devices, especially for multidimensional PDEs.

 In this work, we propose an efficient and scalable algorithm for solving multidimensional PDEs. We present two variants of our algorithm: the first leverages finite-difference method (FDM), classical-to-quantum (C2Q) encoding, and numerical instantiation, while the second employs FDM, C2Q, and column-by-column decomposition (CCD). Both variants are designed to enhance accuracy and scalability while reducing execution times. We have validated and evaluated our proposed concepts using a number of case studies including multidimensional Poisson equation, multidimensional heat equation, Black Scholes equation, and Navier-Stokes equation for computational fluid dynamics (CFD) achieving promising results. Our results demonstrate higher accuracy, higher scalability, and faster execution times compared to VQE-based solvers on noise-free and noisy quantum simulators from IBM. Additionally, we validated our approach on hardware emulators and actual quantum hardware, employing noise mitigation techniques. This work establishes a practical and effective approach for solving PDEs using quantum computing for engineering and scientific applications.


Alex Manley

Taming Complexity in Computer Architecture through Modern AI-Assisted Design and Education

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Heechul Yun, Chair
Tamzidul Hoque
Prasad Kulkarni
Mohammad Alian

Abstract

The escalating complexity inherent in modern computer architecture presents significant challenges for both professional hardware designers and students striving to gain foundational understanding. Historically, the steady improvement of computer systems was driven by transistor scaling, predictable performance increases, and relatively straightforward architectural paradigms. However, with the end of traditional scaling laws and the rise of heterogeneous and parallel architectures, designers now face unprecedented intricacies involving power management, thermal constraints, security considerations, and sophisticated software interactions. Prior tools and methodologies, often reliant on complex, command-line driven simulations, exacerbate these challenges by introducing steep learning curves, creating a critical need for more intuitive, accessible, and efficient solutions. To address these challenges, this thesis introduces two innovative, modern tools.

The first tool, SimScholar, provides an intuitive graphical user interface (GUI) built upon the widely-used gem5 simulator. SimScholar significantly simplifies the simulation process, enabling students and educators to more effectively engage with architectural concepts through a visually guided environment, both reducing complexity and enhancing conceptual understanding. Supporting SimScholar, the gem5 Extended Modules API (gEMA) offers streamlined backend integration with gem5, ensuring efficient communication, modularity, and maintainability.

The second contribution, gem5 Co-Pilot, delivers an advanced framework for architectural design space exploration (DSE). Co-Pilot integrates cycle-accurate simulation via gem5, detailed power and area modeling through McPAT, and intelligent optimization assisted by a large language model (LLM). Central to Co-Pilot is the Design Space Declarative Language (DSDL), a Python-based domain-specific language that facilitates structured, clear specification of design parameters and constraints.

Collectively, these tools constitute a comprehensive approach to taming complexity in computer architecture, offering powerful, user-friendly solutions tailored to both educational and professional settings.


Past Defense Notices

Dates

SRUTHI POTLURI

A Web Application for Recommending Movies to Users

When & Where:


2001B Eaton hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Man Kong
Bo Luo


Abstract

Recommendation systems are becoming more and more important with increasing popularity of e-commerce platforms. An ideal recommendation system recommends preferred items to the user. In this project, an algorithm named item-item collaborative filtering is implemented as premise. The recommendations are smarter by going through movies similar to the movies of different ratings by the user, calculating predictions and recommending those movies which have high predictions. The primary goal of the proposed recommendation algorithm is to include user’s preference and to include lesser known items in recommendations. The proposed recommendation system was evaluated on basis of Mean Absolute Error(MAE) and Root Mean Square Error(RMSE) against 1 Million movie rating involving 6040 users and 3900 movies. The implementation is made as a web-application to simulate the real-time experience for the user.  


DEBABRATA MAJHI

IRIM: Interesting Rule Induction Module with Handling Missing Attribute Values

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Prasad Kulkarni
Bo Luo


Abstract

In the current era of big data, huge amount of data can be easily collected, but the unprocessed data is not useful on its own. It can be useful only when we are able to find interesting patterns or hidden knowledge. The algorithm to find interesting patterns is known as Rule Induction Algorithm. Rule induction is a special area of data mining and machine learning in which formal rules are extracted from a dataset. The extracted rules may represent some general or local (isolated) patterns related to the data.
In this report, we will focus on the IRIM (Interesting Rule Inducing Module) which induces strong interesting rules that covers most of the concept. Usually, the rules induced by IRIM provides interesting and surprising insight to the expert in the domain area.
The IRIM algorithm was implemented using Python and pySpark library, which is specially customize for data mining. Further, the IRIM algorithm was extended to handle the different types of missing data. Then at the end the performance of the IRIM algorithm with and without missing data feature was analyzed. As an example, interesting rules induced from IRIS dataset are shown.

 


SUSHIL BHARATI

Vision Based Adaptive Obstacle Detection, Robust Tracking and 3D Reconstruction for Autonomous Unmanned Aerial Vehicles

When & Where:


246 Nichols Hall

Committee Members:

Richard Wang, Chair
Bo Luo
Suzanne Shontz


Abstract

Vision-based autonomous navigation of UAVs in real-time is a very challenging problem, which requires obstacle detection, tracking, and depth estimation. Although the problems of obstacle detection and tracking along with 3D reconstruction have been extensively studied in computer vision field, it is still a big challenge for real applications like UAV navigation. The thesis intends to address these issues in terms of robustness and efficiency. First, a vision-based fast and robust obstacle detection and tracking approach is proposed by integrating a salient object detection strategy within a kernelized correlation filter (KCF) framework. To increase its performance, an adaptive obstacle detection technique is proposed to refine the location and boundary of the object when the confidence value of the tracker drops below a predefined threshold. In addition, a reliable post-processing technique is implemented for an accurate obstacle localization. Second, we propose an efficient approach to detect the outliers present in noisy image pairs for the robust fundamental matrix estimation, which is a fundamental step for depth estimation in obstacle avoidance. Given a noisy stereo image pair obtained from the mounted stereo cameras and initial point correspondences between them, we propose to utilize reprojection residual error and 3-sigma principle together with robust statistic based Qn estimator (RES-Q) to efficiently detect the outliers and accurately estimate the fundamental matrix. The proposed approaches have been extensively evaluated through quantitative and qualitative evaluations on a number of challenging datasets. The experiments demonstrate that the proposed detection and tracking technique significantly outperforms the state-of-the-art methods in terms of tracking speed and accuracy, and the proposed RES-Q algorithm is found to be more robust than other classical outlier detection algorithms under both symmetric and asymmetric random noise assumptions.


MOHSEN ALEENEJAD

New Modulation Methods and Control Strategies for Power Electronics Inverters

When & Where:


1 Eaton Hall

Committee Members:

Reza Ahmadi, Chair
Glenn Prescott
Alessandro Salandrino
Jim Stiles
Huazhen Fang*

Abstract

The DC to AC power Converters (so-called Inverters) are widely used in industrial applications. The multilevel inverters are becoming increasingly popular in industrial apparatus aimed at medium to high power conversion applications.  In comparison to the conventional inverters, they feature superior characteristics such as lower total harmonic distortion (THD), higher efficiency, and lower switching voltage stress.  Nevertheless, the superior characteristics come at the price of a more complex topology with an increased number of power electronic switches. The increased number of power electronics switches results in more complicated control strategies for the inverter. Moreover, as the number of power electronic switches increases, the chances of fault occurrence of the switches increases, and thus the inverter’s reliability decreases. Due to the extreme monetary ramifications of the interruption of operation in commercial and industrial applications, high reliability for power inverters utilized in these sectors is critical.  As a result, developing simple control strategies for normal and fault-tolerant operation of multilevel inverters has always been an interesting topic for researchers in related areas.  The purpose of this dissertation is to develop new control and fault-tolerant strategies for the multilevel power inverter.  For the normal operation of the inverter, a new high switching frequency technique is developed.  The proposed method extends the utilization of the dc link voltage while minimizing the dv/dt of the switches. In the event of a fault, the line voltages of the faulty inverters are unbalanced and cannot be applied to the three phase loads. For the faulty condition of the inverter, three novel fault-tolerant techniques are developed. The proposed fault-tolerant strategies generate balanced line voltages without bypassing any healthy and operative inverter element, makes better use of the inverter capacity and generates higher output voltage. These strategies exploit the advantages of the Selective Harmonic Elimination (SHE) and Space Vector Modulation (SVM) methods in conjunction with a slightly modified Fundamental Phase Shift Compensation (FPSC) technique to generate balanced voltages and manipulate voltage harmonics at the same time.  The proposed strategies are applicable to several classes of multilevel inverters with three or more voltage levels.


XIAOLI LI

Constructivism Learning

When & Where:


246 Nichols Hall

Committee Members:

Luke Huan, Chair
Victor Frost
Bo Luo
Richard Wang
Alfred Ho*

Abstract

Aiming to achieve the learning capabilities possessed by intelligent beings, especially human, researchers in machine learning field have the long-standing tradition of borrowing ideas from human learning, such as reinforcement learning, active learning, and curriculum learning.  Motivated by a philosophical theory called  "constructivism", in this work, we propose a new machine learning paradigm, constructivism learning.   The constructivism theory has had wide-ranging impact on various human learning theories about how human acquire knowledge.  To adapt this human learning theory to the context of machine learning, we first studied how to improve leaning performance by exploring inductive bias or prior knowledge from multiple learning tasks with multiple data sources, that is multi-task multi-view learning, both in offline and lifelong setting.  Then we formalized a Bayesian nonparametric approach using sequential Dirichlet Process Mixture Models to support constructivism learning.  To further exploit constructivism learning, we also developed a constructivism deep learning method utilizing Uniform Process Mixture Models.


MOHANAD AL-IBADI

Array Processing Techniques for Ice-Sheet Bottom Tracking

When & Where:


317 Nichols Hall

Committee Members:

Shannon Blunt, Chair
John Paden
Eric Perrins
Jim Stiles
Huazhen Fang*

Abstract

   In airborne multichannel radar sounder signal processing, the collected data are most naturally represented in a cylindrical coordinate system: along-track, range, and elevation angle. The data are generally processed in each of these dimensions sequentially to focus or resolve the data in the corresponding dimension such that a 3D image of the scene can be formulated. Pulse-compression is used to process the data along the range dimension, synthetic aperture radar (SAR) processing is used to process the data in the along-track dimension, and array-processing techniques are used for the elevation angle dimension. After the first two steps, the 3D scene is resolved into toroids with constant along-track and constant range that are centered on the flight path. The targets lying in a particular toroid need to be resolved by estimating their respective elevation angles.
   In the proposed work, we focus on the array processing step, where several direction of arrival (DoA) estimation methods will be used to resolve the targets in the elevation-angle dimension, such as MUltiple Signal Classification (MUSIC) and maximum-likelihood estimation (MLE). A tracker is then used on the output of the DoA estimation to track the ice-bottom interface. We propose to use the tree re-weighted message passing algorithm or Kalman filtering, based on the array-processing technique, to track the ice-bottom. The outcome of this is a digital elevation model (DEM) of the ice-bottom. While most published work assumes a narrowband model for the array, we will use a wideband model and focus on issues related to wideband arrays. Along these lines, we propose a theoretical study to evaluate the performance of the radar products based on the array characteristics using different array-processing techniques, such as wideband MLE and focusing-matrices methods. In addition, we will investigate tracking targets using a sparse array composed of three sub-arrays, each separated by a large multiwavelength baseline. Specifically, we propose to develop and investigate the performance of a Kalman tracking solution to this wideband sparse array problem when applied to data collected by the CReSIS radar sounder.

 


QIAOZHI WANG

Towards the Understanding of Private Content -- Content-based Privacy Assessment and Protection in Social Networks

When & Where:


2001B Eaton Hall

Committee Members:

Bo Luo, Chair
Fengjun Li
Richard Wang
Heechul Yun
Prajna Dhar*

Abstract

In the 2016 presidential election, social networks showed their great power as a “modern form of communication”. With the increasing popularity of social networks, privacy concerns arise. For example, it has been shown that microblogs are revealed to audiences that are significantly larger than users' perceptions. Moreover, when users are emotional, they may post messages with sensitive content and later regret doing so.  As a result, users become very vulnerable – private or sensitive information may be accidentally disclosed, even in tweets about trivial daily activities.
Unfortunately, existing research projects on data privacy, such as the k-anonymity and differential privacy mechanisms, mostly focus on protecting individual’s identity from being discovered in large data sets. We argue that the key component of privacy protection in social networks is protecting sensitive content, i.e. privacy as having the ability to control dissemination of information. The overall objectives of the proposed research are: to understand the sensitive content of social network posts, to facilitate content-based protection of private information, and to identify different types of sensitive information.  In particular, we propose a user-centered, quantitative measure of privacy based on textual content, and a customized privacy protection mechanism for social networks. 
We consider private tweet identification and classification as dual-problems. We propose to develop an algorithm to identify all types of private messages, and, more importantly, automatically score the sensitiveness of private message.  We first collect the opinions from a diverse group of users w.r.t. sensitiveness of private information through Amazon Mechanical Turk, and analyze the discrepancies between users' privacy expectations and actual information disclosure. We then develop a computational method to generate the context-free privacy score, which is the “consensus” privacy score for average users. Meanwhile, classification of private tweets is necessary for customized privacy protection. We have made the first attempt to understand different types of private information, and to automatically classify sensitive tweets into 13 pre-defined topic categories. In proposed research, we will further include personal attitudes, topic preferences, and social context into the scoring mechanism, to generate a personalized, context-aware privacy score, which will be utilized in a comprehensive privacy protection mechanism.  

 


STEVE HAENCHEN

A Model to Identify Insider Threats Using Growing Hierarchical Self-Organizing Map of Electronic Media Indicators

When & Where:


1 Eaton Hall

Committee Members:

Hossein Saiedian, Chair
Arvin Agah
Prasad Kulkarni
Bo Luo
Reza Barati

Abstract

Fraud from insiders costs an estimated $3.7 trillion annually. Current fraud prevention and detection methods that include analyzing network logs, computer events, emails, and behavioral characteristics have not been successful in reducing the losses. The proposed Occupational Fraud Prevention and Detection Model uses existing data from the field of digital forensics along with text clustering algorithms, machine learning, and a growing hierarchical self-organizing map model to predict insider threats based on computer usage behavioral characteristics.

The proposed research leverages research results from information security, software engineering, data science and information retrieval, context searching, search patterns, and machine learning to build and employ a database server and workstations to support 50+ terabytes of data representing entire hard drives from work computers. Forensic software FTK and EnCase are used to generate disk images and test extraction results. Primary research tools are built using modern programming languages. The research data is derived from disk images obtained from actual investigations when fraud was asserted and other disk images when fraud was not asserted.

The research methodology includes building a data extraction tool that is a disk level reader to store the disk, partition, and operating system data in a relational database. An analysis tool is also created to convert the data into information representing usage patterns including summarization, normalization, and redundancy removal. We build a normalizing tool that uses machine learning to adjust the baselines for company, department, and job deviations.  A prediction component is developed to derive insider threat scores reflecting the anomalies from the adjusted baseline. The resulting product will allow identification of the computer users most likely to commit fraud so investigators can focus their limited resources on the suspects.

Our primarily plan to evaluate and validate our research results is via empirical study, statistical evaluation and benchmarking with tests of precision and recall from a second set of disk images.


JAMIE ROBINSON

Code Cache Management in Managed Language VMs to Reduce Memory Consumption for Embedded Systems

When & Where:


129 Nichols Hall

Committee Members:

Prasad Kulkarni, Chair
Bo Luo
Heechul Yun


Abstract

The compiled native code generated by a just-in-time (JIT) compiler in managed language virtual machines (VM) is placed in a region of memory called the code cache. Code cache management (CCM) in a VM is responsible to find and evict methods from the code cache to maintain execution correctness and manage program performance for a given code cache size or memory budget. Effective CCM can also boost program speed by enabling more aggressive JIT compilation, powerful optimizations, and improved hardware instruction cache and I-TLB performance.

Though important, CCM is an overlooked component in VMs. We find that the default CCM policies in Oracle’s production-grade HotSpot VM perform poorlyeven at modest memory pressure. We develop a detailed simulation-based framework to model and evaluate the potential efficiency of many different CCM policies in a controlled and realistic, but VM-independent environment. We make the encouraging discovery that effective CCM policies can sustain high program performance even for very small cache sizes.

Our simulation study provides the rationale and motivation to improve CCM strategies in existing VMs. We implement and study the properties of several CCM policies in HotSpot. We find that in spite of working within the bounds of the HotSpot VM’s current CCM sub-system, our best CCM policy implementation in HotSpot improves program performance over the default CCM algorithm by 39%, 41%, 55%, and 50% with code cache sizes that are 90%, 75%, 50%, and 25% of the desired cache size, on average.


AIME DE BERNER

Application of Machine Learning Techniques to the Diagnosis of Vision Disorders

When & Where:


2001B Eaton Hall

Committee Members:

Arvin Agah, Chair
Nicole Beckage
Jerzy Grzymala-Busse


Abstract

In the age of data collection and as we search for knowledge, over time numerous techniques have been developed and used to capture, manipulate, and to process data to acquire the hidden correlations, relations, patterns, and mappings that one may not be able to see. Computers as machines with the help of improved algorithms have proven to provide Artificial Intelligence (AI) by applying models to predict outcomes within an acceptable margin of error. Through performance metrics applied using Data Mining and Machine Learning models to predict human vision disorders, we are able to see promising models. AI techniques used in this work include an improved version of C.45 called C.48, Neuro-Networks, K-Nearest-Neighbor, Random Forest, Support Vector Machines, AdaBoost, among many. The best predictive models were determined that could be applied to the diagnosis of vision disorders, focusing on Strabismus, the need for patient referral to a specialist.