Defense Notices


All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

David Felton

Optimization and Evaluation of Physical Complementary Radar Waveforms

When & Where:


Nichols Hall, Room 129 (Apollo Auditorium)

Committee Members:

Shannon Blunt, Chair
Rachel Jarvis
Patrick McCormick
James Stiles
Zsolt Talata

Abstract

The RF spectrum is a precious, finite resource with ever-increasing demand. Consequently, the mandate to be a "good spectral neighbor" is in direct conflict with the requirements for high-performance sensing where correlation error is fundamentally limited. As such, matched-filter radar performance is often sidelobe-limited with estimation error being constrained by the time-bandwidth (TB) of the collective emission. The methods developed here seek to bridge this gap between idealized radar performance and practical utility via waveform design.    

Estimation error becomes more complex when employing pulse-agility. In doing so, range-sidelobe modulation (RSM) spreads energy across Doppler, rendering traditional methods ineffective. To address this, the gradient-based complementary-FM framework was developed to produce complementary sidelobe cancellation (CSC) after coherently combining subsets within a pulse-agile emission. In contrast to the majority of complementary signals, explored via phase-coding, these Comp-FM waveform subsets achieve CSC while preserving hardware-compatibility since they are FM (though design distortion is never completely avoided). Although Comp-FM addressed practicality via hardware amenability, CSC was localized to zero-Doppler. This work expands the Comp-FM notion to a Doppler-generalized (DG) framework, extending the cancellation condition to an arbitrary span. The same framework can likewise be employed to jointly optimize an entire coherent processing interval (CPI) to minimize RSM within the radar point-spread-function (PSF), thereby generalizing the notion of complementarity and introducing the potential for cognitive operation if sufficient scattering knowledge is available a-priori.          

Sensing with a single emitter is limited by self-inflicted error alone (e.g., clutter, sidelobes), while MIMO systems must additionally contend with the cross-responses from emitters operating concurrently (e.g., simultaneously, spatially proximate, in a shared spectrum), further degrading radar sensitivity. Now, total correlation error is dictated by the overlapping TB (i.e., how coincident are the signals) and number of operating emitters, compounding difficulty to estimate if left unaddressed. As such, the determination of "orthogonal waveforms" comprises a large portion of MIMO literature, though remains a phenomenological misnomer for pulsed emissions. Here, the notion of complementary-FM is applied to a multi-emitter context in which transmitter-amenable quasi-orthogonal subsets, occupying the same spectral band, are produced via a similar gradient-based approach. To further practicalize these MIMO-Comp-FM waveform subsets, the same "DG" approach described above, addressing the otherwise-default Doppler-induced degradation of complementary signals, is applied. In doing so, Doppler-independent separability and complementarity greatly improves estimation sensitivity for multi-emitter systems. 

This MIMO-Comp-FM framework is developed for standard matched filter processing. Coupling this framework with a "DG" form of the previously explored MIMO-MiCRFt is also investigated, illustrating the added benefit of pairing optimized subsets with similarly calibrated processing. 

Each of these methods is developed to address unique and increasingly complex sources of estimation error. All approaches are initially developed and evaluated via simulated analysis where ground-truth is known. Then, despite hardware-induced distortion being unavoidable, the MIMO-Comp-FM framework is confirmed via loopback measurements to preserve the majority of CSC that was observed in simulation. Finally, open-air demonstration of each approach validates practical utility on a radar system.


Hao Xuan

Toward an Integrated Computational Framework for Metagenomics: From Sequence Alignment to Automated Knowledge Discovery

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Cuncong Zhong, Chair
Fengjun Li
Suzanne Shontz
Hongyang Sun
Liang Xu

Abstract

Metagenomic sequencing has become a central paradigm for studying complex microbial communities and their interactions with the host, with emerging applications in clinical prediction and disease modeling. In this work, we first investigate two representative application scenarios: predicting immune checkpoint inhibitor response in non-small cell lung cancer using gut microbial signatures, and characterizing host–microbiome interactions in neonatal systems. The proposed reference-free neural network captures both compositional and functional signals without reliance on reference genomes, while the neonatal study demonstrates how environmental and genetic factors reshape microbial communities and how probiotic intervention can mitigate pathogen-induced immune activation.

These studies highlight both the promise and the inherent difficulty of metagenomic analysis: transforming raw sequencing data into clinically actionable insights remains an algorithmically fragmented and computationally intensive process. This challenge arises from two key limitations: the lack of a unified algorithmic foundation for sequence alignment and the absence of systematic approaches for selecting and organizing analytical tools. Motivated by these challenges, we present a unified computational framework for metagenomic analysis that integrates complementary algorithmic and systems-level solutions.

First, to resolve fragmentation at the alignment level, we develop the Versatile Alignment Toolkit (VAT), a unified algorithmic system for biological sequence alignment across diverse applications. VAT introduces an asymmetric multi-view k-mer indexing scheme that integrates multiple seeding strategies within a single architecture and enables dynamic seed-length adjustment via longest common prefix (LCP)–based inference without re-indexing. A flexible seed-chaining mechanism further supports diverse alignment scenarios, including collinear, rearranged, and split alignments. Combined with a hardware-efficient in-register bitonic sorting algorithm and dynamic index-loading strategy, VAT achieves high efficiency and broad applicability across read mapping, homology search, and whole-genome alignment. Second, to address the challenge of tool selection and pipeline construction, we develop SNAIL, a natural language processing system for automated recognition of bioinformatics tools from large-scale and rapidly growing scientific literature. By integrating XGBoost and Transformer-based models such as SciBERT, SNAIL enables structured extraction of analytical tools and supports automated, reproducible pipeline construction.

Together, this work establishes a unified framework that is grounded in real-world applications and addresses key bottlenecks in metagenomic analysis, enabling more efficient, scalable, and clinically actionable workflows.


Pramil Paudel

Learning Without Seeing: Privacy-Preserving and Adversarial Perspectives in Lensless Imaging

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Fengjun Li, Chair
Alex Bardas
Bo Luo
Cuncong Zhong
Haiyang Chao

Abstract

Conventional computer vision relies on spatially resolved, human-interpretable images, which inherently expose sensitive information and raise privacy concerns. In this study, we explore an alternative paradigm based on lensless imaging, where scenes are captured as diffraction patterns governed by the point spread function (PSF). Although unintelligible to humans, these measurements encode structured, distributed information that remains useful for computational inference. 

We propose a unified framework for privacy-preserving vision that operates directly on lensless sensor measurements by leveraging their frequency-domain and phase-encoded properties. The framework is developed along two complementary directions. First, we enable reconstruction-free inference by exploiting the intrinsic obfuscation of lensless data. We show that semantic tasks such as classification can be performed directly on diffraction patterns using models tailored to non-local, phase-scrambled representations. We further design lensless-aware architectures and integrate them into practical pipelines, including a Swin Transformer-based steganographic framework (DiffHide) for secure and imperceptible information embedding. To assess robustness, we formalize adversarial threat models and develop defenses against learning-based reconstruction attacks, particularly GAN-driven inversion. Second, we investigate the limits of privacy by studying the reconstructability of lensless measurements without explicit knowledge of the forward model. We develop learning-based reconstruction methods that approximate the inverse mapping and analyze conditions under which sensitive information can be recovered. Our results demonstrate that lensless measurements enable effective vision tasks without reconstruction, while providing a principled framework to evaluate and mitigate privacy risks. 


Past Defense Notices

Dates

LAKSHMI KOUTHA

Advanced Encoding Schemes and their Hardware Implementations for Brain Inspired Computing

When & Where:


2001B Eaton Hall

Committee Members:

Yang Yi, Chair
Chris Allen
Glenn Prescott


Abstract

According to Moore’s law the number of transistors per square inch double every two years. Scaling down technology reduces size and cost however, also increases the number of problems. Our current computers using Von-Neumann architectures are seeing progressive difficulties not only due to scaling down the technology but also due to grid-lock situation in its architecture. As a solution to this, scientists came up architectures whose function resembles that of the brain. They called these brains inspired architectures, neuromorphic computers. The building block of the brain is the neuron which encodes, decodes and processes the data. The neuron is known to accept sensory information and converts this information into a spike train. This spike train is encoded by the neuron using different ways depending on the situation. Rate encoding, temporal encoding, population encoding, sparse encoding and rate-order encoding are a few encoding schemes said to be used by the neuron. These different neural encoding schemes are discussed as the primary focus of the thesis. A comparison between these different schemes is also provided for better understanding, thus helping in the design of an efficient neuromorphic computer. This thesis also focusses on hardware implementation of a neuron. Leaky Fire and Integrate neuron model has been used in this work which uses spike-time dependent encoding. Different neuron models are discussed with a comparison as to which model is effective under which circumstances. The electronic neuron model was implemented using 180nm CMOS Technology using Global Foundries PDK libraries. Simulation results for the neuron are presented for different inputs and different excitation currents. These results show the successful encoding of sensory information into a spike train.


PENG SENG TAN

Addressing Spectrum Congestion by Spectrally-Cooperative Radar Design

When & Where:


250 Nichols Hall

Committee Members:

Jim Stiles, Chair
Shannon Blunt
Chris Allen
Lingjia Liu
Tyrone Duncan

Abstract

Due to the increasing need for greater Radio Frequency (RF) spectrum by mobile apps like Facebook and Instagram, high data-rate communication protocols like 5G and the Internet of Things, it has led to the issue of spectrum congestion as radar systems have traditionally maintain the largest share of the RF spectrum. To resolve the spectrum congestion problem, it has become even necessary for users from both types of systems to coexist within a finite spectrum allocation. However, this then leads to other problems such as the increased likelihood of mutual interference experienced by all users that are coexisting within the finite spectrum. 

In this dissertation, we propose to address the problem of spectrum congestion via a two-step approach. The first step of this approach involves designing an optimal sparse spectrum allocation scheme to radar systems such that the radar range resolution performance can be maintained with a smaller resulting bandwidth at a cost of degraded sidelobe performance. The second step of this approach involves designing radar waveforms that possesses good spectral containment property by expanding the framework of Polyphase-coded Frequency Modulated (PCFM) waveforms to higher-order representations such that these waveforms will mitigate issues of interference experienced by other systems when both systems are coexisting within the same band. 


CHENYUAN ZHAO

Energy Efficient Spike-Time-Dependent Encoder Design for Neuromorphic Computing System

When & Where:


250 Nichols Hall

Committee Members:

Yang Yi, Chair
Lingjia Liu
Luke Huan
Suzanne Shontz
Yong Zeng

Abstract

Von Neumann Bottleneck, which refers to the limited throughput between the CPU and memory, has already become the major factor hindering the technical advances of computing systems. In recent years, neuromorphic systems started to gain the increasing attentions as compact and energy-efficient computing platforms. As one of the most crucial components in the neuromorphic computing systems, neural encoder transforms the stimulus (input signals) into spike trains. In this report, I will present my research work on spike-time-dependent encoding schemes and its relevant energy efficient encoders’ design. The performance comparison among rate encoding, latency encoding, and temporal encoding would be discussed in this report. The proposed neural temporal encoder allows efficient mapping of signal amplitude information into a spike time sequence that represents the input data and offers perfect recovery for band-limited stimuli. The simulation and measurement results show that the proposed temporal encoder is proven to be robust and error-tolerant. 


XIAOLI LI

Constructivism Learning: A Learning Paradigm for Transparent and Reliable Predictive Analytics

When & Where:


246 Nichols Hall

Committee Members:

Luke Huan, Chair
Victor Frost
Jerzy Grzymala-Busse
Bo Luo
Alfred Tat-Kei Ho

Abstract

With an increasing trend of adoption of machine learning in various real-world problems, the need for transparent and reliable models has become apparent. Especially in some socially consequential applications, such as medical diagnosis, credit scoring, and decision making in educational systems, it may be problematic if humans cannot understand and trust those models. To this end, in this work, we propose a novel machine learning algorithm, constructivism learning. To achieve transparency, we formalized a Bayesian nonparametric approach using sequential Dirichlet Process Mixture of prediction models to support constructivism learning. To achieve reliability, we exploit two strategies, reducing model uncertainty and increasing task construction stability by leveraging techniques in active learning and self-paced learning. 


JOSEPH ST. AMAND

Local Metric Learning

When & Where:


250 Nichols Hall

Committee Members:

Luke Huan, Chair
Prasad Kulkarni
Jim Miller
Richard Wang
Bozenna Pasik-Duncan

Abstract

Distance metrics are concerned with learning how objects are similar, and are a critical component of many machine learning algorithms such as k-nearest neighbors and kernel machines. Traditional metrics are unable to adapt to data with heterogenous interactions in the feature space. State of the art methods consider learning multiple metrics, each in some way local to a portion of the data. Selecting how the distance metrics are local to the data is done apriori, with no known best approach. 
In this proposal, we address the local metric learning scenario from three complementary perspectives. In the first direction, we consider a spatial approach, and develop an efficient Frank-Wolfe based technique to learn local distance metrics directly in a high-dimensional input space. We then consider a view-local perspective, where we associate each metric with a separate view of the data, and show how the approach naturally evolves into a multiple kernel learning problem. Finally, we propose a new function for learning a metric which is based on a newly discovered operator called the t-product, here we show that our metric is composed of multiple parts, with each portion local to different interactions in the input space. 


MARK GREBE

Domain Specific Languages for Small Embedded Systems

When & Where:


246 Nichols Hall

Committee Members:

Andy Gill, Chair
Perry Alexander
Prasad Kulkarni
Suzanne Shontz
Kyle Camarda

Abstract

Resource limited embedded systems provide a great challenge to programming using functional languages. Although we cannot program these embedded systems directly with Haskell, we show than an embedded domain specific language is able to be used to program them, providing a user friendly environment for both prototyping and full development. The Arduino line of microcontroller boards provide a versatile, low cost and popular platform for development of these resource limited systems, and we use this as the platform for our DSL research. 

First we provide a shallowly embedded domain specific language and a firmware interpreter, allowing the user to program the Arduino while tethered to a host computer. Second, we add a deeply embedded version, allowing the interpreter to run standalone from the host computer, as well as allowing us to compile the code to C and then machine code for efficient operation. Finally, we develop a method of transforming the shallowly embedded DSL syntax into the deeply embedded DSL syntax automatically.


RUBAYET SHAFIN

Performance Analysis of Parametric Channel Estimation for 3D Massive MIMO/FD-MIMO OFDM Systems

When & Where:


250 Nichols Hall

Committee Members:

Lingjia Liu, Chair
Erik Perrins
Yang Yi


Abstract

With the promise of meeting future capacity demands for mobile broadband communications, 3D massive-MIMO/Full Dimension MIMO (FD-MIMO) systems have gained much interest among the researchers in recent years. Apart from the huge spectral efficiency gain offered by the system, the reason for this great interest can also be attributed to significant reduction of latency, simplified multiple access layer, and robustness to interference. However, in order to completely extract the benefits of massive-MIMO systems, accurate channel state information is critical. In this thesis, a channel estimation method based on direction of arrival (DoA) estimation is presented for massive- MIMO OFDM systems. To be specific, the DoA is estimated using Estimation of Signal Parameter via Rotational Invariance Technique (ESPRIT) method, and the root mean square error (RMSE) of the DoA estimation is analytically characterized for the corresponding MIMO-OFDM system.


DANIEL HEIN

A New Approach for Predicting Security Vulnerability Severity in Attack Prone Software Using Architecture and Repository Mined Change Metrics

When & Where:


1 Eaton Hall

Committee Members:

Hossein Saiedian, Chair
Arvin Agah
Perry Alexander
Prasad Kulkarni
Nancy Mead

Abstract

Billions of dollars are lost every year to successful cyber attacks that are fundamentally enabled by software vulnerabilities. Modern cyber attacks increasingly threaten individuals, organizations, and governments, causing service disruption, inconvenience, and costly incident response. Given that such attacks are primarily enabled by software vulnerabilities, this work examines the efficacy of using change metrics, along with architectural burst and maintainability metrics, to predict modules and files that should be analyzed or tested further to excise vulnerabilities prior to release. 

The problem addressed by this research is the residual vulnerability problem, or vulnerabilities that evade detection and persist in released software. Many modern software projects are over a million lines of code, and composed of reused components of varying maturity. The sheer size of modern software, along with the reuse of existing open source modules, complicates the questions of where to look, and in what order to look, for residual vulnerabilities. 

Traditional code complexity metrics, along with newer frequency based churn metrics (mined from software repository change history), are selected specifically for their relevance to the residual vulnerability problem. We compare the performance of these complexity and churn metrics to architectural level change burst metrics, automatically mined from the git repositories of the Mozilla Firefox Web Browser, Apache HTTP Web Server, and the MySQL Database Server, for the purpose of predicting attack prone files and modules. 

We offer new empirical data quantifying the relationship between our selected metrics and the severity of vulnerable files and modules, assessed using severity data compiled from the NIST National Vulnerability Database, and cross-referenced to our study subjects using unique identifers defined by the Common Vulnerabilities and Exposures (CVE) vulnerability catalog. Our results show that architectural level change burst metrics can perform well in situations where more traditional complexity metrics fail as reliable estimators of vulnerability severity. In particular, results from our experiments on Apache HTTP Web Server indicate that architectural level change burst metrics show high correlation with the severity of known vulnerable modules, and do so with information directly available from the version control repository change-set (i.e., commit) history. 


CHENG GAO

Mining Incomplete Numerical Data Sets

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Arvin Agah
Bo Luo
Tyrone Duncan
Xuemin Tu

Abstract

Incomplete and numerical data are common for many application domains. There have been many approaches to handle missing data in statistical analysis and data mining. To deal with numerical data, discretization is crucial for many machine learning algorithms. However, most of the discretization algorithms cannot be applied to incomplete data sets. 

Multiple Scanning is an entropy based discretization method. Previous research shown it outperforms commonly used discretization methods: Equal Width or Equal Frequency discretization. In this work, Multiple Scanning is tested on C4.5 and MLEM2 on incomplete datasets. Results show for some data sets, the setup utilizing Multiple Scanning as preprocessing performs better, for the other data sets, C4.5 or MLEM2 should be used by themselves. Our conclusion is that there are no universal optimal solutions for all data sets. Setup should be custom-made. 


SUMIAH ALALWANI

Experiments on Incomplete Data Sets Using Modifications to Characteristic Relation

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Prasad Kulkarni
Bo Luo


Abstract

Rough set theory is a useful approach for decision rule induction, which is applied, to large life data sets. Lower and upper approximations of concepts values are used to induce rules for incomplete data sets. In our research we will study validity of modifications suggested to characteristic relation. We discuss the implementation of modifications to characteristic relation, and the local definability of each modified set. We show that all suggested modifications sets are not locally definable except for maximal consistent blocks that are restricted to data set with “do not care” conditions. A comparative analysis was conducted for characteristic sets and modifications in terms of cardinality of lower and upper approximations of each concept and decision rules induced by each modification. In this thesis, experiments were conducted on four incomplete data sets with lost and “do not care “ conditions. LEM2 algorithm was implemented to induce certain and possible rules form the incomplete data set. To measure the classification average error rate for induced rules, ten-fold cross validation was implemented. Our results show that there is no significant difference between the qualities of rule induced from each modification.