Defense Notices


All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

David Felton

Optimization and Evaluation of Physical Complementary Radar Waveforms

When & Where:


Nichols Hall, Room 129 (Apollo Auditorium)

Committee Members:

Shannon Blunt, Chair
Rachel Jarvis
Patrick McCormick
James Stiles
Zsolt Talata

Abstract

The RF spectrum is a precious, finite resource with ever-increasing demand. Consequently, the mandate to be a "good spectral neighbor" is in direct conflict with the requirements for high-performance sensing where correlation error is fundamentally limited. As such, matched-filter radar performance is often sidelobe-limited with estimation error being constrained by the time-bandwidth (TB) of the collective emission. The methods developed here seek to bridge this gap between idealized radar performance and practical utility via waveform design.    

Estimation error becomes more complex when employing pulse-agility. In doing so, range-sidelobe modulation (RSM) spreads energy across Doppler, rendering traditional methods ineffective. To address this, the gradient-based complementary-FM framework was developed to produce complementary sidelobe cancellation (CSC) after coherently combining subsets within a pulse-agile emission. In contrast to the majority of complementary signals, explored via phase-coding, these Comp-FM waveform subsets achieve CSC while preserving hardware-compatibility since they are FM (though design distortion is never completely avoided). Although Comp-FM addressed practicality via hardware amenability, CSC was localized to zero-Doppler. This work expands the Comp-FM notion to a Doppler-generalized (DG) framework, extending the cancellation condition to an arbitrary span. The same framework can likewise be employed to jointly optimize an entire coherent processing interval (CPI) to minimize RSM within the radar point-spread-function (PSF), thereby generalizing the notion of complementarity and introducing the potential for cognitive operation if sufficient scattering knowledge is available a-priori.          

Sensing with a single emitter is limited by self-inflicted error alone (e.g., clutter, sidelobes), while MIMO systems must additionally contend with the cross-responses from emitters operating concurrently (e.g., simultaneously, spatially proximate, in a shared spectrum), further degrading radar sensitivity. Now, total correlation error is dictated by the overlapping TB (i.e., how coincident are the signals) and number of operating emitters, compounding difficulty to estimate if left unaddressed. As such, the determination of "orthogonal waveforms" comprises a large portion of MIMO literature, though remains a phenomenological misnomer for pulsed emissions. Here, the notion of complementary-FM is applied to a multi-emitter context in which transmitter-amenable quasi-orthogonal subsets, occupying the same spectral band, are produced via a similar gradient-based approach. To further practicalize these MIMO-Comp-FM waveform subsets, the same "DG" approach described above, addressing the otherwise-default Doppler-induced degradation of complementary signals, is applied. In doing so, Doppler-independent separability and complementarity greatly improves estimation sensitivity for multi-emitter systems. 

This MIMO-Comp-FM framework is developed for standard matched filter processing. Coupling this framework with a "DG" form of the previously explored MIMO-MiCRFt is also investigated, illustrating the added benefit of pairing optimized subsets with similarly calibrated processing. 

Each of these methods is developed to address unique and increasingly complex sources of estimation error. All approaches are initially developed and evaluated via simulated analysis where ground-truth is known. Then, despite hardware-induced distortion being unavoidable, the MIMO-Comp-FM framework is confirmed via loopback measurements to preserve the majority of CSC that was observed in simulation. Finally, open-air demonstration of each approach validates practical utility on a radar system.


Hao Xuan

Toward an Integrated Computational Framework for Metagenomics: From Sequence Alignment to Automated Knowledge Discovery

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Cuncong Zhong, Chair
Fengjun Li
Suzanne Shontz
Hongyang Sun
Liang Xu

Abstract

Metagenomic sequencing has become a central paradigm for studying complex microbial communities and their interactions with the host, with emerging applications in clinical prediction and disease modeling. In this work, we first investigate two representative application scenarios: predicting immune checkpoint inhibitor response in non-small cell lung cancer using gut microbial signatures, and characterizing host–microbiome interactions in neonatal systems. The proposed reference-free neural network captures both compositional and functional signals without reliance on reference genomes, while the neonatal study demonstrates how environmental and genetic factors reshape microbial communities and how probiotic intervention can mitigate pathogen-induced immune activation.

These studies highlight both the promise and the inherent difficulty of metagenomic analysis: transforming raw sequencing data into clinically actionable insights remains an algorithmically fragmented and computationally intensive process. This challenge arises from two key limitations: the lack of a unified algorithmic foundation for sequence alignment and the absence of systematic approaches for selecting and organizing analytical tools. Motivated by these challenges, we present a unified computational framework for metagenomic analysis that integrates complementary algorithmic and systems-level solutions.

First, to resolve fragmentation at the alignment level, we develop the Versatile Alignment Toolkit (VAT), a unified algorithmic system for biological sequence alignment across diverse applications. VAT introduces an asymmetric multi-view k-mer indexing scheme that integrates multiple seeding strategies within a single architecture and enables dynamic seed-length adjustment via longest common prefix (LCP)–based inference without re-indexing. A flexible seed-chaining mechanism further supports diverse alignment scenarios, including collinear, rearranged, and split alignments. Combined with a hardware-efficient in-register bitonic sorting algorithm and dynamic index-loading strategy, VAT achieves high efficiency and broad applicability across read mapping, homology search, and whole-genome alignment. Second, to address the challenge of tool selection and pipeline construction, we develop SNAIL, a natural language processing system for automated recognition of bioinformatics tools from large-scale and rapidly growing scientific literature. By integrating XGBoost and Transformer-based models such as SciBERT, SNAIL enables structured extraction of analytical tools and supports automated, reproducible pipeline construction.

Together, this work establishes a unified framework that is grounded in real-world applications and addresses key bottlenecks in metagenomic analysis, enabling more efficient, scalable, and clinically actionable workflows.


Pramil Paudel

Learning Without Seeing: Privacy-Preserving and Adversarial Perspectives in Lensless Imaging

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Fengjun Li, Chair
Alex Bardas
Bo Luo
Cuncong Zhong
Haiyang Chao

Abstract

Conventional computer vision relies on spatially resolved, human-interpretable images, which inherently expose sensitive information and raise privacy concerns. In this study, we explore an alternative paradigm based on lensless imaging, where scenes are captured as diffraction patterns governed by the point spread function (PSF). Although unintelligible to humans, these measurements encode structured, distributed information that remains useful for computational inference. 

We propose a unified framework for privacy-preserving vision that operates directly on lensless sensor measurements by leveraging their frequency-domain and phase-encoded properties. The framework is developed along two complementary directions. First, we enable reconstruction-free inference by exploiting the intrinsic obfuscation of lensless data. We show that semantic tasks such as classification can be performed directly on diffraction patterns using models tailored to non-local, phase-scrambled representations. We further design lensless-aware architectures and integrate them into practical pipelines, including a Swin Transformer-based steganographic framework (DiffHide) for secure and imperceptible information embedding. To assess robustness, we formalize adversarial threat models and develop defenses against learning-based reconstruction attacks, particularly GAN-driven inversion. Second, we investigate the limits of privacy by studying the reconstructability of lensless measurements without explicit knowledge of the forward model. We develop learning-based reconstruction methods that approximate the inverse mapping and analyze conditions under which sensitive information can be recovered. Our results demonstrate that lensless measurements enable effective vision tasks without reconstruction, while providing a principled framework to evaluate and mitigate privacy risks. 


Past Defense Notices

Dates

VENKAT ANIRUDH YERRAPRAGADA

Comparison of Minimum Cost Perfect Matching Algorithms in solving the Chinese Postman Problem

When & Where:


2001B Eaton Hall

Committee Members:

Man Kong, Chair
Perry Alexander
Jerzy Grzymala-Busse


Abstract

The Chinese Postman Problem also known as Route Inspection Problem is a famous arc routing problem in Graph theory. In this problem, a postman has to deliver mail to the streets such that all the streets are visited at least once and return to his starting point. The problem is to find out a path called the optimal postman tour such that the distance travelled by the postman by following this path is always the minimum distance that has to be travelled to visit all the streets at least once. In graph theory, we represent the street system as a weighted graph whose edges represent the streets and the street intersections are represented by the vertices. A graph can be directed, undirected or a mixed graph. Directed and undirected edges represent the one way and the two way streets respectively. A mixed graph has both the directed and undirected edges.

The Chinese postman problem can be divided into several sub problems of which finding the minimum cost perfect matching is the critical part. For a directed graph, the minimum cost perfect matching of a bipartite graph has to be computed. For an undirected graph, the minimum cost perfect matching of a general graph has to be computed. There are different matching algorithms to compute the minimum cost perfect matching efficiently. In this project, I have understood and implemented four different matching algorithms used in computing an optimal postman tour, the Edmond’s Blossom Algorithm and a Branch and Bound Algorithm for the directed graph and the Hungarian Algorithm and a Branch and Bound Algorithm for the undirected graph. The objective of this project is to compare the performance of these matching algorithms on graphs of different sizes and densities."


SRI MOUNICA MOTIPALLI

Analysis of Privacy Protection Mechanisms in Social Networks using the Social Circle Model

When & Where:


2001B Eaton Hall

Committee Members:

Bo Luo, Chair
Perry Alexander
Jerzy Grzymala-Busse


Abstract

Many online social networks are increasingly being used as information sharing platforms. With a massive increase in the number of users participating in information sharing, an enormous amount of information becomes available on such sites. It is vital to preserve user’s privacy, without preventing them from socialization. Unfortunately, many existing models overlooked a very important fact, that is, a user may want different information boundary preference for different information. To address this short coming, in this paper, I will introduce a ‘social circle’ model, which follows the concepts of ‘private information boundaries’ and ‘restricted access and limited control’. While facilitating socialization, the social circle model also provides some privacy protection capabilities. I then utilize this model to analyze the most popular social networks (such as Facebook, Google+, VKontakte, Flickr, and Instagram) and demonstrate the potential privacy vulnerabilities in some of these networking sites. Lastly, I discuss the implication of the analysis and possible future directions. 


PEGAH NOKHIZ

Understanding User Behavior in Social Networks Using Quantified Moral Foundations

When & Where:


246 Nichols Hall

Committee Members:

Fengjun Li, Chair
Bo Luo
Cuncong Zhong


Abstract

Moral inclinations expressed in user-generated content such as online reviews or tweets can provide useful insights to understand users’ behavior and activities in social networks, for example, to predict users’ rating behavior, perform customer feedback mining, and study users' tendency to spread abusive content on these social platforms.  In this work, we want to answer two important research questions. First, if the moral attributes of social network data can provide additional useful information about users' behavior and how to utilize this information to enhance our understanding. To answer this question, we used the Moral FoundationsTheory and Doc2Vec, a Natural Language Processing technique, to compute the quantified moral loadings of user-generated textual contents in social networks. We used conditional relative frequency and the correlations between the moral foundations as two measures to study the moral break down of the social network data, utilizing a dataset of Yelp reviews and a dataset of tweets on abusive user-generated content. Our findings indicated that these moral features are tightly bound with users' behavior in social networks. The second question we want to answer is if we can use the quantified moral loadings as new boosting features to improve the differentiation, classification, and prediction of social network activities. To test our hypothesis, we adopted our new moral features in a multi-class classification approach to distinguish hateful and offensive tweets in a labeled dataset, and compared with the baseline approach that only uses conventional text mining features such as tf-idf features, Part of Speech (PoS) tags, etc. Our findings demonstrated that the moral features improved the performance of the baseline approach in terms of precision, recall, and F-measure.​


MUSTAFA AL-QADI

Laser Phase Noise and Performance of High-Speed Optical Communication Systems

When & Where:


2001B Eaton Hall

Committee Members:

Ron Hui, Chair
Chris Allen
Victor Frost
Erik Perrins
Jie Han*

Abstract

The non-ending growth of data traffic resulting from the continuing emergence of high-data-rate-demanding applications sets huge capacity requirements on optical interconnects and transport networks. This requires optical communication schemes in these networks to make the best possible use of the available optical spectrum per a single optical channel to enable transmission of multiple tens of tera-bits per second per a single fiber core in high capacity transport networks. Therefore, advanced modulation formats are required to be used in conjunction with energy-efficient and robust transceiver schemes. Important challenges facing these goals are the stringent requirements on the characteristics of optical components comprising these systems. Especially the laser sources. Laser phase noise is one of the most important performance-limiting factors in systems with high spectral efficiency. In this research work, we study the effects of different laser phase noise characteristics on the performance of different optical communication schemes. A novel, simple and accurate phase noise characterization technique is proposed. Experimental results show that the proposed technique is very accurate in estimating the performance of lasers in coherent systems employing digital phase recovery techniques. A novel multi-heterodyne scheme for characterizing the phase noise of laser frequency comb sources is also proposed and validated by experimental results. This proposed scheme is the first one of its type capable of measuring the differential phase noise between multiple spectral lines instantaneously by a single measurement. Moreover, extended relations between system performance and detailed characteristics of laser phase noise are also analyzed and modeled. The results of this study show that the commonly-used metric to estimate the performance of lasers with a specific phase recovery scheme, linewidth-symbol-period product, is not necessarily accurate for all types of lasers, and description of FM-noise power spectral profile is required for accurate performance estimation. We also propose an energy- and cost-efficient transmission scheme suitable for metro and long-reach data-center-interconnect links based on direct detection of field-modulated optical signals with advanced modulation formats, allowing for higher spectral efficiency. The proposed system combines the Kramers-Kronig coherent receiver technique, with the use of quantum-dot multi-mode laser sources, to generate and transmit multi-channel optical signals using a single diode laser source. Experimental results of the proposed system show that high modulation formats can be employed, with high robustness against laser phase noise and frequency drifting.


MARK GREBE

Domain Specific Languages for Small Embedded Systems

When & Where:


250 Nichols Hall

Committee Members:

Andy Gill, Chair
Perry Alexander
Prasad Kulkarni
Suzanne Shontz
Kyle Camarda

Abstract

Resource limited embedded systems provide a great challenge to programming using functional languages.  Although these embedded systems cannot be programmed directly with Haskell, I show that an embedded domain specific language is able to be used to program them, and provides a user friendly environment for both prototyping and full development.  The Arduino line of microcontroller boards provide a versatile, low cost and popular platform for development of these resource limited systems, and I use these boards as the platform for my DSL research.

First, I provide a shallowly embedded domain specific language, and a firmware interpreter, allowing the user to program the Arduino while tethered to a host computer.  Shallow EDSLs allow a programmer to program using many of the features of a host language and its syntax, but sacrifice performance.  Next, I add a deeply embedded version, allowing the interpreter to run standalone from the host computer, as well as allowing the code to be compiled to C and then machine code for efficient operation.   Deep EDSLs provide better performance and flexibility, through the ability to manipulate the abstract syntax tree of the DSL program, but sacrifice syntactical similarity to the host language.   Using Haskino, my EDSL designed for Arduino microcontrollers, and a compiler plugin for the Haskell GHC compiler, I show a method for combining the best aspects of shallow and deep EDSLs. The programmer is able to write in the shallow EDSL, and have it automatically transformed into the deep EDSL.  This allows the EDSL user to benefit from powerful aspects of the host language, Haskell, while meeting the demanding resource constraints of the small embedded processing environment.

 


ALI ABUSHAIBA

Extremum Seeking Maximum Power Point Tracking for a Stand-Alone and Grid-Connected Photovoltaic Systems

When & Where:


Room 1 Eaton Hall

Committee Members:

Reza Ahmadi, Chair
Ken Demarest
Glenn Prescott
Alessandro Salandrino
Prajna Dhar*

Abstract

Energy harvesting from solar sources in an attempt to increase efficiency has sparked interest in many communities to develop more energy harvesting applications for renewable energy topics. Advanced technical methods are required to ensure the maximum available power is harnessed from the photovoltaic (PV) system. This dissertation proposed a new discrete-in-time extremum-seeking (ES) based technique for tracking the maximum power point of a photovoltaic array. The proposed method is a true maximum power point tracker that can be implemented with reasonable processing effort on an expensive digital controller. The dissertation presents a stability analysis of the proposed method to guarantee the convergence of the algorithm.

Two types of PV systems were designed and comprehensive frame work of control design was considered for a stand-alone and a three-phase grid connected system.

Grid-tied systems commonly have a two-stage power electronics interface which is necessitated due to the inherent limitation of the DC-AC (Inverter) power converging stage. However, a one stage converter topology, denoted as Quasi-Z-source inverter (q-ZSI) was selected that interface the PV panel which overcomes the inverter limitations to harvest the maximum available power.

A powerful control scheme called Model Predictive Control with Finite Set (MPC-FS) was designed to control the grid connected system. The predictive control was selected to achieve a robust controller with superior dynamic response in conjunction with the extremum-seeking algorithm to enhance the system behavior.

The proposed method exhibited better performance in comparison to conventional Maximum Power Point Tracking (MPPT) methods and require less computational effort than the complex mathematical methods.​


JUSTIN DAWSON

The Remote Monad

When & Where:


246 Nichols Hall

Committee Members:

Andy Gill, Chair
Perry Alexander
Prasad Kulkarni
Bo Luo
Kyle Camarda

Abstract

Remote Procedure Calls are an integral part of the internet of things and cloud computing. However, remote procedures, by their very nature, have an expensive overhead cost of a network round trip. There have been many optimizations to amortize the network overhead cost, including asynchronous remote calls and batching requests together.

In this dissertation, we present a principled way to batch procedure calls together, called the Remote Monad. The support for monadic structures in languages such as Haskell can be utilized to build a staging mechanism for chains of remote procedures. Our specific formulation of remote monads uses natural transformations to make modular and composable network stacks which can automatically bundle requests into packets by breaking up monadic actions into ideal packets. By observing the properties of these primitive operations, we can leverage a number of tactics to maximize the size of the packets.

We have created a framework which has been successfully used to implement the industry standard JSON-RPC protocol, a graphical browser-based library, an efficient byte string implementation, a library to communicate with an Arduino board and database queries all of which have automatic bundling enabled. We demonstrate that the result of this investigation is that the cost of implementing bundling for remote monads can be amortized almost for free, when given a user-supplied packet transportation mechanism.


JOSEPH St AMAND

Learning to Measure: Distance Metric Learning with Structured Sparsity

When & Where:


246 Nichols Hall

Committee Members:

Arvin Agah, Chair
Prasad Kulkarni
Jim Miller
Richard Wang
Bozenna Pasik-Duncan*

Abstract

Many important machine learning and data mining algorithms rely on a measure to provide a notion of distance or dissimilarity. Naive metrics such as the Euclidean distance are incapable of leveraging task-specific information, and consider all features as equal. A learned distance metric can become much more effective by honing in on structure specific to a task. Additionally, it is often extremely desirable for a metric to be sparse, as this vastly increases the ability to interpret the distance metric. In this dissertation, we explore several current problems in distance metric learning and put forth solutions which make use of structured sparsity.

The first contribution of this dissertation begins with a classic approach in distance metric learning and address a scenario where distance metric learning is typically inapplicable, i.e., the case of learning on heterogeneous data in a high-dimensional input space. We construct a projection-free distance metric learning algorithm which utilizes structured sparse updates and successfully demonstrate its application to learn a metric with over a billion parameters.

The second contribution of this dissertation focuses on an intriguing regression-based approach to distance metric learning. Under this regression approach there are two sets of parameters to learn; those which parameterize the metric, and those defining the so-called ``virtual points''. We begin with an exploration of the metric parameterization and develop a structured sparse approach to robustify the metric to noisy, corrupted, or irrelevant data. We then focus on the virtual points and develop a new method for learning the metric and constraints together in a simultaneous manner. It is demonstrate through empirical means that our approach results in a distance metric which is more effective than the current state of-the-art.

Machine learning algorithms have recently become ingrained in an incredibly diverse amount of technology. The focus of this dissertation is to develop more effective techniques to learn a distance metric. We believe that this work has the potential for broad-reaching impacts, as learning a more effective metric typically results in more accurate metric-based machine learning algorithms.

 


SHIVA RAMA VELMA

An Implementation of the LEM2 Algorithm Handling Numerical Attributes

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse , Chair
Perry Alexander
Prasad Kulkarni


Abstract

Data mining is a computing process of finding meaningful patterns in large sets of data. These patterns are then analyzed and used to make predictions for the future. One form of data mining is to extract rules from data sets. There are various rule induction algorithms, such as LEM1 (Learning from Examples Module Version 1), LEM2 (Learning from Examples Module Version 2) and MLEM2(Modified Learning from Examples Module Version 2). Most of the rule induction algorithms require the input data with only discretized attributes. If the input data contains numerical attributes, we need to convert them into discrete values (intervals) before performing rule induction, this process is called discretization. In this project, we discuss an implementation of LEM2 which generates the rules from data with numerical and symbolic attributes. The accuracy of the rules generated by LEM2 is measured by computing the error rate by a program called rule checker using ten-fold cross-validation and holdout methods. ​


SURYA NIMMAKAYALA

Heuristics to Predict and Eagerly Translate Code in DBTs

When & Where:


250 Nichols Hall

Committee Members:

Prasad Kulkarni, Chair
Perry Alexander
Fengjun Li
Bo Luo
Shawn Keshmiri*

Abstract

Dynamic Binary Translators(DBTs) have a variety of uses, like instrumentation, profiling, security, portability, etc. In order for the desired application to run with these enhanced additional features(not originally part of its design), it is to be run under the control of Dynamic Binary Translator. The application can be thought of as the guest application, to be run with in a controlled environment of the translator, which would be the host application. That way, the intended application execution flow can be enforced by the translator, thereby inducing the desired behavior in the application on the host platform(combination of Operating System and Hardware). Depending on the implementation of the translator(host application), the guest application can either have code compiled for the host platform, or a different platform. It would be the responsibility of the translator to make appropriate code/binary translation of the guest application code, to be run on the host platform.

However, there will be a run-time/execution-time overhead in the translator, when performing the additional tasks to run the guest application in a controlled fashion. This run-time overhead has been limiting the usage of DBT's on a large scale, where response times can be critical. There is often a trade-off between the benefits of using a DBT against the overall application response time. So, there is a need to research/explore ways of faster application execution through DBT's(given their large code-base).

With the evolution of the multi-core and GPU hardware architectures, paralleization of software can be employed through multiple threads, which can concurrently run parts of code and potentially doing more work at the same time. The proper design of parallel applications or parallelizing parts of existing code, can lead to faster application run-time's, by taking advantage of the hardware architecture support to parallel programs.

We explore the possibility of improving the performance of a DBT named DynamoRIO. The basic idea is to improve its performance by speeding-up the process of guest code translation, through multiple threads translating multiple pieces of code concurrently. In an ideal case, all the required code blocks for application execution would be available ahead of time(eager translation), without any wait/overhead at run-time, and also giving it the enhanced features through the DBT. For efficient run-time eager translation there is also a need for heuristics, to better predict the next likely code block to be executed. That could potentially bring down the less productive code translations at run-time. The goal is to get application speed-up through eager translation, coupled with block prediction heuristics, leading to an execution time close to that of native run.