Defense Notices
All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.
Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.
Upcoming Defense Notices
Luke Staudacher
Enabling Versal-Based Signal Processing Through a Development Framework and User GuideWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Jonathan Owen, ChairShannon Blunt
Carl Leuschen
Erik Perrins
Abstract
AMD’s latest generation of adaptive system-on-chip (SoC) devices, the Versal product family, offers enhanced processing capabilities that are attractive to researchers and system designers. However, these capabilities introduce a significant knowledge barrier, limiting the practical benefits of Versal devices compared to more mature platforms from AMD, Intel, and other industry vendors. This project addresses this challenge through two primary deliverables: a software framework and a comprehensive user manual targeting Versal development. The software framework, named RSL Versal Core, provides a framework for users unfamiliar with Versal devices by selectively abstracting away more complex design components. Using a small set of commands, users can synthesize a programmable logic (PL) design, compile a Linux operating system for the onboard Arm processor with PL communication support, and program supported development boards. Following initial setup, the framework also supports extended software and firmware development for specific project needs. The accompanying user manual documents both RSL Versal Core and broader Versal development concepts. It guides users through reproducing and customizing the framework outputs manually and introduces key architectural and design principles useful for effective Versal-based system development. Together, these deliverables enable new developers to rapidly gain proficiency with Versal platforms and enable implementation of digital signal processing (DSP) concepts.
William Powers
Implementation and Analysis of Robust System-Informed Waveform DesignWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Jonathan Owen, ChairShannon Blunt
Carl Leuschen
Abstract
Due to rapid advances in high-speed analog-to-digital conversion and software-defined architectures, modern radar systems increasingly shift signal generation and conditioning into the digital domain. These architectures enable high-fidelity signal capture and provide substantial flexibility in waveform synthesis and signal processing that was previously impractical in analog implementations. Despite these advances, however, achievable radar performance remains fundamentally constrained by the physical transmit hardware through which the signal is ultimately realized. Nonlinear amplification, finite bandwidth, and memory effects introduce distortion that creates a significant gap between idealized waveform design and the waveform that is physically radiated.
To address this limitation, this work proposes a system-aware radar waveform design framework that couples data-driven system identification with deterministic optimization to generate waveforms tailored to the underlying transmit hardware. A complex baseband memory polynomial model is developed to characterize nonlinear transmit-chain behavior using loopback measurements, where $\ell_1$-regularized LASSO estimation is employed to improve robustness against ill-conditioning and feature redundancy. Under this architecture, a generalized integrated sidelobe level (GISL) objective is reformulated using logarithmic scalarization to produce a numerically stable and Pareto-tunable optimization criterion capable of balancing output energy and sidelobe suppression. Additionally, efficient vectorized gradient expressions are derived using Wirtinger calculus and implemented using gradient-based descent and the limited-memory BFGS algorithm for practical high-dimensional waveform synthesis.
To validate the framework, a comprehensive hardware-in-the-loop testbench was developed supporting direct model identification and experimental evaluation of optimized waveform performance. Simulation and experimental results demonstrate that continuous-phase FM waveforms exhibit strong inherent robustness to nonlinear distortion, while phase-coded waveforms with large instantaneous phase discontinuities show significantly greater sensitivity to transmit-chain impairments. Across both waveform classes, the proposed framework achieves substantial improvements in output power efficiency and pulse compression performance relative to system-agnostic waveform design. These results demonstrate that transmitter constraints must be treated as fundamental design variables rather than secondary effects and establish system-aware optimization as a practical framework for next-generation radar waveform synthesis.
Cody Gish
Real-time GPU Based Arbitrary Waveform Generation Utilizing a Software-Defined Radar PlatformWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Jonathan Owen, ChairShannon Blunt
Patrick McCormick
Abstract
Due to the ever-growing demand for access to the finite resources of the electromagnetic spectrum, significant effort has been directed toward improving spectrum utilization. This has become a particular challenge in radar transmission design, where waveform diversity techniques have emerged as a promising solution despite the accompanying implementation complexity. Diverse signals are inherently non-repeating and pose unique challenges in comparison to traditional radar waveforms. Software defined radios (SDRs) allow for traditional RF components and signal processing to be implemented and controlled in software rather than hardware, providing a platform for testing experimental radar algorithms. This thesis presents a real-time parallel implementation of five previously developed distinct waveform-diverse radar signals for use in a coherent SDR system. The implemented waveforms include stochastic waveform generation (StoWGe), multi-user radar communication (MURC), phase-attached radar communication (PARC), pseudo-random optimized frequency modulation (PRO-FM), and waveform recycling. To enable real-time generation at maximum SDR data rates, these waveforms are implemented using digital synthesis techniques via GPU parallel processing. This approach alleviates CPU resource limitations by offloading computationally intensive waveform generation tasks to the GPU, enabling continuous high-throughput operation. A custom asynchronous transmit and receive architecture is developed to integrate these GPU-accelerated waveforms with UHD-based SDR hardware. The system leverages a multithreaded framework approach that can sustain coherent and synchronized radar operation. To validate the system, a series of loopback testing across all waveforms and a variety of parameters is completed to confirm the execution of the generate-transmit-receive chain.
David Felton
Optimization and Evaluation of Physical Complementary Radar WaveformsWhen & Where:
Nichols Hall, Room 129 (Apollo Auditorium)
Committee Members:
Shannon Blunt, ChairRachel Jarvis
Patrick McCormick
James Stiles
Zsolt Talata
Abstract
The RF spectrum is a precious, finite resource with ever-increasing demand. Consequently, the mandate to be a "good spectral neighbor" is in direct conflict with the requirements for high-performance sensing where correlation error is fundamentally limited. As such, matched-filter radar performance is often sidelobe-limited with estimation error being constrained by the time-bandwidth (TB) of the collective emission. The methods developed here seek to bridge this gap between idealized radar performance and practical utility via waveform design.
Estimation error becomes more complex when employing pulse-agility. In doing so, range-sidelobe modulation (RSM) spreads energy across Doppler, rendering traditional methods ineffective. To address this, the gradient-based complementary-FM framework was developed to produce complementary sidelobe cancellation (CSC) after coherently combining subsets within a pulse-agile emission. In contrast to the majority of complementary signals, explored via phase-coding, these Comp-FM waveform subsets achieve CSC while preserving hardware-compatibility since they are FM (though design distortion is never completely avoided). Although Comp-FM addressed practicality via hardware amenability, CSC was localized to zero-Doppler. This work expands the Comp-FM notion to a Doppler-generalized (DG) framework, extending the cancellation condition to an arbitrary span. The same framework can likewise be employed to jointly optimize an entire coherent processing interval (CPI) to minimize RSM within the radar point-spread-function (PSF), thereby generalizing the notion of complementarity and introducing the potential for cognitive operation if sufficient scattering knowledge is available a-priori.
Sensing with a single emitter is limited by self-inflicted error alone (e.g., clutter, sidelobes), while MIMO systems must additionally contend with the cross-responses from emitters operating concurrently (e.g., simultaneously, spatially proximate, in a shared spectrum), further degrading radar sensitivity. Now, total correlation error is dictated by the overlapping TB (i.e., how coincident are the signals) and number of operating emitters, compounding difficulty to estimate if left unaddressed. As such, the determination of "orthogonal waveforms" comprises a large portion of MIMO literature, though remains a phenomenological misnomer for pulsed emissions. Here, the notion of complementary-FM is applied to a multi-emitter context in which transmitter-amenable quasi-orthogonal subsets, occupying the same spectral band, are produced via a similar gradient-based approach. To further practicalize these MIMO-Comp-FM waveform subsets, the same "DG" approach described above, addressing the otherwise-default Doppler-induced degradation of complementary signals, is applied. In doing so, Doppler-independent separability and complementarity greatly improves estimation sensitivity for multi-emitter systems.
This MIMO-Comp-FM framework is developed for standard matched filter processing. Coupling this framework with a "DG" form of the previously explored MIMO-MiCRFt is also investigated, illustrating the added benefit of pairing optimized subsets with similarly calibrated processing.
Each of these methods is developed to address unique and increasingly complex sources of estimation error. All approaches are initially developed and evaluated via simulated analysis where ground-truth is known. Then, despite hardware-induced distortion being unavoidable, the MIMO-Comp-FM framework is confirmed via loopback measurements to preserve the majority of CSC that was observed in simulation. Finally, open-air demonstration of each approach validates practical utility on a radar system.
Hao Xuan
Toward an Integrated Computational Framework for Metagenomics: From Sequence Alignment to Automated Knowledge DiscoveryWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Cuncong Zhong, ChairFengjun Li
Suzanne Shontz
Hongyang Sun
Liang Xu
Abstract
Metagenomic sequencing has become a central paradigm for studying complex microbial communities and their interactions with the host, with emerging applications in clinical prediction and disease modeling. In this work, we first investigate two representative application scenarios: predicting immune checkpoint inhibitor response in non-small cell lung cancer using gut microbial signatures, and characterizing host–microbiome interactions in neonatal systems. The proposed reference-free neural network captures both compositional and functional signals without reliance on reference genomes, while the neonatal study demonstrates how environmental and genetic factors reshape microbial communities and how probiotic intervention can mitigate pathogen-induced immune activation.
These studies highlight both the promise and the inherent difficulty of metagenomic analysis: transforming raw sequencing data into clinically actionable insights remains an algorithmically fragmented and computationally intensive process. This challenge arises from two key limitations: the lack of a unified algorithmic foundation for sequence alignment and the absence of systematic approaches for selecting and organizing analytical tools. Motivated by these challenges, we present a unified computational framework for metagenomic analysis that integrates complementary algorithmic and systems-level solutions.
First, to resolve fragmentation at the alignment level, we develop the Versatile Alignment Toolkit (VAT), a unified algorithmic system for biological sequence alignment across diverse applications. VAT introduces an asymmetric multi-view k-mer indexing scheme that integrates multiple seeding strategies within a single architecture and enables dynamic seed-length adjustment via longest common prefix (LCP)–based inference without re-indexing. A flexible seed-chaining mechanism further supports diverse alignment scenarios, including collinear, rearranged, and split alignments. Combined with a hardware-efficient in-register bitonic sorting algorithm and dynamic index-loading strategy, VAT achieves high efficiency and broad applicability across read mapping, homology search, and whole-genome alignment. Second, to address the challenge of tool selection and pipeline construction, we develop SNAIL, a natural language processing system for automated recognition of bioinformatics tools from large-scale and rapidly growing scientific literature. By integrating XGBoost and Transformer-based models such as SciBERT, SNAIL enables structured extraction of analytical tools and supports automated, reproducible pipeline construction.
Together, this work establishes a unified framework that is grounded in real-world applications and addresses key bottlenecks in metagenomic analysis, enabling more efficient, scalable, and clinically actionable workflows.
Past Defense Notices
RAKSHA GANESH
Structured-Irregular Repeat Accumulate CodesWhen & Where:
250 Nichols Hall
Committee Members:
Erik Perrins, ChairShannon Blunt
Ron Hui
Abstract
There is a strong need for efficient and reliable communication systems in the present day context. To design an efficient transmission system the errors that occur during transmission should be minimized. This can be achieved by channel encoding. The Irregular repeat accumulate codes are a class of serially concatenated codes that have a linear encoding algorithm, flexibility in code rate and code lengths and good performance.
Here we implement a design technique for Structured Irregular repeat accumulate codes. S-IRA codes can be decoded reliably using the iterative log likelihood decoding (sum-product) algorithm at low error rates. We perform encoding, decoding and performance analysis of S-IRA codes of different code rates and code word lengths and compare their performances on the AWGN channel. In this project we also design codes with different column weights for the parity check matrices and compare their performances on the AWGN channel with the already designed codes.
MADHURI MORLA
Effect of SOA Nonlinearities on CO-OFDM SystemWhen & Where:
2001B Eaton Hall
Committee Members:
Ron Hui, ChairVictor Frost
Erik Perrins
Abstract
The use of Semiconductor Optical Amplifier (SOA) for amplification in Coherent Optical-Orthogonal Frequency Division Multiplexing (CO-OFDM) system has been of interest in recent studies. The gain saturation of SOA induces inter-channel crosstalk. This effect is analyzed by simulation and compared with some recent experimental results. Performance of the optical transmission system is measured using Error Vector Magnitude (EVM) which is the measure of deviation of received symbols from their ideal positions in the constellation diagram. EVM as a function of input power to SOA is investigated. Improvement in EVM has been observed in linear region with the increase of input power due to the increase of Optical Signal to Noise Ratio (OSNR). In the nonlinear region, increase of the input optical power to SOA results in the degradation of EVM due to the nonlinear saturation of SOA The effect of gain saturation on EVM as a function of number of subcarriers is investigated.
The relation between different evaluation techniques like Bit Error Rate (BER), SNR and EVM is also presented. EVM is analytically estimated from OSNR by considering the ideal case of additive white Gaussian noise (AWGN) without nonlinearities. Bit Error Rate (BER) is estimated from the analytical and simulated EVM. The role of Peak to Average Power Ratio (PAPR) in the degradation of EVM in the nonlinear region is also studied through numerical simulation.
SAMBHAV SETHIA
Sentiment Analysis on Wikipedia People Pages Using Enhanced Naive Bayes ModelWhen & Where:
246 Nichols Hall
Committee Members:
Fengjun Li, ChairBo Luo
Jerzy Grzymala-Busse
Prasad Kulkarni
Abstract
Sentiment Analysis involves capturing the viewpoint or opinion expressed by the people on various objects. These objects are diverse set of things like a movie, an article, a person of interest, a product, basically anything on which we can opine about. The opinions that are expressed can take different forms, like a review of a movie, feedback on a product, an article in a newspaper expressing the sentiment of the author on the given topic or even a Wikipedia page on a person. The key challenge of sentiment analysis is to classify the underlying text to the correct class i.e., positive, negative or neutral. Sentiment analysis also deals with the computational treatment of opinion, sentiment and the subjectivity in a text.
Wikipedia provides a large repository of pages of people around the world. This project conducts large scale experiment using one of the popular sentiment analysis tools, which is modeled on an enhanced version of Naïve Bayes. Here a sentence by sentence sentiment analysis is done for each biographical page retrieved from Wikipedia. The overall sentiment of a person is then calculated by taking an average of every sentiment value of all the sentences related to that particular person. There are advantages of doing this type of analysis. First, the results obtained are better calibrated on a decimal scale which provides a clearer distinction about the sentiment value associated with the individual as compared to the standard result provided by the tool which is based on tri-scale i.e., positive, negative and neutral. Second, this will allow us to understand statistically the viewpoint of Wikipedia on those people. Finally, this project enables us to perform large-scale temporal and geographical analysis, e.g., examine the overall sentiment associated with the people of each state, and thus helping us to analyze the opinion trend.
XIAOMENG SU
A comparison of the Quality of Rule Induction from Inconsistent Data sets and Incomplete Data SetsWhen & Where:
246 Nichols Hall
Committee Members:
Jerzy Grzymala-Busse, ChairPrasad Kulkarni
Zongbo Wang
Abstract
In data mining, decision rules inducted from known examples are used to classify unseen cases. There are various rule induction algorithms, such as LEM1 (Learning from Examples Module version 1), LEM2 (Learning from Examples Module version 2) and MLEM2 (Modified Learning from Examples Module version 2). In the real world, many data sets are imperfect, either inconsistent or incomplete. The idea of lower and upper approximations, or more generally, the probabilistic approximation, provides an effective way to induct rules from inconsistent data sets and incomplete data sets. But the accuracy of rule sets inducted from imperfect data sets are expected to be lower. The objective of this project is to investigate which kind of imperfect data sets (inconsistent or incomplete) is worse in terms of the quality of inducted rule set. In this project, experiments were conducted on eight inconsistent data sets and eight incomplete data sets with lost values. We implemented the MLEM2 algorithm to induct certain and possible rules from inconsistent data sets, and implemented the local probabilistic version of MLEM2 algorithm to induct certain and possible rules from incomplete data sets. A program called Rule Checker was also developed to classify unseen cases with inducted rules and measure the classification error rate. Ten-cross fold validation was carried out and the average error rate was used as the criteria for comparison. The Mann-Whitney nonparametric test was performed to compare, separately for certain and possible rules, incompleteness with inconsistency. The results show that there is no significant difference between inconsistent and incomplete data sets in terms of the quality of rule induction.
SIVA PRAMOD BOBBILI
Static Disassembly of Binary using Symbol Table InformationWhen & Where:
250 Nichols Hall
Committee Members:
Prasad Kulkarni, ChairAndy Gill
Jerzy Grzymala-Busse
Abstract
Static binary analysis is an important challenge with many applications in security and performance optimization. One of the main challenges with analyzing an executable file statically is to discover all the instructions in the binary executable. It is often difficult to discover all program instructions due to a well-known limitation in static binary analysis, called the code discovery problem. Some of the main contributors to the code discovery problem are variable length CISC instructions, data interspersed with code, padding bytes for branch target alignment and indirect jumps. All these problems manifest themselves in x86 binary files, which is unfortunate since x86 is the most popular architecture format in desktop and server domains.
Although much of the research work in the recent times have stated that the symbol table might be of help to overcome the difficulties of code discovery, the extent to which it can actually help in the code discovery problem is still in question. This work focuses on assessing the benefit of using the symbol table information to overcome the limitations of the code discovery problem and identify more or all instructions in x86 binary executable files. We will discuss the details, extent, limitations and impact of instruction discovery with and without symbol table information in this work.
JONATHAN LUTES
SafeExit: Exit Node Protection for TORWhen & Where:
2001B Eaton Hall
Committee Members:
Bo Luo, ChairArvin Agah
Prasad Kulkarni
Abstract
TOR is one of the most important networks for providing anonymity over the internet. However, in some cases its exit node operators open themselves up to various legal challenges, a fact which discourages participation in the network. In this paper, we propose a mechanism for allowing some users to be voluntarily verified by trusted third parties, providing a means by which an exit node can verify that they are not the true source of traffic. This is done by extending TOR’s anonymity model to include
another class of user, and using a web of trust mechanism to create chains of trust.
KAVYASHREE PILAR
Digital Down Conversion and Compression of Radar DataWhen & Where:
317 Nichols Hall
Committee Members:
Carl Leuschen, ChairShannon Blunt
Glenn Prescott
Abstract
Storage and handling of huge amount of received data samples is one of the major challenges associated with Radar system design. Radar data samples potentially have high temporal and spatial correlation depending on the target characteristics and radar settings. This correlation can be utilized to compress them without any loss in sensitivity in post processed products. This project focuses on reducing the storage requirement of a Radar used for remote sensing of ice sheets. At the front-end of Radar receiver, the data sample rate can be reduced at real-time by performing frequency down-conversion and decimation of the incoming data. The decimated signal can be further compressed by applying suitable data compression algorithm. The project implements a digital down-converter, decimator and a data compression module on FPGA. Literature survey suggests that there are quite a few research works being done towards developing customized Radar data compression algorithms. This project analyses the possibility of using general-purpose algorithms like GZIP, JPEG-2000 (lossless) to compress Radar data. It also considers a simple floating point compression technique to convert 16 bit data to 8 bit data, guaranteeing a 50% reduction in data size. The project implements the 16-to-8 bit conversion, JPEG 2000 lossless and GZIP algorithms in Matlab and compares their SNR performance with Radar data. Simulations suggest that all of them have similar SNR performance but JPEG 2000, GZIP algorithms offer a compression ratio of over 90%. However, 16-to-8-bit compression is implemented in this project because of its simplicity.
A hardware test bed is implemented to integrate the digital radar electronics with the Matlab Simulink Simulation tools in a hardware in the loop (HIL) configuration. The digital down converter, decimator and the data compression module are prototyped on SimuLink. The design is later implemented on FPGA using Verilog code. The functionality is tested at various stages of development using ModelSim simulations, Altera DSPBuilder’s HDL import, HIL co-simulation and using SignalTap. This test bed can also be used for future development efforts.
SURYA TEJ NIMMAKAYALA
Exploring Causes of Performance Overhead during Dynamic Binary TranslationWhen & Where:
250 Nichols Hall
Committee Members:
Prasad Kulkarni, ChairFengjun Li
Bo Luo
Abstract
Dynamic Binary Translators (DBT) have applications ranging from program
portability, instrumentation, optimizations, and improving software security. To achieve these goals and maintain control over the application's execution, DBTs translate and run the original source/guest programs in a sand-boxed environment. DBT systems apply several optimization techniques like code caching, trace creation, etc. to reduce the translation overhead and
enhance program performance at run-time. However, even with these
optimizations, DBTs typically impose a significant performance overhead,
especially for short-running applications. This performance penalty has
restricted the more wide-spread adoption of DBT technology, in spite of its obvious need.
The goal of this work is to determine the different factors that contribute to the performance penalty imposed by dynamic binary translators. In this thesis, we describe the experiments that we designed to achieve our goal and report our results and observations. We use a popular and sophisticated DBT, DynamoRio, for our test platform, and employ the industry-standard SPEC CPU2006 benchmarks to capture run-time statistics. Our experiments find that DynamoRio executes a large number of additional instructions when compared to the native application execution. We further measure that this increase in the number of executed instructions is caused by the DBT frequently exiting
the code cache to perform various management tasks at run-time, including
code translation, indirect branch resolution and trace formation. We also
find that the performance loss experienced by the DBT is directly
proportional to the number of code cache exits. We will discuss the details on the experiments, results, observations, and analysis in this work.
XUN WU
A Global Discretization Approach to Handle Numerical Attributes as Preprocessing PresenterWhen & Where:
2001B Eaton Hall
Committee Members:
Jerzy Grzymala-Busse, ChairPrasad Kulkarni
Heechul Yun
Abstract
Discretization is a common technique to handle numerical attributes in data mining, and it divides continuous values into several intervals by defining multiple thresholds. Decision tree learning algorithms, such as C4.5 and random forests, are able to deal with numerical attributes by applying discretization technique and transforming them into nominal attributes based on one impurity-based criterion, such as information gain or Gini gain. However, there is no doubt that a considerable amount of distinct values are located in the same interval after discretization, through which digital information delivered by the original continuous values are lost.
In this thesis, we proposed a global discretization method that is able to keep the information within the original numerical attributes by expanding them into multiple nominal ones based on each of the candidate cut-point values. The discretized data set, which includes only nominal attributes, evolves from the original data set. We analyzed the problem by applying two decision tree learning algorithms, namely C4.5 and random forests, respectively to each of the twelve pairs of data sets (original and discretized data sets) and evaluating the performances (prediction accuracy rate) of the obtained classification models in Weka Experimenter. This is followed by two separate Wilcoxon tests (each test for one learning algorithm) to decide whether there is a level of statistical significance among these paired data sets. Results of both tests indicate that there is no clear difference in terms of performances by using the discretized data sets compared to the original ones.
YUFEI CHENG
Future Internet Routing Design for Massive Failures and AttacksWhen & Where:
246 Nichols Hall
Committee Members:
James Sterbenz, ChairVictor Frost
Fengjun Li
Gary Minden
Michael Vitevitch
Abstract
With the increasing frequency of natural disasters and intentional attacks that challenge the optical network, vulnerability to cascading and regional-correlated challenges is escalating. Given the high complexity and large traffic load of the optical networks, the correlated challenges pose great damage to reliable network communication. We start our research by proposing a critical regional identification mechanism and study different vulnerability scales using real-world physical network topologies. We further propose geographical diversity and incorporate it into a new graph resilience metric cTGGD (compensated Total Geographical Graph Diversity), which is capable of characterizing and differentiating resiliency level from different physical networks. We propose path geodiverse problem (PGD) and two heuristics for solving the problem with less complexity compared to the optimal algorithm. The geodiverse paths are optimized with a delay-skew optimization formulation for optimal traffic allocation. We implement GeoDivRP in ns-3 to employ the optimized paths and demonstrate their effectiveness compared to OSPF Equal-Cost Multi-Path routing (ECMP) in terms of both throughput and overall link utilization. As from the attackers perspective, we have analyzed the mechanism by which the attackers could use to maximize the attack impact with a limited budget and demonstrate the effectiveness of different network restoration plans.