Defense Notices


All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

Andrew Riachi

An Investigation Into The Memory Consumption of Web Browsers and A Memory Profiling Tool Using Linux Smaps

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Prasad Kulkarni, Chair
Perry Alexander
Drew Davidson
Heechul Yun

Abstract

Web browsers are notorious for consuming large amounts of memory. Yet, they have become the dominant framework for writing GUIs because the web languages are ergonomic for programmers and have a cross-platform reach. These benefits are so enticing that even a large portion of mobile apps, which have to run on resource-constrained devices, are running a web browser under the hood. Therefore, it is important to keep the memory consumption of web browsers as low as practicable.

In this thesis, we investigate the memory consumption of web browsers, in particular, compared to applications written in native GUI frameworks. We introduce smaps-profiler, a tool to profile the overall memory consumption of Linux applications that can report memory usage other profilers simply do not measure. Using this tool, we conduct experiments which suggest that most of the extra memory usage compared to native applications could be due the size of the web browser program itself. We discuss our experiments and findings, and conclude that even more rigorous studies are needed to profile GUI applications.


Elizabeth Wyss

A New Frontier for Software Security: Diving Deep into npm

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Drew Davidson, Chair
Alex Bardas
Fengjun Li
Bo Luo
J. Walker

Abstract

Open-source package managers (e.g., npm for Node.js) have become an established component of modern software development. Rather than creating applications from scratch, developers may employ modular software dependencies and frameworks--called packages--to serve as building blocks for writing larger applications. Package managers make this process easy. With a simple command line directive, developers are able to quickly fetch and install packages across vast open-source repositories. npm--the largest of such repositories--alone hosts millions of unique packages and serves billions of package downloads each week. 

However, the widespread code sharing resulting from open-source package managers also presents novel security implications. Vulnerable or malicious code hiding deep within package dependency trees can be leveraged downstream to attack both software developers and the end-users of their applications. This downstream flow of software dependencies--dubbed the software supply chain--is critical to secure.

This research provides a deep dive into the npm-centric software supply chain, exploring distinctive phenomena that impact its overall security and usability. Such factors include (i) hidden code clones--which may stealthily propagate known vulnerabilities, (ii) install-time attacks enabled by unmediated installation scripts, (iii) hard-coded URLs residing in package code, (iv) the impacts of open-source development practices, (v) package compromise via malicious updates, (vi) spammers disseminating phishing links within package metadata, and (vii) abuse of cryptocurrency protocols designed to reward the creators of high-impact packages. For each facet, tooling is presented to identify and/or mitigate potential security impacts. Ultimately, it is our hope that this research fosters greater awareness, deeper understanding, and further efforts to forge a new frontier for the security of modern software supply chains. 


Alfred Fontes

Optimization and Trade-Space Analysis of Pulsed Radar-Communication Waveforms using Constant Envelope Modulations

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Patrick McCormick, Chair
Shannon Blunt
Jonathan Owen


Abstract

Dual function radar communications (DFRC) is a method of co-designing a single radio frequency system to perform simultaneous radar and communications service. DFRC is ultimately a compromise between radar sensing performance and communications data throughput due to the conflicting requirements between the sensing and information-bearing signals.

A novel waveform-based DFRC approach is phase attached radar communications (PARC), where a communications signal is embedded onto a radar pulse via the phase modulation between the two signals. The PARC framework is used here in a new waveform design technique that designs the radar component of a PARC signal to match the PARC DFRC waveform expected power spectral density (PSD) to a desired spectral template. This provides better control over the PARC signal spectrum, which mitigates the issue of PARC radar performance degradation from spectral growth due to the communications signal. 

The characteristics of optimized PARC waveforms are then analyzed to establish a trade-space between radar and communications performance within a PARC DFRC scenario. This is done by sampling the DFRC trade-space continuum with waveforms that contain a varying degree of communications bandwidth, from a pure radar waveform (no embedded communications) to a pure communications waveform (no radar component). Radar performance, which is degraded by range sidelobe modulation (RSM) from the communications signal randomness, is measured from the PARC signal variance across pulses; data throughput is established as the communications performance metric. Comparing the values of these two measures as a function of communications symbol rate explores the trade-offs in performance between radar and communications with optimized PARC waveforms.


Qua Nguyen

Hybrid Array and Privacy-Preserving Signaling Optimization for NextG Wireless Communications

When & Where:


Zoom Defense, please email jgrisafe@ku.edu for link.

Committee Members:

Erik Perrins, Chair
Morteza Hashemi
Zijun Yao
Taejoon Kim
KC Kong

Abstract

This PhD research tackles two critical challenges in NextG wireless networks: hybrid precoder design for wideband sub-Terahertz (sub-THz) massive multiple-input multiple-output (MIMO) communications and privacy-preserving federated learning (FL) over wireless networks.

In the first part, we propose a novel hybrid precoding framework that integrates true-time delay (TTD) devices and phase shifters (PS) to counteract the beam squint effect - a significant challenge in the wideband sub-THz massive MIMO systems that leads to considerable loss in array gain. Unlike previous methods that only designed TTD values while fixed PS values and assuming unbounded time delay values, our approach jointly optimizes TTD and PS values under realistic time delays constraint. We determine the minimum number of TTD devices required to achieve a target array gain using our proposed approach. Then, we extend the framework to multi-user wideband systems and formulate a hybrid array optimization problem aiming to maximize the minimum data rate across users. This problem is decomposed into two sub-problems: fair subarray allocation, solved via continuous domain relaxation, and subarray gain maximization, addressed via a phase-domain transformation.

The second part focuses on preserving privacy in FL over wireless networks. First, we design a differentially-private FL algorithm that applies time-varying noise variance perturbation. Taking advantage of existing wireless channel noise, we jointly design differential privacy (DP) noise variances and users transmit power to resolve the tradeoffs between privacy and learning utility. Next, we tackle two critical challenges within FL networks: (i) privacy risks arising from model updates and (ii) reduced learning utility due to quantization heterogeneity. Prior work typically addresses only one of these challenges because maintaining learning utility under both privacy risks and quantization heterogeneity is a non-trivial task. We approach to improve the learning utility of a privacy-preserving FL that allows clusters of devices with different quantization resolutions to participate in each FL round. Specifically, we introduce a novel stochastic quantizer (SQ) that ensures a DP guarantee and minimal quantization distortion. To address quantization heterogeneity, we introduce a cluster size optimization technique combined with a linear fusion approach to enhance model aggregation accuracy. Lastly, inspired by the information-theoretic rate-distortion framework, a privacy-distortion tradeoff problem is formulated to minimize privacy loss under a given maximum allowable quantization distortion. The optimal solution to this problem is identified, revealing that the privacy loss decreases as the maximum allowable quantization distortion increases, and vice versa.

This research advances hybrid array optimization for wideband sub-THz massive MIMO and introduces novel algorithms for privacy-preserving quantized FL with diverse precision. These contributions enable high-throughput wideband MIMO communication systems and privacy-preserving AI-native designs, aligning with the performance and privacy protection demands of NextG networks.


Arin Dutta

Performance Analysis of Distributed Raman Amplification with Different Pumping Configurations

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Rongqing Hui, Chair
Morteza Hashemi
Rachel Jarvis
Alessandro Salandrino
Hui Zhao

Abstract

As internet services like high-definition videos, cloud computing, and artificial intelligence keep growing, optical networks need to keep up with the demand for more capacity. Optical amplifiers play a crucial role in offsetting fiber loss and enabling long-distance wavelength division multiplexing (WDM) transmission in high-capacity systems. Various methods have been proposed to enhance the capacity and reach of fiber communication systems, including advanced modulation formats, dense wavelength division multiplexing (DWDM) over ultra-wide bands, space-division multiplexing, and high-performance digital signal processing (DSP) technologies. To maintain higher data rates along with maximizing the spectral efficiency of multi-level modulated signals, a higher Optical Signal-to-Noise Ratio (OSNR) is necessary. Despite advancements in coherent optical communication systems, the spectral efficiency of multi-level modulated signals is ultimately constrained by fiber nonlinearity. Raman amplification is an attractive solution for wide-band amplification with low noise figures in multi-band systems.

Distributed Raman Amplification (DRA) have been deployed in recent high-capacity transmission experiments to achieve a relatively flat signal power distribution along the optical path and offers the unique advantage of using conventional low-loss silica fibers as the gain medium, effectively transforming passive optical fibers into active or amplifying waveguides. Also, DRA provides gain at any wavelength by selecting the appropriate pump wavelength, enabling operation in signal bands outside the Erbium doped fiber amplifier (EDFA) bands. Forward (FW) Raman pumping configuration in DRA can be adopted to further improve the DRA performance as it is more efficient in OSNR improvement because the optical noise is generated near the beginning of the fiber span and attenuated along the fiber. Dual-order FW pumping scheme helps to reduce the non-linear effect of the optical signal and improves OSNR by more uniformly distributing the Raman gain along the transmission span.

The major concern with Forward Distributed Raman Amplification (FW DRA) is the fluctuation in pump power, known as relative intensity noise (RIN), which transfers from the pump laser to both the intensity and phase of the transmitted optical signal as they propagate in the same direction. Additionally, another concern of FW DRA is the rise in signal optical power near the start of the fiber span, leading to an increase in the non-linear phase shift of the signal. These factors, including RIN transfer-induced noise and non-linear noise, contribute to the degradation of system performance in FW DRA systems at the receiver.

As the performance of DRA with backward pumping is well understood with relatively low impact of RIN transfer, our research  is focused on the FW pumping configuration, and is intended to provide a comprehensive analysis on the system performance impact of dual order FW Raman pumping, including signal intensity and phase noise induced by the RINs of both 1st and the 2nd order pump lasers, as well as the impacts of linear and nonlinear noise. The efficiencies of pump RIN to signal intensity and phase noise transfer are theoretically analyzed and experimentally verified by applying a shallow intensity modulation to the pump laser to mimic the RIN. The results indicate that the efficiency of the 2nd order pump RIN to signal phase noise transfer can be more than 2 orders of magnitude higher than that from the 1st order pump. Then the performance of the dual order FW Raman configurations is compared with that of single order Raman pumping to understand trade-offs of system parameters. The nonlinear interference (NLI) noise is analyzed to study the overall OSNR improvement when employing a 2nd order Raman pump. Finally, a DWDM system with 16-QAM modulation is used as an example to investigate the benefit of DRA with dual order Raman pumping and with different pump RIN levels. We also consider a DRA system using a 1st order incoherent pump together with a 2nd order coherent pump. Although dual order FW pumping corresponds to a slight increase of linear amplified spontaneous emission (ASE) compared to using only a 1st order pump, its major advantage comes from the reduction of nonlinear interference noise in a DWDM system. Because the RIN of the 2nd order pump has much higher impact than that of the 1st order pump, there should be more stringent requirement on the RIN of the 2nd order pump laser when dual order FW pumping scheme is used for DRA for efficient fiber-optic communication. Also, the result of system performance analysis reveals that higher baud rate systems, like those operating at 100Gbaud, are less affected by pump laser RIN due to the low-pass characteristics of the transfer of pump RIN to signal phase noise.


Past Defense Notices

Dates

RUBAYET SHAFIN

Performance Analysis of Parametric Channel Estimation for 3D Massive MIMO/FD-MIMO OFDM Systems

When & Where:


250 Nichols Hall

Committee Members:

Lingjia Liu, Chair
Erik Perrins
Yang Yi


Abstract

With the promise of meeting future capacity demands for mobile broadband communications, 3D massive-MIMO/Full Dimension MIMO (FD-MIMO) systems have gained much interest among the researchers in recent years. Apart from the huge spectral efficiency gain offered by the system, the reason for this great interest can also be attributed to significant reduction of latency, simplified multiple access layer, and robustness to interference. However, in order to completely extract the benefits of massive-MIMO systems, accurate channel state information is critical. In this thesis, a channel estimation method based on direction of arrival (DoA) estimation is presented for massive- MIMO OFDM systems. To be specific, the DoA is estimated using Estimation of Signal Parameter via Rotational Invariance Technique (ESPRIT) method, and the root mean square error (RMSE) of the DoA estimation is analytically characterized for the corresponding MIMO-OFDM system.


DANIEL HEIN

A New Approach for Predicting Security Vulnerability Severity in Attack Prone Software Using Architecture and Repository Mined Change Metrics

When & Where:


1 Eaton Hall

Committee Members:

Hossein Saiedian, Chair
Arvin Agah
Perry Alexander
Prasad Kulkarni
Nancy Mead

Abstract

Billions of dollars are lost every year to successful cyber attacks that are fundamentally enabled by software vulnerabilities. Modern cyber attacks increasingly threaten individuals, organizations, and governments, causing service disruption, inconvenience, and costly incident response. Given that such attacks are primarily enabled by software vulnerabilities, this work examines the efficacy of using change metrics, along with architectural burst and maintainability metrics, to predict modules and files that should be analyzed or tested further to excise vulnerabilities prior to release. 

The problem addressed by this research is the residual vulnerability problem, or vulnerabilities that evade detection and persist in released software. Many modern software projects are over a million lines of code, and composed of reused components of varying maturity. The sheer size of modern software, along with the reuse of existing open source modules, complicates the questions of where to look, and in what order to look, for residual vulnerabilities. 

Traditional code complexity metrics, along with newer frequency based churn metrics (mined from software repository change history), are selected specifically for their relevance to the residual vulnerability problem. We compare the performance of these complexity and churn metrics to architectural level change burst metrics, automatically mined from the git repositories of the Mozilla Firefox Web Browser, Apache HTTP Web Server, and the MySQL Database Server, for the purpose of predicting attack prone files and modules. 

We offer new empirical data quantifying the relationship between our selected metrics and the severity of vulnerable files and modules, assessed using severity data compiled from the NIST National Vulnerability Database, and cross-referenced to our study subjects using unique identifers defined by the Common Vulnerabilities and Exposures (CVE) vulnerability catalog. Our results show that architectural level change burst metrics can perform well in situations where more traditional complexity metrics fail as reliable estimators of vulnerability severity. In particular, results from our experiments on Apache HTTP Web Server indicate that architectural level change burst metrics show high correlation with the severity of known vulnerable modules, and do so with information directly available from the version control repository change-set (i.e., commit) history. 


CHENG GAO

Mining Incomplete Numerical Data Sets

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Arvin Agah
Bo Luo
Tyrone Duncan
Xuemin Tu

Abstract

Incomplete and numerical data are common for many application domains. There have been many approaches to handle missing data in statistical analysis and data mining. To deal with numerical data, discretization is crucial for many machine learning algorithms. However, most of the discretization algorithms cannot be applied to incomplete data sets. 

Multiple Scanning is an entropy based discretization method. Previous research shown it outperforms commonly used discretization methods: Equal Width or Equal Frequency discretization. In this work, Multiple Scanning is tested on C4.5 and MLEM2 on incomplete datasets. Results show for some data sets, the setup utilizing Multiple Scanning as preprocessing performs better, for the other data sets, C4.5 or MLEM2 should be used by themselves. Our conclusion is that there are no universal optimal solutions for all data sets. Setup should be custom-made. 


SUMIAH ALALWANI

Experiments on Incomplete Data Sets Using Modifications to Characteristic Relation

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Prasad Kulkarni
Bo Luo


Abstract

Rough set theory is a useful approach for decision rule induction, which is applied, to large life data sets. Lower and upper approximations of concepts values are used to induce rules for incomplete data sets. In our research we will study validity of modifications suggested to characteristic relation. We discuss the implementation of modifications to characteristic relation, and the local definability of each modified set. We show that all suggested modifications sets are not locally definable except for maximal consistent blocks that are restricted to data set with “do not care” conditions. A comparative analysis was conducted for characteristic sets and modifications in terms of cardinality of lower and upper approximations of each concept and decision rules induced by each modification. In this thesis, experiments were conducted on four incomplete data sets with lost and “do not care “ conditions. LEM2 algorithm was implemented to induce certain and possible rules form the incomplete data set. To measure the classification average error rate for induced rules, ten-fold cross validation was implemented. Our results show that there is no significant difference between the qualities of rule induced from each modification.


DANIEL GOMEZ GARCIA ALVESTEGUI

Ultra-Wideband Radar for High-Throughput-Phenotyping of Wheat Canopies

When & Where:


250 Nichols Hall

Committee Members:

Carl Leuschen, Chair
Chris Allen
Ron Hui
Fernando Rodriguez-Morales
David Braaten

Abstract

Increasing the rate of crop yield is an important issue to meet projected future crop production demands. Breeding efforts are being made to rapidly improve crop yields and make them more stress-resistance. Accelerated molecular breeding techniques, in which desirable plant physical traits are selected based on genetic markers, rely on accurate and rapid methods to link plant genotypes and phenotypes. Advances in next-generation-DNA sequencing have made genotyping a fast and efficient process. In contrast, methods for characterizing physical traits remain inefficient. 
The height of wheat crop is an important trait as it may be related to yield and biomass. It is also an indicator of plant growth-stage. Recent high-throughput-phenotyping experiments have used sensing techniques to measure canopy height based on ultrasound sonar and cameras. The main drawback of these methods is that the ground topography is not directly measured. 
In contrast to current sensors, ultra-wideband radars have the potential to take distance measurements to the top of the canopy and the ground simultaneously. We propose the study of ultra-wideband radar for measuring wheat crop heights. Specifically, we propose to study the effects of canopy constituents on the ranging radar-return or impulse-response, as well as on the frequency-response. First, a numerical simulator will be developed to accurately calculate the radar response at different canopy conditions. Second, a parametric study will be performed with aforementioned simulator. Lastly, an estimation algorithm for crop canopy heights will be developed, based on the parametric study. 


ALI ABUSHAIBA

Maximum Power Point Tracking for Photvoltaic Systems Using a Discreet in Time Extremum Seeking Algorithm

When & Where:


2001B Eaton Hall

Committee Members:

Reza Ahmadi, Chair
Ken Demarest
Glenn Prescott
Alessandro Salandrino
Huazhen Fang

Abstract

Energy harvesting from solar sources in an attempt to increase efficiency has sparked interest in many communities to develop more energy harvesting applications for renewable energy topics. Advanced technical methods are required to ensure the maximum available power is harnessed from the photovoltaic (PV) system. This work proposes a new discrete-in-time extremum-seeking based technique for tracking the maximum power point of a photovoltaic array. The proposed method is a true maximum power point tracker that can be implemented with reasonable processing effort on an expensive digital controller. The approach is to study the stability analysis of the proposed method to guarantee the convergence of the algorithm. The proposed method should exhibit better performance in comparison to conventional Maximum Power Point Tracking (MPPT) methods and require less computational effort than the complex mathematical methods. 


JAISNEET BHANDAL

Classification of Private Tweets using Tweets Content

When & Where:


2001B Eaton Hall

Committee Members:

Bo Luo, Chair
Jerzy Grzymala-Busse
Prasad Kulkarni


Abstract

Online social networks (OSNs) like Twitter provide an open platform for users to easily convey their thoughts and ideas from personal experiences to breaking news. With the increasing popularity of Twitter and the explosion of tweets, we have observed large amounts of potentially sensitive/private messages being published to OSNs inadvertently or voluntarily. The owners of these messages may become vulnerable to online stalkers or adversaries, and they often regret posting such messages. Therefore, identifying tweets that reveal private/sensitive information is critical for both the users and the service providers. However, the definition of sensitive information is subjective and different from person to person. To develop a privacy protection mechanism that is customizable to fit the needs of diverse audiences, it is essential to accurately and automatically identify and classify potentially sensitive tweets. 
In this project, we adopted a two-step approach - private tweet identification, and private tweet classification. We make the first attempt to classify private tweets into two main categories, sensitive and nonsensitive - private tweet identification, followed by private tweet classification where we categorize the sensitive tweets into 13 pre-defined topics. We consider private tweet identification and private tweet classification as dual-problems. Progress towards one of them will eventually benefit the other. We used a 2-layer classification approach, where we explore different combinations of classifiers, and analyze the performance of each combination. 


JONATHAN LYLE

A Digital Approach to Bistatic Radar Synchronization via GPS PPS

When & Where:


246 Nichols Hall

Committee Members:

Carl Leuschen, Chair
Chris Allen
Jilu Li


Abstract

Bistatic Radar systems utilize physically separate transmit and receive systems to collect information that monostatic systems cannot. One issue with developing bisatic systems is guaranteeing synchronization between the transmitters and receivers. This project presents a purely digital method for improving synchronization of a bistatic system based on the GPS PPS signal, and using step-time for both transmitter and receiver timing. The issue of bistatic synchronization is simulated in Matlab and then modified to utilize the proposed step-time adjustment to show that the method works in theory. This method is then implemented in hardware on the digital system of CReSIS’s ‘HF Sounder’ radar system, and then tested to verify that the proposed method can be implemented in hardware and that it improves performance.


TYLER WADE

AOT Vs. JIT: Impact of Profile Data on Code Quality

When & Where:


246 Nichols Hall

Committee Members:

Prasad Kulkarni, Chair
Perry Alexander
Heechul Yun


Abstract

Just-in-time (JIT) compilation during program execution and 
ahead-of-time (AOT) compilation during software installation are 
alternate techniques used by managed language virtual machines 
(VM) to generate optimized native code while simultaneously 
achieving binary code portability and high execution performance. 
JIT compilers typically collect profile information at run-time to 
enable profile-guided optimizations (PGO) to customize the gener- 
ated native code to different program inputs/behaviors. AOT com- 
pilation removes the speed and energy overhead of online profile 
collection and dynamic compilation, but may not be able to achieve 
the quality and performance of customized native code. The goal 
of this work is to investigate and quantify the implications of the 
AOT compilation model on the quality of the generated native code 
for current VMs. 
First, we quantify the quality of native code generated by the 
two compilation models for a state-of-the-art (HotSpot) Java VM. 
Second, we determine how the amount of profile data collected af- 
fects the quality of generated code. Third, we develop a mechanism 
to determine the accuracy or similarity of different profile data for a 
given program run, and investigate how the accuracy of profile data 
affects its ability to effectively guide PGOs. Finally, we categorize 
the profile data types in our VM and explore the contribution of 
each such category to performance. 


LOHITH NANUVALA

An Implementation of the MLEM2 Algorithm

When & Where:


1 Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Prasad Kulkarni
Richard Wang


Abstract

Data mining is the process of finding meaningful information from data. Data mining can be used in several areas like business, medicine, education etc. It allows us to find patterns in the data and make predictions for the future. One form of data mining is to extract rules from data sets. In this project we discuss an implementation of one of the data mining algorithms called MLEM2 (Modified Learning from Examples Module, version 2). This algorithm uses the concept of blocks of attribute-value pairs. It is also robust and generates rules for both complete and incomplete data sets with numeric and symbolic attributes. A rule checker has been developed which is used to evaluate the rule sets produced by MLEM2. The accuracy of the rules is measured by computing the error rate which is the ratio of the number of incorrectly classified cases to the total number of all cases. Experiments are conducted on different kinds of data sets (complete, incomplete, numeric and symbolic) using 10-fold cross validation method.