Defense Notices

All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

Andrew Riachi

An Investigation Into The Memory Consumption of Web Browsers and A Memory Profiling Tool Using Linux Smaps

When & Where:

Friday, July 25, 2025 - 1:00 PM
Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Prasad Kulkarni, Chair
Perry Alexander
Drew Davidson
Heechul Yun

Abstract

Web browsers are notorious for consuming large amounts of memory. Yet, they have become the dominant framework for writing GUIs because the web languages are ergonomic for programmers and have a cross-platform reach. These benefits are so enticing that even a large portion of mobile apps, which have to run on resource-constrained devices, are running a web browser under the hood. Therefore, it is important to keep the memory consumption of web browsers as low as practicable.

In this thesis, we investigate the memory consumption of web browsers, in particular, compared to applications written in native GUI frameworks. We introduce smaps-profiler, a tool to profile the overall memory consumption of Linux applications that can report memory usage other profilers simply do not measure. Using this tool, we conduct experiments which suggest that most of the extra memory usage compared to native applications could be due the size of the web browser program itself. We discuss our experiments and findings, and conclude that even more rigorous studies are needed to profile GUI applications.

Elizabeth Wyss

A New Frontier for Software Security: Diving Deep into npm

When & Where:

Wednesday, July 23, 2025 - 9:30 AM
Eaton Hall, Room 2001B

Committee Members:

Drew Davidson, Chair
Alex Bardas
Fengjun Li
Bo Luo
J. Walker

Abstract

Open-source package managers (e.g., npm for Node.js) have become an established component of modern software development. Rather than creating applications from scratch, developers may employ modular software dependencies and frameworks--called packages--to serve as building blocks for writing larger applications. Package managers make this process easy. With a simple command line directive, developers are able to quickly fetch and install packages across vast open-source repositories. npm--the largest of such repositories--alone hosts millions of unique packages and serves billions of package downloads each week.

However, the widespread code sharing resulting from open-source package managers also presents novel security implications. Vulnerable or malicious code hiding deep within package dependency trees can be leveraged downstream to attack both software developers and the end-users of their applications. This downstream flow of software dependencies--dubbed the software supply chain--is critical to secure.

This research provides a deep dive into the npm-centric software supply chain, exploring distinctive phenomena that impact its overall security and usability. Such factors include (i) hidden code clones--which may stealthily propagate known vulnerabilities, (ii) install-time attacks enabled by unmediated installation scripts, (iii) hard-coded URLs residing in package code, (iv) the impacts of open-source development practices, (v) package compromise via malicious updates, (vi) spammers disseminating phishing links within package metadata, and (vii) abuse of cryptocurrency protocols designed to reward the creators of high-impact packages. For each facet, tooling is presented to identify and/or mitigate potential security impacts. Ultimately, it is our hope that this research fosters greater awareness, deeper understanding, and further efforts to forge a new frontier for the security of modern software supply chains.

Alfred Fontes

Optimization and Trade-Space Analysis of Pulsed Radar-Communication Waveforms using Constant Envelope Modulations

When & Where:

Tuesday, July 22, 2025 - 2:00 PM
Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Patrick McCormick, Chair
Shannon Blunt
Jonathan Owen

Abstract

Dual function radar communications (DFRC) is a method of co-designing a single radio frequency system to perform simultaneous radar and communications service. DFRC is ultimately a compromise between radar sensing performance and communications data throughput due to the conflicting requirements between the sensing and information-bearing signals.

A novel waveform-based DFRC approach is phase attached radar communications (PARC), where a communications signal is embedded onto a radar pulse via the phase modulation between the two signals. The PARC framework is used here in a new waveform design technique that designs the radar component of a PARC signal to match the PARC DFRC waveform expected power spectral density (PSD) to a desired spectral template. This provides better control over the PARC signal spectrum, which mitigates the issue of PARC radar performance degradation from spectral growth due to the communications signal.

The characteristics of optimized PARC waveforms are then analyzed to establish a trade-space between radar and communications performance within a PARC DFRC scenario. This is done by sampling the DFRC trade-space continuum with waveforms that contain a varying degree of communications bandwidth, from a pure radar waveform (no embedded communications) to a pure communications waveform (no radar component). Radar performance, which is degraded by range sidelobe modulation (RSM) from the communications signal randomness, is measured from the PARC signal variance across pulses; data throughput is established as the communications performance metric. Comparing the values of these two measures as a function of communications symbol rate explores the trade-offs in performance between radar and communications with optimized PARC waveforms.

Qua Nguyen

Hybrid Array and Privacy-Preserving Signaling Optimization for NextG Wireless Communications

When & Where:

Tuesday, July 22, 2025 - 11:00 AM
Zoom Defense, please email jgrisafe@ku.edu for link.

Committee Members:

Erik Perrins, Chair
Morteza Hashemi
Zijun Yao
Taejoon Kim
KC Kong

Abstract

This PhD research tackles two critical challenges in NextG wireless networks: hybrid precoder design for wideband sub-Terahertz (sub-THz) massive multiple-input multiple-output (MIMO) communications and privacy-preserving federated learning (FL) over wireless networks.

In the first part, we propose a novel hybrid precoding framework that integrates true-time delay (TTD) devices and phase shifters (PS) to counteract the beam squint effect - a significant challenge in the wideband sub-THz massive MIMO systems that leads to considerable loss in array gain. Unlike previous methods that only designed TTD values while fixed PS values and assuming unbounded time delay values, our approach jointly optimizes TTD and PS values under realistic time delays constraint. We determine the minimum number of TTD devices required to achieve a target array gain using our proposed approach. Then, we extend the framework to multi-user wideband systems and formulate a hybrid array optimization problem aiming to maximize the minimum data rate across users. This problem is decomposed into two sub-problems: fair subarray allocation, solved via continuous domain relaxation, and subarray gain maximization, addressed via a phase-domain transformation.

The second part focuses on preserving privacy in FL over wireless networks. First, we design a differentially-private FL algorithm that applies time-varying noise variance perturbation. Taking advantage of existing wireless channel noise, we jointly design differential privacy (DP) noise variances and users transmit power to resolve the tradeoffs between privacy and learning utility. Next, we tackle two critical challenges within FL networks: (i) privacy risks arising from model updates and (ii) reduced learning utility due to quantization heterogeneity. Prior work typically addresses only one of these challenges because maintaining learning utility under both privacy risks and quantization heterogeneity is a non-trivial task. We approach to improve the learning utility of a privacy-preserving FL that allows clusters of devices with different quantization resolutions to participate in each FL round. Specifically, we introduce a novel stochastic quantizer (SQ) that ensures a DP guarantee and minimal quantization distortion. To address quantization heterogeneity, we introduce a cluster size optimization technique combined with a linear fusion approach to enhance model aggregation accuracy. Lastly, inspired by the information-theoretic rate-distortion framework, a privacy-distortion tradeoff problem is formulated to minimize privacy loss under a given maximum allowable quantization distortion. The optimal solution to this problem is identified, revealing that the privacy loss decreases as the maximum allowable quantization distortion increases, and vice versa.

This research advances hybrid array optimization for wideband sub-THz massive MIMO and introduces novel algorithms for privacy-preserving quantized FL with diverse precision. These contributions enable high-throughput wideband MIMO communication systems and privacy-preserving AI-native designs, aligning with the performance and privacy protection demands of NextG networks.

Arin Dutta

Performance Analysis of Distributed Raman Amplification with Different Pumping Configurations

When & Where:

Monday, July 21, 2025 - 10:00 AM
Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Rongqing Hui, Chair
Morteza Hashemi
Rachel Jarvis
Alessandro Salandrino
Hui Zhao

Abstract

As internet services like high-definition videos, cloud computing, and artificial intelligence keep growing, optical networks need to keep up with the demand for more capacity. Optical amplifiers play a crucial role in offsetting fiber loss and enabling long-distance wavelength division multiplexing (WDM) transmission in high-capacity systems. Various methods have been proposed to enhance the capacity and reach of fiber communication systems, including advanced modulation formats, dense wavelength division multiplexing (DWDM) over ultra-wide bands, space-division multiplexing, and high-performance digital signal processing (DSP) technologies. To maintain higher data rates along with maximizing the spectral efficiency of multi-level modulated signals, a higher Optical Signal-to-Noise Ratio (OSNR) is necessary. Despite advancements in coherent optical communication systems, the spectral efficiency of multi-level modulated signals is ultimately constrained by fiber nonlinearity. Raman amplification is an attractive solution for wide-band amplification with low noise figures in multi-band systems.

Distributed Raman Amplification (DRA) have been deployed in recent high-capacity transmission experiments to achieve a relatively flat signal power distribution along the optical path and offers the unique advantage of using conventional low-loss silica fibers as the gain medium, effectively transforming passive optical fibers into active or amplifying waveguides. Also, DRA provides gain at any wavelength by selecting the appropriate pump wavelength, enabling operation in signal bands outside the Erbium doped fiber amplifier (EDFA) bands. Forward (FW) Raman pumping configuration in DRA can be adopted to further improve the DRA performance as it is more efficient in OSNR improvement because the optical noise is generated near the beginning of the fiber span and attenuated along the fiber. Dual-order FW pumping scheme helps to reduce the non-linear effect of the optical signal and improves OSNR by more uniformly distributing the Raman gain along the transmission span.

The major concern with Forward Distributed Raman Amplification (FW DRA) is the fluctuation in pump power, known as relative intensity noise (RIN), which transfers from the pump laser to both the intensity and phase of the transmitted optical signal as they propagate in the same direction. Additionally, another concern of FW DRA is the rise in signal optical power near the start of the fiber span, leading to an increase in the non-linear phase shift of the signal. These factors, including RIN transfer-induced noise and non-linear noise, contribute to the degradation of system performance in FW DRA systems at the receiver.

As the performance of DRA with backward pumping is well understood with relatively low impact of RIN transfer, our research is focused on the FW pumping configuration, and is intended to provide a comprehensive analysis on the system performance impact of dual order FW Raman pumping, including signal intensity and phase noise induced by the RINs of both 1st and the 2nd order pump lasers, as well as the impacts of linear and nonlinear noise. The efficiencies of pump RIN to signal intensity and phase noise transfer are theoretically analyzed and experimentally verified by applying a shallow intensity modulation to the pump laser to mimic the RIN. The results indicate that the efficiency of the 2nd order pump RIN to signal phase noise transfer can be more than 2 orders of magnitude higher than that from the 1st order pump. Then the performance of the dual order FW Raman configurations is compared with that of single order Raman pumping to understand trade-offs of system parameters. The nonlinear interference (NLI) noise is analyzed to study the overall OSNR improvement when employing a 2nd order Raman pump. Finally, a DWDM system with 16-QAM modulation is used as an example to investigate the benefit of DRA with dual order Raman pumping and with different pump RIN levels. We also consider a DRA system using a 1st order incoherent pump together with a 2nd order coherent pump. Although dual order FW pumping corresponds to a slight increase of linear amplified spontaneous emission (ASE) compared to using only a 1st order pump, its major advantage comes from the reduction of nonlinear interference noise in a DWDM system. Because the RIN of the 2nd order pump has much higher impact than that of the 1st order pump, there should be more stringent requirement on the RIN of the 2nd order pump laser when dual order FW pumping scheme is used for DRA for efficient fiber-optic communication. Also, the result of system performance analysis reveals that higher baud rate systems, like those operating at 100Gbaud, are less affected by pump laser RIN due to the low-pass characteristics of the transfer of pump RIN to signal phase noise.

Audrey Mockenhaupt

Using Dual Function Radar Communication Waveforms for Synthetic Aperture Radar Automatic Target Recognition

When & Where:

Friday, July 18, 2025 - 2:30 PM
Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Patrick McCormick, Chair
Shannon Blunt
Jon Owen

Abstract

As machine learning (ML), artificial intelligence (AI), and deep learning continue to advance, their applications become more diverse – one such application is synthetic aperture radar (SAR) automatic target recognition (ATR). These SAR ATR networks use different forms of deep learning such as convolutional neural networks (CNN) to classify targets in SAR imagery. An emerging research area of SAR is dual function radar communication (DFRC) which performs both radar and communications functions using a single co-designed modulation. The utilization of DFRC emissions for SAR imaging impacts image quality, thereby influencing SAR ATR network training. Here, using the Civilian Vehicle Data Dome dataset from the AFRL, SAR ATR networks are trained and evaluated with simulated data generated using Gaussian Minimum Shift Keying (GMSK) and Linear Frequency Modulation (LFM) waveforms. The networks are used to compare how the target classification accuracy of the ATR network differ between DFRC (i.e., GMSK) and baseline (i.e., LFM) emissions. Furthermore, as is common in pulse-agile transmission structures, an effect known as ’range sidelobe modulation’ is examined, along with its impact on SAR ATR. Finally, it is shown that SAR ATR network can be trained for GMSK emissions using existing LFM datasets via two types of data augmentation.

Past Defense Notices

Hayder Almosa

Downlink Achievable Rate Analysis for FDD Massive MIMO Systems

When & Where:

Monday, May 13, 2019 - 1:00 PM
129 Nichols Hall

Committee Members:

Erik Perrins , Chair
Lingjia Liu
Shannon Blunt
Rongqing Hui
Hongyi Cai

Abstract

Multiple-Input Multiple-Output (MIMO) systems with large-scale transmit antenna arrays, often called massive MIMO, are a very promising direction for 5G due to their ability to increase capacity and enhance both spectrum and energy efficiency. To get the benefit of massive MIMO systems, accurate downlink channel state information at the transmitter (CSIT) is essential for downlink beamforming and resource allocation. Conventional approaches to obtain CSIT for FDD massive MIMO systems require downlink training and CSI feedback. However, such training will cause a large overhead for massive MIMO systems because of the large dimensionality of the channel matrix. In this dissertation, we improve the performance of FDD massive MIMO networks in terms of downlink training overhead reduction, by designing an efficient downlink beamforming method and developing a new algorithm to estimate the channel state information based on compressive sensing techniques. First, we design an efficient downlink beamforming method based on partial CSI. By exploiting the relationship between uplink direction of arrivals (DoAs) and downlink direction of departures (DoDs), we derive an expression for estimated downlink DoDs, which will be used for downlink beamforming. Second, By exploiting the sparsity structure of downlink channel matrix, we develop an algorithm that selects the best features from the measurement matrix to obtain efficient CSIT acquisition that can reduce the downlink training overhead compared with conventional LS/MMSE estimators. In both cases, we compare the performance of our proposed beamforming method with traditional methods in terms of downlink achievable rate and simulation results show that our proposed method outperform the traditional beamforming methods.

Naresh Kumar Sampath Kumar

Complexity of Rules Sets in Mining Incomplete Data Using Characteristic Sets and Generalized Maximal Consistent Blocks

When & Where:

Monday, May 13, 2019 - 10:00 AM
2001 B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Prasad Kulkarni
Richard Wang

Abstract

The process of going through data to discover hidden connections and predict future trends has a long history. In this data-driven world, data mining is an important process to extract knowledge or insights from data in various forms. It explores the unknown credible patterns which are significant in solving many problems. There are quite a few techniques in data mining including classification, clustering, and prediction. We will discuss the classification, by using a technique called rule induction using four different approaches.

We compare the complexity of rule sets induced using characteristic sets and maximal consistent blocks. The complexity of rule sets is determined by the total number of rules induced for a given data set and the total number of conditions present in each rule. We used Incomplete Data sets to induce rules. These data sets have missing attribute values. Both methods were implemented and analyzed to check how it influences the complexity. Preliminary results suggest that the choice between characteristic sets and generalized maximal consistent blocks is inconsequential. But the cardinality of the rule sets is always smaller for incomplete data sets with “do not care” conditions. Thus, the choice between interpretations of the missing attribute value is more important than the choice between characteristic sets and generalized maximal consistent blocks.

Usman Sajid

ZiZoNet: A Zoom-In and Zoom-Out Mechanism for Crowd Counting in Static Images

When & Where:

Monday, May 13, 2019 - 1:30 AM
246 Nichols Hall

Committee Members:

Guanghui Wang, Chair
Bo Luo
Heechul Yun

Abstract

As people gather during different social, political or musical events, automated crowd analysis can lead to effective and better management of such events to prevent any unwanted scene as well as avoid political manipulation of crowd numbers. Crowd counting remains an integral part of crowd analysis and also an active research area in the field of computer vision. Existing methods fail to perform where crowd density is either too high or too low in an image, thus resulting in either overestimation or underestimation. These methods also mix crowd-like cluttered background regions (e.g. tree leaves or small and continuous patterns) in images with actual crowd, resulting in further crowd overestimation. In this work, we present a novel deep convolutional neural network (CNN) based framework ZiZoNet for automated crowd counting in static images in very low to very high crowd density scenarios to address above issues. ZiZoNet consists of three modules namely Crowd Density Classifier (CDC), Decision Module (DM) and Count Regressor Module (CRM). The test image, divided into 224x224 patches, passes through crowd density classifier (CDC) that classifies each patch to a class label (no-crowd (NC), low-crowd (LC), medium-crowd (MC), high-crowd (HC)). Based on the CDC information and using either heuristic Rule-set Engine (RSE) or machine learning based Random Forest based Decision Block (RFDB), DM decides which mode (zoom-in, normal or zoom-out) this image should use for crowd counting. CRM then performs patch-wise crowd estimate for this image accordingly as decided or instructed by the DM module. Extensive experiments on three diverse and challenging crowd counting benchmarks (UCF-QNRF, ShanghaiTech, AHU-Crowd) show that our method outperforms current state-of-the-art models under most of the evaluation criteria.

Ernesto Alexander Ramos

Tunable Surface Plasmon Dynamics

When & Where:

Friday, May 10, 2019 - 3:00 PM
2001 B Eaton Hall

Committee Members:

Alessandro Salandrino, Chair
Christopher Allen
Rongqing Hui

Abstract

Due to their extreme spatial confinement, surface plasmon resonances show great potential in the design of future devices that would blur the boundaries between electronics and optics. Traditionally, plasmonic interactions are induced with geometries involving noble metals and dielectrics. However, accessing these plasmonic modes requires delicate election of material parameters with little margin for error, controllability, or room for signal bandwidth. To rectify this, two novel plasmonic mechanisms with a high degree of control are explored: For the near infrared region, transparent conductive oxides (TCOs) exhibit tunability not only in "static" plasmon generation (through material doping) but could also allow modulation on a plasmon carrier through external bias induced switching. These effects rely on the electron accumulation layer that is created at the interface between an insulator and a doped oxide. Here a rigorous study of the electromagnetic characteristics of these electron accumulation layers is presented. As a consequence of the spatially graded permittivity profiles of these systems it will be shown that these systems display unique properties. The concept of Accumulation-layer Surface Plasmons (ASP) is introduced and the conditions for the existence or for the suppression of surface-wave eigenmodes are analyzed. A second method could allow access to modes of arbitrarily high order. Sub-wavelength plasmonic nanoparticles can support an infinite discrete set of orthogonal localized surface plasmon modes, however only the lowest order resonances can be effectively excited by incident light alone. By allowing the background medium to vary in time, novel localized surface plasmon dynamics emerge. In particular, we show that these temporal permittivity variations lift the orthogonality of the localized surface plasmon modes and introduce coupling among different angular momentum states. Exploiting these dynamics, surface plasmon amplification of high order resonances can be achieved under the action of a spatially uniform optical pump of appropriate frequency.

Nishil Parmar

A Comparison of Quality of Rules Induced using Single Local Probabilistic Approximations vs Concept Probabilistic Approximations

When & Where:

Friday, May 10, 2019 - 2:00 PM
1415A LEEP2

Committee Members:

Jerzy Grzymala-Busse, Chair
Prasad Kulkarni
Bo Luo

Abstract

This project report presents results of experiments on rule induction from incomplete data using probabilistic approximations. Mining incomplete data using probabilistic approximations is a well-established technique. Main goal of this report is to present research on a comparison carried out on two different approaches to mining incomplete data using probabilistic approximations: single local probabilistic approximations approach and concept probabilistic approximations. These approaches were implemented in python programming language and experiments were carried out on incomplete data sets with two interpretations of missing attribute values: lost values and do not care conditions. Our main objective was to compare concept and single local approximations in terms of the error rate computed using double hold-out method for validation. For our experiments we used seven incomplete data sets with many missing attribute values. The best results were accomplished by concept probabilistic approximations for five data sets and by single local probabilistic approximations for remaining two data sets.

Victor Berger da Silva

Probabilistic graphical techniques for automated ice-bottom tracking and comparison between state-of-the-art solutions

When & Where:

Friday, May 10, 2019 - 2:00 PM
317 Nichols Hall

Committee Members:

Carl Leuschen, Chair
John Paden
Guanghui Wang

Abstract

Multichannel radar depth sounding systems are able to produce two-dimensional and three-dimensional imagery of the internal structure of polar ice sheets. One of the relevant features typically present in this imagery is the ice-bedrock interface, which is the boundary between the bottom of the ice-sheet and the bedrock underneath. Crucial information regarding the current state of the ice sheets, such as the thickness of the ice, can be derived if the location of the ice-bedrock interface is extracted from the imagery. Due to the large amount of data collected by the radar systems employed, we seek to automate the extraction of the ice-bedrock interface and allow for efficient manual corrections when errors occur in the automated method. We present improvements made to previously proposed solutions which pose feature extraction in polar radar imagery as an inference problem on a probabilistic graphical model. The improvements proposed here are in the form of novel image pre-processing steps and empirically-derived cost functions that allow for the integration of further domain-specific knowledge into the models employed. Along with an explanation of our modifications, we demonstrate the results obtained by our proposed models and algorithms, including significantly decreased mean error measurements such as a 47% reduction in average tracking error in the case of three-dimensional imagery. We also present the results obtained by several state-of-the-art ice-interface tracking solutions, and compare all automated results with manually-corrected ground-truth data. Furthermore, we perform a self-assessment of tracking results by analyzing the differences found between the automatically extracted ice-layers in cases where two separate radar measurements have been made at the same location.

Dain Vermaak

Visualizing, and Analyzing Student Progress on Learning Maps

When & Where:

Friday, May 10, 2019 - 11:00 AM
1 Eaton Hall, Dean's Conference Room

Committee Members:

James Miller, Chair
Man Kong
Suzanne Shontz
Guanghui Wang
Bruce Frey

Abstract

A learning map is an unweighted directed graph containing relationships between discrete skills and concepts with edges defining the prerequisite hierarchy. They arose as a means of connecting student instruction directly to standards and curriculum and are designed to assist teachers in lesson planning and evaluating student response. As learning maps gain popularity there is an increasing need for teachers to quickly evaluate which nodes have been mastered by their students. Psychometrics is a field focused on measuring student performance and includes the development of processes used to link a student's response to multiple choice questions directly to their understanding of concepts. This dissertation focuses on developing modeling and visualization capabilities to enable efficient analysis of data pertaining to student understanding generated by psychometric techniques.

Such analysis naturally includes that done by classroom teachers. Visual solutions to this problem clearly indicate the current understanding of a student or classroom in such a way as to make suggestions that can guide future learning. In response to these requirements we present various experimental approaches which augment the original learning map design with targeted visual variables.

As well as looking forward, we also consider ways in which data visualization can be used to evaluate and improve existing teaching methods. We present several graphics based on modelling student progression as information flow. These methods rely on conservation of data to increase edge information, reducing the load carried by the nodes and encouraging path comparison.

In addition to visualization schemes and methods, we present contributions made to the field of Computer Science in the form of algorithms developed over the course of the research project in response to gaps in prior art. These include novel approaches to simulation of student response patterns, ranked layout of weighted directed graphs with variable edge widths, and enclosing certain groups of graph nodes in envelopes.

Finally, we present a final design which combines the features of key experimental approaches into a single visualization tool capable of meeting both predictive and validation requirements along with the methods used to measure the effectiveness and correctness of the final design.

Priyanka Saha

Complexity of Rule Sets Induced from Incomplete Data with Lost Values and Attribute-Concept Values

When & Where:

Friday, May 10, 2019 - 10:00 AM
2001 B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Taejoon Kim
Cuncong Zhong

Abstract

Data is a very rich source of knowledge and information. However, special techniques need to be implemented in order to extract interesting facts and discover patterns in large data sets. This is achieved using the technique called Data Mining. Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal to extract information from a data set and transform the information into a comprehensible structure for further use. Rule induction is a Data Mining technique in which formal rules are extracted from a set of observations. The rules induced may represent a full scientific model of the data, or merely represent local patterns in the data.

The data sets, however, is not always complete and might contain missing values. Data mining also provides techniques to handle the missing values in a data set. In this project, we’ve implemented lost value and attribute-concept value interpretations of incomplete data. Experiments were conducted on 176 datasets using three types of approximations (lower, middle and upper) of the concept and Modified Learning from Examples Module, version 2 (MLEM2) rule induction algorithm was used to induce rule sets.

The goal of the project was to prove that the complexity of rule sets derived from datasets having missing attributes is better for attribute-concept value interpretation compared to the lost value interpretation. The size of the rule set was always smaller for the attribute-concept value interpretation. Also, as a secondary objective, we tried to explore what type of approximation provides the smallest size of the rule sets.

Mohanad Al-Ibadi

Array Processing Techniques for Estimating and Tracking of an Ice-Sheet Bottom

When & Where:

Friday, May 10, 2019 - 9:00 AM
317 Nichols Hall

Committee Members:

Shannon Blunt, Chair
John Paden
Christopher Allen
Erik Perrins
James Stiles

Abstract

Ice bottom topography layers are an important boundary condition required to model the flow dynamics of an ice sheet. In this work, using low frequency multichannel radar data, we locate the ice bottom using two types of automatic trackers.

First, we use the multiple signal classification (MUSIC) beamformer to determine the pseudo-spectrum of the targets at each range-bin. The result is passed into a sequential tree-reweighted message passing belief-propagation algorithm to track the bottom of the ice in the 3D image. This technique is successfully applied to process data collected over the Canadian Arctic Archipelago ice caps, and produce digital elevation models (DEMs) for 102 data frames. We perform crossover analysis to self-assess the generated DEMs, where flight paths cross over each other and two measurements are made at the same location. Also, the tracked results are compared before and after manual corrections. We found that there is a good match between the overlapping DEMs, where the mean error of the crossover DEMs is 38+7 m, which is small relative to the average ice-thickness, while the average absolute mean error of the automatically tracked ice-bottom, relative to the manually corrected ice-bottom, is 10 range-bins.

Second, a direction of arrival (DOA)-based tracker is used to estimate the DOA of the backscatter signals sequentially from range bin to range bin using two methods: a sequential maximum a posterior probability (S-MAP) estimator and one based on the particle filter (PF). A dynamic flat earth transition model is used to model the flow of information between range bins. A simulation study is performed to evaluate the performance of these two DOA trackers. The results show that the PF-based tracker can handle low-quality data better than S-MAP, but, unlike S-MAP, it saturates quickly with increasing numbers of snapshots. Also, S-MAP is successfully applied to track the ice-bottom of several data frames collected over Russell glacier, and the results are compared against those generated by the beamformer-based tracker. The results of the DOA-based techniques are the final tracked surfaces, so there is no need for an additional tracking stage as there is with the beamformer technique.

Jason Gevargizian

MSRR: Leveraging dynamic measurement for establishing trust in remote attestation

When & Where:

Thursday, April 25, 2019 - 11:00 AM
246 Nichols Hall

Committee Members:

Prasad Kulkarni, Chair
Arvin Agah
Perry Alexander
Bo Luo
Kevin Leonard

Abstract

Measurers are critical to a remote attestation (RA) system to verify the integrity of a remote untrusted host. Runtime measurers in a dynamic RA system sample the dynamic program state of the host to form evidence in order to establish trust by a remote system (appraisal system). However, existing runtime measurers are tightly integrated with specific software. Such measurers need to be generated anew for each software, which is a manual process that is both challenging and tedious.

In this paper we present a novel approach to decouple application-specific measurement policies from the measurers tasked with performing the actual runtime measurement. We describe the MSRR (MeaSeReR) Measurement Suite, a system of tools designed with the primary goal of reducing the high degree of manual effort required to produce measurement solutions at a per application basis.

The MSRR suite prototypes a novel general-purpose measurement system, the MSRR Measurement System, that is agnostic of the target application. Furthermore, we describe a robust high-level measurement policy language, MSRR-PL, that can be used to write per application policies for the MSRR Measurer. Finally, we provide a tool to automatically generate MSRR-PL policies for target applications by leveraging state of the art static analysis tools.

In this work, we show how the MSRR suite can be used to significantly reduce the time and effort spent on designing measurers anew for each application. We describe MSRR's robust querying language, which allows the appraisal system to accurately specify the what, when, and how to measure. We describe the capabilities and the limitations of our measurement policy generation tool. We evaluate MSRR's overhead and demonstrate its functionality by employing real-world case studies. We show that MSRR has an acceptable overhead on a host of applications with various measurement workloads.