Defense Notices


All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

Andrew Riachi

An Investigation Into The Memory Consumption of Web Browsers and A Memory Profiling Tool Using Linux Smaps

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Prasad Kulkarni, Chair
Perry Alexander
Drew Davidson
Heechul Yun

Abstract

Web browsers are notorious for consuming large amounts of memory. Yet, they have become the dominant framework for writing GUIs because the web languages are ergonomic for programmers and have a cross-platform reach. These benefits are so enticing that even a large portion of mobile apps, which have to run on resource-constrained devices, are running a web browser under the hood. Therefore, it is important to keep the memory consumption of web browsers as low as practicable.

In this thesis, we investigate the memory consumption of web browsers, in particular, compared to applications written in native GUI frameworks. We introduce smaps-profiler, a tool to profile the overall memory consumption of Linux applications that can report memory usage other profilers simply do not measure. Using this tool, we conduct experiments which suggest that most of the extra memory usage compared to native applications could be due the size of the web browser program itself. We discuss our experiments and findings, and conclude that even more rigorous studies are needed to profile GUI applications.


Elizabeth Wyss

A New Frontier for Software Security: Diving Deep into npm

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Drew Davidson, Chair
Alex Bardas
Fengjun Li
Bo Luo
J. Walker

Abstract

Open-source package managers (e.g., npm for Node.js) have become an established component of modern software development. Rather than creating applications from scratch, developers may employ modular software dependencies and frameworks--called packages--to serve as building blocks for writing larger applications. Package managers make this process easy. With a simple command line directive, developers are able to quickly fetch and install packages across vast open-source repositories. npm--the largest of such repositories--alone hosts millions of unique packages and serves billions of package downloads each week. 

However, the widespread code sharing resulting from open-source package managers also presents novel security implications. Vulnerable or malicious code hiding deep within package dependency trees can be leveraged downstream to attack both software developers and the end-users of their applications. This downstream flow of software dependencies--dubbed the software supply chain--is critical to secure.

This research provides a deep dive into the npm-centric software supply chain, exploring distinctive phenomena that impact its overall security and usability. Such factors include (i) hidden code clones--which may stealthily propagate known vulnerabilities, (ii) install-time attacks enabled by unmediated installation scripts, (iii) hard-coded URLs residing in package code, (iv) the impacts of open-source development practices, (v) package compromise via malicious updates, (vi) spammers disseminating phishing links within package metadata, and (vii) abuse of cryptocurrency protocols designed to reward the creators of high-impact packages. For each facet, tooling is presented to identify and/or mitigate potential security impacts. Ultimately, it is our hope that this research fosters greater awareness, deeper understanding, and further efforts to forge a new frontier for the security of modern software supply chains. 


Alfred Fontes

Optimization and Trade-Space Analysis of Pulsed Radar-Communication Waveforms using Constant Envelope Modulations

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Patrick McCormick, Chair
Shannon Blunt
Jonathan Owen


Abstract

Dual function radar communications (DFRC) is a method of co-designing a single radio frequency system to perform simultaneous radar and communications service. DFRC is ultimately a compromise between radar sensing performance and communications data throughput due to the conflicting requirements between the sensing and information-bearing signals.

A novel waveform-based DFRC approach is phase attached radar communications (PARC), where a communications signal is embedded onto a radar pulse via the phase modulation between the two signals. The PARC framework is used here in a new waveform design technique that designs the radar component of a PARC signal to match the PARC DFRC waveform expected power spectral density (PSD) to a desired spectral template. This provides better control over the PARC signal spectrum, which mitigates the issue of PARC radar performance degradation from spectral growth due to the communications signal. 

The characteristics of optimized PARC waveforms are then analyzed to establish a trade-space between radar and communications performance within a PARC DFRC scenario. This is done by sampling the DFRC trade-space continuum with waveforms that contain a varying degree of communications bandwidth, from a pure radar waveform (no embedded communications) to a pure communications waveform (no radar component). Radar performance, which is degraded by range sidelobe modulation (RSM) from the communications signal randomness, is measured from the PARC signal variance across pulses; data throughput is established as the communications performance metric. Comparing the values of these two measures as a function of communications symbol rate explores the trade-offs in performance between radar and communications with optimized PARC waveforms.


Qua Nguyen

Hybrid Array and Privacy-Preserving Signaling Optimization for NextG Wireless Communications

When & Where:


Zoom Defense, please email jgrisafe@ku.edu for link.

Committee Members:

Erik Perrins, Chair
Morteza Hashemi
Zijun Yao
Taejoon Kim
KC Kong

Abstract

This PhD research tackles two critical challenges in NextG wireless networks: hybrid precoder design for wideband sub-Terahertz (sub-THz) massive multiple-input multiple-output (MIMO) communications and privacy-preserving federated learning (FL) over wireless networks.

In the first part, we propose a novel hybrid precoding framework that integrates true-time delay (TTD) devices and phase shifters (PS) to counteract the beam squint effect - a significant challenge in the wideband sub-THz massive MIMO systems that leads to considerable loss in array gain. Unlike previous methods that only designed TTD values while fixed PS values and assuming unbounded time delay values, our approach jointly optimizes TTD and PS values under realistic time delays constraint. We determine the minimum number of TTD devices required to achieve a target array gain using our proposed approach. Then, we extend the framework to multi-user wideband systems and formulate a hybrid array optimization problem aiming to maximize the minimum data rate across users. This problem is decomposed into two sub-problems: fair subarray allocation, solved via continuous domain relaxation, and subarray gain maximization, addressed via a phase-domain transformation.

The second part focuses on preserving privacy in FL over wireless networks. First, we design a differentially-private FL algorithm that applies time-varying noise variance perturbation. Taking advantage of existing wireless channel noise, we jointly design differential privacy (DP) noise variances and users transmit power to resolve the tradeoffs between privacy and learning utility. Next, we tackle two critical challenges within FL networks: (i) privacy risks arising from model updates and (ii) reduced learning utility due to quantization heterogeneity. Prior work typically addresses only one of these challenges because maintaining learning utility under both privacy risks and quantization heterogeneity is a non-trivial task. We approach to improve the learning utility of a privacy-preserving FL that allows clusters of devices with different quantization resolutions to participate in each FL round. Specifically, we introduce a novel stochastic quantizer (SQ) that ensures a DP guarantee and minimal quantization distortion. To address quantization heterogeneity, we introduce a cluster size optimization technique combined with a linear fusion approach to enhance model aggregation accuracy. Lastly, inspired by the information-theoretic rate-distortion framework, a privacy-distortion tradeoff problem is formulated to minimize privacy loss under a given maximum allowable quantization distortion. The optimal solution to this problem is identified, revealing that the privacy loss decreases as the maximum allowable quantization distortion increases, and vice versa.

This research advances hybrid array optimization for wideband sub-THz massive MIMO and introduces novel algorithms for privacy-preserving quantized FL with diverse precision. These contributions enable high-throughput wideband MIMO communication systems and privacy-preserving AI-native designs, aligning with the performance and privacy protection demands of NextG networks.


Arin Dutta

Performance Analysis of Distributed Raman Amplification with Different Pumping Configurations

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Rongqing Hui, Chair
Morteza Hashemi
Rachel Jarvis
Alessandro Salandrino
Hui Zhao

Abstract

As internet services like high-definition videos, cloud computing, and artificial intelligence keep growing, optical networks need to keep up with the demand for more capacity. Optical amplifiers play a crucial role in offsetting fiber loss and enabling long-distance wavelength division multiplexing (WDM) transmission in high-capacity systems. Various methods have been proposed to enhance the capacity and reach of fiber communication systems, including advanced modulation formats, dense wavelength division multiplexing (DWDM) over ultra-wide bands, space-division multiplexing, and high-performance digital signal processing (DSP) technologies. To maintain higher data rates along with maximizing the spectral efficiency of multi-level modulated signals, a higher Optical Signal-to-Noise Ratio (OSNR) is necessary. Despite advancements in coherent optical communication systems, the spectral efficiency of multi-level modulated signals is ultimately constrained by fiber nonlinearity. Raman amplification is an attractive solution for wide-band amplification with low noise figures in multi-band systems.

Distributed Raman Amplification (DRA) have been deployed in recent high-capacity transmission experiments to achieve a relatively flat signal power distribution along the optical path and offers the unique advantage of using conventional low-loss silica fibers as the gain medium, effectively transforming passive optical fibers into active or amplifying waveguides. Also, DRA provides gain at any wavelength by selecting the appropriate pump wavelength, enabling operation in signal bands outside the Erbium doped fiber amplifier (EDFA) bands. Forward (FW) Raman pumping configuration in DRA can be adopted to further improve the DRA performance as it is more efficient in OSNR improvement because the optical noise is generated near the beginning of the fiber span and attenuated along the fiber. Dual-order FW pumping scheme helps to reduce the non-linear effect of the optical signal and improves OSNR by more uniformly distributing the Raman gain along the transmission span.

The major concern with Forward Distributed Raman Amplification (FW DRA) is the fluctuation in pump power, known as relative intensity noise (RIN), which transfers from the pump laser to both the intensity and phase of the transmitted optical signal as they propagate in the same direction. Additionally, another concern of FW DRA is the rise in signal optical power near the start of the fiber span, leading to an increase in the non-linear phase shift of the signal. These factors, including RIN transfer-induced noise and non-linear noise, contribute to the degradation of system performance in FW DRA systems at the receiver.

As the performance of DRA with backward pumping is well understood with relatively low impact of RIN transfer, our research  is focused on the FW pumping configuration, and is intended to provide a comprehensive analysis on the system performance impact of dual order FW Raman pumping, including signal intensity and phase noise induced by the RINs of both 1st and the 2nd order pump lasers, as well as the impacts of linear and nonlinear noise. The efficiencies of pump RIN to signal intensity and phase noise transfer are theoretically analyzed and experimentally verified by applying a shallow intensity modulation to the pump laser to mimic the RIN. The results indicate that the efficiency of the 2nd order pump RIN to signal phase noise transfer can be more than 2 orders of magnitude higher than that from the 1st order pump. Then the performance of the dual order FW Raman configurations is compared with that of single order Raman pumping to understand trade-offs of system parameters. The nonlinear interference (NLI) noise is analyzed to study the overall OSNR improvement when employing a 2nd order Raman pump. Finally, a DWDM system with 16-QAM modulation is used as an example to investigate the benefit of DRA with dual order Raman pumping and with different pump RIN levels. We also consider a DRA system using a 1st order incoherent pump together with a 2nd order coherent pump. Although dual order FW pumping corresponds to a slight increase of linear amplified spontaneous emission (ASE) compared to using only a 1st order pump, its major advantage comes from the reduction of nonlinear interference noise in a DWDM system. Because the RIN of the 2nd order pump has much higher impact than that of the 1st order pump, there should be more stringent requirement on the RIN of the 2nd order pump laser when dual order FW pumping scheme is used for DRA for efficient fiber-optic communication. Also, the result of system performance analysis reveals that higher baud rate systems, like those operating at 100Gbaud, are less affected by pump laser RIN due to the low-pass characteristics of the transfer of pump RIN to signal phase noise.


Audrey Mockenhaupt

Using Dual Function Radar Communication Waveforms for Synthetic Aperture Radar Automatic Target Recognition

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Patrick McCormick, Chair
Shannon Blunt
Jon Owen


Abstract

Pending.


Rich Simeon

Delay-Doppler Channel Estimation for High-Speed Aeronautical Mobile Telemetry Applications

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Erik Perrins, Chair
Shannon Blunt
Morteza Hashemi
Jim Stiles
Craig McLaughlin

Abstract

The next generation of digital communications systems aims to operate in high-Doppler environments such as high-speed trains and non-terrestrial networks that utilize satellites in low-Earth orbit. Current generation systems use Orthogonal Frequency Division Multiplexing modulation which is known to suffer from inter-carrier interference (ICI) when different channel paths have dissimilar Doppler shifts.

A new Orthogonal Time Frequency Space (OTFS) modulation (also known as Delay-Doppler modulation) is proposed as a candidate modulation for 6G networks that is resilient to ICI. To date, OTFS demodulation designs have focused on the use cases of popular urban terrestrial channel models where path delay spread is a fraction of the OTFS symbol duration. However, wireless wide-area networks that operate in the aeronautical mobile telemetry (AMT) space can have large path delay spreads due to reflections from distant geographic features. This presents problems for existing channel estimation techniques which assume a small maximum expected channel delay, since data transmission is paused to sound the channel by an amount equal to twice the maximum channel delay. The dropout in data contributes to a reduction in spectral efficiency.

Our research addresses OTFS limitations in the AMT use case. We start with an exemplary OTFS framework with parameters optimized for AMT. Following system design, we focus on two distinct areas to improve OTFS performance in the AMT environment. First we propose a new channel estimation technique using a pilot signal superimposed over data that can measure large delay spread channels with no penalty in spectral efficiency. A successive interference cancellation algorithm is used to iteratively improve channel estimates and jointly decode data. A second aspect of our research aims to equalize in delay-Doppler space. In the delay-Doppler paradigm, the rapid channel variations seen in the time-frequency domain is transformed into a sparse quasi-stationary channel in the delay-Doppler domain. We propose to use machine learning using Gaussian Process Regression to take advantage of the sparse and stationary channel and learn the channel parameters to compensate for the effects of fractional Doppler in which simpler channel estimation techniques cannot mitigate. Both areas of research can advance the robustness of OTFS across all communications systems.


Mohammad Ful Hossain Seikh

AAFIYA: Antenna Analysis in Frequency-domain for Impedance and Yield Assessment

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Jim Stiles, Chair
Rachel Jarvis
Alessandro Salandrino


Abstract

This project presents AAFIYA (Antenna Analysis in Frequency-domain for Impedance and Yield Assessment), a modular Python toolkit developed to automate and streamline the characterization and analysis of radiofrequency (RF) antennas using both measurement and simulation data. Motivated by the need for reproducible, flexible, and publication-ready workflows in modern antenna research, AAFIYA provides comprehensive support for all major antenna metrics, including S-parameters, impedance, gain and beam patterns, polarization purity, and calibration-based yield estimation. The toolkit features robust data ingestion from standard formats (such as Touchstone files and beam pattern text files), vectorized computation of RF metrics, and high-quality plotting utilities suitable for scientific publication.

Validation was carried out using measurements from industry-standard electromagnetic anechoic chamber setups involving both Log Periodic Dipole Array (LPDA) reference antennas and Askaryan Radio Array (ARA) Bottom Vertically Polarized (BVPol) antennas, covering a frequency range of 50–1500 MHz. Key performance metrics, such as broadband impedance matching, S11 and S21 related calculations, 3D realized gain patterns, vector effective lengths,  and cross-polarization ratio, were extracted and compared against full-wave electromagnetic simulations (using HFSS and WIPL-D). The results demonstrate close agreement between measurement and simulation, confirming the reliability of the workflow and calibration methodology.

AAFIYA’s open-source, extensible design enables rapid adaptation to new experiments and provides a foundation for future integration with machine learning and evolutionary optimization algorithms. This work not only delivers a validated toolkit for antenna research and pedagogy but also sets the stage for next-generation approaches in automated antenna design, optimization, and performance analysis.


Past Defense Notices

Dates

SANTOSH GONDI

Design, Implementation, and Performance Analysis of In-Home Video based Monitoring System for Patients with Dementia

When & Where:


246 Nichols Hall

Committee Members:

James Sterbenz, Chair
Victor Frost
Bo Luo
Russ Waitman

Abstract

Dementia is a major public health problem affecting 35 million people in USA. The caregivers of dementia patients experience many types of physical and psychological stress while dealing with disruptive behaviors of dementia patients. This will also result in frequent hospitalizations and re-admissions. In this project we design, implement, and measure the performance of an advanced video based monitoring system to aide the caregivers in managing the behavioral symptoms of dementia patients. The caregivers will be able to easily capture and share the antecedents, consequences, and the function of behavior, through a video clip, and get the real-time feedback from clinical experts. Overall the system will help in reducing the hospital admission/readmission, improve the quality of life for caregivers, and in general result in reduced cost of health care systems. System is developed using python scripts, open source web frameworks, FFmpeg tool chain, and commercial off-the-shelf IP camera and mini-PC. WebRTC is used for video based coaching of caregivers. A framework has been developed to evaluate the storage and retrieval latency of video clips to public and On-premise clouds, video streaming performance in LAN and WLAN environments, and WebRTC performance in different types of access networks. InstaGENIrack, a GENI rack in KU is used as on-premise cloud infrastructure for the evaluation. OpenSSL utilities are employed for secured transport and storage of captured video clips. We conducted the trials in Google fiber ISP in Kansas city, and compared the performance with other traditional ISPs..


ANSU JOYS

Identifying Software Phase Markers in Java Byte Code

When & Where:


250 Nichols Hall

Committee Members:

Prasad Kulkarni, Chair
Andy Gill
Bo Luo


Abstract

Program execution can be classified into phases. These phases can be repeated during a single execution of the application. Ability to identify and classify the phases statically will help prepare the system early for the next phase, which can benefit overall program performance at run time. While static program phase detection algorithms have been explored for binary executable with promising results, to the best of our knowledge, such algorithms have not been targeted and evaluated for managed language, specifically Java, programs. Accurate detection of future program phases can allow the Java virtual machine to perform phase-specific optimizations to improve performance. 

In this project, we build a framework to detect program phases and insert software phase markers in Java byte code. We employ an existing algorithm to detect program phases and adapt it to detect phases for Java binaries. We modify the control flow graph generated by the byte code analysis tool WALA (Watson library for Analysis) to integrate program loops. We analyze each method in the control flow graph produced by WALA to detect loops, and convert the call graph into a "call loop graph". We then rely on program profiling to provide data on the number of times each basic block and edge is reached at run-time. We use this profiling information to determine the average number of instructions executed along each graph edge, the average number of times an edge is executed and the standard deviation for instructions on these edges in our algorithm to identify the software phase markers. 


ADITYA BALASUBRAMANIAN

Study and Performance Analysis of OFDM using GNURadio and USRP

When & Where:


250 Nichols Hall

Committee Members:

Lingjia Liu, Chair
Joe Evans
James Sterbenz


Abstract

Software defined radios (SDR) are a rapidly evolving technology which are used widely in industry and academia today. They offer a very low cost and flexible alternative for implementing and testing wireless technologies since most of the physical layer functionalities are implemented in 
software instead of hardware. Universal Software Defined Radio Peripheral (USRP) is one of the most popular products belong to the family of SDR. GNURadio, a software development kit comprising of C++ and Python libraries is widely used with USRP as a hardware platform to create SDR applications. 
In this project a tested is implemented for performance analysis of an OFDM communication system using GNURadio and USRP. The performance is analyzed and studied in a practical laboratory environment using GNURadio and USRP. The packet error rate versus SNR is calculated in different 
environmental settings .The effect of Interference and obstruction is also taken into account in studying the performance.


LOGAN SMITH

Validation of CReSIS Synthetic Aperture Radar Processor and Optimal Processing Parameters

When & Where:


317 Nichols Hall

Committee Members:

John Paden, Chair
Chris Allen
Carl Leuschen


Abstract

Sounding the ice sheets of Greenland and Antarctica is a vital component in determining the affect of global warming on sea level rise. Of particular importance to measure are the outlet glaciers that transport ice from the interior to the edge of the ice sheet. These outlet glaciers are difficult to sound due to crevassing caused by the relatively fast movement of the ice in the glacial channel and higher signal attenuation caused by warmer ice. The Center for Remote Sensing of Ice Sheets (CReSIS) uses multi-channel airborne radars with methods for achieving better resolution and signal-to-noise ratio (SNR) in the three major dimension to sound outlet glaciers. Synthetic aperture radar (SAR) techniques are used in the along-track dimension, pulse compression in the range dimension, and an antenna array in the cross-track dimension. 

CReSIS has developed a SAR processor to effectively and efficiently process the data collected by these radars in each dimension. To validate the performance of this processor a SAR simulator was developed with the functionality to test multiple aspects of the SAR processor. In addition to the implementation of this simulator for validation of processing the data in the along-track, cross-track and range dimensions, there are a number of data-dependent processing steps that can affect the quality of the final data product. These include creating matched filters for each dimension of the data, removing phase and amplitude differences between receive channels, and determining the optimal along-track beamwidth to use for processing the data. All of these factors can improve the ability to obtain the maximum amount of information from the collected data. The validation and optimal processing parameters and their theory are discussed here. 


H. SHANKER RAO

Dominant Attribute and Multiple Scanning Approaches for Discretization of Numerical Attributes

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Perry Alexander
Doina Caragea


Abstract

Rapid development of high throughput technologies and database management systems has made it possible to produce and store large amount of data. However, making sense of big data and discovering knowledge from it is a compounding challenge. Generally, data mining techniques search for information in datasets and express gained knowledge in the form of trends, regularities, patterns or rules. Rules are frequently identified automatically by a technique called rule induction, which is the most important technique in data mining and machine learning and it was developed primarily to handle symbolic data. However, real life data often contain numerical attributes and therefore, in order to fully utilize the power of rule induction techniques, an essential preprocessing step of converting numeric data into symbolic data called discretization is employed in data mining. 
Here we present two entropy based discretization techniques known as dominant attribute approach and multiple scanning approach, respectively. These approaches were implemented as two explicit algorithms in a JAVA programming language and experiments were conducted by applying each algorithm separately on seventeen well known numerical data sets. The resulting discretized data sets were used for rule induction by LEM2 or Learning from Examples Module 2 algorithm. For each dataset in multiple scanning approach, experiments were repeated with incremental scans until interval counts were stabilized. Preliminary results from this study indicated that multiple scanning approach performed better than dominant attribute approach in terms of producing comparatively smaller and simpler rule sets. 


YI ZHU

Matrix and Tensor-based ESPRIT Algorithm for Joint Angle and Delay Estimation in 2D Active Massive MIMO Systems and Analysis of Direction of Arrival Estimation Algorithms for Basal Ice Sheet Tomography

When & Where:


246 Nichols Hall

Committee Members:

Lingjia Liu, Chair
Shannon Blunt
John Paden
Erik Perrins

Abstract

In this thesis, we apply and analyze three direction of arrival (DoA) algorithms to tackle two distinct problems: one belongs to wireless communication, the other to radar signal processing. Though the essence of these two problems is DoA estimation, their formulation, underlying assumptions, application scenario, etc. are totally different. Hence, we write them separately, with ESPRIT algorithm the focus of Part I and MUSIC and MLE detailed in Part II. 

For wireless communication scenario, mobile data traffic is expected to have an exponential growth in the future. In “massive MIMO” systems, a base station will rely on the uplink sounding signals from mobile stations to figure out the spatial information to perform MIMO beamforming. Accordingly, multi-dimensional parameter estimation of a ray-based multipath wireless channel becomes crucial for such systems to realize the predicted capacity gains. We study joint angle and delay estimation for such system and results suggest that the dimension of the antenna array at the base station plays an important role in determining the estimation performance. These insights will be useful for designing practical “massive MIMO” systems in future mobile wireless communications. 

For the problem of radar sensing of ice sheet topography, one of the key requirements for deriving more realistic ice sheet models is to obtain a good set of basal measurements that enables accurate estimation of bed roughness and conditions. For this purpose, 3D tomography of the ice bed has been successfully implemented with the help of DoA. The SAR focused datasets provide a good case study. For the antenna array geometry and sample support used in our tomographic application, MUSIC performs better originally using a cross-over analysis where the estimated topography from crossing flight lines are compared for consistency. However, after several improvements applied to MLE, MLE outperforms MUSIC. We observe that, the spatial bottom smoothing, aiming to remove the artifacts made by MLE algorithm, is the most essential step in the post-processing procedure. The 3D tomography we obtained lays a good foundation for further analysis and modeling of ice sheets. 


YUHAO YANG

Protecting Attributes and Contents in Online Social Networks

When & Where:


250 Nichols Hall

Committee Members:

Bo Luo, Chair
Arvin Agah
Luke Huan
Prasad Kulkarni
Alfred Ho

Abstract

With the fast development of computer and information technologies, online social networks grow dramatically. While huge amount of information is distributed expeditiously in online social networking sites, privacy concerns arise. 
In this dissertation, we first study the vulnerabilities of user attributes and contents, in particular, the identifiability of the users when the adversary learns a small piece of information about the target. We further employ an information theory based approach to quantitatively evaluate the threats of attribute-based re-identification. We have shown that large portions of users with online presence are highly identifiable. 
The notion of privacy as control and information boundary has been introduced by the user-oriented privacy research community, and partly adopted in commercial social networking platforms. However, such functions are not widely accepted by the users, mainly because it is tedious and labor-intensive to manually assign friends into such circles. To tackle this problem, we introduce a social circle discovery approach using multi-view clustering. We present our observations on the key features of social circles, including friendship links, content similarity and social interactions. We treat each feature as one view, and propose a one-side co-trained spectral clustering technique, which is tailored for the sparse nature of our data. We evaluate our approach on real-world online social network data, and show that the proposed approach significantly outperforms structure-based clustering. Finally, we build a proof-of-concept demonstration of the automatic circle detection and recommendation approaches.


JAMUNA GOPAL

I Know Your Family: An Hybrid Information Retrieval Approach to Extract Family Information

When & Where:


250 Nichols Hall

Committee Members:

Bo Luo, Chair
Jerzy Grzymala-Busse
Prasad Kulkarni


Abstract

The aim of this project is to identify the family related information of a person from their Twitter Data. We use their personal details, tweets and their friends’ details in order to achieve this. Since, we deal with the modern world short text data; we have used a hybrid information retrieval methodology taking into account the Parts of Speech of the data, Phrase Similarity and the Semantic Similarity of the data along with the openly available twitter data. The future use of this research is to develop a Client Side protection tool that will help users validate the data to be posted for privacy breech.


KAIGE YAN

Power and Performance Co-optimization for Emerging Mobile Platforms

When & Where:


250 Nichols Hall

Committee Members:

Xin Fu, Chair
Prasad Kulkarni
Heechul Yun


Abstract

The mobile devices emerge as the most popular computing platform since 2011. Different from the traditional PC, the mobile devices are more power-constraint and performance-sensitive due to its size. In order to reduce the power consumption and improve the performance, we focus on the Last Level Cache (LLC), which is the power-hungry structure and critical to the performance in mobile platforms. In this project, we first integrate the McPAT power model into the Gem5 simulator. We also introduce the emerging memory technologies, such as Sprin-Transfer Torque RAM (STT-RAM) and embedded DRAM (eDRAM), into the cache design and compare their power and performance effectiveness with the conventional SRAM-based cache. Additionally, we identify that the frequent execution switch between the kernel and user code is the major reason for the high LLC miss in mobile applications. This is because blocks belonging to kernel and user space have severe interferences. We further propose static and dynamic way partition schemes to separate the cache blocks from kernel and user space. The experiment results show promising power reduction and performance improvement with our proposed techniques.


MICHAEL JANTZ

Exploring Dynamic Compilation and Cross-Layer Object Management Policies for Managed Language Applications

When & Where:


246 Nichols Hall

Committee Members:

Prasad Kulkarni, Chair
Xin Fu
Andy Gill
Bo Luo
Karen Nordheden

Abstract

Recent years have witnessed the widespread adoption of managed programming languages that are designed to execute on virtual machines. Virtual machine architectures provide several powerful software engineering advantages over statically compiled binaries, such as portable program represenations, additional safety guarantees, and automatic memory and thread management, which have largely driven their success. To support and facilitate the use of these features, virtual machines implement a number of services that adaptively manage and optimize application behavior during execution. Such runtime services often require tradeoffs between efficiency and effectiveness, and different policies can have major implications on the system's performance and energy requirements. 

In this work, we extensively explore policies for the two runtime services that are most important for achieving performance and energy efficiency: dynamic (or Just-In-Time (JIT)) compilation and memory management. First, we examine the properties of single-tier and multi-tier JIT compilation policies in order to find strategies that realize the best program performance for existing and future machines. We perform hundreds of experiments with different compiler aggressiveness and optimization levels to evaluate the performance impact of varying if and when methods are compiled. Next, we investigate the issue of how to optimize program regions to maximize performance in JIT compilation environments. For this study, we conduct a thorough analysis of the behavior of optimization phases in our dynamic compiler, and construct a custom experimental framework to determine the performance limits of phase selection during dynamic compilation. Lastly, we explore innovative memory management strategies to improve energy efficiency in the memory subsystem. We propose and develop a novel cross-layer approach to memory management that integrates information and analysis in the VM with fine-grained management of memory resources in the operating system. Using custom as well as standard benchmark workloads, we perform detailed evaluation that demonstrates the energy-saving potential of our approach.