Defense Notices


All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

Xiangyu Chen

Toward Data Efficient Learning in Computer Vision

When & Where:


Nichols Hall, Room 246

Committee Members:

Cuncong Zhong, Chair
Prasad Kulkarni
Fengjun Li
Bo Luo
Guanghui Wang

Abstract

Deep learning leads the performance in many areas of computer vision. Deep neural networks usually require a large amount of data to train a good model with the growing number of parameters. However, collecting and labeling a large dataset is not always realistic, e.g. to recognize rare diseases in the medical field. In addition, both collecting and labeling data are labor-intensive and time-consuming. In contrast, studies show that humans can recognize new categories with even a single example, which is apparently in the opposite direction of current machine learning algorithms. Thus, data-efficient learning, where the labeled data scale is relatively small, has attracted increased attention recently. According to the key components of machine learning algorithms, data-efficient learning algorithms can also be divided into three folders, data-based, model-based, and optimization-based. In this study, we investigate two data-based models and one model-based approach.

First, to collect more data to increase data quantity. The most direct way for data-efficient learning is to generate more data to mimic data-rich scenarios. To achieve this, we propose to integrate both spatial and Discrete Cosine Transformation (DCT) based frequency representations to finetune the classifier. In addition to the quantity, another property of data is the quality to the model, different from the quality to human eyes. As language carries denser information than natural images. To mimic language, we propose to explicitly increase the input information density in the frequency domain. The goal of model-based methods in data-efficient learning is mainly to make models converge faster. After carefully examining the self-attention modules in Vision Transformers, we discover that trivial attention covers useful non-trivial attention due to its large amount. To solve this issue, we proposed to divide attention weights into trivial and non-trivial ones by thresholds and suppress the accumulated trivial attention weights. Extensive experiments have been performed to demonstrate the effectiveness of the proposed models.


Yousif Dafalla

Web-Armour: Mitigating Reconnaissance and Vulnerability Scanning with Injecting Scan-Impeding Delays in Web Deployments

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Alex Bardas, Chair
Drew Davidson
Fengjun Li
Bo Luo
ZJ Wang

Abstract

Scanning hosts on the internet for vulnerable devices and services is a key step in numerous cyberattacks. Previous work has shown that scanning is a widespread phenomenon on the internet and commonly targets web application/server deployments. Given that automated scanning is a crucial step in many cyberattacks, it would be beneficial to make it more difficult for adversaries to perform such activity.

In this work, we propose Web-Armour, a mitigation approach to adversarial reconnaissance and vulnerability scanning of web deployments. The proposed approach relies on injecting scanning impeding delays to infrequently or rarely used portions of a web deployment. Web-Armour has two goals: First, increase the cost for attackers to perform automated reconnaissance and vulnerability scanning; Second, introduce minimal to negligible performance overhead to benign users of the deployment. We evaluate Web-Armour on live environments, operated by real users, and on different controlled (offline) scenarios. We show that Web-Armour can effectively lead to thwarting reconnaissance and internet-wide scanning.


Sandhya Kandaswamy

An Empirical Evaluation of Multi-Resource Scheduling for Moldable Workflows

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Hongyang Sun, Chair
Suzanne Shontz
Heechul Yun


Abstract

Resource scheduling plays a vital role in High-Performance Computing (HPC) systems. However, most scheduling research in HPC has focused on only a single type of resource (e.g., computing cores or I/O resources). With the advancement in hardware architectures and the increase in data-intensive HPC applications, there is a need to simultaneously embrace a diverse set of resources (e.g., computing cores, cache, memory, I/O, and network resources) in the design of runtime schedulers for improving the overall application performance. This thesis performs an empirical evaluation of a recently proposed multi-resource scheduling algorithm for minimizing the overall completion time (or makespan) of computational workflows comprised of moldable parallel jobs. Moldable parallel jobs allow the scheduler to select the resource allocations at launch time and thus can adapt to the available system resources (as compared to rigid jobs) while staying easy to design and implement (as compared to malleable jobs). The algorithm was proven to have a worst-case approximation ratio that grows linearly with the number of resource types for moldable workflows. In this thesis, a comprehensive set of simulations is conducted to empirically evaluate the performance of the algorithm using synthetic workflows generated by DAGGEN and moldable jobs that exhibit different speedup profiles. The results show that the algorithm fares better than the theoretical bound predicts, and it consistently outperforms two baseline heuristics under a variety of parameter settings, illustrating its robust practical performance.


Bernaldo Luc

FPGA Implementation of an FFT-Based Carrier Frequency Estimation Algorithm

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Erik Perrins, Chair
Morteza Hashemi
Rongqing Hui


Abstract

Carrier synchronization is an essential part of digital communication systems. In essence, carrier synchronization is the process of estimating and correcting any carrier phase and frequency differences between the transmitted and received signals. Typically, carrier synchronization is achieved using a phase lock loop (PLL) system; however, this method is unreliable when experiencing frequency offsets larger than 30 kHz. This thesis evaluates the FPGA implementation of a combined FFT and PLL-based carrier phase synchronization system. The algorithm includes non-data-aided, FFT-based, frequency estimator used to initialize a data-aided, PLL-based phase estimator. The frequency estimator algorithm employs a resource-efficient strategy of averaging several small FFTs instead of using one large FFT, which results in a rough estimate of the frequency offset. Since it is initialized with a rough frequency estimate, this hybrid design allows the PLL to start in a state close to frequency lock and focus mainly on phase synchronization. The results show that the algorithm demonstrates comparable performance, based on performance metrics such as bit-error rate (BER) and estimator error variance, to alternative frequency estimation strategies and simulation models. Moreover, the FFT-initialized PLL approach improves the frequency acquisition range of the PLL while achieving similar BER performance as the PLL-only system.


Rakshitha Vidhyashankar

An empirical study of temporal knowledge graph and link prediction using longitudinal editorial data

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Zijun Yao, Chair
Prasad Kulkarni
Hongyang Sun


Abstract

Natural Language Processing (NLP) is an application of Machine Learning (ML) which focuses on deriving useful and underlying facts through the semantics in articles to automatically extract insights about how information can be pictured, presented, and interpreted.  Knowledge graphs, as a promising medium for carrying the structured linguistical piece, can be a desired target for learning and visualization through artificial neural networks, in order to identify the absent information and understand the hidden transitive relationship among them. In this study, we aim to construct Temporal Knowledge Graphs of sematic information to facilitate better visualization of editorial data. Further, A neural network-based approach for link prediction is carried out on the constructed knowledge graphs. This study uses news articles in English language, from New York Times (NYT) collected over a period of time for experiments. The sentences in these articles can be decomposed into Part-Of-Speech (POS) Tags to give a triple t = {sub, pred, obj}. A directed Graph G (V, E) is constructed using POS tags, such that the Set of Vertices is the grammatical constructs that appear in the sentence and the Set of Edges is the directed relation between the constructs. The main challenge that arises with knowledge graphs is the storage constraints that arise in lieu of storing the graph information. The study proposes ways by which this can be handled. Once these graphs are constructed, a neural architecture is trained to learn the graph embeddings which can be utilized to predict the potentially missing links which are transitive in nature. The results are evaluated using learning-to-rank metrics such Mean Reciprocal Rank (MRR). 


Jace Kline

A Framework for Assessing Decompiler Inference Accuracy of Source-Level Program Constructs

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Prasad Kulkarni, Chair
Perry Alexander
Bo Luo


Abstract

Decompilation is the process of reverse engineering a binary program into an equivalent source code representation with the objective to recover high-level program constructs such as functions, variables, data types, and control flow mechanisms. Decompilation is applicable in many contexts, particularly for security analysts attempting to decipher the construction and behavior of malware samples. However, due to the loss of information during compilation, this process is naturally speculative and thus is prone to inaccuracy. This inherent speculation motivates the idea of an evaluation framework for decompilers.

In this work, we present a novel framework to quantitatively evaluate the inference accuracy of decompilers, regarding functions, variables, and data types. Within our framework, we develop a domain-specific language (DSL) for representing such program information from any "ground truth" or decompiler source. Using our DSL, we implement a strategy for comparing ground truth and decompiler representations of the same program. Subsequently, we extract and present insightful metrics illustrating the accuracy of decompiler inference regarding functions, variables, and data types, over a given set of benchmark programs. We leverage our framework to assess the correctness of the Ghidra decompiler when compared to ground truth information scraped from DWARF debugging information. We perform this assessment over a subset of the GNU Core Utilities (Coreutils) programs and discuss our findings.


Jaypal Singh

EvalIt: Skill Evaluation using block chain

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Drew Davidson, Chair
David Johnson
Hongyang Sun


Abstract

Skills validation is a key issue when hiring workers. Companies and universities often face difficulties in determining an applicant's skills because certification of the skills claimed by an applicant is usually not readily verifiable and verification is costly. Also, from applicant's perspective, skill evaluation from industry expert is valuable instead of learning a generalized course with certification. Most of the certification programs are easy and proved not so fruitful in learning the required work skills. Blockchain has been proposed in the literature for functional verification and tamper-proof information storage in a decentralized way. "EvalIt" is a blockchain-based Dapp that addresses the above issues and guarantees some desirable properties. The Dapp facilitates skill evaluation efforts through payments using tokens that it collects from payments made by users of the platform.


Soma Pal

Properties of Profile-guided Compiler Optimization with GCC and LLVM

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Prasad Kulkarni, Chair
Mohammad Alian
Tamzidul Hoque


Abstract

Profile-guided optimizations (PGO) are a class of sophisticated compiler transformations that employ information regarding the profile or execution time behavior of a program to improve program performance, typically speed. PGOs for popular language platforms, like C, C++, and Java, are generally regarded as a mature and mainstream technology and are supported by most standard compilers. Consequently, properties and characteristics of PGOs are assumed to be established and known but have rarely been systematically studied with multiple mainstream compilers.

The goal of this work is to explore and report some important properties of PGOs in mainstream compilers, specifically GCC and LLVM in this work. We study the performance delivered by PGOs at the program and function-level, impact of different execution profiles on PGO performance, and compare relative PGO benefit delivered by different mainstream compilers. We also built the experimental framework to conduct this research. We expect that our work will help focus future research and assist in building frameworks to field PGOs in actual systems.


Past Defense Notices

Dates

Samyak Jain

Monkeypox Detection Using Computer Vision

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Prasad Kulkarni, Chair
David Johnson, (Co-Chair)
Hongyang Sun


Abstract

As the world recovers from the damage caused by the spread of COVID-19, the monkeypox virus poses a new threat of becoming a global pandemic. The monkeypox virus itself is not as deadly or contagious as COVID-19, but many countries report new patient cases every day. So it wouldn't be surprising if the world faces another pandemic due to lack of proper precautions. Recently, deep learning has shown great potential in image-based diagnostics, such as cancer detection, tumor cell identification, and COVID-19 patient detection. Therefore, since monkeypox has infected human skin, a similar application can be employed in diagnosing monkeypox-related diseases, and this image can be captured and used for further disease diagnosis. This project presents a deep learning approach for detecting monkeypox disease from skin lesion images. Several pre-trained deep learning models, such as ResNet50 and Mobilenet, are deployed on the dataset to classify monkeypox and other diseases.


Grace Young

Quantum Algorithms & the Hidden Subgroup Problem

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Matthew Moore, Chair
Perry Alexander
Esam El-Araby
Cuncong Zhong
KC Kong

Abstract

In the last century, we have seen incredible growth in the field of quantum computing. Quantum computing offers us the opportunity to find efficient solutions to certain computational problems which are intractable on classical computers. One class of problems that seems to benefit from quantum computing is the Hidden Subgroup Problem (HSP). In the following proposal, we will examine basics of quantum computing as well as the current research surrounding the HSP. We will also discuss the importance of the HSP and its relation to other popular problems such as Integer Factoring, Discrete Logarithm, Shortest Vector, and Subset Sum problems.

The proposed research aims to develop a quantum algorithmic polynomial-time reduction to special cases of the HSP where the parameterizing group is the Dihedral group. This problem is known as the Dihedral HSP (DHSP). The usual approach to the HSP relies on harmonic analysis in the domain of the problem and the best-known algorithm using this approach is sub-exponential, but still super-polynomial. The algorithm we have designed focuses on the structure encoded in the codomain which uses this structure to direct a “walk” down the subgroup lattice terminating at the hidden subgroup.

 


Victor Alberto Lopez Nikolskiy

Maximum Power Point Tracking For Solar Harvesting Using Industry Implementation Of Perturb And Observe with Integrated Circuits

When & Where:


Eaton Hall, Room 2001B

Committee Members:

James Stiles, Chair
Christopher Allen
Patrick McCormick


Abstract

This project is not a new idea or an innovative method, this project consists in the implementation of techniques already used in the consumer industry.

The purpose of this project is to implement a compact and low-weight Maximum Power Point Tracking (MPPT) Solar Harvesting Device intended for a small fixed-wing unmanned aircraft. For the aircraft selected, the load could be subsidized up to 25% by the MPPT device and installed solar cells.

The MPPT device was designed around the Texas Instruments SM72445 Integrated Circuit and its technical documentation. The prototype was evaluated using a Photovoltaic Profile Emulator Power Supply and a LiPo battery.

The device performed MPPT in one of the different tested current-voltage (IV) profiles reaching Maximum Power Point (MPP).  The device did not maintain the MPP. Under an additional external DC load or different IV profiles, the emulator operates in prohibited operating conditions. The probable cause of the failed behavior is due to instability in the emulator’s output. The inputs to the controller and response behaviors of the H-bridge circuit were as expected and designed.


Koyel Pramanick

Detection of measures devised by the compiler to improve security of the generated code

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Prasad Kulkarni, Chair
Drew Davidson
Fengjun Li
Bo Luo
John Symons

Abstract

The main aim of the thesis is to identify provisions employed by the compiler to ensure the security of any arbitrary binary. These provisions are security techniques applied automatically by the compiler during the system build process. Compilers provide a number of security checks, that can be applied statically or at compile time, to protect the software from attacks that target code vulnerabilities. Most compilers use warnings to indicate potential code bugs and run-time security checks which add instrumentation code in the binary to detect problems during execution. Our first work is to develop a language-agnostic and compiler-agnostic experimental framework which determines the presence of targeted compiler-based run-time security checks in any binary. Our next work is focused on exploring if unresolved compiler generated warnings can be detected in the binary when the source code is not available.


Ben Liu

Computational Microbiome Analysis: Method Development, Integration and Clinical Applications

When & Where:


Eaton Hall, Room 1

Committee Members:

Cuncong Zhong, Chair
Esam El-Araby
Bo Luo
Zijun Yao
Mizuki Azuma

Abstract

Metagenomics is the study of microbial genomes from one common environment and in most cases, metagenomic data refer to the whole-genome shotgun sequencing data of the microbiota, which are fragmented DNA sequences from all regions in the microbial genomes. Because the data are generated without laboratory culture, they provide a more unbiased insight to and uniquely enriched information of the microbial community. Currently many researchers are interested in metagenomic data, and a sea of software exist for various purposes at different analyzing stages. Most researchers build their own analyzing pipeline on their expertise, and the pipelines for the same purpose built by two researchers might be disparate, thus affecting the conclusion of experiment. 

My research interests involve: (1) We first developed an assembly graph-based ncRNA searching tools, named DRAGoM, to improve the searching quality in metagenomic data. (2) We proposed an automatic metagenomic data analyzing pipeline generation system to extract, organize and exploit the enormous amount of knowledge available in literature. The system consists of two work procedures: construction and generation. In the construction procedure, the system takes a corpus of raw textual data, and updates the constructed pipeline network, whereas in the genera- tion stage, the system recommends analyzing pipeline based on the user input. (3) We performed a meta-analysis on the taxonomic and functional features of the gut microbiome from non-small cell lung cancer patients treated with immunotherapy, to establish a model to predict if a patient will benefit from immunotherapy. We systematically studied the taxonomical characteristics of the dataset and used both random forest and multilayer perceptron neural network models to predict the patients with progressing-free survival above 6 months versus those below 3 months.


Matthew Showers

Software-based Runtime Protection of Secret Assets in Untrusted Hardware under Zero Trust

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Tamzidul Hoque, Chair
Alex Bardas
Drew Davidson


Abstract

The complexity of the design and fabrication process of electronic devices is advancing with their ability to provide wide-ranging functionalities including data processing, sensing, communication, artificial intelligence, and security. Due to these complexities in the design and manufacturing process and associated time and cost, system developers often prefer to procure off-the-shelf components directly from the market instead of developing custom Integrated Circuits (ICs) from scratch. Procurement of Commerical-Off-The-Shelf (COTS) components reduces system development time and cost significantly, enables easy integration of new technologies, and facilitates smaller production runs. Moreover, since various companies use the same COTS IC, they are generally available in the market for a long period and are easy to replace. 

Although utilizing COTS parts can provide many benefits, it also introduces serious security concerns. None of the entities in the COTS IC supply chain are trusted from a consumer's perspective, leading to a ”Zero Trust” supply chain threat model. Any of these entities could introduce hidden malicious circuits or hardware Trojans within the component that could help an attacker in the field extract secret information (e.g., cryptographic keys) or cause a functional failure. Existing solutions to counter hardware Trojans are inapplicable in a zero trust scenario as they assume either the design house or the foundry to be trusted. Moreover, many solutions require access to the design for analysis or modification to enable the countermeasure. 

In this work, we have proposed a software-oriented countermeasure to ensure the confidentiality of secret assets against hardware Trojan attacks in untrusted COTS microprocessors. The proposed solution does not require any supply chain entity to be trusted and does not require analysis or modification of the IC design.  

To protect secret assets in an untrusted microprocessor, the proposed method leverages the concept of residue number coding to transform the software functions operating on the asset to be homomorphic. We have presented a detailed security analysis to evaluate the confidentiality of a secret asset under Trojan attacks using the secret key of the Advanced Encryption Standard (AES) program as a case study. Finally, to help streamline the application of this protection scheme, we have developed a plugin for the LLVM compiler toolchain that integrates the solution without requiring extensive source code alterations.


Madhuvanthi Mohan Vijayamala

Camouflaged Object Detection in Images using a Search-Identification based framework

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Prasad Kulkarni, Chair
David Johnson (Co-Chair)
Zijun Yao


Abstract

While identifying an object in an image is almost an instantaneous task for the human visual perception system, it takes more effort and time to process and identify a camouflaged object - an entity that flawlessly blends with the background in the image. This explains why it is much more challenging to enable a machine learning model to do the same, in comparison to generic object detection or salient object detection.

This project implements a framework called Search Identification Network, that simulates the search and identification pattern adopted by predators in hunting their prey and applies it to detect camouflaged objects. The efficiency of this framework in detecting polyps in medical image datasets is also measured.


Lumumba Harnett

Mismatched Processing for Radar Interference Cancellation

When & Where:


Nichols Hall, Room 129

Committee Members:

Shannon Blunt, Chair
Chrisopther Allen
Erik Perrins
James Stiles
Richard Hale

Abstract

Matched processing is fundamental filtering operation within radar signal processing to estimate scattering in the radar scene based on the transmit signal. Although matched processing maximizes the signal-to-noise ratio (SNR), the filtering operation is ineffective when interference is captured in the receive measurement. Adaptive interference mitigation combined with matched processing has proven to mitigate interference and estimate the radar scene. But, a known caveat of matched processing is the resulting sidelobes that may mask other scatterers. The sidelobes can be efficiently addressed by windowing but this approach also comes with limited suppression capabilities, loss in resolution, and loss in SNR. The recent emergence of mismatch processing has shown to optimally reduce sidelobes while maintaining nominal resolution and signal estimation performance. Throughout this work, re-iterative minimum-mean square error (RMMSE) adaptive and least-squares (LS) optimal mismatch processing are proposed for enhanced signal estimation in unison with adaptive interference mitigation for various radar applications including random pulse repetition interval (PRI) staggering pulse-Doppler radar, airborne ground moving target indication, and radar & communication spectrum sharing. Mismatch processing and adaptive interference cancellation each can be computationally complex for practical implementation. Sub-optimal RMMSE and LS approaches are also introduced to address computational limitations. The efficacy of these algorithms are presented using various high-fidelity Monte Carlo simulations and open-air experimental datasets. 


Naveed Mahmud

Towards Complete Emulation of Quantum Algorithms using High-Performance Reconfigurable Computing

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Esam El-Araby, Chair
Perry Alexander
Prasad Kulkarni
Heechul Yun
Tyrone Duncan

Abstract

Quantum computing is a promising technology that can potentially demonstrate supremacy over classical computing in solving specific problems. At present, two critical challenges for quantum computing are quantum state decoherence, and low scalability of current quantum devices. Decoherence places constraints on realistic applicability of quantum algorithms as real-life applications usually require complex equivalent quantum circuits to be realized. For example, encoding classical data on quantum computers for solving I/O and data-intensive applications generally requires quantum circuits that violate decoherence constraints. In addition, current quantum devices are of small-scale having low quantum bit(qubit) counts, and often producing inaccurate or noisy measurements, which also impacts the realistic applicability of real-world quantum algorithms. Consequently, benchmarking of existing quantum algorithms and investigation of new applications are heavily dependent on classical simulations that use costly, resource-intensive computing platforms. Hardware-based emulation has been alternatively proposed as a more cost-effective and power-efficient approach. This work proposes a hardware-based emulation methodology for quantum algorithms, using cost-effective Field-Programmable Gate-Array(FPGA) technology. The proposed methodology consists of three components that are required for complete emulation of quantum algorithms; the first component models classical-to-quantum(C2Q) data encoding, the second emulates the behavior of quantum algorithms, and the third models the process of measuring the quantum state and extracting classical information, i.e., quantum-to-classical(Q2C) data decoding. The proposed emulation methodology is used to investigate and optimize methods for C2Q/Q2C data encoding/decoding, as well as several important quantum algorithms such as Quantum Fourier Transform(QFT), Quantum Haar Transform(QHT), and Quantum Grover’s Search(QGS). This work delivers contributions in terms of reducing complexities of quantum circuits, extending and optimizing quantum algorithms, and developing new quantum applications. For higher emulation performance and scalability of the framework, hardware design techniques and hardware architectural optimizations are investigated and proposed. The emulation architectures are designed and implemented on a high-performance-reconfigurable-computer(HPRC), and proposed quantum circuits are implemented on a state-of-the-art quantum processor. Experimental results show that the proposed hardware architectures enable emulation of quantum algorithms with higher scalability, higher accuracy, and higher throughput, compared to existing hardware-based emulators. As a case study, quantum image processing using multi-spectral images is considered for the experimental evaluations. 


Eric Seals

Memory Bandwidth Dynamic Regulation and Throttling

When & Where:


Learned Hall, Room 3150

Committee Members:

Heechul Yun, Chair
Alex Bardas
Drew Davidson


Abstract

Multi-core, integrated CPU-GPU embedded systems provide new capabilities for sophisticated real-time systems with size, weight, and power limitations; however, interference between shared resources remains a challenge in providing necessary performance guarantees. The shared main memory is a notable system bottleneck - causing throughput slowdowns and timing unpredictability.
In this paper, we propose a full system mechanism which can provide memory bandwidth regulation across both CPU and the GPU complexes. This system monitors the memory controller accesses directly through hardware statistics counters, performs memory regulation at the software level for real-time CPU tasks, and incorporates a feedback-based throttling mechanism for non-critical GPU kernels using hardware within the NVIDIA Tegra X1 memory controller subsystem. The system is built as a loadable Linux kernel module that extends the MemGuard tool. We show that this system can make CPU task execution more predictable against co-running, memory intensive interference on either CPU or GPU.