Defense Notices


All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

Sudha Chandrika Yadlapalli

BERT-Driven Sentiment Analysis: Automated Course Feedback Classification and Ratings

When & Where:


Eaton Hall, Room 2001B

Committee Members:

David Johnson, Chair
Prasad Kulkarni
Hongyang Sun


Abstract

Automating the analysis of unstructured textual data, such as student course feedback, is crucial for gaining actionable insights. This project focuses on developing a sentiment analysis system leveraging the DeBERTa-v3-base model, a variant of BERT (Bidirectional Encoder Representations from Transformers), to classify feedback sentiments and generate corresponding ratings on a 1-to-5 scale.

A dataset of 100,000+ student reviews was preprocessed and fine-tuned on the model to handle class imbalances and capture contextual nuances. Training was conducted on high-performance A100 GPUs, which enhanced computational efficiency and reduced training times significantly. The trained BERT sentiment model demonstrated superior performance compared to traditional machine learning models, achieving ~82% accuracy in sentiment classification.

The model was seamlessly integrated into a functional web application, providing a streamlined approach to evaluate and visualize course reviews dynamically. Key features include a course ratings dashboard, allowing students to view aggregated ratings for each course, and a review submission functionality where new feedback is analyzed for sentiment in real-time. For the department, an admin page provides secure access to detailed analytics, such as the distribution of positive and negative reviews, visualized trends, and the access to view individual course reviews with their corresponding sentiment scores.

This project includes a comprehensive pipeline, starting from data preprocessing and model training to deploying an end-to-end application. Traditional machine learning models, such as Logistic Regression and Decision Tree, were initially tested but yielded suboptimal results. The adoption of BERT, trained on a large dataset of 100k reviews, significantly improved performance, showcasing the benefits of advanced transformer-based models for sentiment analysis tasks.


Shriraj K. Vaidya

Exploring DL Compiler Optimizations with TVM

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Prasad Kulkarni, Chair
Dongjie Wang
Zijun Yao


Abstract

Deep Learning (DL) compilers, also called Machine Learning (ML) compilers, take a computational graph representation of a ML model as input and apply graph-level and operator-level optimizations to generate optimized machine-code for different supported hardware architectures. DL compilers can apply several graph-level optimizations, including operator fusion, constant folding, and data layout transformations to convert the input computation graph into a functionally equivalent and optimized variant. The DL compilers also perform kernel scheduling, which is the task of finding the most efficient implementation for the operators in the computational graph. While many research efforts have focused on exploring different kernel scheduling techniques and algorithms, the benefits of individual computation graph-level optimizations are not as well studied. In this work, we employ the TVM compiler to perform a comprehensive study of the impact of different graph-level optimizations on the performance of DL models on CPUs and GPUs. We find that TVM's graph optimizations can improve model performance by up to 41.73% on CPUs and 41.6% on GPUs, and by 16.75% and 21.89%, on average, on CPUs and GPUs, respectively, on our custom benchmark suite.


Rizwan Khan

Fatigue crack segmentation of steel bridges using deep learning models - a comparative study.

When & Where:


Learned Hall, Room 3131

Committee Members:

David Johnson, Chair
Hongyang Sun



Abstract

Structural health monitoring (SHM) is crucial for maintaining the safety and durability of infrastructure. To address the limitations of traditional inspection methods, this study leverages cutting-edge deep learning-based segmentation models for autonomous crack identification. Specifically, we utilized the recently launched YOLOv11 model, alongside the established DeepLabv3+ model for crack segmentation. Mask R-CNN, a widely recognized model in crack segmentation studies, is used as the baseline approach for comparison. Our approach integrates the CREC cropping strategy to optimize dataset preparation and employs post-processing techniques, such as dilation and erosion, to refine segmentation results. Experimental results demonstrate that our method—combining state-of-the-art models, innovative data preparation strategies, and targeted post-processing—achieves superior mean Intersection-over-Union (mIoU) performance compared to the baseline, showcasing its potential for precise and efficient crack detection in SHM systems


Zhaohui Wang

Enhancing Security and Privacy of IoT Systems: Uncovering and Resolving Cross-App Threats

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Fengjun Li, Chair
Alex Bardas
Drew Davidson
Bo Luo
Haiyang Chao

Abstract

The rapid growth of Internet of Things (IoT) technology has brought unprecedented convenience to our daily lives, enabling users to customize automation rules and develop IoT apps to meet their specific needs. However, as IoT devices interact with multiple apps across various platforms, users are exposed to complex security and privacy risks. Even interactions among seemingly harmless apps can introduce unforeseen security and privacy threats.

In this work, we introduce two innovative approaches to uncover and address these concealed threats in IoT environments. The first approach investigates hidden cross-app privacy leakage risks in IoT apps. These risks arise from cross-app chains that are formed among multiple seemingly benign IoT apps. Our analysis reveals that interactions between apps can expose sensitive information such as user identity, location, tracking data, and activity patterns. We quantify these privacy leaks by assigning probability scores to evaluate the risks based on inferences. Additionally, we provide a fine-grained categorization of privacy threats to generate detailed alerts, enabling users to better understand and address specific privacy risks. To systematically detect cross-app interference threats, we propose to apply principles of logical fallacies to formalize conflicts in rule interactions. We identify and categorize cross-app interference by examining relations between events in IoT apps. We define new risk metrics for evaluating the severity of these interferences and use optimization techniques to resolve interference threats efficiently. This approach ensures comprehensive coverage of cross-app interference, offering a systematic solution compared to the ad hoc methods used in previous research.

To enhance forensic capabilities within IoT, we integrate blockchain technology to create a secure, immutable framework for digital forensics. This framework enables the identification, tracing, storage, and analysis of forensic information to detect anomalous behavior. Furthermore, we developed a large-scale, manually verified, comprehensive dataset of real-world IoT apps. This clean and diverse benchmark dataset supports the development and validation of IoT security and privacy solutions. Each of these approaches has been evaluated using our dataset of real-world apps, collectively offering valuable insights and tools for enhancing IoT security and privacy against cross-app threats.


Hao Xuan

A Unified Algorithmic Framework for Biological Sequence Alignment

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Cuncong Zhong, Chair
Fengjun Li
Suzanne Shontz
Hongyang Sun
Liang Xu

Abstract

Sequence alignment is pivotal in both homology searches and the mapping of reads from next-generation sequencing (NGS) and third-generation sequencing (TGS) technologies. Currently, the majority of sequence alignment algorithms utilize the “seed-and-extend” paradigm, designed to filter out unrelated or nonhomologous sequences when no highly similar subregions are detected. A well-known implementation of this paradigm is BLAST, one of the most widely used multipurpose aligners. Over time, this paradigm has been optimized in various ways to suit different alignment tasks. However, while these specialized aligners often deliver high performance and efficiency, they are typically restricted to one or few alignment applications. To the best of our knowledge, no existing aligner can perform all alignment tasks while maintaining superior performance and efficiency.

In this work, we introduce a unified sequence alignment framework to address this limitation. Our alignment framework is built on the seed-and-extend paradigm but incorporates novel designs in its seeding and indexing components to maximize both flexibility and efficiency. The resulting software, the Versatile Alignment Toolkit (VAT), allows the users to switch seamlessly between nearly all major alignment tasks through command-line parameter configuration. VAT was rigorously benchmarked against leading aligners for DNA and protein homolog searches, NGS and TGS read mapping, and whole-genome alignment. The results demonstrated VAT’s top-tier performance across all benchmarks, underscoring the feasibility of using a unified algorithmic framework to handle diverse alignment tasks. VAT can simplify and standardize bioinformatic analysis workflows that involve multiple alignment tasks. 


Manu Chaudhary

Utilizing Quantum Computing for Solving Multidimensional Partial Differential Equations

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Esam El-Araby, Chair
Perry Alexander
Tamzidul Hoque
Prasad Kulkarni
Tyrone Duncan

Abstract

Quantum computing has the potential to revolutionize computational problem-solving by leveraging the quantum mechanical phenomena of superposition and entanglement, which allows for processing a large amount of information simultaneously. This capability is significant in the numerical solution of complex and/or multidimensional partial differential equations (PDEs), which are fundamental to modeling various physical phenomena. There are currently many quantum techniques available for solving partial differential equations (PDEs), which are mainly based on variational quantum circuits. However, the existing quantum PDE solvers, particularly those based on variational quantum eigensolver (VQE) techniques, suffer from several limitations. These include low accuracy, high execution times, and low scalability on quantum simulators as well as on noisy intermediate-scale quantum (NISQ) devices, especially for multidimensional PDEs.

 In this work, we propose an efficient and scalable algorithm for solving multidimensional PDEs. We present two variants of our algorithm: the first leverages finite-difference method (FDM), classical-to-quantum (C2Q) encoding, and numerical instantiation, while the second employs FDM, C2Q, and column-by-column decomposition (CCD). Both variants are designed to enhance accuracy and scalability while reducing execution times. We have validated and evaluated our algorithm using the multidimensional Poisson equation as a case study. Our results demonstrate higher accuracy, higher scalability, and faster execution times compared to VQE-based solvers on noise-free and noisy quantum simulators from IBM. Additionally, we validated our approach on hardware emulators and actual quantum hardware, employing noise mitigation techniques. We will also focus on extending these techniques to PDEs relevant to computational fluid dynamics and financial modeling, further bridging the gap between theoretical quantum algorithms and practical applications.


Venkata Sai Krishna Chaitanya Addepalli

A Comprehensive Approach to Facial Emotion Recognition: Integrating Established Techniques with a Tailored Model

When & Where:


Eaton Hall, Room 2001B

Committee Members:

David Johnson, Chair
Prasad Kulkarni
Hongyang Sun


Abstract

Facial emotion recognition has become a pivotal application of machine learning, enabling advancements in human-computer interaction, behavioral analysis, and mental health monitoring. Despite its potential, challenges such as data imbalance, variation in expressions, and noisy datasets often hinder accurate prediction.

 This project presents a novel approach to facial emotion recognition by integrating established techniques like data augmentation and regularization with a tailored convolutional neural network (CNN) architecture. Using the FER2013 dataset, the study explores the impact of incremental architectural improvements, optimized hyperparameters, and dropout layers to enhance model performance.

 The proposed model effectively addresses issues related to data imbalance and overfitting while achieving enhanced accuracy and precision in emotion classification. The study underscores the importance of feature extraction through convolutional layers and optimized fully connected networks for efficient emotion recognition. The results demonstrate improvements in generalization, setting a foundation for future real-time applications in diverse fields. 


Tejarsha Arigila

Benchmarking Aggregation Free Federated Learning using Data Condensation: Comparison with Federated Averaging

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Fengjun Li, Chair
Bo Luo
Sumaiya Shomaji


Abstract

This project investigates the performance of Federated Learning Aggregation-Free (FedAF) compared to traditional federated learning methods under non-independent and identically distributed (non-IID) data conditions, characterized by Dirichlet distribution parameters (alpha = 0.02, 0.05, 0.1). Utilizing the MNIST and CIFAR-10 datasets, the study benchmarks FedAF against Federated Averaging (FedAVG) in terms of accuracy, convergence speed, communication efficiency, and robustness to label and feature skews.  

Traditional federated learning approaches like FedAVG aggregate locally trained models at a central server to form a global model. However, these methods often encounter challenges such as client drift in heterogeneous data environments, which can adversely affect model accuracy and convergence rates. FedAF introduces an innovative aggregation-free strategy wherein clients collaboratively generate a compact set of condensed synthetic data. This data, augmented by soft labels from the clients, is transmitted to the server, which then uses it to train the global model. This approach effectively reduces client drift and enhances resilience to data heterogeneity. Additionally, by compressing the representation of real data into condensed synthetic data, FedAF improves privacy by minimizing the transfer of raw data.  

The experimental results indicate that while FedAF converges faster, it struggles to stabilize under highly heterogenous environments due to limited real data representation capacity of condensed synthetic data. 


Mohammed Misbah Zarrar

Efficient End-to-End Deep Learning for Autonomous Racing: TinyLidarNet and Low-Power Computing Platforms

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Heechul Yun, Chair
Prasad Kulkarni
Bo Luo


Abstract

End-to-end deep learning has proven effective for robotic navigation by deriving control signals directly from raw sensory data. However, the majority of existing end-to-end navigation solutions are predominantly camera-based. 

We propose TinyLidarNet, a lightweight 2D LiDAR-based end-to-end deep learning model for autonomous racing. We systematically analyze its performance on untrained tracks and computing requirements for real-time processing. We find that TinyLidarNet's 1D Convolutional Neural Network (CNN) based architecture significantly outperforms widely used Multi-Layer Perceptron (MLP) based architecture. In addition, we show that it can be processed in real-time on low-end micro-controller units (MCUs).

We deployed TinyLidarNet on an MCU-based F1TENTH platform, which is comprised of an ESP32-S3 MCU and a RPLiDAR sensor and demonstrated the feasibility of using MCUs in F1TENTH autonomous racing. 

Finally, we compare TinyLidarNet with ForzaETH, a state-of-the-art Model Predictive Controller (MPC) based F1TENTH racing stack. Our results show that TinyLidarNet is able to closely match the performance of ForzaETH by training the model using the data generated by ForzaETH


Ye Wang

Deceptive Signals: Unveiling and Countering Sensor Spoofing Attacks on Cyber Systems

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Fengjun Li, Chair
Drew Davidson
Rongqing Hui
Bo Luo
Haiyang Chao

Abstract

In modern computer systems, sensors play a critical role in enabling a wide range of functionalities, from navigation in autonomous vehicles to environmental monitoring in smart homes. Acting as an interface between physical and digital worlds, sensors collect data to drive automated functionalities and decision-making. However, this reliance on sensor data introduces significant potential vulnerabilities, leading to various physical, sensor-enabled attacks such as spoofing, tampering, and signal injection. Sensor spoofing attacks, where adversaries manipulate sensor input or inject false data into target systems, pose serious risks to system security and privacy.

In this work, we have developed two novel sensor spoofing attack methods that significantly enhance both efficacy and practicality. The first method employs physical signals that are imperceptible to humans but detectable by sensors. Specifically, we target deep learning based facial recognition systems using infrared lasers. By leveraging advanced laser modeling, simulation-guided targeting, and real-time physical adjustments, our infrared laser-based physical adversarial attack achieves high success rates with practical real-time guarantees, surpassing the limitations of prior physical perturbation attacks. The second method embeds physical signals, which are inherently present in the system, into legitimate patterns. In particular, we integrate trigger signals into standard operational patterns of actuators on mobile devices to construct remote logic bombs, which are shown to be able to evade all existing detection mechanisms. Achieving a zero false-trigger rate with high success rates, this novel sensor bomb is highly effective and stealthy.

Our study on emerging sensor-based threats highlights the urgent need for comprehensive defenses against sensor spoofing. Along this direction, we design and investigate two defense strategies to mitigate these threats. The first strategy involves filtering out physical signals identified as potential attack vectors. The second strategy is to leverage beneficial physical signals to obfuscate malicious patterns and reinforce data integrity. For example, side channels targeting the same sensor can be used to introduce cover signals that prevent information leakage, while environment-based physical signals serve as signatures to authenticate data. Together, these strategies form a comprehensive defense framework that filters harmful sensor signals and utilizes beneficial ones, significantly enhancing the overall security of cyber systems.


SM Ishraq-Ul Islam

Quantum Circuit Synthesis using Genetic Algorithms Combined with Fuzzy Logic

When & Where:


LEEP2, Room 1420

Committee Members:

Esam El-Araby, Chair
Tamzidul Hoque
Prasad Kulkarni


Abstract

  Quantum computing emerges as a promising direction for high-performance computing in the post-Moore era. Leveraging quantum mechanical properties, quantum devices can theoretically provide significant speedup over classical computers in certain problem domains. Quantum algorithms are typically expressed as quantum circuits composed of quantum gates, or as unitary matrices. Execution of quantum algorithms on physical devices requires translation to machine-compatible circuits -- a process referred to as quantum compilation or synthesis. 

    Quantum synthesis is a challenging problem. Physical quantum devices support a limited number of native basis gates, requiring synthesized circuits to be composed of only these gates. Moreover, quantum devices typically have specific qubit topologies, which constrain how and where gates can be applied. Consequently, logical qubits in input circuits and unitaries may need to be mapped to and routed between physical qubits on the device.

    Current Noisy Intermediate-Scale Quantum (NISQ) devices present additional constraints, through their gate errors and high susceptibility to noise. NISQ devices are vulnerable to errors during gate application and their short decoherence times leads to qubits rapidly succumbing to accumulated noise and possibly corrupting computations. Therefore, circuits synthesized for NISQ devices need to have a low number of gates to reduce gate errors, and short execution times to avoid qubit decoherence. 

   The problem of synthesizing device-compatible quantum circuits, while optimizing for low gate count and short execution times, can be shown to be computationally intractable using analytical methods. Therefore, interest has grown towards heuristics-based compilation techniques, which are able to produce approximations of the desired algorithm to a required degree of precision. In this work, we investigate using Genetic Algorithms (GAs) -- a proven gradient-free optimization technique based on natural selection -- for circuit synthesis. In particular, we formulate the quantum synthesis problem as a multi-objective optimization (MOO) problem, with the objectives of minimizing the approximation error, number of multi-qubit gates, and circuit depth. We also employ fuzzy logic for runtime parameter adaptation of GA to enhance search efficiency and solution quality of our proposed quantum synthesis method.


Sravan Reddy Chintareddy

Combating Spectrum Crunch with Efficient Machine-Learning Based Spectrum Access and Harnessing High-frequency Bands for Next-G Wireless Networks

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Morteza Hashemi, Chair
Victor Frost
Erik Perrins
Dongjie Wang
Shawn Keshmiri

Abstract

There is an increasing trend in the number of wireless devices that is now already over 14 billion and is expected to grow to 40 billion devices by 2030. In addition, we are witnessing an unprecedented proliferation of applications and technologies with wireless connectivity requirements such as unmanned aerial vehicles, connected health, and radars for autonomous vehicles. The advent of new wireless technologies and devices will only worsen the current spectrum crunch that service providers and wireless operators are already experiencing. In this PhD study, we address these challenges through the following research thrusts, in which we consider two emerging applications aimed at advancing spectrum efficiency and high-frequency connectivity solutions.

 

First, we focus on effectively utilizing the existing spectrum resources for emerging applications such as networked UAVs operating within the Unmanned Traffic Management (UTM) system. In this thrust, we develop a coexistence framework for UAVs to share spectrum with traditional cellular networks by using machine learning (ML) techniques so that networked UAVs act as secondary users without interfering with primary users. We propose federated learning (FL) and reinforcement learning (RL) solutions to establish a collaborative spectrum sensing and dynamic spectrum allocation framework for networked UAVs. In the second part, we explore the potential of millimeter-wave (mmWave) and terahertz (THz) frequency bands for high-speed data transmission in urban settings. Specifically, we investigate THz-based midhaul links for 5G networks, where a network's central units (CUs) connect to distributed units (DUs). Through numerical analysis, we assess the feasibility of using 140 GHz links and demonstrate the merits of high-frequency bands to support high data rates in midhaul networks for future urban communications infrastructure. Overall, this research is aimed at establishing frameworks and methodologies that contribute toward the sustainable growth and evolution of wireless connectivity.


Arnab Mukherjee

Attention-Based Solutions for Occlusion Challenges in Person Tracking

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Prasad Kulkarni, Chair
Sumaiya Shomaji
Hongyang Sun
Jian Li

Abstract

Person tracking and association is a complex task in computer vision applications. Even with a powerful detector, a highly accurate association algorithm is necessary to match and track the correct person across all frames. This method has numerous applications in surveillance, and its complexity increases with the number of detected objects and their movements across frames. A significant challenge in person tracking is occlusion, which occurs when an individual being tracked is partially or fully blocked by another object or person. This can make it difficult for the tracking system to maintain the identity of the individual and track them effectively.

In this research, we propose a solution to the occlusion problem by utilizing an occlusion-aware spatial attention transformer. We have divided the entire tracking association process into two scenarios: occlusion and no-occlusion. When a detected person with a specific ID suddenly disappears from a frame for a certain period, we employ advanced methods such as Detector Integration and Pose Estimation to ensure the correct association. Additionally, we implement a spatial attention transformer to differentiate these occluded detections, transform them, and then match them with the correct individual using the Cosine Similarity Metric.

The features extracted from the attention transformer provide a robust baseline for detecting people, enhancing the algorithms adaptability and addressing key challenges associated with existing approaches. This improved method reduces the number of misidentifications and instances of ID switching while also enhancing tracking accuracy and precision.


Agraj Magotra

Data-Driven Insights into Sustainability: An Artificial Intelligence (AI) Powered Analysis of ESG Practices in the Textile and Apparel Industry

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Sumaiya Shomaji, Chair
Prasad Kulkarni
Zijun Yao


Abstract

The global textile and apparel (T&A) industry is under growing scrutiny for its substantial environmental and social impact, producing 92 million tons of waste annually and contributing to 20% of global water pollution. In Bangladesh, one of the world's largest apparel exporters, the integration of Environmental, Social, and Governance (ESG) practices is critical to meet international sustainability standards and maintain global competitiveness. This master's study leverages Artificial Intelligence (AI) and Machine Learning (ML) methodologies to comprehensively analyze unstructured corporate data related to ESG practices among LEED-certified Bangladeshi T&A factories. 

Our study employs advanced techniques, including Web Scraping, Natural Language Processing (NLP), and Topic Modeling, to extract and analyze sustainability-related information from factory websites. We develop a robust ML framework that utilizes Non-Negative Matrix Factorization (NMF) for topic extraction and a Random Forest classifier for ESG category prediction, achieving an 86% classification accuracy. The study uncovers four key ESG themes: Environmental Sustainability, Social : Workplace Safety and Compliance, Social: Education and Community Programs, and Governance. The analysis reveals that 46% of factories prioritize environmental initiatives, such as energy conservation and waste management, while 44% emphasize social aspects, including workplace safety and education. Governance practices are significantly underrepresented, with only 10% of companies addressing ethical governance, healthcare provisions and employee welfare.

To deepen our understanding of the ESG themes, we conducted a Centrality Analysis to identify the most influential keywords within each category, using measures such as degree, closeness, and eigenvector centrality. Furthermore, our analysis reveals that higher certification levels, like Platinum, are associated with a more balanced emphasis on environmental, social, and governance practices, while lower levels focus primarily on environmental efforts. These insights highlight key areas where the industry can improve and inform targeted strategies for enhancing ESG practices. Overall, this ML framework provides a data-driven, scalable approach for analyzing unstructured corporate data and promoting sustainability in Bangladesh’s T&A sector, offering actionable recommendations for industry stakeholders, policymakers, and global brands committed to responsible sourcing.


Samyoga Bhattarai

‘Pro-ID: A Secure Face Recognition System using Locality Sensitive Hashing to Protect Human ID’

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Sumaiya Shomaji, Chair
Tamzidul Hoque
Hongyang Sun


Abstract

Face recognition systems are widely used in various applications, from mobile banking apps to personal smartphones. However, these systems often store biometric templates in raw form, posing significant security and privacy risks. Pro-ID addresses this vulnerability by incorporating SimHash, an algorithm of Locality Sensitive Hashing (LSH), to create secure and irreversible hash codes of facial feature vectors. Unlike traditional methods that leave raw data exposed to potential breaches, SimHash transforms the feature space into high-dimensional hash codes, safeguarding user identity while preserving system functionality. 

The proposed system creates a balance between two aspects: security and the system’s performance. Additionally, the system is designed to resist common attacks, including brute force and template inversion, ensuring that even if the hashed templates are exposed, the original biometric data cannot be reconstructed.  

A key challenge addressed in this project is minimizing the trade-off between security and performance. Extensive evaluations demonstrate that the proposed method maintains competitive accuracy rates comparable to traditional face recognition systems while significantly enhancing security metrics such as irreversibility, unlinkability, and revocability. This innovative approach contributes to advancing the reliability and trustworthiness of biometric systems, providing a secure framework for applications in face recognition systems. 


Past Defense Notices

Dates

Shalmoli Ghosh

High-Power Fabry-Perot Quantum-Well Laser Diodes for Application in Multi-Channel Coherent Optical Communication Systems

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Rongqing Hui , Chair
Shannon Blunt
Jim Stiles


Abstract

Wavelength Division Multiplexing (WDM) is essential for managing rapid network traffic growth in fiber optic systems. Each WDM channel demands a narrow-linewidth, frequency-stabilized laser diode, leading to complexity and increased energy consumption. Multi-wavelength laser sources, generating optical frequency combs (OFC), offer an attractive solution, enabling a single laser diode to provide numerous equally spaced spectral lines for enhanced bandwidth efficiency.

Quantum-dot and quantum-dash OFCs provide phase-synchronized lines with low relative intensity noise (RIN), while Quantum Well (QW) OFCs offer higher power efficiency, but they have higher RIN in the low frequency region of up to 2 GHz. However, both quantum-dot/dash and QW based OFCs, individual spectral lines exhibit high phase noise, limiting coherent detection. Output power levels of these OFCs range between 1-20 mW where the power of each spectral line is typically less than -5 dBm. Due to this requirement, these OFCs require excessive optical amplification, also they possess relatively broad spectral linewidths of each spectral line, due to the inverse relationship between optical power and linewidth as per the Schawlow-Townes formula. This constraint hampers their applicability in coherent detection systems, highlighting a challenge for achieving high-performance optical communication.

In this work, coherent system application of a single-section Quantum-Well Fabry-Perot (FP) laser diode is demonstrated. This laser delivers over 120 mW optical power at the fiber pigtail with a mode spacing of 36.14 GHz. In an experimental setup, 20 spectral lines from a single laser transmitter carry 30 GBaud 16-QAM signals over 78.3 km single-mode fiber, achieving significant data transmission rates. With the potential to support a transmission capacity of 2.15 Tb/s (4.3 Tb/s for dual polarization) per transmitter, including Forward Error Correction (FEC) and maintenance overhead, it offers a promising solution for meeting the escalating demands of modern network traffic efficiently.


TJ Barclay

Proof-Producing Translation from Gallina to CakeML

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Perry Alexander, Chair
Alex Bardas
Drew Davidson
Sankha Guria
Eileen Nutting

Abstract

Users of theorem provers often desire to to extract their verified code to a

  more efficient, compiled language. Coq's current extraction mechanism provides

  this facility but does not provide a formal guarantee that the extracted code

  has the same semantics as the logic it is extracted from. Providing such a

  guarantee requires a formal semantics for the target code. The CakeML

  project, implemented in HOL4, provides a formally defined syntax and semantics

  for a subset of SML and includes a proof-producing translator from

  higher-order logic to CakeML. We use the CakeML definition to develop a

  certifying extractor to CakeML from Gallina using the translation and proof techniques

  of the HOL4 CakeML translator. We also address how differences

  between HOL4 (higher-order logic) and Coq (calculus of constructions) effect

  the implementation details of the Coq translator.


Anissa Khan

Privacy Preserving Biometric Matching

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Perry Alexander, Chair
Prasad Kulkarni
Fengjun Li


Abstract

Biometric matching is a process by which distinct features are used to identify an individual. Doing so privately is important because biometric data, such as fingerprints or facial features, is not something that can be easily changed or updated if put at risk. In this study, we perform a piece of the biometric matching process in a privacy preserving manner by using secure multiparty computation (SMPC). Using SMPC allows the identifying biological data, called a template, to remain stored by the data owner during the matching process. This provides security guarantees to the biological data while it is in use and therefore reduces the chances the data is stolen. In this study, we find that performing biometric matching using SMPC is just as accurate as performing the same match in plaintext.

 


Bryan Richlinski

Prioritize Program Diversity: Enumerative Synthesis with Entropy Ordering

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Sankha Guria, Chair
Perry Alexander
Drew Davidson
Jennifer Lohoefener

Abstract

Program synthesis is a popular way to create a correct-by-construction program from a user-provided specification. 

Term enumeration is a leading technique to systematically explore the space of programs by generating terms from a formal grammar.

These terms are treated as candidate programs which are tested/verified against the specification for correctness. 

In order to prioritize candidates more likely to satisfy the specification, enumeration is often ordered by program size or other domain-specific heuristics.

However, domain-specific heuristics require expert knowledge, and enumeration by size often leads to terms comprised of frequently 

repeating symbols that are less likely to satisfy a specification. 

In this thesis, we build a heuristic that prioritizes term enumeration based on variability of individual symbols in the program, i.e., 

information entropy of the program. We use this heuristic to order programs in both top-down and bottom-up enumeration. 

We evaluated our work on a subset of the PBE-String track of the 2017 SyGuS competition benchmarks and compared against size-based enumeration. 

Top-down enumeration guided by entropy expands upon fewer partial expressions than naive in 77\% of benchmarks, 

and tests fewer complete expressions in 54\%, resulting in improved synthesis time in 40\% of benchmarks. 

However, 71\% of benchmarks in bottom-up enumeration using entropy tests fewer expressions than naive enumeration, without any improvements to the running time. 

We conclude entropy is a promising direction to prioritize candidates during program search in enumerative synthesis, 

and propose a future directions for improving performance of our proposed techniques.


Elizabeth Wyss

A New Frontier for Software Security: Diving Deep into npm

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Drew Davidson, Chair
Alex Bardas
Fengjun Li
Bo Luo
J. Walker

Abstract

Open-source package managers (e.g., npm for Node.js) have become an established component of modern software development. Rather than creating applications from scratch, developers may employ modular software dependencies and frameworks--called packages--to serve as building blocks for writing larger applications. Package managers make this process easy. With a simple command line directive, developers are able to quickly fetch and install packages across vast open-source repositories. npm--the largest of such repositories--alone hosts millions of unique packages and serves billions of package downloads each week. 

 

However, the widespread code sharing resulting from open-source package managers also presents novel security implications. Vulnerable or malicious code hiding deep within package dependency trees can be leveraged downstream to attack both software developers and the users of their applications. This downstream flow of software dependencies--dubbed the software supply chain--is critical to secure.

 

This research provides a deep dive into the npm-centric software supply chain, exploring various facets and phenomena that impact the security of this software supply chain. Such factors include (i) hidden code clones--which obscure provenance and can stealthily propagate known vulnerabilities, (ii) install-time attacks enabled by unmediated installation scripts, (iii) hard-coded URLs residing in package code, (iv) the impacts open-source development practices, and (v) package compromise via malicious updates. For each facet, tooling is presented to identify and/or mitigate potential security impacts. Ultimately, it is our hope that this research fosters greater awareness, deeper understanding, and further efforts to forge a new frontier for the security of modern software supply chains. 


Jagadeesh Sai Dokku

Intelligent Chat Bot for KU Website: Automated Query Response and Resource Navigation

When & Where:


Eaton Hall, Room 2001B

Committee Members:

David Johnson, Chair
Prasad Kulkarni
Hongyang Sun


Abstract

This project introduces an intelligent chatbot designed to improve user experience on our university website by providing instant, automated responses to common inquiries. Navigating a university website can be challenging for students, applicants, and visitors who seek quick information about admissions, campus services, events, and more. To address this challenge, we developed a chatbot that simulates human conversation using Natural Language Processing (NLP), allowing users to find information more efficiently. The chatbot is powered by a Bidirectional Long Short-Term Memory (BiLSTM) model, an architecture well-suited for understanding complex sentence structures. This model captures contextual information from both directions in a sentence, enabling it to identify user intent with high accuracy. We trained the chatbot on a dataset of intent-labeled queries, enabling it to recognize specific intentions such as asking about campus facilities, academic programs, or event schedules. The NLP pipeline includes steps like tokenization, lemmatization, and vectorization. Tokenization and lemmatization prepare the text by breaking it into manageable units and standardizing word forms, making it easier for the model to recognize similar word patterns. The vectorization process then translates this processed text into numerical data that the model can interpret. Flask is used to manage the backend, allowing seamless communication between the user interface and the BiLSTM model. When a user submits a query, Flask routes the input to the model, processes the prediction, and delivers the appropriate response back to the user interface. This chatbot demonstrates a successful application of NLP in creating interactive, efficient, and user-friendly solutions. By automating responses, it reduces reliance on manual support and ensures users can access relevant information at any time. This project highlights how intelligent chatbots can transform the way users interact with university websites, offering a faster and more engaging experience.

 


Anahita Memar

Optimizing Protein Particle Classification: A Study on Smoothing Techniques and Model Performance

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Prasad Kulkarni, Chair
Hossein Saiedian
Prajna Dhar


Abstract

This thesis investigates the impact of smoothing techniques on enhancing classification accuracy in protein particle datasets, focusing on both binary and multi-class configurations across three datasets. By applying methods including Averaging-Based Smoothing, Moving Average, Exponential Smoothing, Savitzky-Golay, and Kalman Smoothing, we sought to improve performance in Random Forest, Decision Tree, and Neural Network models. Initial baseline accuracies revealed the complexity of multi-class separability, while clustering analyses provided valuable insights into class similarities and distinctions, guiding our interpretation of classification challenges.

These results indicate that Averaging-Based Smoothing and Moving Average techniques are particularly effective in enhancing classification accuracy, especially in configurations with marked differences in surfactant conditions. Feature importance analysis identified critical metrics, such as IntMean and IntMax, which played a significant role in distinguishing classes. Cross-validation validated the robustness of our models, with Random Forest and Neural Network consistently outperforming others in binary tasks and showing promising adaptability in multi-class classification. This study not only highlights the efficacy of smoothing techniques for improving classification in protein particle analysis but also offers a foundational approach for future research in biopharmaceutical data processing and analysis.


Yousif Dafalla

Web-Armour: Mitigating Reconnaissance and Vulnerability Scanning with Injecting Scan-Impeding Delays in Web Deployments

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Alex Bardas, Chair
Drew Davidson
Fengjun Li
Bo Luo
ZJ Wang

Abstract

Scanning hosts on the internet for vulnerable devices and services is a key step in numerous cyberattacks. Previous work has shown that scanning is a widespread phenomenon on the internet and commonly targets web application/server deployments. Given that automated scanning is a crucial step in many cyberattacks, it would be beneficial to make it more difficult for adversaries to perform such activity.

In this work, we propose Web-Armour, a mitigation approach to adversarial reconnaissance and vulnerability scanning of web deployments. The proposed approach relies on injecting scanning impeding delays to infrequently or rarely used portions of a web deployment. Web-Armour has two goals: First, increase the cost for attackers to perform automated reconnaissance and vulnerability scanning; Second, introduce minimal to negligible performance overhead to benign users of the deployment. We evaluate Web-Armour on live environments, operated by real users, and on different controlled (offline) scenarios. We show that Web-Armour can effectively lead to thwarting reconnaissance and internet-wide scanning.


Kabir Panahi

A Security Analysis of the Integration of Biometric Technology in the 2019 Afghan Presidential Election

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Alex Bardas, Chair
Drew Davidson
Fengjun Li
Bo Luo

Abstract

Afghanistan deployed Biometric Voter Verification (BVV) technology nationally for the first time in the 2019 presidential election to address the systematic frauds in the prior elections. Through semi-structure interviews with 18 key national and international stakeholders who had an active role in this election, this study investigates the gap between intended outcomes of the BVV technology—focused on voter enfranchisement, fraud prevention, and public trust—and the reality on election day and beyond within the unique socio-political and technical landscape of Afghanistan.

Our findings reveal that while BVV technology initially promised a secure and transparent election, various technical and implementation challenges emerged, including threats for voters, staff, and officials. We found that the BVVs both supported and violated electoral goals: while they helped reduce fraud, they inadvertently disenfranchised some voters and caused delays that affected public trust. Technical limitations, usability issues, and administrative misalignments contributed to these outcomes. This study recommends critical lessons for future implementations of electoral technologies, emphasizing the importance of context-aware technological solutions and the need for robust administrative and technical frameworks to fully realize the potential benefits of election technology in fragile democracies.


Hara Madhav Talasila

Radiometric Calibration of Radar Depth Sounder Data Products

When & Where:


Nichols Hall, Room 317 (Richard K. Moore Conference Room)

Committee Members:

Carl Leuschen, Chair
Christopher Allen
James Stiles
Jilu Li
Leigh Stearns

Abstract

Although the Center for Remote Sensing of Ice Sheets (CReSIS) performs several radar calibration steps to produce Operation IceBridge (OIB) radar depth sounder data products, these datasets are not radiometrically calibrated and the swath array processing uses ideal (rather than measured [calibrated]) steering vectors. Any errors in the steering vectors, which describe the response of the radar as a function of arrival angle, will lead to errors in positioning and backscatter that subsequently affect estimates of basal conditions, ice thickness, and radar attenuation. Scientific applications that estimate physical characteristics of surface and subsurface targets from the backscatter are limited with the current data because it is not absolutely calibrated. Moreover, changes in instrument hardware and processing methods for OIB over the last decade affect the quality of inter-seasonal comparisons. Recent methods which interpret basal conditions and calculate radar attenuation using CReSIS OIB 2D radar depth sounder echograms are forced to use relative scattering power, rather than absolute methods.

As an active target calibration is not possible for past field seasons, a method that uses natural targets will be developed. Unsaturated natural target returns from smooth sea-ice leads or lakes are imaged in many datasets and have known scattering responses. The proposed method forms a system of linear equations with the recorded scattering signatures from these known targets, scattering signatures from crossing flight paths, and the radiometric correction terms. A least squares solution to optimize the radiometric correction terms is calculated, which minimizes the error function representing the mismatch in expected and measured scattering. The new correction terms will be used to correct the remaining mission data. The radar depth sounder data from all OIB campaigns can be reprocessed to produce absolutely calibrated echograms for the Arctic and Antarctic. A software simulator will be developed to study calibration errors and verify the calibration software. The software for processing natural targets and crossovers will be made available in CReSIS’s open-source polar radar software toolbox. The OIB data will be reprocessed with new calibration terms, providing to the data user community a complete set of radiometrically calibrated radar echograms for the CReSIS OIB radar depth sounder for the first time.