Defense Notices


All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

Devin Setiawan

Concept-Driven Interpretability in Graph Neural Networks: Applications in Neuroscientific Connectomics and Clinical Motor Analysis

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Sumaiya Shomaji, Chair
Sankha Guria
Han Wang


Abstract

Graph Neural Networks (GNNs) achieve state-of-the-art performance in modeling complex biological and behavioral systems, yet their "black-box" nature limits their utility for scientific discovery and clinical translation. Standard post-hoc explainability methods typically attribute importance to low-level features, such as individual nodes or edges, which often fail to map onto the high-level, domain-specific concepts utilized by experts. To address this gap, this thesis explores diverse methodological strategies for achieving Concept-Level Interpretability in GNNs, demonstrating how deep learning models can be structurally and analytically aligned with expert domain knowledge. This theme is explored through two distinct methodological paradigms applied to critical challenges in neuroscience and clinical psychology. First, we introduce an interpretable-by-design approach for modeling brain structure-function coupling. By employing an ensemble of GNNs conceptually biased via input graph filtering, the model enforces verifiably disentangled node embeddings. This allows for the quantitative testing of specific structural hypotheses, revealing that a minority of strong anatomical connections disproportionately drives functional connectivity predictions. Second, we present a post-hoc conceptual alignment paradigm for quantifying atypical motor signatures in Autism Spectrum Disorder (ASD). Utilizing a Spatio-Temporal Graph Autoencoder (STGCN-AE) trained on normative skeletal data, we establish an unsupervised anomaly detection system. To provide clinical interpretability, the model's reconstruction error is systematically aligned with a library of human-interpretable kinematic features, such as postural sway and limb jerk. Explanatory meta-modeling via XGBoost and SHAP analysis further translates this abstract loss into a multidimensional clinical signature. Together, these applications demonstrate that integrating concept-level interpretability through either architectural design or systematic post-hoc alignment enables GNNs to serve as robust tools for hypothesis testing and clinical assessment.


Smriti Pranjal

NoBIAS: Non-coding RNA Base Interaction Annotation using Visual Snapshot

When & Where:


Slawson Hall, Room 198

Committee Members:

Cuncong Zhong, Chair
Sumaiya Shomaji
Hongyang Sun
Zijun Yao
Xiaoqing Wu

Abstract

Non-coding RNAs fold into complex 3D structures that govern their biological functions, with RNA structural motifs (RSMs) serving as conserved building blocks of this architecture.
These motifs are defined by characteristic base-interaction patterns, making accurate identification and classification of RNA interactions essential for understanding RNA structure and function.

Despite their biological importance, accurately identifying and classifying these interactions remains challenging because the available data are highly variable in quality and scarce in quantity. This compromises annotation reliability, hinders the construction of trustworthy ground truth for systematic assessment, and restricts the supply of reliable training examples needed for supervised learning.

To address this, we introduce NoBIAS, the first resolution-aware, integrated machine learning-based suite for annotating base interactions from 3D RNA structures, inspired by human pattern recognition, augmented with structure prediction for data enrichment, and evaluated on a carefully curated, stratified benchmark.

NoBIAS is a hierarchical framework for RNA base-interaction annotation that integrates interaction-specific inductive biases with multimodal representation learning. By combining a convolution-augmented, rule-guided module for stacking interactions with complementary graph and image encoders for pairing interactions, NoBIAS captures both structural priors and local visual cues of RNA base doublets. A performance-calibrated logit fusion scheme then adaptively integrates modality-specific predictions based on local-structural resolution, enabling robust inference across heterogeneous 3D RNA structures.

Evaluation across multiple benchmark tiers: spanning consensus, homolog-supported, and manually verified cases, shows that NoBIAS consistently outperforms existing methods under increasingly challenging conditions. Together, the NoBIAS design and its evaluation framework provide a systematic foundation for robust RNA base-interaction annotation, enabling more reliable analysis of RNA structure under realistic uncertainty.


Md Mashfiq Rizvee

Hierarchical Probabilistic Architectures for Scalable Biometric and Electronic Authentication in Secure Surveillance Ecosystems

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Sumaiya Shomaji, Chair
Tamzidul Hoque
David Johnson
Hongyang Sun
Alexandra Kondyli

Abstract

Secure and scalable authentication has become a primary requirement in modern digital ecosystems, where both human biometrics and electronic identities must be verified under noise, large population growth and resource constraints. Existing approaches often struggle to simultaneously provide storage efficiency, dynamic updates and strong authentication reliability. The proposed work advances a unified probabilistic framework based on Hierarchical Bloom Filter (HBF) architectures to address these limitations across biometric and hardware domains. The first contribution establishes the Dynamic Hierarchical Bloom Filter (DHBF) as a noise-tolerant and dynamically updatable authentication structure for large-scale biometrics. Unlike static Bloom-based systems that require reconstruction upon updates, DHBF supports enrollment, querying, insertion and deletion without structural rebuild. Experimental evaluation on 30,000 facial biometric templates demonstrates 100% enrollment and query accuracy, including robust acceptance of noisy biometric inputs while maintaining correct rejection of non-enrolled identities. These results validate that hierarchical probabilistic encoding can preserve both scalability and authentication reliability in practical deployments. Building on this foundation, Bio-BloomChain integrates DHBF into a blockchain-based smart contract framework to provide tamper-evident, privacy-preserving biometric lifecycle management. The system stores only hashed and non-invertible commitments on-chain while maintaining probabilistic verification logic within the contract layer. Large-scale evaluation again reports 100% enrollment, insertion, query and deletion accuracy across 30,000 templates, therefore, solving the existing problem of blockchains being able to authenticate noisy data. Moreover, the deployment analysis shows that execution on Polygon zkEVM reduces operational costs by several orders of magnitude compared to Ethereum, therefore, bringing enrollment and deletion costs below $0.001 per operation which demonstrate the feasibility of scalable blockchain biometric authentication in practice. Finally, the hierarchical probabilistic paradigm is extended to electronic hardware authentication through the Persistent Hierarchical Bloom Filter (PHBF). Applied to electronic fingerprints derived from physical unclonable functions (PUFs), PHBF demonstrates robust authentication under environmental variations such as temperature-induced noise. Experimental results show zero-error operation at the selected decision threshold and substantial system-level improvements as well as over 10^5 faster query processing and significantly reduced storage requirements compared to large scale tracking.


Past Defense Notices

Dates

Agraj Magotra

Data-Driven Insights into Sustainability: An Artificial Intelligence (AI) Powered Analysis of ESG Practices in the Textile and Apparel Industry

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Sumaiya Shomaji, Chair
Prasad Kulkarni
Zijun Yao


Abstract

The global textile and apparel (T&A) industry is under growing scrutiny for its substantial environmental and social impact, producing 92 million tons of waste annually and contributing to 20% of global water pollution. In Bangladesh, one of the world's largest apparel exporters, the integration of Environmental, Social, and Governance (ESG) practices is critical to meet international sustainability standards and maintain global competitiveness. This master's study leverages Artificial Intelligence (AI) and Machine Learning (ML) methodologies to comprehensively analyze unstructured corporate data related to ESG practices among LEED-certified Bangladeshi T&A factories. 

Our study employs advanced techniques, including Web Scraping, Natural Language Processing (NLP), and Topic Modeling, to extract and analyze sustainability-related information from factory websites. We develop a robust ML framework that utilizes Non-Negative Matrix Factorization (NMF) for topic extraction and a Random Forest classifier for ESG category prediction, achieving an 86% classification accuracy. The study uncovers four key ESG themes: Environmental Sustainability, Social : Workplace Safety and Compliance, Social: Education and Community Programs, and Governance. The analysis reveals that 46% of factories prioritize environmental initiatives, such as energy conservation and waste management, while 44% emphasize social aspects, including workplace safety and education. Governance practices are significantly underrepresented, with only 10% of companies addressing ethical governance, healthcare provisions and employee welfare.

To deepen our understanding of the ESG themes, we conducted a Centrality Analysis to identify the most influential keywords within each category, using measures such as degree, closeness, and eigenvector centrality. Furthermore, our analysis reveals that higher certification levels, like Platinum, are associated with a more balanced emphasis on environmental, social, and governance practices, while lower levels focus primarily on environmental efforts. These insights highlight key areas where the industry can improve and inform targeted strategies for enhancing ESG practices. Overall, this ML framework provides a data-driven, scalable approach for analyzing unstructured corporate data and promoting sustainability in Bangladesh’s T&A sector, offering actionable recommendations for industry stakeholders, policymakers, and global brands committed to responsible sourcing.


Samyoga Bhattarai

‘Pro-ID: A Secure Face Recognition System using Locality Sensitive Hashing to Protect Human ID’

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Sumaiya Shomaji, Chair
Tamzidul Hoque
Hongyang Sun


Abstract

Face recognition systems are widely used in various applications, from mobile banking apps to personal smartphones. However, these systems often store biometric templates in raw form, posing significant security and privacy risks. Pro-ID addresses this vulnerability by incorporating SimHash, an algorithm of Locality Sensitive Hashing (LSH), to create secure and irreversible hash codes of facial feature vectors. Unlike traditional methods that leave raw data exposed to potential breaches, SimHash transforms the feature space into high-dimensional hash codes, safeguarding user identity while preserving system functionality. 

The proposed system creates a balance between two aspects: security and the system’s performance. Additionally, the system is designed to resist common attacks, including brute force and template inversion, ensuring that even if the hashed templates are exposed, the original biometric data cannot be reconstructed.  

A key challenge addressed in this project is minimizing the trade-off between security and performance. Extensive evaluations demonstrate that the proposed method maintains competitive accuracy rates comparable to traditional face recognition systems while significantly enhancing security metrics such as irreversibility, unlinkability, and revocability. This innovative approach contributes to advancing the reliability and trustworthiness of biometric systems, providing a secure framework for applications in face recognition systems. 


Shalmoli Ghosh

High-Power Fabry-Perot Quantum-Well Laser Diodes for Application in Multi-Channel Coherent Optical Communication Systems

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Rongqing Hui , Chair
Shannon Blunt
Jim Stiles


Abstract

Wavelength Division Multiplexing (WDM) is essential for managing rapid network traffic growth in fiber optic systems. Each WDM channel demands a narrow-linewidth, frequency-stabilized laser diode, leading to complexity and increased energy consumption. Multi-wavelength laser sources, generating optical frequency combs (OFC), offer an attractive solution, enabling a single laser diode to provide numerous equally spaced spectral lines for enhanced bandwidth efficiency.

Quantum-dot and quantum-dash OFCs provide phase-synchronized lines with low relative intensity noise (RIN), while Quantum Well (QW) OFCs offer higher power efficiency, but they have higher RIN in the low frequency region of up to 2 GHz. However, both quantum-dot/dash and QW based OFCs, individual spectral lines exhibit high phase noise, limiting coherent detection. Output power levels of these OFCs range between 1-20 mW where the power of each spectral line is typically less than -5 dBm. Due to this requirement, these OFCs require excessive optical amplification, also they possess relatively broad spectral linewidths of each spectral line, due to the inverse relationship between optical power and linewidth as per the Schawlow-Townes formula. This constraint hampers their applicability in coherent detection systems, highlighting a challenge for achieving high-performance optical communication.

In this work, coherent system application of a single-section Quantum-Well Fabry-Perot (FP) laser diode is demonstrated. This laser delivers over 120 mW optical power at the fiber pigtail with a mode spacing of 36.14 GHz. In an experimental setup, 20 spectral lines from a single laser transmitter carry 30 GBaud 16-QAM signals over 78.3 km single-mode fiber, achieving significant data transmission rates. With the potential to support a transmission capacity of 2.15 Tb/s (4.3 Tb/s for dual polarization) per transmitter, including Forward Error Correction (FEC) and maintenance overhead, it offers a promising solution for meeting the escalating demands of modern network traffic efficiently.


TJ Barclay

Proof-Producing Translation from Gallina to CakeML

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Perry Alexander, Chair
Alex Bardas
Drew Davidson
Sankha Guria
Eileen Nutting

Abstract

Users of theorem provers often desire to to extract their verified code to a

  more efficient, compiled language. Coq's current extraction mechanism provides

  this facility but does not provide a formal guarantee that the extracted code

  has the same semantics as the logic it is extracted from. Providing such a

  guarantee requires a formal semantics for the target code. The CakeML

  project, implemented in HOL4, provides a formally defined syntax and semantics

  for a subset of SML and includes a proof-producing translator from

  higher-order logic to CakeML. We use the CakeML definition to develop a

  certifying extractor to CakeML from Gallina using the translation and proof techniques

  of the HOL4 CakeML translator. We also address how differences

  between HOL4 (higher-order logic) and Coq (calculus of constructions) effect

  the implementation details of the Coq translator.


Anissa Khan

Privacy Preserving Biometric Matching

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Perry Alexander, Chair
Prasad Kulkarni
Fengjun Li


Abstract

Biometric matching is a process by which distinct features are used to identify an individual. Doing so privately is important because biometric data, such as fingerprints or facial features, is not something that can be easily changed or updated if put at risk. In this study, we perform a piece of the biometric matching process in a privacy preserving manner by using secure multiparty computation (SMPC). Using SMPC allows the identifying biological data, called a template, to remain stored by the data owner during the matching process. This provides security guarantees to the biological data while it is in use and therefore reduces the chances the data is stolen. In this study, we find that performing biometric matching using SMPC is just as accurate as performing the same match in plaintext.

 


Bryan Richlinski

Prioritize Program Diversity: Enumerative Synthesis with Entropy Ordering

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Sankha Guria, Chair
Perry Alexander
Drew Davidson
Jennifer Lohoefener

Abstract

Program synthesis is a popular way to create a correct-by-construction program from a user-provided specification. 

Term enumeration is a leading technique to systematically explore the space of programs by generating terms from a formal grammar.

These terms are treated as candidate programs which are tested/verified against the specification for correctness. 

In order to prioritize candidates more likely to satisfy the specification, enumeration is often ordered by program size or other domain-specific heuristics.

However, domain-specific heuristics require expert knowledge, and enumeration by size often leads to terms comprised of frequently 

repeating symbols that are less likely to satisfy a specification. 

In this thesis, we build a heuristic that prioritizes term enumeration based on variability of individual symbols in the program, i.e., 

information entropy of the program. We use this heuristic to order programs in both top-down and bottom-up enumeration. 

We evaluated our work on a subset of the PBE-String track of the 2017 SyGuS competition benchmarks and compared against size-based enumeration. 

Top-down enumeration guided by entropy expands upon fewer partial expressions than naive in 77\% of benchmarks, 

and tests fewer complete expressions in 54\%, resulting in improved synthesis time in 40\% of benchmarks. 

However, 71\% of benchmarks in bottom-up enumeration using entropy tests fewer expressions than naive enumeration, without any improvements to the running time. 

We conclude entropy is a promising direction to prioritize candidates during program search in enumerative synthesis, 

and propose a future directions for improving performance of our proposed techniques.


Elizabeth Wyss

A New Frontier for Software Security: Diving Deep into npm

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Drew Davidson, Chair
Alex Bardas
Fengjun Li
Bo Luo
J. Walker

Abstract

Open-source package managers (e.g., npm for Node.js) have become an established component of modern software development. Rather than creating applications from scratch, developers may employ modular software dependencies and frameworks--called packages--to serve as building blocks for writing larger applications. Package managers make this process easy. With a simple command line directive, developers are able to quickly fetch and install packages across vast open-source repositories. npm--the largest of such repositories--alone hosts millions of unique packages and serves billions of package downloads each week. 

 

However, the widespread code sharing resulting from open-source package managers also presents novel security implications. Vulnerable or malicious code hiding deep within package dependency trees can be leveraged downstream to attack both software developers and the users of their applications. This downstream flow of software dependencies--dubbed the software supply chain--is critical to secure.

 

This research provides a deep dive into the npm-centric software supply chain, exploring various facets and phenomena that impact the security of this software supply chain. Such factors include (i) hidden code clones--which obscure provenance and can stealthily propagate known vulnerabilities, (ii) install-time attacks enabled by unmediated installation scripts, (iii) hard-coded URLs residing in package code, (iv) the impacts open-source development practices, and (v) package compromise via malicious updates. For each facet, tooling is presented to identify and/or mitigate potential security impacts. Ultimately, it is our hope that this research fosters greater awareness, deeper understanding, and further efforts to forge a new frontier for the security of modern software supply chains. 


Jagadeesh Sai Dokku

Intelligent Chat Bot for KU Website: Automated Query Response and Resource Navigation

When & Where:


Eaton Hall, Room 2001B

Committee Members:

David Johnson, Chair
Prasad Kulkarni
Hongyang Sun


Abstract

This project introduces an intelligent chatbot designed to improve user experience on our university website by providing instant, automated responses to common inquiries. Navigating a university website can be challenging for students, applicants, and visitors who seek quick information about admissions, campus services, events, and more. To address this challenge, we developed a chatbot that simulates human conversation using Natural Language Processing (NLP), allowing users to find information more efficiently. The chatbot is powered by a Bidirectional Long Short-Term Memory (BiLSTM) model, an architecture well-suited for understanding complex sentence structures. This model captures contextual information from both directions in a sentence, enabling it to identify user intent with high accuracy. We trained the chatbot on a dataset of intent-labeled queries, enabling it to recognize specific intentions such as asking about campus facilities, academic programs, or event schedules. The NLP pipeline includes steps like tokenization, lemmatization, and vectorization. Tokenization and lemmatization prepare the text by breaking it into manageable units and standardizing word forms, making it easier for the model to recognize similar word patterns. The vectorization process then translates this processed text into numerical data that the model can interpret. Flask is used to manage the backend, allowing seamless communication between the user interface and the BiLSTM model. When a user submits a query, Flask routes the input to the model, processes the prediction, and delivers the appropriate response back to the user interface. This chatbot demonstrates a successful application of NLP in creating interactive, efficient, and user-friendly solutions. By automating responses, it reduces reliance on manual support and ensures users can access relevant information at any time. This project highlights how intelligent chatbots can transform the way users interact with university websites, offering a faster and more engaging experience.

 


Anahita Memar

Optimizing Protein Particle Classification: A Study on Smoothing Techniques and Model Performance

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Prasad Kulkarni, Chair
Hossein Saiedian
Prajna Dhar


Abstract

This thesis investigates the impact of smoothing techniques on enhancing classification accuracy in protein particle datasets, focusing on both binary and multi-class configurations across three datasets. By applying methods including Averaging-Based Smoothing, Moving Average, Exponential Smoothing, Savitzky-Golay, and Kalman Smoothing, we sought to improve performance in Random Forest, Decision Tree, and Neural Network models. Initial baseline accuracies revealed the complexity of multi-class separability, while clustering analyses provided valuable insights into class similarities and distinctions, guiding our interpretation of classification challenges.

These results indicate that Averaging-Based Smoothing and Moving Average techniques are particularly effective in enhancing classification accuracy, especially in configurations with marked differences in surfactant conditions. Feature importance analysis identified critical metrics, such as IntMean and IntMax, which played a significant role in distinguishing classes. Cross-validation validated the robustness of our models, with Random Forest and Neural Network consistently outperforming others in binary tasks and showing promising adaptability in multi-class classification. This study not only highlights the efficacy of smoothing techniques for improving classification in protein particle analysis but also offers a foundational approach for future research in biopharmaceutical data processing and analysis.


Yousif Dafalla

Web-Armour: Mitigating Reconnaissance and Vulnerability Scanning with Injecting Scan-Impeding Delays in Web Deployments

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Alex Bardas, Chair
Drew Davidson
Fengjun Li
Bo Luo
ZJ Wang

Abstract

Scanning hosts on the internet for vulnerable devices and services is a key step in numerous cyberattacks. Previous work has shown that scanning is a widespread phenomenon on the internet and commonly targets web application/server deployments. Given that automated scanning is a crucial step in many cyberattacks, it would be beneficial to make it more difficult for adversaries to perform such activity.

In this work, we propose Web-Armour, a mitigation approach to adversarial reconnaissance and vulnerability scanning of web deployments. The proposed approach relies on injecting scanning impeding delays to infrequently or rarely used portions of a web deployment. Web-Armour has two goals: First, increase the cost for attackers to perform automated reconnaissance and vulnerability scanning; Second, introduce minimal to negligible performance overhead to benign users of the deployment. We evaluate Web-Armour on live environments, operated by real users, and on different controlled (offline) scenarios. We show that Web-Armour can effectively lead to thwarting reconnaissance and internet-wide scanning.