Defense Notices


All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

Arnab Mukherjee

Attention-Based Solutions for Occlusion Challenges in Person Tracking

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Prasad Kulkarni, Chair
Sumaiya Shomaji
Hongyang Sun
Jian Li

Abstract

Person re-identification (Re-ID) and multi-object tracking in unconstrained surveillance environments pose significant challenges within the field of computer vision. These complexities stem mainly from occlusion, variability in appearance, and identity switching across various camera views. This research outlines a comprehensive and innovative agenda aimed at tackling these issues, employing a series of increasingly advanced deep learning architectures, culminating in a groundbreaking occlusion-aware Vision Transformer framework.

At the heart of this work is the introduction of Deep SORT with Multiple Inputs (Deep SORT-MI), a cutting-edge real-time Re-ID system featuring a dual-metric association strategy. This strategy adeptly combines Mahalanobis distance for motion-based tracking with cosine similarity for appearance-based re-identification. As a result, this method significantly decreases identity switching compared to the baseline SORT algorithm on the MOT-16 benchmark, thereby establishing a robust foundation for metric learning in subsequent research.

Expanding on this foundation, a novel pose-estimation framework integrates 2D skeletal keypoint features extracted via OpenPose directly into the association pipeline. By capturing the spatial relationships among body joints along with appearance features, this system enhances robustness against posture variations and partial occlusion. Consequently, it achieves substantial reductions in false positives and identity switches compared to earlier methods, showcasing its practical viability.

Furthermore, a Diverse Detector Integration (DDI) study meticulously assessed the influence of detector choices—including YOLO v4, Faster R-CNN, MobileNet SSD v2, and Deep SORT—on the efficacy of metric learning-based tracking. The results reveal that YOLO v4 consistently delivers exceptional tracking accuracy on both the MOT-16 and MOT-17 datasets, establishing its superiority in this competitive landscape.

In conclusion, this body of research notably advances occlusion-aware person Re-ID by illustrating a clear progression from metric learning to pose-guided feature extraction and ultimately to transformer-based global attention modeling. The findings underscore that lightweight, meticulously parameterized Vision Transformers can achieve impressive generalization for occlusion detection, even under constrained data scenarios. This opens up exciting prospects for integrated detection, localization, and re-identification in real-world surveillance systems, promising to enhance their effectiveness and reliability.


Sai Rithvik Gundla

Beyond Regression Accuracy: Evaluating Runtime Prediction for Scheduling Input Sensitive Workloads

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Hongyang Sun, Chair
Arvin Agah
David Johnson


Abstract

Runtime estimation plays a structural role in reservation-based scheduling for High Performance Computing (HPC) systems, where predicted walltimes directly influence reservation timing, backfilling feasibility, and overall queue dynamics. This raises a fundamental question of whether improved runtime prediction accuracy necessarily translates into improved scheduling performance. In this work, we conduct an empirical study of runtime estimation under EASY Backfilling using an application-driven workload consisting of MRI-based brain segmentation jobs. Despite identical configurations and uniform metadata, runtimes exhibit substantial variability driven by intrinsic input structure. To capture this variability, we develop a feature-driven machine learning (ML) framework that extracts region-wise features from MRI volumes to predict job runtimes without relying on historical execution traces or scheduling metadata. We integrate these ML-derived predictions into an EASY Backfilling scheduler implemented in the Batsim simulation framework. Our results show that regression accuracy alone does not determine scheduling performance. Instead, scheduling performance depends strongly on estimation bias and its effect on reservation timing and runtime exceedances. In particular, mild multiplicative calibration of ML-based runtime estimates stabilizes scheduler behavior and yields consistently competitive performance across workload and system configurations. Comparable performance can also be observed with certain levels of uniform overestimation; however, calibrated ML predictions provide a systematic mechanism to control estimation bias without relying on arbitrary static inflation. In contrast, underestimation consistently leads to severe performance degradation and cascading job terminations. These findings highlight runtime estimation as a structural control input in backfilling-based HPC scheduling and demonstrate the importance of evaluating prediction models jointly with scheduling dynamics rather than through regression metrics alone.


Ye Wang

Toward Practical and Stealthy Sensor Exploitation: Physical, Contextual, and Control-Plane Attack Paradigms

When & Where:


Nichols Hall, Room 250 (Gemini Conference Room)

Committee Members:

Fengjun Li, Chair
Drew Davidson
Rongqing Hui
Bo Luo
Haiyang Chao

Abstract

Modern intelligent systems increasingly rely on continuous sensor data streams for perception, decision-making, and control, making sensors a critical yet underexplored attack surface. While prior research has demonstrated the feasibility of sensor-based attacks, recent advances in mobile operating systems and machine learning-based defenses have significantly reduced their practicality, rendering them more detectable, resource-intensive, and constrained by evolving permission and context-aware security models.

This dissertation revisits sensor exploitation under these modern constraints and develops a unified, cross-layer perspective that improves both practicality and stealth of sensor-enabled attacks. We identify three fundamental challenges: (i) the difficulty of reliably manipulating physical sensor signals in noisy, real-world environments; (ii) the effectiveness of context-aware defenses in detecting anomalous sensor behavior on mobile devices, and (iii) the lack of lightweight coordination for practical sensor-based side- and covert-channels.

To address the first challenge, we propose a physical-domain attack framework that integrates signal modeling, simulation-guided attack synthesis, and real-time adaptive targeting, enabling robust adversarial perturbations with high attack success rates even under environmental uncertainty. As a case study, we demonstrate an infrared laser-based adversarial example attack against face recognition systems, which achieves consistently high success rates across diverse conditions with practical execution overhead.

To improve attack stealth against context-aware defenses, we introduce an auto-contextualization mechanism that synchronizes malicious sensor actuation with legitimate application activity. By aligning injected signals with both statistical patterns and semantic context of benign behavior, the approach renders attacks indistinguishable from normal system operations and benign sensor usage. We validate this design using three Android logic bombs, showing that auto-contextualized triggers can evade both rule-based and learning-based detection mechanisms.

Finally, we extend sensor exploitation beyond the traditional attack-channel plane by introducing a lightweight control-plane protocol embedded within sensor data streams. This protocol encodes control signals directly into sensor observations and leverages simple signal-processing primitives to coordinate multi-stage attacks without relying on privileged APls or explicit inter-process communication. The resulting design enables low-overhead, stealthy coordination of cross-device side- and covert-channels.

Together, these contributions establish a new paradigm for sensor exploitation that spans physical, contextual, and control-plane dimensions. By bridging these layers, this dissertation demonstrates that sensor-based attacks remain not only feasible but also practical and stealthy in modern computer systems.


Hao Xuan

Toward an Integrated Computational Framework for Metagenomics: From Sequence Alignment to Automated Knowledge Discovery

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Cuncong Zhong, Chair
Fengjun Li
Suzanne Shontz
Hongyang Sun
Liang Xu

Abstract

Metagenomic sequencing has become a central paradigm for studying complex microbial communities and their interactions with the host, with emerging applications in clinical prediction and disease modeling. In this work, we first investigate two representative application scenarios: predicting immune checkpoint inhibitor response in non-small cell lung cancer using gut microbial signatures, and characterizing host–microbiome interactions in neonatal systems. The proposed reference-free neural network captures both compositional and functional signals without reliance on reference genomes, while the neonatal study demonstrates how environmental and genetic factors reshape microbial communities and how probiotic intervention can mitigate pathogen-induced immune activation.

These studies highlight both the promise and the inherent difficulty of metagenomic analysis: transforming raw sequencing data into clinically actionable insights remains an algorithmically fragmented and computationally intensive process. This challenge arises from two key limitations: the lack of a unified algorithmic foundation for sequence alignment and the absence of systematic approaches for selecting and organizing analytical tools. Motivated by these challenges, we present a unified computational framework for metagenomic analysis that integrates complementary algorithmic and systems-level solutions.

First, to resolve fragmentation at the alignment level, we develop the Versatile Alignment Toolkit (VAT), a unified algorithmic system for biological sequence alignment across diverse applications. VAT introduces an asymmetric multi-view k-mer indexing scheme that integrates multiple seeding strategies within a single architecture and enables dynamic seed-length adjustment via longest common prefix (LCP)–based inference without re-indexing. A flexible seed-chaining mechanism further supports diverse alignment scenarios, including collinear, rearranged, and split alignments. Combined with a hardware-efficient in-register bitonic sorting algorithm and dynamic index-loading strategy, VAT achieves high efficiency and broad applicability across read mapping, homology search, and whole-genome alignment. Second, to address the challenge of tool selection and pipeline construction, we develop SNAIL, a natural language processing system for automated recognition of bioinformatics tools from large-scale and rapidly growing scientific literature. By integrating XGBoost and Transformer-based models such as SciBERT, SNAIL enables structured extraction of analytical tools and supports automated, reproducible pipeline construction.

Together, this work establishes a unified framework that is grounded in real-world applications and addresses key bottlenecks in metagenomic analysis, enabling more efficient, scalable, and clinically actionable workflows.


Devin Setiawan

Concept-Driven Interpretability in Graph Neural Networks: Applications in Neuroscientific Connectomics and Clinical Motor Analysis

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Sumaiya Shomaji, Chair
Sankha Guria
Han Wang


Abstract

Graph Neural Networks (GNNs) achieve state-of-the-art performance in modeling complex biological and behavioral systems, yet their "black-box" nature limits their utility for scientific discovery and clinical translation. Standard post-hoc explainability methods typically attribute importance to low-level features, such as individual nodes or edges, which often fail to map onto the high-level, domain-specific concepts utilized by experts. To address this gap, this thesis explores diverse methodological strategies for achieving Concept-Level Interpretability in GNNs, demonstrating how deep learning models can be structurally and analytically aligned with expert domain knowledge. This theme is explored through two distinct methodological paradigms applied to critical challenges in neuroscience and clinical psychology. First, we introduce an interpretable-by-design approach for modeling brain structure-function coupling. By employing an ensemble of GNNs conceptually biased via input graph filtering, the model enforces verifiably disentangled node embeddings. This allows for the quantitative testing of specific structural hypotheses, revealing that a minority of strong anatomical connections disproportionately drives functional connectivity predictions. Second, we present a post-hoc conceptual alignment paradigm for quantifying atypical motor signatures in Autism Spectrum Disorder (ASD). Utilizing a Spatio-Temporal Graph Autoencoder (STGCN-AE) trained on normative skeletal data, we establish an unsupervised anomaly detection system. To provide clinical interpretability, the model's reconstruction error is systematically aligned with a library of human-interpretable kinematic features, such as postural sway and limb jerk. Explanatory meta-modeling via XGBoost and SHAP analysis further translates this abstract loss into a multidimensional clinical signature. Together, these applications demonstrate that integrating concept-level interpretability through either architectural design or systematic post-hoc alignment enables GNNs to serve as robust tools for hypothesis testing and clinical assessment.


Moh Absar Rahman

Permissions vs Promises: Assessing Over-privileged Android Apps via Local LLM-based Description Validation

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Drew Davidson, Chair
Sankha Guria
David Johnson


Abstract

Android is the most widely adopted mobile operating system, supporting billions of devices and driven by a robust app ecosystem.  Its permission-based security model aims to enforce the Principle of Least Privilege (PoLP), restricting apps to only the permissions it needs.  However, many apps still request excessive permissions, increasing the risk of data leakage and malicious exploitation. Previous research on overprivileged permission has become ineffective due to outdated methods and increasing technical complexity.  The introduction of runtime permissions and scoped storage has made some of the traditional analysis techniques obsolete.  Additionally, developers often are not transparent in explaining the usage of app permissions on the Play Store, misleading users unknowingly and unwillingly granting unnecessary permissions. This combination of overprivilege and poor transparency poses significant security threats to Android users.  Recently, the rise of local large language models (LLMs) has shown promise in various security fields. The main focus of this study is to analyze whether an app is overpriviledged based on app description provided on the Play Store using Local LLM. Finally, we conduct a manual evaluation to validate the LLM’s findings, comparing its results against human-verified response.


Mohsen Nayebi Kerdabadi

Representation Augmentation for Electronic Health Records via Knowledge Graphs, Large Language Models, and Contrastive Learning

When & Where:


Learned Hall, Room 3150

Committee Members:

Zijun Yao, Chair
Sumaiya Shomaji
Hongyang Sun
Dongjie Wang
Shawn Keshmiri

Abstract

Electronic Health Records (EHRs) provide rich longitudinal patient information, but their high dimensionality, sparsity, heterogeneity, and temporal complexity make robust representation learning difficult. This dissertation studies how to improve patient and medical concept representation learning in EHRs and consequently enhance healthcare predictive tasks by integrating domain knowledge, knowledge graphs, large language models (LLMs), and contrastive learning. First, it introduces an ontology-aware temporal contrastive framework for survival analysis that learns discriminative patient representations from censored and observed trajectories by modeling temporal distinctiveness in longitudinal EHR data. Second, it proposes a multi-ontology representation learning framework that jointly propagates knowledge within and across diagnosis, medication, and procedure ontologies, enabling richer medical concept embeddings, especially under limited data and for rare conditions. Third, it develops an LLM-enriched, text-attributed medical knowledge graph framework that combines EHR-derived statistical evidence with type-constrained LLM reasoning to infer semantic relations, generate contextual node and edge descriptions, and co-learn concept embeddings through joint language-model and graph-neural-network training. Together, these studies advance a unified view of EHR representation learning in which structured medical knowledge, textual semantics, and temporal patient trajectories are jointly leveraged to build more accurate, interpretable, and robust healthcare prediction models.


Brinley Hull

Mist – An Interactive Virtual Pet for Autism Spectrum Disorder Stress Onset Detection & Mitigation

When & Where:


Nichols Hall, Room 317 (Moore Conference Room)

Committee Members:

Arvin Agah, Chair
Perry Alexander
David Johnson
Sumaiya Shomaji

Abstract

Individuals with Autism Spectrum Disorder (ASD) frequently experience elevated stress and are at higher risk for mood disorders such as anxiety and depression. Sensory over-responsivity, social challenges, and difficulties with emotional recognition and regulation contribute to such heightened stress. This study presents a proof-of-concept system that detects and mitigates stress through interactions with a virtual pet. Designed for young adults with high-functioning autism, and potentially useful for people beyond that group, the system monitors simulated heart rate, skin resistance, body temperature, and environmental sound and light levels. Upon detection of stress or potential triggers, the system alerts the user and offers stress-reduction activities via a virtual pet, including guided deep-breathing exercises and interactive engagement with the virtual companion. Through combining real-time stress detection with interactive interventions on a single platform, the system aims to help autistic individuals recognize and manage stress more effectively.


Harun Khan

Identifying Weight Surgery Attacks in Siamese Networks

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Prasad Kulkarni, Chair
Alex Bardas
Bo Luo


Abstract

Facial recognition systems increasingly rely on machine learning services, yet they remain vulnerable to cyber-attacks. While traditional adversarial attacks target input data, an underexplored threat comes from weight manipulation attacks, which directly modify model parameters and can compromise deployed systems in cyber-physical settings. This paper investigates defenses against Weight Surgery, a weight manipulation attack that modifies the final linear layer of neural networks to merge or shatter classes without requiring access to training data. We propose a computationally lightweight defense capable of detecting sample pairs affected by Weight Surgery at low false-positive rates. The defense is designed to operate in realistic deployment scenarios, selecting its sensitivity parameter 𝛾 using only benign samples to meet a target false-positive rate. Evaluation on 1000 independently attacked models demonstrates that our method achieves over 95% recall at a target false-positive rate of 0.001. Performance remains strong even under stricter conditions: at FPR = 0.0001, recall is 92.5%, and at 𝛾=0.98, FPR drops to 0.00001 while maintaining 88.9% recall. These results highlight the robustness and practicality of the defense, offering an effective safeguard for neural networks against model-targeted attacks.


Tanvir Hossain

Security Solutions for Zero-Trust Microelectronics Supply Chains

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Tamzidul Hoque, Chair
Drew Davidson
Prasad Kulkarni
Heechul Yun
Huijeong Kim

Abstract

Microelectronics supply chains increasingly rely on globally distributed design, fabrication, integration, and deployment processes, making traditional assumptions of trusted hardware inadequate. Security in this setting can be understood through a zero-trust microelectronics supply-chain model, in which neither manufacturing partners nor procured hardware platforms are assumed trustworthy by default. Two complementary threat scenarios are considered in the proposed research. In the first scenario, custom Integrated Circuits (ICs) fabricated through potentially untrusted foundries are examined, where design-for-security protections intended to prevent piracy, overproduction, and intellectual-property theft can themselves become vulnerable to attacks. In this scenario, hardware Trojan-assisted meta-attacks are used to show that such protections can be systematically identified and subverted by fabrication-stage adversaries. In the second scenario, commercial off-the-shelf ICs are considered from the perspective of end users and procurers, where internal design visibility is unavailable and hardware trustworthiness cannot be directly verified. For this setting, runtime-oriented protection mechanisms are developed to safeguard sensitive computation against malicious hardware behavior and side-channel leakage. Building on these two scenarios, a future research direction is outlined for side-channel-driven vulnerability discovery in off-the-shelf devices, motivated by the need to evaluate and test such platforms prior to deployment when no design information is available. The proposed direction explores gray-box security evaluation using power and electromagnetic side-channel analysis to identify anomalous behaviors and potential vulnerabilities in opaque hardware platforms. Together, these directions establish a foundation for analyzing and mitigating security risks across zero-trust microelectronics supply chains.


Krishna Chaitanya Reddy Chitta

A Dynamic Resource Management Framework and Reconfiguration Strategies for Cloud-native Bulk Synchronous Parallel Applications

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Hongyang Sun, Chair
David Johnson
Sumaiya Shomaji


Abstract

Many High Performance Computing (HPC) applications following the Bulk Synchronous Parallel

(BSP) model are increasingly deployed in cloud-native, multi-tenant container environments such

as Kubernetes. Unlike dedicated HPC clusters, these shared platforms introduce resource virtualization

and variability, making BSP applications more susceptible to performance fluctuations.

Workload imbalance across supersteps can trigger the straggler effect, where faster tasks wait

at synchronization barriers for slower ones, increasing overall execution time. Existing BSP resource

management approaches typically assume static workloads and reuse a single configuration

throughout execution. However, real-world workloads vary due to dynamic data and system conditions,

making static configurations suboptimal. This limitation underscores the need for adaptive

resource management strategies that respond to workload changes while considering reconfiguration

costs.

 

To address these limitations, we evaluate a dynamic, data-driven resource management framework

tailored for cloud-native BSP applications. This approach integrates workload profiling,

time-series forecasting, and predictive performance modeling to estimate task execution behavior

under varying workload and resource conditions. The framework explicitly models the trade-off

between performance gains achieved through reconfiguration and the associated checkpointing

and migration costs incurred during container reallocation. Multiple reconfiguration strategies

are evaluated, spanning simple window-based heuristics, dynamic programming methods, and

reinforcement learning approaches. Through extensive experimental evaluation, this framework

demonstrates up to 24.5% improvement in total execution time compared to a baseline static configuration.

Furthermore, we systematically analyze the performance of each strategy under varying

workload characteristics, simulation lengths, and checkpoint penalties, and provide guidance on

selecting the most appropriate strategy for a given workload environment.


Past Defense Notices

Dates

Sravan Reddy Chintareddy

Combating Spectrum Crunch with Efficient Machine-Learning Based Spectrum Access and Harnessing High-frequency Bands for Next-G Wireless Networks

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Morteza Hashemi, Chair
Victor Frost
Erik Perrins
Dongjie Wang
Shawn Keshmiri

Abstract

There is an increasing trend in the number of wireless devices that is now already over 14 billion and is expected to grow to 40 billion devices by 2030. In addition, we are witnessing an unprecedented proliferation of applications and technologies with wireless connectivity requirements such as unmanned aerial vehicles, connected health, and radars for autonomous vehicles. The advent of new wireless technologies and devices will only worsen the current spectrum crunch that service providers and wireless operators are already experiencing. In this PhD study, we address these challenges through the following research thrusts, in which we consider two emerging applications aimed at advancing spectrum efficiency and high-frequency connectivity solutions.

 

First, we focus on effectively utilizing the existing spectrum resources for emerging applications such as networked UAVs operating within the Unmanned Traffic Management (UTM) system. In this thrust, we develop a coexistence framework for UAVs to share spectrum with traditional cellular networks by using machine learning (ML) techniques so that networked UAVs act as secondary users without interfering with primary users. We propose federated learning (FL) and reinforcement learning (RL) solutions to establish a collaborative spectrum sensing and dynamic spectrum allocation framework for networked UAVs. In the second part, we explore the potential of millimeter-wave (mmWave) and terahertz (THz) frequency bands for high-speed data transmission in urban settings. Specifically, we investigate THz-based midhaul links for 5G networks, where a network's central units (CUs) connect to distributed units (DUs). Through numerical analysis, we assess the feasibility of using 140 GHz links and demonstrate the merits of high-frequency bands to support high data rates in midhaul networks for future urban communications infrastructure. Overall, this research is aimed at establishing frameworks and methodologies that contribute toward the sustainable growth and evolution of wireless connectivity.


Arnab Mukherjee

Attention-Based Solutions for Occlusion Challenges in Person Tracking

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Prasad Kulkarni, Chair
Sumaiya Shomaji
Hongyang Sun
Jian Li

Abstract

Person tracking and association is a complex task in computer vision applications. Even with a powerful detector, a highly accurate association algorithm is necessary to match and track the correct person across all frames. This method has numerous applications in surveillance, and its complexity increases with the number of detected objects and their movements across frames. A significant challenge in person tracking is occlusion, which occurs when an individual being tracked is partially or fully blocked by another object or person. This can make it difficult for the tracking system to maintain the identity of the individual and track them effectively.

In this research, we propose a solution to the occlusion problem by utilizing an occlusion-aware spatial attention transformer. We have divided the entire tracking association process into two scenarios: occlusion and no-occlusion. When a detected person with a specific ID suddenly disappears from a frame for a certain period, we employ advanced methods such as Detector Integration and Pose Estimation to ensure the correct association. Additionally, we implement a spatial attention transformer to differentiate these occluded detections, transform them, and then match them with the correct individual using the Cosine Similarity Metric.

The features extracted from the attention transformer provide a robust baseline for detecting people, enhancing the algorithms adaptability and addressing key challenges associated with existing approaches. This improved method reduces the number of misidentifications and instances of ID switching while also enhancing tracking accuracy and precision.


Agraj Magotra

Data-Driven Insights into Sustainability: An Artificial Intelligence (AI) Powered Analysis of ESG Practices in the Textile and Apparel Industry

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Sumaiya Shomaji, Chair
Prasad Kulkarni
Zijun Yao


Abstract

The global textile and apparel (T&A) industry is under growing scrutiny for its substantial environmental and social impact, producing 92 million tons of waste annually and contributing to 20% of global water pollution. In Bangladesh, one of the world's largest apparel exporters, the integration of Environmental, Social, and Governance (ESG) practices is critical to meet international sustainability standards and maintain global competitiveness. This master's study leverages Artificial Intelligence (AI) and Machine Learning (ML) methodologies to comprehensively analyze unstructured corporate data related to ESG practices among LEED-certified Bangladeshi T&A factories. 

Our study employs advanced techniques, including Web Scraping, Natural Language Processing (NLP), and Topic Modeling, to extract and analyze sustainability-related information from factory websites. We develop a robust ML framework that utilizes Non-Negative Matrix Factorization (NMF) for topic extraction and a Random Forest classifier for ESG category prediction, achieving an 86% classification accuracy. The study uncovers four key ESG themes: Environmental Sustainability, Social : Workplace Safety and Compliance, Social: Education and Community Programs, and Governance. The analysis reveals that 46% of factories prioritize environmental initiatives, such as energy conservation and waste management, while 44% emphasize social aspects, including workplace safety and education. Governance practices are significantly underrepresented, with only 10% of companies addressing ethical governance, healthcare provisions and employee welfare.

To deepen our understanding of the ESG themes, we conducted a Centrality Analysis to identify the most influential keywords within each category, using measures such as degree, closeness, and eigenvector centrality. Furthermore, our analysis reveals that higher certification levels, like Platinum, are associated with a more balanced emphasis on environmental, social, and governance practices, while lower levels focus primarily on environmental efforts. These insights highlight key areas where the industry can improve and inform targeted strategies for enhancing ESG practices. Overall, this ML framework provides a data-driven, scalable approach for analyzing unstructured corporate data and promoting sustainability in Bangladesh’s T&A sector, offering actionable recommendations for industry stakeholders, policymakers, and global brands committed to responsible sourcing.


Samyoga Bhattarai

‘Pro-ID: A Secure Face Recognition System using Locality Sensitive Hashing to Protect Human ID’

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Sumaiya Shomaji, Chair
Tamzidul Hoque
Hongyang Sun


Abstract

Face recognition systems are widely used in various applications, from mobile banking apps to personal smartphones. However, these systems often store biometric templates in raw form, posing significant security and privacy risks. Pro-ID addresses this vulnerability by incorporating SimHash, an algorithm of Locality Sensitive Hashing (LSH), to create secure and irreversible hash codes of facial feature vectors. Unlike traditional methods that leave raw data exposed to potential breaches, SimHash transforms the feature space into high-dimensional hash codes, safeguarding user identity while preserving system functionality. 

The proposed system creates a balance between two aspects: security and the system’s performance. Additionally, the system is designed to resist common attacks, including brute force and template inversion, ensuring that even if the hashed templates are exposed, the original biometric data cannot be reconstructed.  

A key challenge addressed in this project is minimizing the trade-off between security and performance. Extensive evaluations demonstrate that the proposed method maintains competitive accuracy rates comparable to traditional face recognition systems while significantly enhancing security metrics such as irreversibility, unlinkability, and revocability. This innovative approach contributes to advancing the reliability and trustworthiness of biometric systems, providing a secure framework for applications in face recognition systems. 


Shalmoli Ghosh

High-Power Fabry-Perot Quantum-Well Laser Diodes for Application in Multi-Channel Coherent Optical Communication Systems

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Rongqing Hui , Chair
Shannon Blunt
Jim Stiles


Abstract

Wavelength Division Multiplexing (WDM) is essential for managing rapid network traffic growth in fiber optic systems. Each WDM channel demands a narrow-linewidth, frequency-stabilized laser diode, leading to complexity and increased energy consumption. Multi-wavelength laser sources, generating optical frequency combs (OFC), offer an attractive solution, enabling a single laser diode to provide numerous equally spaced spectral lines for enhanced bandwidth efficiency.

Quantum-dot and quantum-dash OFCs provide phase-synchronized lines with low relative intensity noise (RIN), while Quantum Well (QW) OFCs offer higher power efficiency, but they have higher RIN in the low frequency region of up to 2 GHz. However, both quantum-dot/dash and QW based OFCs, individual spectral lines exhibit high phase noise, limiting coherent detection. Output power levels of these OFCs range between 1-20 mW where the power of each spectral line is typically less than -5 dBm. Due to this requirement, these OFCs require excessive optical amplification, also they possess relatively broad spectral linewidths of each spectral line, due to the inverse relationship between optical power and linewidth as per the Schawlow-Townes formula. This constraint hampers their applicability in coherent detection systems, highlighting a challenge for achieving high-performance optical communication.

In this work, coherent system application of a single-section Quantum-Well Fabry-Perot (FP) laser diode is demonstrated. This laser delivers over 120 mW optical power at the fiber pigtail with a mode spacing of 36.14 GHz. In an experimental setup, 20 spectral lines from a single laser transmitter carry 30 GBaud 16-QAM signals over 78.3 km single-mode fiber, achieving significant data transmission rates. With the potential to support a transmission capacity of 2.15 Tb/s (4.3 Tb/s for dual polarization) per transmitter, including Forward Error Correction (FEC) and maintenance overhead, it offers a promising solution for meeting the escalating demands of modern network traffic efficiently.


TJ Barclay

Proof-Producing Translation from Gallina to CakeML

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Perry Alexander, Chair
Alex Bardas
Drew Davidson
Sankha Guria
Eileen Nutting

Abstract

Users of theorem provers often desire to to extract their verified code to a

  more efficient, compiled language. Coq's current extraction mechanism provides

  this facility but does not provide a formal guarantee that the extracted code

  has the same semantics as the logic it is extracted from. Providing such a

  guarantee requires a formal semantics for the target code. The CakeML

  project, implemented in HOL4, provides a formally defined syntax and semantics

  for a subset of SML and includes a proof-producing translator from

  higher-order logic to CakeML. We use the CakeML definition to develop a

  certifying extractor to CakeML from Gallina using the translation and proof techniques

  of the HOL4 CakeML translator. We also address how differences

  between HOL4 (higher-order logic) and Coq (calculus of constructions) effect

  the implementation details of the Coq translator.


Anissa Khan

Privacy Preserving Biometric Matching

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Perry Alexander, Chair
Prasad Kulkarni
Fengjun Li


Abstract

Biometric matching is a process by which distinct features are used to identify an individual. Doing so privately is important because biometric data, such as fingerprints or facial features, is not something that can be easily changed or updated if put at risk. In this study, we perform a piece of the biometric matching process in a privacy preserving manner by using secure multiparty computation (SMPC). Using SMPC allows the identifying biological data, called a template, to remain stored by the data owner during the matching process. This provides security guarantees to the biological data while it is in use and therefore reduces the chances the data is stolen. In this study, we find that performing biometric matching using SMPC is just as accurate as performing the same match in plaintext.

 


Bryan Richlinski

Prioritize Program Diversity: Enumerative Synthesis with Entropy Ordering

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Sankha Guria, Chair
Perry Alexander
Drew Davidson
Jennifer Lohoefener

Abstract

Program synthesis is a popular way to create a correct-by-construction program from a user-provided specification. 

Term enumeration is a leading technique to systematically explore the space of programs by generating terms from a formal grammar.

These terms are treated as candidate programs which are tested/verified against the specification for correctness. 

In order to prioritize candidates more likely to satisfy the specification, enumeration is often ordered by program size or other domain-specific heuristics.

However, domain-specific heuristics require expert knowledge, and enumeration by size often leads to terms comprised of frequently 

repeating symbols that are less likely to satisfy a specification. 

In this thesis, we build a heuristic that prioritizes term enumeration based on variability of individual symbols in the program, i.e., 

information entropy of the program. We use this heuristic to order programs in both top-down and bottom-up enumeration. 

We evaluated our work on a subset of the PBE-String track of the 2017 SyGuS competition benchmarks and compared against size-based enumeration. 

Top-down enumeration guided by entropy expands upon fewer partial expressions than naive in 77\% of benchmarks, 

and tests fewer complete expressions in 54\%, resulting in improved synthesis time in 40\% of benchmarks. 

However, 71\% of benchmarks in bottom-up enumeration using entropy tests fewer expressions than naive enumeration, without any improvements to the running time. 

We conclude entropy is a promising direction to prioritize candidates during program search in enumerative synthesis, 

and propose a future directions for improving performance of our proposed techniques.


Elizabeth Wyss

A New Frontier for Software Security: Diving Deep into npm

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Drew Davidson, Chair
Alex Bardas
Fengjun Li
Bo Luo
J. Walker

Abstract

Open-source package managers (e.g., npm for Node.js) have become an established component of modern software development. Rather than creating applications from scratch, developers may employ modular software dependencies and frameworks--called packages--to serve as building blocks for writing larger applications. Package managers make this process easy. With a simple command line directive, developers are able to quickly fetch and install packages across vast open-source repositories. npm--the largest of such repositories--alone hosts millions of unique packages and serves billions of package downloads each week. 

 

However, the widespread code sharing resulting from open-source package managers also presents novel security implications. Vulnerable or malicious code hiding deep within package dependency trees can be leveraged downstream to attack both software developers and the users of their applications. This downstream flow of software dependencies--dubbed the software supply chain--is critical to secure.

 

This research provides a deep dive into the npm-centric software supply chain, exploring various facets and phenomena that impact the security of this software supply chain. Such factors include (i) hidden code clones--which obscure provenance and can stealthily propagate known vulnerabilities, (ii) install-time attacks enabled by unmediated installation scripts, (iii) hard-coded URLs residing in package code, (iv) the impacts open-source development practices, and (v) package compromise via malicious updates. For each facet, tooling is presented to identify and/or mitigate potential security impacts. Ultimately, it is our hope that this research fosters greater awareness, deeper understanding, and further efforts to forge a new frontier for the security of modern software supply chains. 


Jagadeesh Sai Dokku

Intelligent Chat Bot for KU Website: Automated Query Response and Resource Navigation

When & Where:


Eaton Hall, Room 2001B

Committee Members:

David Johnson, Chair
Prasad Kulkarni
Hongyang Sun


Abstract

This project introduces an intelligent chatbot designed to improve user experience on our university website by providing instant, automated responses to common inquiries. Navigating a university website can be challenging for students, applicants, and visitors who seek quick information about admissions, campus services, events, and more. To address this challenge, we developed a chatbot that simulates human conversation using Natural Language Processing (NLP), allowing users to find information more efficiently. The chatbot is powered by a Bidirectional Long Short-Term Memory (BiLSTM) model, an architecture well-suited for understanding complex sentence structures. This model captures contextual information from both directions in a sentence, enabling it to identify user intent with high accuracy. We trained the chatbot on a dataset of intent-labeled queries, enabling it to recognize specific intentions such as asking about campus facilities, academic programs, or event schedules. The NLP pipeline includes steps like tokenization, lemmatization, and vectorization. Tokenization and lemmatization prepare the text by breaking it into manageable units and standardizing word forms, making it easier for the model to recognize similar word patterns. The vectorization process then translates this processed text into numerical data that the model can interpret. Flask is used to manage the backend, allowing seamless communication between the user interface and the BiLSTM model. When a user submits a query, Flask routes the input to the model, processes the prediction, and delivers the appropriate response back to the user interface. This chatbot demonstrates a successful application of NLP in creating interactive, efficient, and user-friendly solutions. By automating responses, it reduces reliance on manual support and ensures users can access relevant information at any time. This project highlights how intelligent chatbots can transform the way users interact with university websites, offering a faster and more engaging experience.