Defense Notices
All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.
Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.
Upcoming Defense Notices
Pardaz Banu Mohammad
Towards Early Detection of Alzheimer’s Disease based on Speech using Reinforcement Learning Feature SelectionWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Arvin Agah, ChairDavid Johnson
Sumaiya Shomaji
Dongjie Wang
Sara Wilson
Abstract
Alzheimer’s Disease (AD) is a progressive, irreversible neurodegenerative disorder and the leading cause of dementia worldwide, affecting an estimated 55 million people globally. The window of opportunity for intervention is demonstrably narrow, making reliable early-stage detection a clinical and scientific imperative. While current diagnostic techniques such as neuroimaging and cerebrospinal fluid (CSF) biomarkers carry well-defined limitations in scalability, cost, and access equity, speech has emerged as a compelling non-invasive proxy for cognitive function evaluation.
This work presents a novel approach for using acoustic feature selection as a decision-making technique and implements it using deep reinforcement learning. Specifically, we use a Deep-Q-Network (DQN) agent to navigate a high dimensional feature space of over 6,000 acoustic features extracted using the openSMILE toolkit, dynamically constructing maximally discriminative and non-redundant features subsets. In order to capture the latent structural dependencies among
acoustic features which classifier and wrapper methods have difficulty to model, we introduce the Graph Convolutional Network (GCN) based correlation awareness feature representation layer that operates as an auxiliary input to the DQN state encoder. Post selection interpretability is reinforced through TF-IDF weighting and K-means clustering which together yield both feature level and cluster level explanations that are clinically actionable. The framework is evaluated across five classifiers, namely, support vector machines (SVM), logistic regression, XGBoost, random forest, and feedforward neural network. We use 10-fold stratified cross-validation on established benchmarks of datasets, including DementiaBank Pitt Corpus, Ivanova, and ADReSS challenge data. The proposed approach is benchmarked against state-of-the-art feature selection methods such as LASSO, Recursive feature selection, and mutual information selectors. This research contributes to three primary intellectual advances: (1) a graph augmented state representation that encodes inter-feature relational structure within a reinforcement learning agent, (2) a clinically interpretable pipeline that bridges the gap between algorithmic performance and translational utility, and (3) multilingual data approach for the reinforcement learning agent framework. This study has direct implications for equitable, low-cost and scalable AD screening in both clinical and community settings.
Zhou Ni
Bridging Federated Learning and Wireless Networks: From Adaptive Learning to FLdriven System OptimizationWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Morteza Hashemi, ChairFengjun Li
Van Ly Nguyen
Han Wang
Shawn Keshmiri
Abstract
Federated learning (FL) has emerged as a promising distributed machine learning
framework that enables multiple devices to collaboratively train models without sharing raw
data, thereby preserving privacy and reducing the need for centralized data collection. However,
deploying FL in practical wireless environments introduces two major challenges. First, the data
generated across distributed devices are often heterogeneous and non-IID, which makes a single
global model insufficient for many users. Second, learning performance in wireless systems is
strongly affected by communication constraints such as interference, unreliable channels, and
dynamic resource availability. This PhD research aims to address these challenges by bridging
FL methods and wireless networks.
In the first thrust, we develop personalized and adaptive FL methods given the underlying
wireless link conditions. To this end, we propose channel-aware neighbor selection and
similarity-aware aggregation in wireless device-to-device (D2D) learning environments. We
further investigate the impacts of partial model update reception on FL performance. The
overarching goal of the first thrust is to enhance FL performance under wireless constraints.
Next, we investigate the opposite direction and raise the question: How can FL-based distributed
optimization be used for the design of next-generation wireless systems? To this end, we
investigate communication-aware participation optimization in vehicular networks, where
wireless resource allocation affects the number of clients that can successfully contribute to FL.
We further extend this direction to integrated sensing and communication (ISAC) systems,
where personalized FL (PFL) is used to support distributed beamforming optimization with joint
sensing and communication objectives.
Overall, this research establishes a unified framework for bridging FL and wireless networks. As
a future direction, this work will be extended to more realistic ISAC settings with dynamic
spectrum access, where communication, sensing, scheduling, and learning performance must be
considered jointly.
Arnab Mukherjee
Attention-Based Solutions for Occlusion Challenges in Person TrackingWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Prasad Kulkarni, ChairSumaiya Shomaji
Hongyang Sun
Jian Li
Abstract
Person re-identification (Re-ID) and multi-object tracking in unconstrained surveillance environments pose significant challenges within the field of computer vision. These complexities stem mainly from occlusion, variability in appearance, and identity switching across various camera views. This research outlines a comprehensive and innovative agenda aimed at tackling these issues, employing a series of increasingly advanced deep learning architectures, culminating in a groundbreaking occlusion-aware Vision Transformer framework.
At the heart of this work is the introduction of Deep SORT with Multiple Inputs (Deep SORT-MI), a cutting-edge real-time Re-ID system featuring a dual-metric association strategy. This strategy adeptly combines Mahalanobis distance for motion-based tracking with cosine similarity for appearance-based re-identification. As a result, this method significantly decreases identity switching compared to the baseline SORT algorithm on the MOT-16 benchmark, thereby establishing a robust foundation for metric learning in subsequent research.
Expanding on this foundation, a novel pose-estimation framework integrates 2D skeletal keypoint features extracted via OpenPose directly into the association pipeline. By capturing the spatial relationships among body joints along with appearance features, this system enhances robustness against posture variations and partial occlusion. Consequently, it achieves substantial reductions in false positives and identity switches compared to earlier methods, showcasing its practical viability.
Furthermore, a Diverse Detector Integration (DDI) study meticulously assessed the influence of detector choices—including YOLO v4, Faster R-CNN, MobileNet SSD v2, and Deep SORT—on the efficacy of metric learning-based tracking. The results reveal that YOLO v4 consistently delivers exceptional tracking accuracy on both the MOT-16 and MOT-17 datasets, establishing its superiority in this competitive landscape.
In conclusion, this body of research notably advances occlusion-aware person Re-ID by illustrating a clear progression from metric learning to pose-guided feature extraction and ultimately to transformer-based global attention modeling. The findings underscore that lightweight, meticulously parameterized Vision Transformers can achieve impressive generalization for occlusion detection, even under constrained data scenarios. This opens up exciting prospects for integrated detection, localization, and re-identification in real-world surveillance systems, promising to enhance their effectiveness and reliability.
Sai Katari
Android Malware Detection SystemWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
David Johnson, ChairArvin Agah
Prasad Kulkarni
Abstract
Android malware remains a significant threat to mobile security, requiring efficient and scalable detection methods. This project presents an Android Malware Detection System that uses machine learning to classify applications as benign or malicious based on static permission-based analysis. The system is trained on the TUANDROMD dataset of 4,464 applications using four models-Logistic Regression, XGBoost, Random Forest, and Naive Bayes-with a 75/25 train/test split and 5-fold cross-validation on the training set for evaluation. To improve reliability, the system incorporates a hybrid decision approach that combines machine learning confidence scores with a rule-based static analysis engine, using a three-zone confidence routing mechanism to capture threats that ML alone may miss. The solution is deployed as a Flask web application with both a manual detection interface and an APK file scanner, providing predictions, confidence scores, and risk insights, ultimately supporting more informed and secure decision-making.
Ertewaa Saud Alsahayan
Toward Reliable LLM-Assisted Design Space Exploration under Performance, Cost, and Dependability ConstraintsWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Tamzidul Hoque, ChairPrasad Kulkarni
Sumaiya Shomaji
Hongyang Sun
Huijeong Kim
Abstract
Architectural design space exploration (DSE) requires navigating large configuration spaces while satisfying multiple conflicting objectives, including performance, cost, and system dependability. Large language models (LLMs) have shown promise in assisting DSE by proposing candidate designs and interpreting simulation feedback. However, extending LLM-based DSE to realistic multi-objective settings introduces structural challenges. A naive multi-objective extension of prior LLM-based DSE approaches, which we term Co-Pilot2, exhibits reasoning instability, candidate degeneration, feasibility violations, and lack of progressive improvement. These limitations arise not from insufficient model capacity, but from the absence of structured control, verification, and decision integrity within the exploration process.
To address these challenges, this research introduces REMODEL, a structured LLM-controlled DSE framework that transforms free-form reasoning into a constrained, verifiable, and iterative optimization process. REMODEL incorporates candidate pooling across parallel reasoning instances, strict state isolation via history snapshotting, deterministic feasibility verification, canonical design representation and deduplication, explicit decision stages, and structured reasoning to enforce complete parameter coverage and consistent trend analysis. These mechanisms enable reliable and stable exploration under complex multi-objective constraints.
To support dependability-aware evaluation, the framework is integrated with cycle-accurate simulation using gem5 and its reliability-focused extension GemV, enabling detailed analysis of performance, power, and fault tolerance through vulnerability metrics. This integration allows the system to reason not only about performance–cost trade-offs, but also about reliability-aware design decisions under realistic execution conditions.
Experimental evaluation demonstrates that REMODEL identifies near-optimal designs within a small number of simulations, achieving significantly higher solution quality per simulation compared to baseline methods such as random search and genetic algorithms, while maintaining low computational overhead.
This work establishes a foundation for dependable LLM-assisted DSE by incorporating reliability constraints into the exploration loop. As a future direction, this framework will be extended to incorporate security-aware design considerations, enabling unified reasoning over performance, cost, reliability, and system security.
Bretton Scarbrough
Structured Light for Particle Manipulation: Hologram Generation and Optical Binding SimulationWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Shima Fardad, ChairRongqing Hui
Alessandro Salandrino
Abstract
This thesis addresses two related problems in the optical manipulation of microscopic particles: the efficient generation of holograms for holographic optical tweezers and the simulation of multi-particle optical binding. Holographic optical tweezers use phase-only spatial light modulators to create programmable optical trapping fields, enabling dynamic control over the number, position, and relative strength of optical traps. Because the quality of the trapping field depends strongly on the computed hologram, the first part of this work focuses on improving hologram-generation methods used in these systems.
A new phase-induced compressive sensing algorithm is presented for holographic optical tweezers, along with weighted and unweighted variants. These methods are developed from the Gerchberg-Saxton framework and are designed to improve computational efficiency while preserving favorable trapping characteristics such as uniformity and optical efficiency. By combining compressive sensing with phase induction, the proposed algorithms reduce the computational burden associated with iterative hologram generation while maintaining strong performance across a variety of trapping arrangements. Comparative simulations are used to evaluate these methods against several established hologram-generation algorithms, and the results show that the proposed approaches offer meaningful improvements in convergence behavior and overall performance.
The second part of this thesis examines optical binding, a phenomenon in which multiple particles interact through both the incident optical field and the fields scattered by neighboring particles. To study this process, a numerical simulation is developed that incorporates gradient forces, radiation pressure, and light-mediated particle-particle interactions in both two- and three-dimensional configurations. The simulation is used to investigate how particles evolve under different initial conditions and illumination states, and how collective effects influence the formation of stable or semi-stable arrangements. These results provide insight into the role of scattering-mediated forces in many-particle optical systems and highlight differences between two-dimensional and three-dimensional behavior.
Although hologram generation and optical binding are treated as separate problems in this work, they are connected by a common goal: understanding how structured optical fields can be designed and applied to control microscopic matter. Together, the results of this thesis contribute to the broader study of computational beam shaping and many-body optical interactions, with relevance to advanced optical trapping, particle organization, and dynamically reconfigurable light-driven systems.
Sai Rithvik Gundla
Beyond Regression Accuracy: Evaluating Runtime Prediction for Scheduling Input Sensitive WorkloadsWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Hongyang Sun, ChairArvin Agah
David Johnson
Abstract
Runtime estimation plays a structural role in reservation-based scheduling for High Performance Computing (HPC) systems, where predicted walltimes directly influence reservation timing, backfilling feasibility, and overall queue dynamics. This raises a fundamental question of whether improved runtime prediction accuracy necessarily translates into improved scheduling performance. In this work, we conduct an empirical study of runtime estimation under EASY Backfilling using an application-driven workload consisting of MRI-based brain segmentation jobs. Despite identical configurations and uniform metadata, runtimes exhibit substantial variability driven by intrinsic input structure. To capture this variability, we develop a feature-driven machine learning (ML) framework that extracts region-wise features from MRI volumes to predict job runtimes without relying on historical execution traces or scheduling metadata. We integrate these ML-derived predictions into an EASY Backfilling scheduler implemented in the Batsim simulation framework. Our results show that regression accuracy alone does not determine scheduling performance. Instead, scheduling performance depends strongly on estimation bias and its effect on reservation timing and runtime exceedances. In particular, mild multiplicative calibration of ML-based runtime estimates stabilizes scheduler behavior and yields consistently competitive performance across workload and system configurations. Comparable performance can also be observed with certain levels of uniform overestimation; however, calibrated ML predictions provide a systematic mechanism to control estimation bias without relying on arbitrary static inflation. In contrast, underestimation consistently leads to severe performance degradation and cascading job terminations. These findings highlight runtime estimation as a structural control input in backfilling-based HPC scheduling and demonstrate the importance of evaluating prediction models jointly with scheduling dynamics rather than through regression metrics alone.
Pavan Sai Reddy Pendry
BabyJay - A RAG Based Chatbot for the University of KansasWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
David Johnson, ChairRachel Jarvis
Prasad Kulkarni
Abstract
The University of Kansas maintains hundreds of departmental and unit websites, leaving students without a unified way to find information. General-purpose chatbots hallucinate KU-specific facts, and static FAQ pages cannot hold a conversation. This work presents BabyJay, a Retrieval-Augmented Generation chatbot that answers student questions using content scraped from official KU sources, with inline citations on every response. The pipeline combines query preprocessing and decomposition, an intent classifier that routes most queries to fast JSON lookups, hybrid retrieval (BM25 and ChromaDB vector search merged via Reciprocal Rank Fusion), a cross-encoder re-ranker, and generation by Claude Sonnet 4.6 under a context-only system prompt. Evaluation on 46 question-answer pairs across five difficulty tiers and eight domains produced a composite score of 0.72, entity precision of 93%, and zero runtime errors. Retrieval, rather than generation, emerged as the primary bottleneck, motivating future work on multi-domain query handling.
Ye Wang
Toward Practical and Stealthy Sensor Exploitation: Physical, Contextual, and Control-Plane Attack ParadigmsWhen & Where:
Nichols Hall, Room 250 (Gemini Conference Room)
Committee Members:
Fengjun Li, ChairDrew Davidson
Rongqing Hui
Bo Luo
Haiyang Chao
Abstract
Modern intelligent systems increasingly rely on continuous sensor data streams for perception, decision-making, and control, making sensors a critical yet underexplored attack surface. While prior research has demonstrated the feasibility of sensor-based attacks, recent advances in mobile operating systems and machine learning-based defenses have significantly reduced their practicality, rendering them more detectable, resource-intensive, and constrained by evolving permission and context-aware security models.
This dissertation revisits sensor exploitation under these modern constraints and develops a unified, cross-layer perspective that improves both practicality and stealth of sensor-enabled attacks. We identify three fundamental challenges: (i) the difficulty of reliably manipulating physical sensor signals in noisy, real-world environments; (ii) the effectiveness of context-aware defenses in detecting anomalous sensor behavior on mobile devices, and (iii) the lack of lightweight coordination for practical sensor-based side- and covert-channels.
To address the first challenge, we propose a physical-domain attack framework that integrates signal modeling, simulation-guided attack synthesis, and real-time adaptive targeting, enabling robust adversarial perturbations with high attack success rates even under environmental uncertainty. As a case study, we demonstrate an infrared laser-based adversarial example attack against face recognition systems, which achieves consistently high success rates across diverse conditions with practical execution overhead.
To improve attack stealth against context-aware defenses, we introduce an auto-contextualization mechanism that synchronizes malicious sensor actuation with legitimate application activity. By aligning injected signals with both statistical patterns and semantic context of benign behavior, the approach renders attacks indistinguishable from normal system operations and benign sensor usage. We validate this design using three Android logic bombs, showing that auto-contextualized triggers can evade both rule-based and learning-based detection mechanisms.
Finally, we extend sensor exploitation beyond the traditional attack-channel plane by introducing a lightweight control-plane protocol embedded within sensor data streams. This protocol encodes control signals directly into sensor observations and leverages simple signal-processing primitives to coordinate multi-stage attacks without relying on privileged APls or explicit inter-process communication. The resulting design enables low-overhead, stealthy coordination of cross-device side- and covert-channels.
Together, these contributions establish a new paradigm for sensor exploitation that spans physical, contextual, and control-plane dimensions. By bridging these layers, this dissertation demonstrates that sensor-based attacks remain not only feasible but also practical and stealthy in modern computer systems.
Jamison Bond
Mutual Coupling Array Calibration Utilizing Decomposition of Modeled Scattering MatrixWhen & Where:
Nichols Hall, Room 250 (Gemini Conference Room)
Committee Members:
Patrick McCormick, ChairShannon Blunt
Carl Leuschen
Abstract
***Currently being reviewed, unavailable***
Kevin Likcani
Use of Machine Learning to Predict Drug Court SuccessWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
David Johnson, ChairPrasad Kulkarni
Heechul Yun
Abstract
Substance use remains a major public health issue in the United States that significantly impacts individuals, families, and society. Many individuals who suffer from substance use disorder (SUD) face incarceration due to drug-related offenses. Drug courts have emerged as an alternative to imprisonment and offer the opportunity for individuals to participate in a drug rehabilitation program instead. Drug courts mainly focus on those with non-violent drug-related offenses. One of the challenges of decision making in drug courts is assessing the likelihood of participants graduating from the drug court and avoiding recidivism after graduation. This study investigates the use of machine learning models to predict success in drug courts using data from a substance use drug court in Missouri. Success is measured in terms of graduation from the program, and the model includes a wide range of potential predictors including demographic characteristics, family and social factors, substance use history, legal involvement, physical and mental health history, employment history as well as drug court participation data. The results will be beneficial to drug court teams and presiding judges in predicting client success, evaluating risk factors during treatment for participants, informing person-centered treatment planning, and the development of after-care plans for high-risk participants to reduce the likelihood of recidivism.
Peter Tso
Implementation of Free-Space Optical Networks based on Resonant Semiconductor Saturable Absorber and Phase Light ModulatorWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Rongqing Hui, ChairShannon Blunt
Shima Fardad
Abstract
Optical Neural Networks (ONNs) have gained traction as an alternative to the conventional computing architectures used in modern CPUs and GPUs, largely because light enables massive parallelism, ultrafast inference, and minimal power consumption.
As with conventional deep neural networks (DNNs), free-space ONNs require two main layers: (1) a nonlinear activation function which exists to separate adjacent linear layers, and (2) weighting layers that applies a linear transformation given an input.
Firstly, a Resonant Semiconductor Saturable Absorption Mirror (RSAM) was investigated as a viable nonlinear activation function. Several mechanisms have been used to create nonlinear activation functions, such as cold atoms, vapor absorption cells, and polaritons, but these implementations are bulky and must operate under tightly controlled environments while RSAMs is a passive device. Compared to typical SESAMs, the resonance structure of RSAM also reduces the saturation fluence compared to non-resonant SAMs, allowing low power laser sources to be used. A fiber-based optical testbed demonstrated notable improvement of 8.1% in classification accuracy compared to a linear only network trained with the MNIST dataset.
Secondly, Micro-electromechanical-system-based phase light modulators (PLMs) were evaluated as an alternative to LC-SLMs for in-situ reinforcement learning. PLMs can operate at kilohertz-scale frame rates at a substantially lower cost compared to LC-SLMs but have lower phase resolution and non-uniform quantization which impacts fidelity. Despite these disadvantages, the high-speed nature of PLMs allows for significant decrease in optimization time, which not only allows for reduction in training time, but also allows for larger datasets and more complex models with more learnable parameters. A single layer optical network was implemented using policy-based learning with discrete action-space to minimize impact of quantization. The testbed achieves 90.1%, 79.7%, and 76.9% training, validation, and test accuracy, respectively, on 3,000 images from the MNIST dataset. Additionally, we achieved 79.9%, 72.1%, and 71.7% accuracy on 3,000 images from the Fashion MNIST dataset. At 14 minutes per epoch during training, it is at least a magnitude lower in training time compared to LC-SLMs based models.
Joseph Vinduska
Fault-Frequency Agnostic Checkpointing StrategiesWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Hongyang Sun, ChairArvin Agah
Drew Davidson
Abstract
Checkpointing strategies in high-performance computing traditionally employ the Young-Daly for-
mula to determine the (first-order) optimal duration between checkpoints, which assumes a known
mean time between faults (MTBF). However, in practice, the MTBF may not be known accurately
or may vary, causing Young-Daly checkpointing to perform sub-optimally. In 2021, Sigdel et al.
introduced the CHORE (CHeckpointing Overhead and Rework Equated) checkpointing strategy,
which is MTBF-agnostic yet demonstrates a bounded increase in overhead compared to the op-
timal strategy. This thesis analyzed and extends the CHORE framework in several ways. First,
it verifies Sigdel et al.’s claims about the relative overhead of the CHORE strategy through both
event-driven simulations and expected runtimes derived from the underlying probablistic model.
Second, it extends the CHORE strategy to silent errors, which must be deliberately checked for to
be detected. In this scenario, the overhead compared to optimal checkpointing is once more ana-
lyzed through simulations and expected runtimes. Third, a heuristic is proposed to offer improved
performance of the CHORE algorithm under typical runtime scenarios by interpreting CHORE as
an additive-increase multiplicative-decrease model and tuning the parameters.
Lee Taylor
Ultrawideband Single-Pass Interferometric SAR Integrated with Multi-Rotor UAVWhen & Where:
Nichols Hall, Room 317 (Moore Conference Room)
Committee Members:
Carl Leuschen, ChairShannon Blunt
Patrick McCormick
John Paden
Fernando Rodriguez-Morales
Abstract
Ultrawideband (UWB) Interferometric Synthetic Aperture Radar (InSAR) integrated with multi-rotor Uncrewed Aerial Vehicle (UAV), or UIMU in this work for brevity, provides ultrafine-resolution, all-weather, 3D surface imagery at any time of day. UIMU can be rapidly deployable and low-cost, and therefore a critical new tool for low-altitude remote sensing applications, such as disaster response, environmental monitoring, and intelligence surveillance and reconnaissance (ISR). Traditional repeat-pass data collection methods reduce the phase coherence required for InSAR processing of ultrafine-resolution datasets due to the unstable flight behavior of multi-rotor UAVs. Collecting Synthetic Aperture Radar (SAR) datasets using two receive channels during a single-pass will improve phase coherence and the ability to produce ultrafine-resolution 3D InSAR imagery.
This work proposes to quantify and characterize 3D target-position accuracy for a dual-channel 6 GHz bandwidth (2 cm range resolution) frequency modulated continuous wave (FMCW) radar integrated with the Aurela X6 hexacopter to establish novel single-pass UWB InSAR data collection methods and processing algorithms for multi-rotor UAV. The feasibility of the proposed investigation is demonstrated by the preliminary qualitative analysis of single-pass InSAR imagery presented in this proposal. Fieldwork will be conducted to measure the positions of GPS located corner reflectors using the UIMU system. Algorithms for motion tolerant Time-Domain Backprojection (TDBP), InSAR coregistration, and digital elevation mapping novel to multi-rotor UAV at UWB will be developed and presented. An analysis of vehicle motion induced phase decoherence, and InSAR imagery signal to noise ratio (SNR) will be presented. The TDBP SNR performance will be compared to the Open Polar Radar Omega-K algorithm to attempt to quantify motion tolerance between the different SAR processing algorithms.
This work will establish a foundation for future investigations of real-time image processing, separated transmission and receive platforms (bistatic), or swarm configurations for UIMU systems.
Hao Xuan
Toward an Integrated Computational Framework for Metagenomics: From Sequence Alignment to Automated Knowledge DiscoveryWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Cuncong Zhong, ChairFengjun Li
Suzanne Shontz
Hongyang Sun
Liang Xu
Abstract
Metagenomic sequencing has become a central paradigm for studying complex microbial communities and their interactions with the host, with emerging applications in clinical prediction and disease modeling. In this work, we first investigate two representative application scenarios: predicting immune checkpoint inhibitor response in non-small cell lung cancer using gut microbial signatures, and characterizing host–microbiome interactions in neonatal systems. The proposed reference-free neural network captures both compositional and functional signals without reliance on reference genomes, while the neonatal study demonstrates how environmental and genetic factors reshape microbial communities and how probiotic intervention can mitigate pathogen-induced immune activation.
These studies highlight both the promise and the inherent difficulty of metagenomic analysis: transforming raw sequencing data into clinically actionable insights remains an algorithmically fragmented and computationally intensive process. This challenge arises from two key limitations: the lack of a unified algorithmic foundation for sequence alignment and the absence of systematic approaches for selecting and organizing analytical tools. Motivated by these challenges, we present a unified computational framework for metagenomic analysis that integrates complementary algorithmic and systems-level solutions.
First, to resolve fragmentation at the alignment level, we develop the Versatile Alignment Toolkit (VAT), a unified algorithmic system for biological sequence alignment across diverse applications. VAT introduces an asymmetric multi-view k-mer indexing scheme that integrates multiple seeding strategies within a single architecture and enables dynamic seed-length adjustment via longest common prefix (LCP)–based inference without re-indexing. A flexible seed-chaining mechanism further supports diverse alignment scenarios, including collinear, rearranged, and split alignments. Combined with a hardware-efficient in-register bitonic sorting algorithm and dynamic index-loading strategy, VAT achieves high efficiency and broad applicability across read mapping, homology search, and whole-genome alignment. Second, to address the challenge of tool selection and pipeline construction, we develop SNAIL, a natural language processing system for automated recognition of bioinformatics tools from large-scale and rapidly growing scientific literature. By integrating XGBoost and Transformer-based models such as SciBERT, SNAIL enables structured extraction of analytical tools and supports automated, reproducible pipeline construction.
Together, this work establishes a unified framework that is grounded in real-world applications and addresses key bottlenecks in metagenomic analysis, enabling more efficient, scalable, and clinically actionable workflows.
Devin Setiawan
Concept-Driven Interpretability in Graph Neural Networks: Applications in Neuroscientific Connectomics and Clinical Motor AnalysisWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Sumaiya Shomaji, ChairSankha Guria
Han Wang
Abstract
Graph Neural Networks (GNNs) achieve state-of-the-art performance in modeling complex biological and behavioral systems, yet their "black-box" nature limits their utility for scientific discovery and clinical translation. Standard post-hoc explainability methods typically attribute importance to low-level features, such as individual nodes or edges, which often fail to map onto the high-level, domain-specific concepts utilized by experts. To address this gap, this thesis explores diverse methodological strategies for achieving Concept-Level Interpretability in GNNs, demonstrating how deep learning models can be structurally and analytically aligned with expert domain knowledge. This theme is explored through two distinct methodological paradigms applied to critical challenges in neuroscience and clinical psychology. First, we introduce an interpretable-by-design approach for modeling brain structure-function coupling. By employing an ensemble of GNNs conceptually biased via input graph filtering, the model enforces verifiably disentangled node embeddings. This allows for the quantitative testing of specific structural hypotheses, revealing that a minority of strong anatomical connections disproportionately drives functional connectivity predictions. Second, we present a post-hoc conceptual alignment paradigm for quantifying atypical motor signatures in Autism Spectrum Disorder (ASD). Utilizing a Spatio-Temporal Graph Autoencoder (STGCN-AE) trained on normative skeletal data, we establish an unsupervised anomaly detection system. To provide clinical interpretability, the model's reconstruction error is systematically aligned with a library of human-interpretable kinematic features, such as postural sway and limb jerk. Explanatory meta-modeling via XGBoost and SHAP analysis further translates this abstract loss into a multidimensional clinical signature. Together, these applications demonstrate that integrating concept-level interpretability through either architectural design or systematic post-hoc alignment enables GNNs to serve as robust tools for hypothesis testing and clinical assessment.
Mahmudul Hasan
Trust Assurance of Commercial Off-The-Shelf (COTS) Hardware Through Verification and Runtime ResilienceWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Tamzidul Hoque, ChairEsam El-Araby
Prasad Kulkarni
Hongyang Sun
Huijeong Kim
Abstract
The adoption of Commercial off-the-shelf (COTS) components has become a dominant paradigm in modern system design due to their reduced development cost, faster time-to-market, and widespread availability. However, the reliance on globally distributed and untrusted supply chains introduces significant security risks, particularly the possibility of malicious hardware modifications such as Trojans, embedded during design or fabrication. In such settings, traditional methods that depend on golden models, full design visibility, or trusted fabrication are no longer sufficient, creating the need for new security assurance approaches under a zero-trust model. This proposed research addresses security challenges in COTS microprocessors through two complementary solutions: runtime resilience and pre-deployment trust verification. First, a multi-variant-execution-based framework is developed that leverages functionally equivalent program variants to induce diverse microarchitectural execution patterns. By comparing intermediate outputs across variants, the framework enables runtime detection and tolerance of Trojan induced payload effects without requiring hardware redundancy or architectural modifications. To enhance the effectiveness of variant generation, a reinforcement learning assisted framework is introduced, in which the reward function is defined by security objectives rather than traditional performance optimization, enabling the generation of variants that are more robust against repeated Trojan activation. Second, to enable black-box trust verification prior to deployment, this work presents a framework that can efficiently test the presence of hardware Trojans by identifying microarchitectural rare events and transferring activation knowledge from existing processor designs to trigger highly susceptible internal nodes. By leveraging ISA-level knowledge, open-source RTL references, and LLM-guided test generation, the framework achieves high trigger coverage without requiring access to proprietary designs or golden references. Building on these two scenarios, a future research direction is outlined for evolving trust in COTS hardware through continuous runtime observation, where multi-variant execution is extended with lightweight monitoring mechanisms that capture key microarchitectural events and execution traces. These observations are accumulated as hardware trust counters, enabling the system to progressively establish confidence in the underlying hardware by verifying consistent behavior across diverse execution patterns over time. Together, these directions establish a foundation for analyzing and mitigating security risks across zero-trust COTS supply chains.
Mohsen Nayebi Kerdabadi
Representation Augmentation for Electronic Health Records via Knowledge Graphs, Large Language Models, and Contrastive LearningWhen & Where:
Learned Hall, Room 3150
Committee Members:
Zijun Yao, ChairSumaiya Shomaji
Hongyang Sun
Dongjie Wang
Shawn Keshmiri
Abstract
Electronic Health Records (EHRs) provide rich longitudinal patient information, but their high dimensionality, sparsity, heterogeneity, and temporal complexity make robust representation learning difficult. This dissertation studies how to improve patient and medical concept representation learning in EHRs and consequently enhance healthcare predictive tasks by integrating domain knowledge, knowledge graphs, large language models (LLMs), and contrastive learning. First, it introduces an ontology-aware temporal contrastive framework for survival analysis that learns discriminative patient representations from censored and observed trajectories by modeling temporal distinctiveness in longitudinal EHR data. Second, it proposes a multi-ontology representation learning framework that jointly propagates knowledge within and across diagnosis, medication, and procedure ontologies, enabling richer medical concept embeddings, especially under limited data and for rare conditions. Third, it develops an LLM-enriched, text-attributed medical knowledge graph framework that combines EHR-derived statistical evidence with type-constrained LLM reasoning to infer semantic relations, generate contextual node and edge descriptions, and co-learn concept embeddings through joint language-model and graph-neural-network training. Together, these studies advance a unified view of EHR representation learning in which structured medical knowledge, textual semantics, and temporal patient trajectories are jointly leveraged to build more accurate, interpretable, and robust healthcare prediction models.
Moh Absar Rahman
Permissions vs Promises: Assessing Over-privileged Android Apps via Local LLM-based Description ValidationWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Drew Davidson, ChairSankha Guria
David Johnson
Abstract
Android is the most widely adopted mobile operating system, supporting billions of devices and driven by a robust app ecosystem. Its permission-based security model aims to enforce the Principle of Least Privilege (PoLP), restricting apps to only the permissions it needs. However, many apps still request excessive permissions, increasing the risk of data leakage and malicious exploitation. Previous research on overprivileged permission has become ineffective due to outdated methods and increasing technical complexity. The introduction of runtime permissions and scoped storage has made some of the traditional analysis techniques obsolete. Additionally, developers often are not transparent in explaining the usage of app permissions on the Play Store, misleading users unknowingly and unwillingly granting unnecessary permissions. This combination of overprivilege and poor transparency poses significant security threats to Android users. Recently, the rise of local large language models (LLMs) has shown promise in various security fields. The main focus of this study is to analyze whether an app is overpriviledged based on app description provided on the Play Store using Local LLM. Finally, we conduct a manual evaluation to validate the LLM’s findings, comparing its results against human-verified response.
Past Defense Notices
Zara Safaeipour
Task-Aware Communication Computation Co-Design for Wireless Edge AI SystemsWhen & Where:
Nichols Hall, Room 246
Committee Members:
Morteza Hashemi, ChairVan Ly Nguyen
Dongjie Wang
Abstract
Wireless edge systems typically need to complete timely computation and inference tasks under strict power, bandwidth, latency, and processing constraints. As AI models and datasets grow in size and complexity, the traditional model of sending all data to a remote cloud or running full inference on edge device becomes impractical. This creates a need for communication-computation co-design to enable efficient AI task processing at the wireless edge. To address this problem, we investigate task-aware communication-computation optimization for two specific problem settings.
First, we explore semantic communication that transmits only the information essential for the receiver’s computation tasks. We propose a semantic-aware and goal-oriented communication method for object detection. Our proposed approach is built upon the auto-encoders, with the encoder and the decoder are respectively implemented at the transmitter and receiver to extract semantic information for the specific computation goal (e.g., object detection). Numerical results show that transmitting only the necessary semantic features significantly improves the overall system efficiency.
Second, we study collaborative inference in wireless edge networks, where energy-constrained devices aim to complete delay-sensitive inference tasks. The inference computation is split between the device and an edge server, thereby achieving collaborative inference. We formulate a utility maximization problem under energy and delay constraints and propose Bayes-Split-Edge, which uses Bayesian optimization to determine the optimal transmission power and neural network split point. The proposed framework introduces a hybrid acquisition function that balances inference task utility, sample efficiency, and constraint violation penalties. We evaluate our approach using the VGG19 model, the ImageNet-Mini dataset, and real-world mMobile wireless channel datasets.
Overall, this research is aimed at developing efficient edge AI systems by incorporating the underlying wireless communications limitations and challenges into AI tasks processing.
Andrew Riachi
An Investigation Into The Memory Consumption of Web Browsers and A Memory Profiling Tool Using Linux SmapsWhen & Where:
Nichols Hall, Room 250 (Gemini Conference Room)
Committee Members:
Prasad Kulkarni, ChairPerry Alexander
Drew Davidson
Heechul Yun
Abstract
Web browsers are notorious for consuming large amounts of memory. Yet, they have become the dominant framework for writing GUIs because the web languages are ergonomic for programmers and have a cross-platform reach. These benefits are so enticing that even a large portion of mobile apps, which have to run on resource-constrained devices, are running a web browser under the hood. Therefore, it is important to keep the memory consumption of web browsers as low as practicable.
In this thesis, we investigate the memory consumption of web browsers, in particular, compared to applications written in native GUI frameworks. We introduce smaps-profiler, a tool to profile the overall memory consumption of Linux applications that can report memory usage other profilers simply do not measure. Using this tool, we conduct experiments which suggest that most of the extra memory usage compared to native applications could be due the size of the web browser program itself. We discuss our experiments and findings, and conclude that even more rigorous studies are needed to profile GUI applications.
Elizabeth Wyss
A New Frontier for Software Security: Diving Deep into npmWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Drew Davidson, ChairAlex Bardas
Fengjun Li
Bo Luo
J. Walker
Abstract
Open-source package managers (e.g., npm for Node.js) have become an established component of modern software development. Rather than creating applications from scratch, developers may employ modular software dependencies and frameworks--called packages--to serve as building blocks for writing larger applications. Package managers make this process easy. With a simple command line directive, developers are able to quickly fetch and install packages across vast open-source repositories. npm--the largest of such repositories--alone hosts millions of unique packages and serves billions of package downloads each week.
However, the widespread code sharing resulting from open-source package managers also presents novel security implications. Vulnerable or malicious code hiding deep within package dependency trees can be leveraged downstream to attack both software developers and the end-users of their applications. This downstream flow of software dependencies--dubbed the software supply chain--is critical to secure.
This research provides a deep dive into the npm-centric software supply chain, exploring distinctive phenomena that impact its overall security and usability. Such factors include (i) hidden code clones--which may stealthily propagate known vulnerabilities, (ii) install-time attacks enabled by unmediated installation scripts, (iii) hard-coded URLs residing in package code, (iv) the impacts of open-source development practices, (v) package compromise via malicious updates, (vi) spammers disseminating phishing links within package metadata, and (vii) abuse of cryptocurrency protocols designed to reward the creators of high-impact packages. For each facet, tooling is presented to identify and/or mitigate potential security impacts. Ultimately, it is our hope that this research fosters greater awareness, deeper understanding, and further efforts to forge a new frontier for the security of modern software supply chains.
Alfred Fontes
Optimization and Trade-Space Analysis of Pulsed Radar-Communication Waveforms using Constant Envelope ModulationsWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Patrick McCormick, ChairShannon Blunt
Jonathan Owen
Abstract
Dual function radar communications (DFRC) is a method of co-designing a single radio frequency system to perform simultaneous radar and communications service. DFRC is ultimately a compromise between radar sensing performance and communications data throughput due to the conflicting requirements between the sensing and information-bearing signals.
A novel waveform-based DFRC approach is phase attached radar communications (PARC), where a communications signal is embedded onto a radar pulse via the phase modulation between the two signals. The PARC framework is used here in a new waveform design technique that designs the radar component of a PARC signal to match the PARC DFRC waveform expected power spectral density (PSD) to a desired spectral template. This provides better control over the PARC signal spectrum, which mitigates the issue of PARC radar performance degradation from spectral growth due to the communications signal.
The characteristics of optimized PARC waveforms are then analyzed to establish a trade-space between radar and communications performance within a PARC DFRC scenario. This is done by sampling the DFRC trade-space continuum with waveforms that contain a varying degree of communications bandwidth, from a pure radar waveform (no embedded communications) to a pure communications waveform (no radar component). Radar performance, which is degraded by range sidelobe modulation (RSM) from the communications signal randomness, is measured from the PARC signal variance across pulses; data throughput is established as the communications performance metric. Comparing the values of these two measures as a function of communications symbol rate explores the trade-offs in performance between radar and communications with optimized PARC waveforms.
Qua Nguyen
Hybrid Array and Privacy-Preserving Signaling Optimization for NextG Wireless CommunicationsWhen & Where:
Zoom Defense, please email jgrisafe@ku.edu for link.
Committee Members:
Erik Perrins, ChairMorteza Hashemi
Zijun Yao
Taejoon Kim
KC Kong
Abstract
This PhD research tackles two critical challenges in NextG wireless networks: hybrid precoder design for wideband sub-Terahertz (sub-THz) massive multiple-input multiple-output (MIMO) communications and privacy-preserving federated learning (FL) over wireless networks.
In the first part, we propose a novel hybrid precoding framework that integrates true-time delay (TTD) devices and phase shifters (PS) to counteract the beam squint effect - a significant challenge in the wideband sub-THz massive MIMO systems that leads to considerable loss in array gain. Unlike previous methods that only designed TTD values while fixed PS values and assuming unbounded time delay values, our approach jointly optimizes TTD and PS values under realistic time delays constraint. We determine the minimum number of TTD devices required to achieve a target array gain using our proposed approach. Then, we extend the framework to multi-user wideband systems and formulate a hybrid array optimization problem aiming to maximize the minimum data rate across users. This problem is decomposed into two sub-problems: fair subarray allocation, solved via continuous domain relaxation, and subarray gain maximization, addressed via a phase-domain transformation.
The second part focuses on preserving privacy in FL over wireless networks. First, we design a differentially-private FL algorithm that applies time-varying noise variance perturbation. Taking advantage of existing wireless channel noise, we jointly design differential privacy (DP) noise variances and users transmit power to resolve the tradeoffs between privacy and learning utility. Next, we tackle two critical challenges within FL networks: (i) privacy risks arising from model updates and (ii) reduced learning utility due to quantization heterogeneity. Prior work typically addresses only one of these challenges because maintaining learning utility under both privacy risks and quantization heterogeneity is a non-trivial task. We approach to improve the learning utility of a privacy-preserving FL that allows clusters of devices with different quantization resolutions to participate in each FL round. Specifically, we introduce a novel stochastic quantizer (SQ) that ensures a DP guarantee and minimal quantization distortion. To address quantization heterogeneity, we introduce a cluster size optimization technique combined with a linear fusion approach to enhance model aggregation accuracy. Lastly, inspired by the information-theoretic rate-distortion framework, a privacy-distortion tradeoff problem is formulated to minimize privacy loss under a given maximum allowable quantization distortion. The optimal solution to this problem is identified, revealing that the privacy loss decreases as the maximum allowable quantization distortion increases, and vice versa.
This research advances hybrid array optimization for wideband sub-THz massive MIMO and introduces novel algorithms for privacy-preserving quantized FL with diverse precision. These contributions enable high-throughput wideband MIMO communication systems and privacy-preserving AI-native designs, aligning with the performance and privacy protection demands of NextG networks.
Arin Dutta
Performance Analysis of Distributed Raman Amplification with Different Pumping ConfigurationsWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Rongqing Hui, ChairMorteza Hashemi
Rachel Jarvis
Alessandro Salandrino
Hui Zhao
Abstract
As internet services like high-definition videos, cloud computing, and artificial intelligence keep growing, optical networks need to keep up with the demand for more capacity. Optical amplifiers play a crucial role in offsetting fiber loss and enabling long-distance wavelength division multiplexing (WDM) transmission in high-capacity systems. Various methods have been proposed to enhance the capacity and reach of fiber communication systems, including advanced modulation formats, dense wavelength division multiplexing (DWDM) over ultra-wide bands, space-division multiplexing, and high-performance digital signal processing (DSP) technologies. To maintain higher data rates along with maximizing the spectral efficiency of multi-level modulated signals, a higher Optical Signal-to-Noise Ratio (OSNR) is necessary. Despite advancements in coherent optical communication systems, the spectral efficiency of multi-level modulated signals is ultimately constrained by fiber nonlinearity. Raman amplification is an attractive solution for wide-band amplification with low noise figures in multi-band systems.
Distributed Raman Amplification (DRA) have been deployed in recent high-capacity transmission experiments to achieve a relatively flat signal power distribution along the optical path and offers the unique advantage of using conventional low-loss silica fibers as the gain medium, effectively transforming passive optical fibers into active or amplifying waveguides. Also, DRA provides gain at any wavelength by selecting the appropriate pump wavelength, enabling operation in signal bands outside the Erbium doped fiber amplifier (EDFA) bands. Forward (FW) Raman pumping configuration in DRA can be adopted to further improve the DRA performance as it is more efficient in OSNR improvement because the optical noise is generated near the beginning of the fiber span and attenuated along the fiber. Dual-order FW pumping scheme helps to reduce the non-linear effect of the optical signal and improves OSNR by more uniformly distributing the Raman gain along the transmission span.
The major concern with Forward Distributed Raman Amplification (FW DRA) is the fluctuation in pump power, known as relative intensity noise (RIN), which transfers from the pump laser to both the intensity and phase of the transmitted optical signal as they propagate in the same direction. Additionally, another concern of FW DRA is the rise in signal optical power near the start of the fiber span, leading to an increase in the non-linear phase shift of the signal. These factors, including RIN transfer-induced noise and non-linear noise, contribute to the degradation of system performance in FW DRA systems at the receiver.
As the performance of DRA with backward pumping is well understood with relatively low impact of RIN transfer, our research is focused on the FW pumping configuration, and is intended to provide a comprehensive analysis on the system performance impact of dual order FW Raman pumping, including signal intensity and phase noise induced by the RINs of both 1st and the 2nd order pump lasers, as well as the impacts of linear and nonlinear noise. The efficiencies of pump RIN to signal intensity and phase noise transfer are theoretically analyzed and experimentally verified by applying a shallow intensity modulation to the pump laser to mimic the RIN. The results indicate that the efficiency of the 2nd order pump RIN to signal phase noise transfer can be more than 2 orders of magnitude higher than that from the 1st order pump. Then the performance of the dual order FW Raman configurations is compared with that of single order Raman pumping to understand trade-offs of system parameters. The nonlinear interference (NLI) noise is analyzed to study the overall OSNR improvement when employing a 2nd order Raman pump. Finally, a DWDM system with 16-QAM modulation is used as an example to investigate the benefit of DRA with dual order Raman pumping and with different pump RIN levels. We also consider a DRA system using a 1st order incoherent pump together with a 2nd order coherent pump. Although dual order FW pumping corresponds to a slight increase of linear amplified spontaneous emission (ASE) compared to using only a 1st order pump, its major advantage comes from the reduction of nonlinear interference noise in a DWDM system. Because the RIN of the 2nd order pump has much higher impact than that of the 1st order pump, there should be more stringent requirement on the RIN of the 2nd order pump laser when dual order FW pumping scheme is used for DRA for efficient fiber-optic communication. Also, the result of system performance analysis reveals that higher baud rate systems, like those operating at 100Gbaud, are less affected by pump laser RIN due to the low-pass characteristics of the transfer of pump RIN to signal phase noise.
Audrey Mockenhaupt
Using Dual Function Radar Communication Waveforms for Synthetic Aperture Radar Automatic Target RecognitionWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Patrick McCormick, ChairShannon Blunt
Jon Owen
Abstract
As machine learning (ML), artificial intelligence (AI), and deep learning continue to advance, their applications become more diverse – one such application is synthetic aperture radar (SAR) automatic target recognition (ATR). These SAR ATR networks use different forms of deep learning such as convolutional neural networks (CNN) to classify targets in SAR imagery. An emerging research area of SAR is dual function radar communication (DFRC) which performs both radar and communications functions using a single co-designed modulation. The utilization of DFRC emissions for SAR imaging impacts image quality, thereby influencing SAR ATR network training. Here, using the Civilian Vehicle Data Dome dataset from the AFRL, SAR ATR networks are trained and evaluated with simulated data generated using Gaussian Minimum Shift Keying (GMSK) and Linear Frequency Modulation (LFM) waveforms. The networks are used to compare how the target classification accuracy of the ATR network differ between DFRC (i.e., GMSK) and baseline (i.e., LFM) emissions. Furthermore, as is common in pulse-agile transmission structures, an effect known as ’range sidelobe modulation’ is examined, along with its impact on SAR ATR. Finally, it is shown that SAR ATR network can be trained for GMSK emissions using existing LFM datasets via two types of data augmentation.
Rich Simeon
Delay-Doppler Channel Estimation for High-Speed Aeronautical Mobile Telemetry ApplicationsWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Erik Perrins, ChairShannon Blunt
Morteza Hashemi
Jim Stiles
Craig McLaughlin
Abstract
The next generation of digital communications systems aims to operate in high-Doppler environments such as high-speed trains and non-terrestrial networks that utilize satellites in low-Earth orbit. Current generation systems use Orthogonal Frequency Division Multiplexing modulation which is known to suffer from inter-carrier interference (ICI) when different channel paths have dissimilar Doppler shifts.
A new Orthogonal Time Frequency Space (OTFS) modulation (also known as Delay-Doppler modulation) is proposed as a candidate modulation for 6G networks that is resilient to ICI. To date, OTFS demodulation designs have focused on the use cases of popular urban terrestrial channel models where path delay spread is a fraction of the OTFS symbol duration. However, wireless wide-area networks that operate in the aeronautical mobile telemetry (AMT) space can have large path delay spreads due to reflections from distant geographic features. This presents problems for existing channel estimation techniques which assume a small maximum expected channel delay, since data transmission is paused to sound the channel by an amount equal to twice the maximum channel delay. The dropout in data contributes to a reduction in spectral efficiency.
Our research addresses OTFS limitations in the AMT use case. We start with an exemplary OTFS framework with parameters optimized for AMT. Following system design, we focus on two distinct areas to improve OTFS performance in the AMT environment. First we propose a new channel estimation technique using a pilot signal superimposed over data that can measure large delay spread channels with no penalty in spectral efficiency. A successive interference cancellation algorithm is used to iteratively improve channel estimates and jointly decode data. A second aspect of our research aims to equalize in delay-Doppler space. In the delay-Doppler paradigm, the rapid channel variations seen in the time-frequency domain is transformed into a sparse quasi-stationary channel in the delay-Doppler domain. We propose to use machine learning using Gaussian Process Regression to take advantage of the sparse and stationary channel and learn the channel parameters to compensate for the effects of fractional Doppler in which simpler channel estimation techniques cannot mitigate. Both areas of research can advance the robustness of OTFS across all communications systems.
Mohammad Ful Hossain Seikh
AAFIYA: Antenna Analysis in Frequency-domain for Impedance and Yield AssessmentWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Jim Stiles, ChairRachel Jarvis
Alessandro Salandrino
Abstract
This project presents AAFIYA (Antenna Analysis in Frequency-domain for Impedance and Yield Assessment), a modular Python toolkit developed to automate and streamline the characterization and analysis of radiofrequency (RF) antennas using both measurement and simulation data. Motivated by the need for reproducible, flexible, and publication-ready workflows in modern antenna research, AAFIYA provides comprehensive support for all major antenna metrics, including S-parameters, impedance, gain and beam patterns, polarization purity, and calibration-based yield estimation. The toolkit features robust data ingestion from standard formats (such as Touchstone files and beam pattern text files), vectorized computation of RF metrics, and high-quality plotting utilities suitable for scientific publication.
Validation was carried out using measurements from industry-standard electromagnetic anechoic chamber setups involving both Log Periodic Dipole Array (LPDA) reference antennas and Askaryan Radio Array (ARA) Bottom Vertically Polarized (BVPol) antennas, covering a frequency range of 50–1500 MHz. Key performance metrics, such as broadband impedance matching, S11 and S21 related calculations, 3D realized gain patterns, vector effective lengths, and cross-polarization ratio, were extracted and compared against full-wave electromagnetic simulations (using HFSS and WIPL-D). The results demonstrate close agreement between measurement and simulation, confirming the reliability of the workflow and calibration methodology.
AAFIYA’s open-source, extensible design enables rapid adaptation to new experiments and provides a foundation for future integration with machine learning and evolutionary optimization algorithms. This work not only delivers a validated toolkit for antenna research and pedagogy but also sets the stage for next-generation approaches in automated antenna design, optimization, and performance analysis.
Soumya Baddham
Battling Toxicity: A Comparative Analysis of Machine Learning Models for Content ModerationWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
David Johnson, ChairPrasad Kulkarni
Hongyang Sun
Abstract
With the exponential growth of user-generated content, online platforms face unprecedented challenges in moderating toxic and harmful comments. Due to this, Automated content moderation has emerged as a critical application of machine learning, enabling platforms to ensure user safety and maintain community standards. Despite its importance, challenges such as severe class imbalance, contextual ambiguity, and the diverse nature of toxic language often compromise moderation accuracy, leading to biased classification performance.
This project presents a comparative analysis of machine learning approaches for a Multi-Label Toxic Comment Classification System using the Toxic Comment Classification dataset from Kaggle. The study examines the performance of traditional algorithms, such as Logistic Regression, Random Forest, and XGBoost, alongside deep architectures, including Bi-LSTM, CNN-Bi-LSTM, and DistilBERT. The proposed approach utilizes word-level embeddings across all models and examines the effects of architectural enhancements, hyperparameter optimization, and advanced training strategies on model robustness and predictive accuracy.
The study emphasizes the significance of loss function optimization and threshold adjustment strategies in improving the detection of minority classes. The comparative results reveal distinct performance trade-offs across model architectures, with transformer models achieving superior contextual understanding at the cost of computational complexity. At the same time, deep learning approaches(LSTM models) offer efficiency advantages. These findings establish evidence-based guidelines for model selection in real-world content moderation systems, striking a balance between accuracy requirements and operational constraints.