Defense Notices
All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.
Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.
Upcoming Defense Notices
Sudha Chandrika Yadlapalli
BERT-Driven Sentiment Analysis: Automated Course Feedback Classification and RatingsWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
David Johnson, ChairPrasad Kulkarni
Hongyang Sun
Abstract
Automating the analysis of unstructured textual data, such as student course feedback, is crucial for gaining actionable insights. This project focuses on developing a sentiment analysis system leveraging the DeBERTa-v3-base model, a variant of BERT (Bidirectional Encoder Representations from Transformers), to classify feedback sentiments and generate corresponding ratings on a 1-to-5 scale.
A dataset of 100,000+ student reviews was preprocessed and fine-tuned on the model to handle class imbalances and capture contextual nuances. Training was conducted on high-performance A100 GPUs, which enhanced computational efficiency and reduced training times significantly. The trained BERT sentiment model demonstrated superior performance compared to traditional machine learning models, achieving ~82% accuracy in sentiment classification.
The model was seamlessly integrated into a functional web application, providing a streamlined approach to evaluate and visualize course reviews dynamically. Key features include a course ratings dashboard, allowing students to view aggregated ratings for each course, and a review submission functionality where new feedback is analyzed for sentiment in real-time. For the department, an admin page provides secure access to detailed analytics, such as the distribution of positive and negative reviews, visualized trends, and the access to view individual course reviews with their corresponding sentiment scores.
This project includes a comprehensive pipeline, starting from data preprocessing and model training to deploying an end-to-end application. Traditional machine learning models, such as Logistic Regression and Decision Tree, were initially tested but yielded suboptimal results. The adoption of BERT, trained on a large dataset of 100k reviews, significantly improved performance, showcasing the benefits of advanced transformer-based models for sentiment analysis tasks.
Shriraj K. Vaidya
Exploring DL Compiler Optimizations with TVMWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Prasad Kulkarni, ChairDongjie Wang
Zijun Yao
Abstract
Deep Learning (DL) compilers, also called Machine Learning (ML) compilers, take a computational graph representation of a ML model as input and apply graph-level and operator-level optimizations to generate optimized machine-code for different supported hardware architectures. DL compilers can apply several graph-level optimizations, including operator fusion, constant folding, and data layout transformations to convert the input computation graph into a functionally equivalent and optimized variant. The DL compilers also perform kernel scheduling, which is the task of finding the most efficient implementation for the operators in the computational graph. While many research efforts have focused on exploring different kernel scheduling techniques and algorithms, the benefits of individual computation graph-level optimizations are not as well studied. In this work, we employ the TVM compiler to perform a comprehensive study of the impact of different graph-level optimizations on the performance of DL models on CPUs and GPUs. We find that TVM's graph optimizations can improve model performance by up to 41.73% on CPUs and 41.6% on GPUs, and by 16.75% and 21.89%, on average, on CPUs and GPUs, respectively, on our custom benchmark suite.
Zhaohui Wang
Enhancing Security and Privacy of IoT Systems: Uncovering and Resolving Cross-App ThreatsWhen & Where:
Nichols Hall, Room 250 (Gemini Room)
Committee Members:
Fengjun Li, ChairAlex Bardas
Drew Davidson
Bo Luo
Haiyang Chao
Abstract
The rapid growth of Internet of Things (IoT) technology has brought unprecedented convenience to our daily lives, enabling users to customize automation rules and develop IoT apps to meet their specific needs. However, as IoT devices interact with multiple apps across various platforms, users are exposed to complex security and privacy risks. Even interactions among seemingly harmless apps can introduce unforeseen security and privacy threats.
In this work, we introduce two innovative approaches to uncover and address these concealed threats in IoT environments. The first approach investigates hidden cross-app privacy leakage risks in IoT apps. These risks arise from cross-app chains that are formed among multiple seemingly benign IoT apps. Our analysis reveals that interactions between apps can expose sensitive information such as user identity, location, tracking data, and activity patterns. We quantify these privacy leaks by assigning probability scores to evaluate the risks based on inferences. Additionally, we provide a fine-grained categorization of privacy threats to generate detailed alerts, enabling users to better understand and address specific privacy risks. To systematically detect cross-app interference threats, we propose to apply principles of logical fallacies to formalize conflicts in rule interactions. We identify and categorize cross-app interference by examining relations between events in IoT apps. We define new risk metrics for evaluating the severity of these interferences and use optimization techniques to resolve interference threats efficiently. This approach ensures comprehensive coverage of cross-app interference, offering a systematic solution compared to the ad hoc methods used in previous research.
To enhance forensic capabilities within IoT, we integrate blockchain technology to create a secure, immutable framework for digital forensics. This framework enables the identification, tracing, storage, and analysis of forensic information to detect anomalous behavior. Furthermore, we developed a large-scale, manually verified, comprehensive dataset of real-world IoT apps. This clean and diverse benchmark dataset supports the development and validation of IoT security and privacy solutions. Each of these approaches has been evaluated using our dataset of real-world apps, collectively offering valuable insights and tools for enhancing IoT security and privacy against cross-app threats.
Hao Xuan
A Unified Algorithmic Framework for Biological Sequence AlignmentWhen & Where:
Nichols Hall, Room 250 (Gemini Room)
Committee Members:
Cuncong Zhong, ChairFengjun Li
Suzanne Shontz
Hongyang Sun
Liang Xu
Abstract
Sequence alignment is pivotal in both homology searches and the mapping of reads from next-generation sequencing (NGS) and third-generation sequencing (TGS) technologies. Currently, the majority of sequence alignment algorithms utilize the “seed-and-extend” paradigm, designed to filter out unrelated or nonhomologous sequences when no highly similar subregions are detected. A well-known implementation of this paradigm is BLAST, one of the most widely used multipurpose aligners. Over time, this paradigm has been optimized in various ways to suit different alignment tasks. However, while these specialized aligners often deliver high performance and efficiency, they are typically restricted to one or few alignment applications. To the best of our knowledge, no existing aligner can perform all alignment tasks while maintaining superior performance and efficiency.
In this work, we introduce a unified sequence alignment framework to address this limitation. Our alignment framework is built on the seed-and-extend paradigm but incorporates novel designs in its seeding and indexing components to maximize both flexibility and efficiency. The resulting software, the Versatile Alignment Toolkit (VAT), allows the users to switch seamlessly between nearly all major alignment tasks through command-line parameter configuration. VAT was rigorously benchmarked against leading aligners for DNA and protein homolog searches, NGS and TGS read mapping, and whole-genome alignment. The results demonstrated VAT’s top-tier performance across all benchmarks, underscoring the feasibility of using a unified algorithmic framework to handle diverse alignment tasks. VAT can simplify and standardize bioinformatic analysis workflows that involve multiple alignment tasks.
Manu Chaudhary
Utilizing Quantum Computing for Solving Multidimensional Partial Differential EquationsWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Esam El-Araby, ChairPerry Alexander
Tamzidul Hoque
Prasad Kulkarni
Tyrone Duncan
Abstract
Quantum computing has the potential to revolutionize computational problem-solving by leveraging the quantum mechanical phenomena of superposition and entanglement, which allows for processing a large amount of information simultaneously. This capability is significant in the numerical solution of complex and/or multidimensional partial differential equations (PDEs), which are fundamental to modeling various physical phenomena. There are currently many quantum techniques available for solving partial differential equations (PDEs), which are mainly based on variational quantum circuits. However, the existing quantum PDE solvers, particularly those based on variational quantum eigensolver (VQE) techniques, suffer from several limitations. These include low accuracy, high execution times, and low scalability on quantum simulators as well as on noisy intermediate-scale quantum (NISQ) devices, especially for multidimensional PDEs.
In this work, we propose an efficient and scalable algorithm for solving multidimensional PDEs. We present two variants of our algorithm: the first leverages finite-difference method (FDM), classical-to-quantum (C2Q) encoding, and numerical instantiation, while the second employs FDM, C2Q, and column-by-column decomposition (CCD). Both variants are designed to enhance accuracy and scalability while reducing execution times. We have validated and evaluated our algorithm using the multidimensional Poisson equation as a case study. Our results demonstrate higher accuracy, higher scalability, and faster execution times compared to VQE-based solvers on noise-free and noisy quantum simulators from IBM. Additionally, we validated our approach on hardware emulators and actual quantum hardware, employing noise mitigation techniques. We will also focus on extending these techniques to PDEs relevant to computational fluid dynamics and financial modeling, further bridging the gap between theoretical quantum algorithms and practical applications.
Venkata Sai Krishna Chaitanya Addepalli
A Comprehensive Approach to Facial Emotion Recognition: Integrating Established Techniques with a Tailored ModelWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
David Johnson, ChairPrasad Kulkarni
Hongyang Sun
Abstract
Facial emotion recognition has become a pivotal application of machine learning, enabling advancements in human-computer interaction, behavioral analysis, and mental health monitoring. Despite its potential, challenges such as data imbalance, variation in expressions, and noisy datasets often hinder accurate prediction.
This project presents a novel approach to facial emotion recognition by integrating established techniques like data augmentation and regularization with a tailored convolutional neural network (CNN) architecture. Using the FER2013 dataset, the study explores the impact of incremental architectural improvements, optimized hyperparameters, and dropout layers to enhance model performance.
The proposed model effectively addresses issues related to data imbalance and overfitting while achieving enhanced accuracy and precision in emotion classification. The study underscores the importance of feature extraction through convolutional layers and optimized fully connected networks for efficient emotion recognition. The results demonstrate improvements in generalization, setting a foundation for future real-time applications in diverse fields.
Ye Wang
Deceptive Signals: Unveiling and Countering Sensor Spoofing Attacks on Cyber SystemsWhen & Where:
Nichols Hall, Room 250 (Gemini Room)
Committee Members:
Fengjun Li, ChairDrew Davidson
Rongqing Hui
Bo Luo
Haiyang Chao
Abstract
In modern computer systems, sensors play a critical role in enabling a wide range of functionalities, from navigation in autonomous vehicles to environmental monitoring in smart homes. Acting as an interface between physical and digital worlds, sensors collect data to drive automated functionalities and decision-making. However, this reliance on sensor data introduces significant potential vulnerabilities, leading to various physical, sensor-enabled attacks such as spoofing, tampering, and signal injection. Sensor spoofing attacks, where adversaries manipulate sensor input or inject false data into target systems, pose serious risks to system security and privacy.
In this work, we have developed two novel sensor spoofing attack methods that significantly enhance both efficacy and practicality. The first method employs physical signals that are imperceptible to humans but detectable by sensors. Specifically, we target deep learning based facial recognition systems using infrared lasers. By leveraging advanced laser modeling, simulation-guided targeting, and real-time physical adjustments, our infrared laser-based physical adversarial attack achieves high success rates with practical real-time guarantees, surpassing the limitations of prior physical perturbation attacks. The second method embeds physical signals, which are inherently present in the system, into legitimate patterns. In particular, we integrate trigger signals into standard operational patterns of actuators on mobile devices to construct remote logic bombs, which are shown to be able to evade all existing detection mechanisms. Achieving a zero false-trigger rate with high success rates, this novel sensor bomb is highly effective and stealthy.
Our study on emerging sensor-based threats highlights the urgent need for comprehensive defenses against sensor spoofing. Along this direction, we design and investigate two defense strategies to mitigate these threats. The first strategy involves filtering out physical signals identified as potential attack vectors. The second strategy is to leverage beneficial physical signals to obfuscate malicious patterns and reinforce data integrity. For example, side channels targeting the same sensor can be used to introduce cover signals that prevent information leakage, while environment-based physical signals serve as signatures to authenticate data. Together, these strategies form a comprehensive defense framework that filters harmful sensor signals and utilizes beneficial ones, significantly enhancing the overall security of cyber systems.
SM Ishraq-Ul Islam
Quantum Circuit Synthesis using Genetic Algorithms Combined with Fuzzy LogicWhen & Where:
LEEP2, Room 1420
Committee Members:
Esam El-Araby, ChairTamzidul Hoque
Prasad Kulkarni
Abstract
Quantum computing emerges as a promising direction for high-performance computing in the post-Moore era. Leveraging quantum mechanical properties, quantum devices can theoretically provide significant speedup over classical computers in certain problem domains. Quantum algorithms are typically expressed as quantum circuits composed of quantum gates, or as unitary matrices. Execution of quantum algorithms on physical devices requires translation to machine-compatible circuits -- a process referred to as quantum compilation or synthesis.
Quantum synthesis is a challenging problem. Physical quantum devices support a limited number of native basis gates, requiring synthesized circuits to be composed of only these gates. Moreover, quantum devices typically have specific qubit topologies, which constrain how and where gates can be applied. Consequently, logical qubits in input circuits and unitaries may need to be mapped to and routed between physical qubits on the device.
Current Noisy Intermediate-Scale Quantum (NISQ) devices present additional constraints, through their gate errors and high susceptibility to noise. NISQ devices are vulnerable to errors during gate application and their short decoherence times leads to qubits rapidly succumbing to accumulated noise and possibly corrupting computations. Therefore, circuits synthesized for NISQ devices need to have a low number of gates to reduce gate errors, and short execution times to avoid qubit decoherence.
The problem of synthesizing device-compatible quantum circuits, while optimizing for low gate count and short execution times, can be shown to be computationally intractable using analytical methods. Therefore, interest has grown towards heuristics-based compilation techniques, which are able to produce approximations of the desired algorithm to a required degree of precision. In this work, we investigate using Genetic Algorithms (GAs) -- a proven gradient-free optimization technique based on natural selection -- for circuit synthesis. In particular, we formulate the quantum synthesis problem as a multi-objective optimization (MOO) problem, with the objectives of minimizing the approximation error, number of multi-qubit gates, and circuit depth. We also employ fuzzy logic for runtime parameter adaptation of GA to enhance search efficiency and solution quality of our proposed quantum synthesis method.
Sravan Reddy Chintareddy
Combating Spectrum Crunch with Efficient Machine-Learning Based Spectrum Access and Harnessing High-frequency Bands for Next-G Wireless NetworksWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Morteza Hashemi, ChairVictor Frost
Erik Perrins
Dongjie Wang
Shawn Keshmiri
Abstract
There is an increasing trend in the number of wireless devices that is now already over 14 billion and is expected to grow to 40 billion devices by 2030. In addition, we are witnessing an unprecedented proliferation of applications and technologies with wireless connectivity requirements such as unmanned aerial vehicles, connected health, and radars for autonomous vehicles. The advent of new wireless technologies and devices will only worsen the current spectrum crunch that service providers and wireless operators are already experiencing. In this PhD study, we address these challenges through the following research thrusts, in which we consider two emerging applications aimed at advancing spectrum efficiency and high-frequency connectivity solutions.
First, we focus on effectively utilizing the existing spectrum resources for emerging applications such as networked UAVs operating within the Unmanned Traffic Management (UTM) system. In this thrust, we develop a coexistence framework for UAVs to share spectrum with traditional cellular networks by using machine learning (ML) techniques so that networked UAVs act as secondary users without interfering with primary users. We propose federated learning (FL) and reinforcement learning (RL) solutions to establish a collaborative spectrum sensing and dynamic spectrum allocation framework for networked UAVs. In the second part, we explore the potential of millimeter-wave (mmWave) and terahertz (THz) frequency bands for high-speed data transmission in urban settings. Specifically, we investigate THz-based midhaul links for 5G networks, where a network's central units (CUs) connect to distributed units (DUs). Through numerical analysis, we assess the feasibility of using 140 GHz links and demonstrate the merits of high-frequency bands to support high data rates in midhaul networks for future urban communications infrastructure. Overall, this research is aimed at establishing frameworks and methodologies that contribute toward the sustainable growth and evolution of wireless connectivity.
Arnab Mukherjee
Attention-Based Solutions for Occlusion Challenges in Person TrackingWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Prasad Kulkarni, ChairSumaiya Shomaji
Hongyang Sun
Jian Li
Abstract
Person tracking and association is a complex task in computer vision applications. Even with a powerful detector, a highly accurate association algorithm is necessary to match and track the correct person across all frames. This method has numerous applications in surveillance, and its complexity increases with the number of detected objects and their movements across frames. A significant challenge in person tracking is occlusion, which occurs when an individual being tracked is partially or fully blocked by another object or person. This can make it difficult for the tracking system to maintain the identity of the individual and track them effectively.
In this research, we propose a solution to the occlusion problem by utilizing an occlusion-aware spatial attention transformer. We have divided the entire tracking association process into two scenarios: occlusion and no-occlusion. When a detected person with a specific ID suddenly disappears from a frame for a certain period, we employ advanced methods such as Detector Integration and Pose Estimation to ensure the correct association. Additionally, we implement a spatial attention transformer to differentiate these occluded detections, transform them, and then match them with the correct individual using the Cosine Similarity Metric.
The features extracted from the attention transformer provide a robust baseline for detecting people, enhancing the algorithms adaptability and addressing key challenges associated with existing approaches. This improved method reduces the number of misidentifications and instances of ID switching while also enhancing tracking accuracy and precision.
Agraj Magotra
Data-Driven Insights into Sustainability: An Artificial Intelligence (AI) Powered Analysis of ESG Practices in the Textile and Apparel IndustryWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Sumaiya Shomaji, ChairPrasad Kulkarni
Zijun Yao
Abstract
The global textile and apparel (T&A) industry is under growing scrutiny for its substantial environmental and social impact, producing 92 million tons of waste annually and contributing to 20% of global water pollution. In Bangladesh, one of the world's largest apparel exporters, the integration of Environmental, Social, and Governance (ESG) practices is critical to meet international sustainability standards and maintain global competitiveness. This master's study leverages Artificial Intelligence (AI) and Machine Learning (ML) methodologies to comprehensively analyze unstructured corporate data related to ESG practices among LEED-certified Bangladeshi T&A factories.
Our study employs advanced techniques, including Web Scraping, Natural Language Processing (NLP), and Topic Modeling, to extract and analyze sustainability-related information from factory websites. We develop a robust ML framework that utilizes Non-Negative Matrix Factorization (NMF) for topic extraction and a Random Forest classifier for ESG category prediction, achieving an 86% classification accuracy. The study uncovers four key ESG themes: Environmental Sustainability, Social : Workplace Safety and Compliance, Social: Education and Community Programs, and Governance. The analysis reveals that 46% of factories prioritize environmental initiatives, such as energy conservation and waste management, while 44% emphasize social aspects, including workplace safety and education. Governance practices are significantly underrepresented, with only 10% of companies addressing ethical governance, healthcare provisions and employee welfare.
To deepen our understanding of the ESG themes, we conducted a Centrality Analysis to identify the most influential keywords within each category, using measures such as degree, closeness, and eigenvector centrality. Furthermore, our analysis reveals that higher certification levels, like Platinum, are associated with a more balanced emphasis on environmental, social, and governance practices, while lower levels focus primarily on environmental efforts. These insights highlight key areas where the industry can improve and inform targeted strategies for enhancing ESG practices. Overall, this ML framework provides a data-driven, scalable approach for analyzing unstructured corporate data and promoting sustainability in Bangladesh’s T&A sector, offering actionable recommendations for industry stakeholders, policymakers, and global brands committed to responsible sourcing.
Samyoga Bhattarai
‘Pro-ID: A Secure Face Recognition System using Locality Sensitive Hashing to Protect Human ID’When & Where:
Eaton Hall, Room 2001B
Committee Members:
Sumaiya Shomaji, ChairTamzidul Hoque
Hongyang Sun
Abstract
Face recognition systems are widely used in various applications, from mobile banking apps to personal smartphones. However, these systems often store biometric templates in raw form, posing significant security and privacy risks. Pro-ID addresses this vulnerability by incorporating SimHash, an algorithm of Locality Sensitive Hashing (LSH), to create secure and irreversible hash codes of facial feature vectors. Unlike traditional methods that leave raw data exposed to potential breaches, SimHash transforms the feature space into high-dimensional hash codes, safeguarding user identity while preserving system functionality.
The proposed system creates a balance between two aspects: security and the system’s performance. Additionally, the system is designed to resist common attacks, including brute force and template inversion, ensuring that even if the hashed templates are exposed, the original biometric data cannot be reconstructed.
A key challenge addressed in this project is minimizing the trade-off between security and performance. Extensive evaluations demonstrate that the proposed method maintains competitive accuracy rates comparable to traditional face recognition systems while significantly enhancing security metrics such as irreversibility, unlinkability, and revocability. This innovative approach contributes to advancing the reliability and trustworthiness of biometric systems, providing a secure framework for applications in face recognition systems.
Shalmoli Ghosh
High-Power Fabry-Perot Quantum-Well Laser Diodes for Application in Multi-Channel Coherent Optical Communication SystemsWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Rongqing Hui , ChairShannon Blunt
Jim Stiles
Abstract
Wavelength Division Multiplexing (WDM) is essential for managing rapid network traffic growth in fiber optic systems. Each WDM channel demands a narrow-linewidth, frequency-stabilized laser diode, leading to complexity and increased energy consumption. Multi-wavelength laser sources, generating optical frequency combs (OFC), offer an attractive solution, enabling a single laser diode to provide numerous equally spaced spectral lines for enhanced bandwidth efficiency.
Quantum-dot and quantum-dash OFCs provide phase-synchronized lines with low relative intensity noise (RIN), while Quantum Well (QW) OFCs offer higher power efficiency, but they have higher RIN in the low frequency region of up to 2 GHz. However, both quantum-dot/dash and QW based OFCs, individual spectral lines exhibit high phase noise, limiting coherent detection. Output power levels of these OFCs range between 1-20 mW where the power of each spectral line is typically less than -5 dBm. Due to this requirement, these OFCs require excessive optical amplification, also they possess relatively broad spectral linewidths of each spectral line, due to the inverse relationship between optical power and linewidth as per the Schawlow-Townes formula. This constraint hampers their applicability in coherent detection systems, highlighting a challenge for achieving high-performance optical communication.
In this work, coherent system application of a single-section Quantum-Well Fabry-Perot (FP) laser diode is demonstrated. This laser delivers over 120 mW optical power at the fiber pigtail with a mode spacing of 36.14 GHz. In an experimental setup, 20 spectral lines from a single laser transmitter carry 30 GBaud 16-QAM signals over 78.3 km single-mode fiber, achieving significant data transmission rates. With the potential to support a transmission capacity of 2.15 Tb/s (4.3 Tb/s for dual polarization) per transmitter, including Forward Error Correction (FEC) and maintenance overhead, it offers a promising solution for meeting the escalating demands of modern network traffic efficiently.
Anissa Khan
Privacy Preserving Biometric MatchingWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Perry Alexander, ChairPrasad Kulkarni
Fengjun Li
Abstract
Biometric matching is a process by which distinct features are used to identify an individual. Doing so privately is important because biometric data, such as fingerprints or facial features, is not something that can be easily changed or updated if put at risk. In this study, we perform a piece of the biometric matching process in a privacy preserving manner by using secure multiparty computation (SMPC). Using SMPC allows the identifying biological data, called a template, to remain stored by the data owner during the matching process. This provides security guarantees to the biological data while it is in use and therefore reduces the chances the data is stolen. In this study, we find that performing biometric matching using SMPC is just as accurate as performing the same match in plaintext.
Bryan Richlinski
Prioritize Program Diversity: Enumerative Synthesis with Entropy OrderingWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Sankha Guria, ChairPerry Alexander
Drew Davidson
Jennifer Lohoefener
Abstract
Program synthesis is a popular way to create a correct-by-construction program from a user-provided specification. Term enumeration is a leading technique to systematically explore the space of programs by generating terms from a formal grammar. These terms are treated as candidate programs which are tested/verified against the specification for correctness. In order to prioritize candidates more likely to satisfy the specification, enumeration is often ordered by program size or other domain-specific heuristics. However, domain-specific heuristics require expert knowledge, and enumeration by size often leads to terms comprised of frequently repeating symbols that are less likely to satisfy a specification. In this thesis, we build a heuristic that prioritizes term enumeration based on variability of individual symbols in the program, i.e., information entropy of the program. We use this heuristic to order programs in both top-down and bottom-up enumeration. We evaluated our work on a subset of the PBE-String track of the 2017 SyGuS competition benchmarks and compared against size-based enumeration. In top-down enumeration, our entropy heuristic shortens runtime in ~56% of cases and tests fewer programs in ~80% before finding a valid solution. For bottom-up enumeration, our entropy heuristic improves the number of enumerated programs in ~30% of cases before finding a valid solution, without improving the runtime. Our findings suggest that using entropy to prioritize program enumeration is a promising step forward for faster program synthesis.
Elizabeth Wyss
A New Frontier for Software Security: Diving Deep into npmWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Drew Davidson, ChairAlex Bardas
Fengjun Li
Bo Luo
J. Walker
Abstract
Open-source package managers (e.g., npm for Node.js) have become an established component of modern software development. Rather than creating applications from scratch, developers may employ modular software dependencies and frameworks--called packages--to serve as building blocks for writing larger applications. Package managers make this process easy. With a simple command line directive, developers are able to quickly fetch and install packages across vast open-source repositories. npm--the largest of such repositories--alone hosts millions of unique packages and serves billions of package downloads each week.
However, the widespread code sharing resulting from open-source package managers also presents novel security implications. Vulnerable or malicious code hiding deep within package dependency trees can be leveraged downstream to attack both software developers and the users of their applications. This downstream flow of software dependencies--dubbed the software supply chain--is critical to secure.
This research provides a deep dive into the npm-centric software supply chain, exploring various facets and phenomena that impact the security of this software supply chain. Such factors include (i) hidden code clones--which obscure provenance and can stealthily propagate known vulnerabilities, (ii) install-time attacks enabled by unmediated installation scripts, (iii) hard-coded URLs residing in package code, (iv) the impacts open-source development practices, and (v) package compromise via malicious updates. For each facet, tooling is presented to identify and/or mitigate potential security impacts. Ultimately, it is our hope that this research fosters greater awareness, deeper understanding, and further efforts to forge a new frontier for the security of modern software supply chains.
Jagadeesh Sai Dokku
Intelligent Chat Bot for KU Website: Automated Query Response and Resource NavigationWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
David Johnson, ChairPrasad Kulkarni
Hongyang Sun
Abstract
This project introduces an intelligent chatbot designed to improve user experience on our university website by providing instant, automated responses to common inquiries. Navigating a university website can be challenging for students, applicants, and visitors who seek quick information about admissions, campus services, events, and more. To address this challenge, we developed a chatbot that simulates human conversation using Natural Language Processing (NLP), allowing users to find information more efficiently. The chatbot is powered by a Bidirectional Long Short-Term Memory (BiLSTM) model, an architecture well-suited for understanding complex sentence structures. This model captures contextual information from both directions in a sentence, enabling it to identify user intent with high accuracy. We trained the chatbot on a dataset of intent-labeled queries, enabling it to recognize specific intentions such as asking about campus facilities, academic programs, or event schedules. The NLP pipeline includes steps like tokenization, lemmatization, and vectorization. Tokenization and lemmatization prepare the text by breaking it into manageable units and standardizing word forms, making it easier for the model to recognize similar word patterns. The vectorization process then translates this processed text into numerical data that the model can interpret. Flask is used to manage the backend, allowing seamless communication between the user interface and the BiLSTM model. When a user submits a query, Flask routes the input to the model, processes the prediction, and delivers the appropriate response back to the user interface. This chatbot demonstrates a successful application of NLP in creating interactive, efficient, and user-friendly solutions. By automating responses, it reduces reliance on manual support and ensures users can access relevant information at any time. This project highlights how intelligent chatbots can transform the way users interact with university websites, offering a faster and more engaging experience.
Past Defense Notices
Laurynas Lialys
Engineering laser beams for particle trapping, lattice formation and microscopyWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Shima Fardad, ChairMorteza Hashemi
Rongqing Hui
Alessandro Salandrino
Xinmai Yang
Abstract
Having control over nano- and micro-sized objects' position inside a suspension is crucial in many applications such as: sorting and delivery of particles, studying cells and microorganisms, spectroscopy imaging techniques, and building microscopic size lattices and artificial structures. This control can be achieved by judiciously engineering optical forces and light-matter interactions inside colloidal suspensions that result in optical trapping. However, in the current techniques, to confine and transport particles in 3D, the use of high-NA (Numerical Aperture) optics is a must. This in turn leads to several disadvantages such as alignment complications, lower trap stability, and undesirable thermal effects. Hence, here we study novel optical trapping methods such as asymmetric counter-propagating beams where we have engineered the optical forces to overcome the aforementioned limitations. This system is significantly easier to align as it uses much lower NA optics which creates a very flexible manipulating system. This new approach allows the trapping and transportation of different shape objects, sizing from hundreds of nanometers to hundreds of micrometers by exploiting asymmetrical optical fields with higher stability. In addition, this technique also allows for significantly longer particle trapping lengths of up to a few millimeters. As a result, we can apply this method to trapping much larger particles and microorganisms that have never been trapped optically before. Another application that the larger trapping lengths of the proposed system allow for is the creation of 3D lattices of microscopic size particles and other artificial structures, which is one important application of optical trapping.
This system can be used to create a fully reconfigurable medium by optically controlling the position of selected nano- and micro-sized dielectric and metallic particles to mimic a certain medium. This “table-top” emulation can significantly simplify our studies of wave-propagation phenomena on transmitted signals in the real world.
Furthermore, an important application of an optical tweezer system is that it can be combined with a variety of spectroscopy and microscopy techniques to extract valuable, time-sensitive information from trapped entities. In this research, I plan to integrate several spectroscopy techniques into the proposed trapping method in order to achieve higher-resolution images, especially for biomaterials such as microorganisms.
Michael Cooley
Machine Learning for Navel Discharge ReviewWhen & Where:
Eaton Hall, Room 1
Committee Members:
Prasad Kulkarni, ChairDavid Johnson (Co-Chair)
Jerzy Grzymala-Busse
Abstract
This research project aims to predict the outcome of the Naval Discharge Review Board decision for an applicant based on factors in the application, using Machine Learning techniques. The study explores three popular machine learning algorithms: MLP, Adaboost, and KNN, with KNN providing the best results. The training is verified through hyperparameter optimization and cross fold validation.
Additionally, the study investigates the ability of ChatGPT's API to classify the data that couldn't be classified manually. A total of over 8000 samples were classified by ChatGPT's API, and an MLP model was trained using the same hyperparameters that were found to be optimal for the 3000 size manual sample.The model was then tested on the manual sample. The results show that the model trained on data labeled by ChatGPT performed equivalently, suggesting that ChatGPT's API is a promising tool for labeling in this domain.
Vasudha Yenuganti
RNA Structure Annotation Based on Base Pairs Using ML Based ClassifiersWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Cuncong Zhong, ChairDavid Johnson
Prasad Kulkarni
Abstract
RNA molecules play a crucial role in the regulation of gene expression and other cellular processes. Understanding the three-dimensional structure of RNA is essential for predicting its function and interactions with other molecules. One key feature of RNA structure is the presence of base pairs, where nucleotides i.e., adenine(A), guanine(G), cytosine(C), and uracil(U), form hydrogen bonds with each other. The limited availability of high-quality RNA structural data combined with associated atomic coordinate errors in low resolution structures, presents significant challenges for extracting important geometrical characteristics from RNA's complex three-dimensional structure, particularly in terms of base interactions.
In this study, we propose an approach for annotating base-pairing interactions in low-resolution RNA structures using machine learning (ML) based classifiers and leveraging the more precise structural information available in high-resolution homologs to annotate base-pairing interactions in low-resolution structures. We first use DSSR tool to extract annotations of high-resolution RNA structures and extract distances of atoms of interacting base pairs. The distances serve as features, and 12 standard annotations are used as labels for our ML model. We then apply different ML classifiers, including support vector machines, neural networks, and random forests, to predict RNA annotations. We evaluate the performance of these classifiers using a benchmark dataset and report their precision, recall, and F1-score. Low-resolution RNA structures are then annotated based on the sequence-similarity with high-resolution structures and the corresponding predicted annotations.
For future aspects, the presented approach can also help to explore the plausible base pair interactions to identify conserved motifs in low-resolution structures. The detected interactions along with annotations can aid in the study of RNA tertiary structures, which can lead to a better understanding of their functions in the cell.
Venkata Nadha Reddy Karasani
Implementing Web Presence For The History Of Black WritingWhen & Where:
LEEP2, Room 1415
Committee Members:
Drew Davidson, ChairPerry Alexander
Hossein Saiedian
Abstract
The Black Literature Network Project is a comprehensive initiative to disseminate literature knowledge to students, academics, and the general public. It encompasses four distinct portals, each featuring content created and curated by scholars in the field. These portals include the Novel Generator Machine, Literary Data Gallery, Multithreaded Literary Briefs, and Remarkable Receptions Podcast Series. My significant contribution to this project was creating a standalone website for the Current Archives and Collections Index that offers an easily searchable index of black-themed collections. Additionally, I was exclusively responsible for the complete development of the novel generator tool. This application provides customized book recommendations based on user preferences. As a part of the History of Black Writing (HBW) Program, I had the opportunity to customize an open-source annotation tool called Hypothesis. This customization allowed for its use on all websites related to the Black Literature Network Project by the end users. The Black Book Interactive Project (BBIP) collaborates with institutions and groups nationwide to promote access to Black-authored texts and digital publishing. Through BBIP, we plan to increase black literature’s visibility in digital humanities research.
Michael Bechtel
Shared Resource Denial-of-Service Attacks on Multicore PlatformsWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Heechul Yun, ChairMohammad Alian
Drew Davidson
Prasad Kulkarni
Shawn Keshmiri
Abstract
With the increased adoption of complex machine learning algorithms across many different fields, powerful computing platforms have become necessary to meet their computational needs. Multicore platforms are a popular choice as they provide greater computing capabilities and can still meet different size, weight, and power (SWaP) constraints. However, contention for shared hardware resources between multiple cores remains a significant challenge that can lead to interference and unpredictable timing behaviors. Furthermore, this contention can be intentionally induced by malicious actors with the specific goals of delaying safety-critical tasks and jeopardizing system safety. This is done by performing Denial-of-Service (DoS) attacks that target shared resources such that the other cores in a system are unable to access them. When done properly, these shared resource DoS attacks can significantly impact performance and threaten system stability. For example, DoS attacks can cause >300X slowdown on the popular Raspberry Pi 3 embedded platform.
Motivated by the inherent risks posed by these DoS attacks, this dissertation presents investigations and evaluations of shared resource contention on multicore platforms, and the impacts it can have on the performance of real-time tasks. We propose various DoS attacks that each target different shared resources in the memory hierarchy with the goal of causing as much slowdown as possible. We show that each attack can inflict significant temporal slowdowns to victim tasks on target platforms by exploiting different hardware and software mechanisms. We then develop and analyze techniques for providing shared resource isolation and temporal performance guarantees for safety-critical tasks running on multicore platforms. In particular, we find that bandwidth throttling mechanisms are effective solutions against most DoS attacks and can protect the performance of real-time victim tasks.
Sarah Johnson
Formal Analysis of TPM Key Certification ProtocolsWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Perry Alexander, ChairMichael Branicky
Emily Witt
Abstract
Development and deployment of trusted systems often require definitive identification of devices. A remote entity should have confidence that a device is as it claims to be. An ideal method for fulfulling this need is through the use of secure device identitifiers. A secure device identifier (DevID) is defined as an identifier that is cryptographically bound to a device. A DevID must not be transferable from one device to another as that would allow distinct devices to be identified as the same. Since the Trusted Platform Module (TPM) is a secure Root of Trust for Storage, it provides the necessary protections for storing these identifiers. Consequently, the Trusted Computing Group (TCG) recommends the use of TPM keys for DevIDs. The TCG's specification TPM 2.0 Keys for Device Identity and Attestation describes several methods for remotely proving a key to be resident in a specific device's TPM. These methods are carefully constructed protocols which are intended to be performed by a trusted Certificate Authority (CA) in communication with a certificate-requesting device. DevID certificates produced by an OEM's CA at device manufacturing time may be used to provide definitive evidence to a remote entity that a key belongs to a specific device. Whereas DevID certificates produced by an Owner/Administrator's CA require a chain of certificates in order to verify a chain of trust to an OEM-provided root certificate. This distinction is due to the differences in the respective protocols prescribed by the TCG's specification. We aim to abstractly model these protocols and formally verify that their resulting assurances on TPM-residency do in fact hold. We choose this goal since the TCG themselves do not provide any proofs or clear justifications for how the protocols might provide these assurances. The resulting TPM-command library and execution relation modeled in Coq may easily be expanded upon to become useful in verifying a wide range of properties regarding DevIDs and TPMs.
Andrew Cousino
Recording Remote Attestations on the BlockchainWhen & Where:
Nichols Hall, Gemini Room
Committee Members:
Perry Alexander, ChairAlex Bardas
Drew Davidson
Abstract
Remote attestation is a process of establishing trust between various systems on a network. Until now, attestations had to be done on the fly as caching attestations had not yet been solved. With the blockchain providing a monotonic record, this work attempts to enable attestations to be cached. This paves the way for more complex attestation protocols to fit the wide variety of needs of users. We also developed specifications for these records to be cached on the blockchain.
Ragib Shakil Rafi
Nonlinearity Assisted Mie Scattering from NanoparticlesWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Alessandro Salandrino, ChairShima Fardad
Morteza Hashemi
Rongqing Hui
Judy Wu
Abstract
Scattering by nanoparticles is an exciting branch of physics to control and manipulate light. More specifically, there have been fascinating developments regarding light scattering by sub-wavelength particles, including high-index dielectric and metal particles, for their applications in optical resonance phenomena, detecting the fluorescence of molecules, enhancing Raman scattering, transferring the energy to the higher order modes, sensing and photodetector technologies. It recently gained more attention due to its near-field effect at the nanoscale and achieving new insights and applications through space and time-varying parametric modulation and including nonlinear effects. When the particle size is comparable to or slightly bigger than the incident wavelength, Mie solutions to Maxwell's equations describe these electromagnetic scattering problems. The addition and excitation of nonlinear effects in these high-indexed sub-wavelength dielectric and plasmonic particles might improve the existing performance of the system or provide additional features directed toward unique applications. In this thesis, we study the Mie scattering from dielectric and plasmonic particles in the presence of nonlinear effects. For dielectrics, we present a numerical study of the linear and nonlinear diffraction and focusing properties of dielectric metasurfaces consisting of silicon microcylinder arrays resting on a silicon substrate. Upon diffraction, such structures lead to the formation of near-field intensity profiles reminiscent of photonic nanojets and propagate similarly. Our results indicate that the Kerr nonlinear effect enhances light concentration throughout the generated photonic jet with an increase in the intensity of about 20% compared to the linear regime for the power levels considered in this work. The transverse beamwidth remains subwavelength in all cases, and the nonlinear effect reduces the full width. In the future, we want to optimize the performance through parametric modification of the system and continue our study with plasmonic structures in time–varying scenarios. We hope that with appropriate parametric modulation, intermodal energy transfer is possible in such structures. We want to explore the nonlinear excitation to transfer energy in higher-order modes by exploiting different wave-mixing interactions in time-modulated scatterers.
Anna Fritz
Negotiating Remote Attestation ProtocolsWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Perry Alexander, ChairAlex Bardas
Drew Davidson
Fengjun Li
Emily Witt
Abstract
During remote attestation, a relying party prompts a target to perform some stateful measurement which can be appraised to determine trust in the target's system. In this current framework, requested measurement operations must be provisioned by a knowledgeable system user who may fail to consider situational demands which potentially impact the desired measurement. To solve this problem, we introduce negotiation: a framework that allows the target and relying party to mutually determine an attestation protocol that satisfies both the target's need to protect sensitive information and the relying party's desire for a comprehensive measurement. We designed and verified this negotiation procedure such that for all negotiations, we can provably produce an executable protocol that satisfies the targets privacy standards. With the remainder of this work, we aim to realize and instantiate protocol orderings ensuring negotiation produces a protocol sufficient for the relying party. All progress is towards our ultimate goal of producing a working, fully verified negotiation scheme which will be integrated into our current attestation framework for flexible, end-to-end attestations.
Paul Gomes
A framework for embedding hybrid term proximity score with standard TF-IDF to improve the performance of recipe retrieval systemWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Prasad Kulkarni, ChairDavid Johnson
Hongyang Sun
Abstract
Information retrieval system plays an important role in the modern era in retrieving relevant information from a large collection of data, such as documents, webpages, and other multimedia content. Having an information retrieval system in any domain allows users to collect relevant information. Unfortunately, navigating a modern-day recipe website presents the audience with numerous recipes in a colorful user interface but with very little capability to search and narrow down your content based on your specific interests. The goal of the project is to develop a search engine for recipes using standard TF-IDF weighting and to improve the performance of the standard IR by implementing term proximity. The approach used to calculate term proximity in this project is a hybrid approach, a combination of span-based and pair-based approaches. The project architecture includes a crawler, a database, an API, a service responsible for TF-IDF weighting and term proximity calculation, and a web application to present the search results.