Defense Notices


All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

Ye Wang

Deceptive Signals: Unveiling and Countering Sensor Spoofing Attacks on Cyber Systems

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Fengjun Li, Chair
Drew Davidson
Rongqing Hui
Bo Luo
Haiyang Chao

Abstract

In modern computer systems, sensors play a critical role in enabling a wide range of functionalities, from navigation in autonomous vehicles to environmental monitoring in smart homes. Acting as an interface between physical and digital worlds, sensors collect data to drive automated functionalities and decision-making. However, this reliance on sensor data introduces significant potential vulnerabilities, leading to various physical, sensor-enabled attacks such as spoofing, tampering, and signal injection. Sensor spoofing attacks, where adversaries manipulate sensor input or inject false data into target systems, pose serious risks to system security and privacy.

In this work, we have developed two novel sensor spoofing attack methods that significantly enhance both efficacy and practicality. The first method employs physical signals that are imperceptible to humans but detectable by sensors. Specifically, we target deep learning based facial recognition systems using infrared lasers. By leveraging advanced laser modeling, simulation-guided targeting, and real-time physical adjustments, our infrared laser-based physical adversarial attack achieves high success rates with practical real-time guarantees, surpassing the limitations of prior physical perturbation attacks. The second method embeds physical signals, which are inherently present in the system, into legitimate patterns. In particular, we integrate trigger signals into standard operational patterns of actuators on mobile devices to construct remote logic bombs, which are shown to be able to evade all existing detection mechanisms. Achieving a zero false-trigger rate with high success rates, this novel sensor bomb is highly effective and stealthy.

Our study on emerging sensor-based threats highlights the urgent need for comprehensive defenses against sensor spoofing. Along this direction, we design and investigate two defense strategies to mitigate these threats. The first strategy involves filtering out physical signals identified as potential attack vectors. The second strategy is to leverage beneficial physical signals to obfuscate malicious patterns and reinforce data integrity. For example, side channels targeting the same sensor can be used to introduce cover signals that prevent information leakage, while environment-based physical signals serve as signatures to authenticate data. Together, these strategies form a comprehensive defense framework that filters harmful sensor signals and utilizes beneficial ones, significantly enhancing the overall security of cyber systems.


Sravan Reddy Chintareddy

Combating Spectrum Crunch with Efficient Machine-Learning Based Spectrum Access and Harnessing High-frequency Bands for Next-G Wireless Networks

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Morteza Hashemi, Chair
Victor Frost
Erik Perrins
Dongjie Wang
Shawn Keshmiri

Abstract

There is an increasing trend in the number of wireless devices that is now already over 14 billion and is expected to grow to 40 billion devices by 2030. In addition, we are witnessing an unprecedented proliferation of applications and technologies with wireless connectivity requirements such as unmanned aerial vehicles, connected health, and radars for autonomous vehicles. The advent of new wireless technologies and devices will only worsen the current spectrum crunch that service providers and wireless operators are already experiencing. In this PhD study, we address these challenges through the following research thrusts, in which we consider two emerging applications aimed at advancing spectrum efficiency and high-frequency connectivity solutions.

 

First, we focus on effectively utilizing the existing spectrum resources for emerging applications such as networked UAVs operating within the Unmanned Traffic Management (UTM) system. In this thrust, we develop a coexistence framework for UAVs to share spectrum with traditional cellular networks by using machine learning (ML) techniques so that networked UAVs act as secondary users without interfering with primary users. We propose federated learning (FL) and reinforcement learning (RL) solutions to establish a collaborative spectrum sensing and dynamic spectrum allocation framework for networked UAVs. In the second part, we explore the potential of millimeter-wave (mmWave) and terahertz (THz) frequency bands for high-speed data transmission in urban settings. Specifically, we investigate THz-based midhaul links for 5G networks, where a network's central units (CUs) connect to distributed units (DUs). Through numerical analysis, we assess the feasibility of using 140 GHz links and demonstrate the merits of high-frequency bands to support high data rates in midhaul networks for future urban communications infrastructure. Overall, this research is aimed at establishing frameworks and methodologies that contribute toward the sustainable growth and evolution of wireless connectivity.


Agraj Magotra

Data-Driven Insights into Sustainability: An Artificial Intelligence (AI) Powered Analysis of ESG Practices in the Textile and Apparel Industry

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Sumaiya Shomaji, Chair
Prasad Kulkarni
Zijun Yao


Abstract

The global textile and apparel (T&A) industry is under growing scrutiny for its substantial environmental and social impact, producing 92 million tons of waste annually and contributing to 20% of global water pollution. In Bangladesh, one of the world's largest apparel exporters, the integration of Environmental, Social, and Governance (ESG) practices is critical to meet international sustainability standards and maintain global competitiveness. This master's study leverages Artificial Intelligence (AI) and Machine Learning (ML) methodologies to comprehensively analyze unstructured corporate data related to ESG practices among LEED-certified Bangladeshi T&A factories. 

Our study employs advanced techniques, including Web Scraping, Natural Language Processing (NLP), and Topic Modeling, to extract and analyze sustainability-related information from factory websites. We develop a robust ML framework that utilizes Non-Negative Matrix Factorization (NMF) for topic extraction and a Random Forest classifier for ESG category prediction, achieving an 86% classification accuracy. The study uncovers four key ESG themes: Environmental Sustainability, Social : Workplace Safety and Compliance, Social: Education and Community Programs, and Governance. The analysis reveals that 46% of factories prioritize environmental initiatives, such as energy conservation and waste management, while 44% emphasize social aspects, including workplace safety and education. Governance practices are significantly underrepresented, with only 10% of companies addressing ethical governance, healthcare provisions and employee welfare.

To deepen our understanding of the ESG themes, we conducted a Centrality Analysis to identify the most influential keywords within each category, using measures such as degree, closeness, and eigenvector centrality. Furthermore, our analysis reveals that higher certification levels, like Platinum, are associated with a more balanced emphasis on environmental, social, and governance practices, while lower levels focus primarily on environmental efforts. These insights highlight key areas where the industry can improve and inform targeted strategies for enhancing ESG practices. Overall, this ML framework provides a data-driven, scalable approach for analyzing unstructured corporate data and promoting sustainability in Bangladesh’s T&A sector, offering actionable recommendations for industry stakeholders, policymakers, and global brands committed to responsible sourcing.


Shalmoli Ghosh

High-Power Fabry-Perot Quantum-Well Laser Diodes for Application in Multi-Channel Coherent Optical Communication Systems

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Rongqing Hui , Chair
Shannon Blunt
Jim Stiles


Abstract

Wavelength Division Multiplexing (WDM) is essential for managing rapid network traffic growth in fiber optic systems. Each WDM channel demands a narrow-linewidth, frequency-stabilized laser diode, leading to complexity and increased energy consumption. Multi-wavelength laser sources, generating optical frequency combs (OFC), offer an attractive solution, enabling a single laser diode to provide numerous equally spaced spectral lines for enhanced bandwidth efficiency.

Quantum-dot and quantum-dash OFCs provide phase-synchronized lines with low relative intensity noise (RIN), while Quantum Well (QW) OFCs offer higher power efficiency, but they have higher RIN in the low frequency region of up to 2 GHz. However, both quantum-dot/dash and QW based OFCs, individual spectral lines exhibit high phase noise, limiting coherent detection. Output power levels of these OFCs range between 1-20 mW where the power of each spectral line is typically less than -5 dBm. Due to this requirement, these OFCs require excessive optical amplification, also they possess relatively broad spectral linewidths of each spectral line, due to the inverse relationship between optical power and linewidth as per the Schawlow-Townes formula. This constraint hampers their applicability in coherent detection systems, highlighting a challenge for achieving high-performance optical communication.

In this work, coherent system application of a single-section Quantum-Well Fabry-Perot (FP) laser diode is demonstrated. This laser delivers over 120 mW optical power at the fiber pigtail with a mode spacing of 36.14 GHz. In an experimental setup, 20 spectral lines from a single laser transmitter carry 30 GBaud 16-QAM signals over 78.3 km single-mode fiber, achieving significant data transmission rates. With the potential to support a transmission capacity of 2.15 Tb/s (4.3 Tb/s for dual polarization) per transmitter, including Forward Error Correction (FEC) and maintenance overhead, it offers a promising solution for meeting the escalating demands of modern network traffic efficiently.


Anissa Khan

Privacy Preserving Biometric Matching

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Perry Alexander, Chair
Prasad Kulkarni
Fengjun Li


Abstract

Biometric matching is a process by which distinct features are used to identify an individual. Doing so privately is important because biometric data, such as fingerprints or facial features, is not something that can be easily changed or updated if put at risk. In this study, we perform a piece of the biometric matching process in a privacy preserving manner by using secure multiparty computation (SMPC). Using SMPC allows the identifying biological data, called a template, to remain stored by the data owner during the matching process. This provides security guarantees to the biological data while it is in use and therefore reduces the chances the data is stolen. In this study, we find that performing biometric matching using SMPC is just as accurate as performing the same match in plaintext.

 


Bryan Richlinski

Prioritize Program Diversity: Enumerative Synthesis with Entropy Ordering

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Sankha Guria, Chair
Perry Alexander
Drew Davidson
Jennifer Lohoefener

Abstract

Program synthesis is a popular way to create a correct-by-construction program from a user-provided specification. Term enumeration is a leading technique to systematically explore the space of programs by generating terms from a formal grammar. These terms are treated as candidate programs which are tested/verified against the specification for correctness. In order to prioritize candidates more likely to satisfy the specification, enumeration is often ordered by program size or other domain-specific heuristics. However, domain-specific heuristics require expert knowledge, and enumeration by size often leads to terms comprised of frequently repeating symbols that are less likely to satisfy a specification. In this thesis, we build a heuristic that prioritizes term enumeration based on variability of individual symbols in the program, i.e., information entropy of the program. We use this heuristic to order programs in both top-down and bottom-up enumeration. We evaluated our work on a subset of the PBE-String track of the 2017 SyGuS competition benchmarks and compared against size-based enumeration. In top-down enumeration, our entropy heuristic shortens runtime in ~56% of cases and tests fewer programs in ~80% before finding a valid solution. For bottom-up enumeration, our entropy heuristic improves the number of enumerated programs in ~30% of cases before finding a valid solution, without improving the runtime. Our findings suggest that using entropy to prioritize program enumeration is a promising step forward for faster program synthesis.


Elizabeth Wyss

A New Frontier for Software Security: Diving Deep into npm

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Drew Davidson, Chair
Alex Bardas
Fengjun Li
Bo Luo
J. Walker

Abstract

Open-source package managers (e.g., npm for Node.js) have become an established component of modern software development. Rather than creating applications from scratch, developers may employ modular software dependencies and frameworks--called packages--to serve as building blocks for writing larger applications. Package managers make this process easy. With a simple command line directive, developers are able to quickly fetch and install packages across vast open-source repositories. npm--the largest of such repositories--alone hosts millions of unique packages and serves billions of package downloads each week. 

 

However, the widespread code sharing resulting from open-source package managers also presents novel security implications. Vulnerable or malicious code hiding deep within package dependency trees can be leveraged downstream to attack both software developers and the users of their applications. This downstream flow of software dependencies--dubbed the software supply chain--is critical to secure.

 

This research provides a deep dive into the npm-centric software supply chain, exploring various facets and phenomena that impact the security of this software supply chain. Such factors include (i) hidden code clones--which obscure provenance and can stealthily propagate known vulnerabilities, (ii) install-time attacks enabled by unmediated installation scripts, (iii) hard-coded URLs residing in package code, (iv) the impacts open-source development practices, and (v) package compromise via malicious updates. For each facet, tooling is presented to identify and/or mitigate potential security impacts. Ultimately, it is our hope that this research fosters greater awareness, deeper understanding, and further efforts to forge a new frontier for the security of modern software supply chains. 


Jagadeesh Sai Dokku

Intelligent Chat Bot for KU Website: Automated Query Response and Resource Navigation

When & Where:


Eaton Hall, Room 2001B

Committee Members:

David Johnson, Chair
Prasad Kulkarni
Hongyang Sun


Abstract

This project introduces an intelligent chatbot designed to improve user experience on our university website by providing instant, automated responses to common inquiries. Navigating a university website can be challenging for students, applicants, and visitors who seek quick information about admissions, campus services, events, and more. To address this challenge, we developed a chatbot that simulates human conversation using Natural Language Processing (NLP), allowing users to find information more efficiently. The chatbot is powered by a Bidirectional Long Short-Term Memory (BiLSTM) model, an architecture well-suited for understanding complex sentence structures. This model captures contextual information from both directions in a sentence, enabling it to identify user intent with high accuracy. We trained the chatbot on a dataset of intent-labeled queries, enabling it to recognize specific intentions such as asking about campus facilities, academic programs, or event schedules. The NLP pipeline includes steps like tokenization, lemmatization, and vectorization. Tokenization and lemmatization prepare the text by breaking it into manageable units and standardizing word forms, making it easier for the model to recognize similar word patterns. The vectorization process then translates this processed text into numerical data that the model can interpret. Flask is used to manage the backend, allowing seamless communication between the user interface and the BiLSTM model. When a user submits a query, Flask routes the input to the model, processes the prediction, and delivers the appropriate response back to the user interface. This chatbot demonstrates a successful application of NLP in creating interactive, efficient, and user-friendly solutions. By automating responses, it reduces reliance on manual support and ensures users can access relevant information at any time. This project highlights how intelligent chatbots can transform the way users interact with university websites, offering a faster and more engaging experience.

 


Anahita Memar

Optimizing Protein Particle Classification: A Study on Smoothing Techniques and Model Performance

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Prasad Kulkarni, Chair
Hossein Saiedian
Prajna Dhar


Abstract

This thesis investigates the impact of smoothing techniques on enhancing classification accuracy in protein particle datasets, focusing on both binary and multi-class configurations across three datasets. By applying methods including Averaging-Based Smoothing, Moving Average, Exponential Smoothing, Savitzky-Golay, and Kalman Smoothing, we sought to improve performance in Random Forest, Decision Tree, and Neural Network models. Initial baseline accuracies revealed the complexity of multi-class separability, while clustering analyses provided valuable insights into class similarities and distinctions, guiding our interpretation of classification challenges.

These results indicate that Averaging-Based Smoothing and Moving Average techniques are particularly effective in enhancing classification accuracy, especially in configurations with marked differences in surfactant conditions. Feature importance analysis identified critical metrics, such as IntMean and IntMax, which played a significant role in distinguishing classes. Cross-validation validated the robustness of our models, with Random Forest and Neural Network consistently outperforming others in binary tasks and showing promising adaptability in multi-class classification. This study not only highlights the efficacy of smoothing techniques for improving classification in protein particle analysis but also offers a foundational approach for future research in biopharmaceutical data processing and analysis.


Yousif Dafalla

Web-Armour: Mitigating Reconnaissance and Vulnerability Scanning with Injecting Scan-Impeding Delays in Web Deployments

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Alex Bardas, Chair
Drew Davidson
Fengjun Li
Bo Luo
ZJ Wang

Abstract

Scanning hosts on the internet for vulnerable devices and services is a key step in numerous cyberattacks. Previous work has shown that scanning is a widespread phenomenon on the internet and commonly targets web application/server deployments. Given that automated scanning is a crucial step in many cyberattacks, it would be beneficial to make it more difficult for adversaries to perform such activity.

In this work, we propose Web-Armour, a mitigation approach to adversarial reconnaissance and vulnerability scanning of web deployments. The proposed approach relies on injecting scanning impeding delays to infrequently or rarely used portions of a web deployment. Web-Armour has two goals: First, increase the cost for attackers to perform automated reconnaissance and vulnerability scanning; Second, introduce minimal to negligible performance overhead to benign users of the deployment. We evaluate Web-Armour on live environments, operated by real users, and on different controlled (offline) scenarios. We show that Web-Armour can effectively lead to thwarting reconnaissance and internet-wide scanning.


Past Defense Notices

Dates

Durga Venkata Suraj Tedla

Block chain based inter organization file sharing system

When & Where:


Eaton Hall, Room 2001B

Committee Members:

David Johnson, Chair
Drew Davidson
Sankha Guria


Abstract

A coalition of companies collaborates collectively and shares information to improve their operations together. Distributed trust and transparency cannot be obtained with centralized file-sharing platforms. File sharing may be done transparently and securely with blockchain technology. This project suggests an inter-organizational secure file-sharing system based on blockchain technology. The group can use it to securely share files in a distributed manner. The creation of smart contracts and the configuration of blockchain networks are carried out by Hyperledger Fabric, an enterprise blockchain platform. Distributed file storage is accomplished through the usage of the Inter Planetary File System (IPFS). The workflow for file-sharing and identity management procedures is provided in the paper. Using blockchain technology, the recommended approach enables a group of businesses to share files with availability, integrity, and confidentiality. The suggested method uses blockchain to enable safe file exchange amongst a group of enterprises. It offers shared file availability, confidentiality, and integrity. It guarantees complete file encryption. The blockchain provides tamper-resistant storage for the shared file's content ID. On the distributed storage and blockchain ledger, respectively, the encrypted file and file metadata are stored.


Sai Narendra Koganti

Real-time Object Detection for Safer Driving Experience in Urban Environment: Leveraging YOLO Algorithm

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Sumaiya Shomaji, Chair
David Johnson
Prasad Kulkarni


Abstract

This project offers a hands-on investigation of object identification utilizing the YOLO method, Python, and OpenCV. It begins by explaining the YOLO architecture, focusing on the single-stage detection process for bounding box prediction and class probability calculation. The setup phase includes library installation and model configuration, resulting in a smooth implementation procedure. Using OpenCV, the project includes preparatory processes required for object detection in images. The YOLO model is seamlessly integrated into the OpenCV framework, enabling object detection. Post-processing techniques, such as non-maximum suppression, are used to modify detection results and improve accuracy. Visualizations, such as bounding boxes and labels, are used to help interpret the discovered items. The project finishes by investigating potential expansions and optimizations, such as custom dataset training and deployment on edge devices, opening up new paths for further investigation and development. This project provides developers with the tools and knowledge they need to build effective object detection systems for a wide range of applications, from surveillance and security to autonomous vehicles and augmented reality.


Vijay Verma

Binary Segmentation of PCB Components Using U-Net Model

When & Where:


Zoom Defense, please email jgrisafe@ku.edu for defense link.

Committee Members:

Sumaiya Shomaji, Chair
Tamzidul Hoque
Zijun Yao


Abstract

This project explores the adaptation of the U-Net convolutional neural network, renowned for its medical image segmentation prowess, to the analysis of Printed Circuit Boards (PCBs). By utilizing the Fine-Printed Circuit Board Image Collection (FPIC) dataset, we address key challenges in PCB inspection, such as the precise segmentation of complex components, handling class imbalances, and capturing minute details. The U-Net model has been finely tuned with an encoding-decoding architecture, enhanced by convolutional layers, batch normalization, and dropout techniques to extract and reconstruct high-quality features from PCB images effectively. The Dice coefficient, used as the loss function, significantly improves boundary accuracy, and manages class diversity. Throughout extensive training and validation phases, the model has demonstrated superior performance metrics compared to traditional methods, making substantial advancements in automated PCB inspection. During the rigorous training and validation stages, the U-Net model demonstrated excellent performance metrics, eclipsing traditional inspection methods. For capacitors, the model achieved a training accuracy of 95.03% and a validation accuracy of 95.92%. For resistors, training using transfer learning techniques resulted in even more remarkable performance, with training accuracy reaching 98% and validation accuracy hitting 98.23%. These metrics highlight the model's robustness and accuracy, marking a significant advancement in automated PCB inspection and suggesting the model's potential for wider industrial applications in multiclass component segmentation within complex PCB.


Ruturaj Vaidya

Exploring binary analysis techniques for security

When & Where:


Zoom Defense, please email jgrisafe@ku.edu for defense link.

Committee Members:

Prasad Kulkarni, Chair
Alex Bardas
Drew Davidson
Esam El-Araby
Michael Vitevitch

Abstract

In this dissertation our goal is to evaluate how the loss of information at binary-level affects the performance of existing compiler-level techniques in terms of both efficiency and effectiveness. Binary analysis is difficult, as most of semantic and syntactic information available at source-level gets lost during the compilation process. If the binary is stripped and/ or optimized, then it negatively affects the efficacy of binary analysis frameworks. Moreover, handwritten assembly, obfuscation, excessive indirect calls or jumps, etc. further degrade the accuracy of binary analysis. Challenges to precise binary analysis have implications on the effectiveness, accuracy, and performance, of security and program hardening techniques implemented at the binary level. While these challenges are well-known, their respective impacts on the effectiveness and performance of program hardening techniques are less well-studied.

In this dissertation, we employ classes of defense mechanisms to protect software from the most common software attacks, like buffer overflows and control flow attacks, to determine how this loss of program information at the binary-level affects the effectiveness and performance of defense mechanisms. Additionally, we aim to tackle an important problem of type recovery from binary executables that in turn help bolster the software protection mechanisms.


Jianpeng Li

BlackLitNetwork: Advancing Black Literature Discovery Through Modern Web Technologies

When & Where:


LEEP2, Room 1420

Committee Members:

Drew Davidson, Chair
Sumaiya Shomaji
Han Wang


Abstract

Advancements in web technologies have significantly expanded access to diverse cultural narratives, yet black literature remains underrepresented in digital domains. The BlackLitNetwork addresses this oversight by harnessing Elasticsearch, MongoDB, React, Python, CSS, HTML, and Node.js, to enhance the discoverability and engagement with black novels. A major component of the platform is a novel generator built with Elasticsearch, which employs powerful full-text search capabilities, essential for users to navigate an extensive literary database effectively.

MongoDB supports the archives platform with a flexible data schema for managing varied literary content efficiently, while Python facilitates robust data cleaning and preprocessing to ensure data integrity and usability. The user interface, created using React, transforms Figma designs from our design team into a dynamic web presence, integrating HTML and CSS to ensure both aesthetic appeal and accessibility.

To further enhance security and manageability, we've implemented a Node.js backend. This layer acts as a middleware, managing and processing requests between our frontend and Elasticsearch. This not only secures our data interactions but also allows for request handling before querying Elasticsearch. This architecture ensures that BlackLitNetwork remains scalable and maintainable.

BlackLitNetwork also features specialized pages for podcasts, briefs, and interactive data visualizations, each designed to highlight historical, and contextual elements of black literature. These components aid in fostering a deeper understanding, establishing BlackLitNetwork as a tool for scholars. This project not only enriches the field of humanities but also promotes a broader understanding of the black literary heritage, making it a resource for researchers, educators, and readers keen on exploring the richness of black literature.


Aiden Liang

Enhancing Healthcare Resource Demand Forecasting Using Machine Learning: An Integrated Approach to Addressing Temporal Dynamics and External Influences

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Prasad Kulkarni, Chair
Fengjun Li
Zijun Yao


Abstract

This project aims to enhance predictive models for forecasting healthcare resource demand, particularly focusing on hospital bed occupancy and emergency room visits while considering external factors such as disease outbreaks and weather conditions. Utilizing a range of machine learning techniques, the research seeks to improve the accuracy and reliability of these forecasts, essential for optimizing healthcare resource management. The project involves multiple phases, starting with the collection and preparation of historical data from public health databases and hospital records, enriched with external variables such as weather patterns and epidemiological data. Advanced feature engineering is key, transforming raw data into a machine learning-friendly format, including temporal and lag features to identify patterns and trends. The study explores various machine learning methods, from traditional models like ARIMA to advanced techniques such as LSTM networks and GRU models, incorporating rigorous training and validation protocols to ensure robust performance. Model effectiveness is evaluated using metrics like MAE, RMSE, and MAPE, with a strong focus on model interpretability and explainability through techniques like SHAP and LIME. The project also addresses practical implementation challenges and ethical considerations, aiming to bridge academic research with practical healthcare applications. Findings are intended for dissemination through academic papers and conferences, ensuring that the models developed meet both the ethical standards and practical needs of the healthcare industry.


Thomas Atkins

Secure and Auditable Academic Collections Storage via Hyperledger Fabric-Based Smart Contracts

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Drew Davidson, Chair
Fengjun Li
Bo Luo


Abstract

This paper introduces a novel approach to manage collections of artifacts through smart contract access control, rooted in on-chain role-based property-level access control. This smart contract facilitates the lifecycle of these artifacts including allowing for the creation, modification, removal, and historical auditing of the artifacts through both direct and suggested actions. This method introduces a collection object designed to store role privileges concerning state object properties. User roles are defined within an on-chain entity that maps users' signed identities to roles across different collections, enabling a single user to assume varying roles in distinct collections. Unlike existing key-level endorsement mechanisms, this approach offers finer-grained privileges by defining them on a per-property basis, not at the key level. The outcome is a more flexible and fine-grained access control system seamlessly integrated into the smart contract itself, empowering administrators to manage access with precision and adaptability across diverse organizational contexts. This has the added benefit of allowing for the auditing of not only the history of the artifacts, but also for the permissions granted to the users.  


Theodore Harbison

Posting Passwords: How social media information can be leveraged in password guessing attacks

When & Where:


Zoom Defense, please email jgrisafe@ku.edu for defense link.

Committee Members:

Hossein Saiedian, Chair
Fengjun Li
Heechul Yun


Abstract

The explosion of social media, while fostering connection, inadvertently exposes personal details that heighten password vulnerability. This thesis tackles this critical link, aiming to raise public awareness of the dangers of weak passwords and excessive online sharing. We introduce a novel password guessing algorithm, SocGuess, which capitalizes on the rich trove of information on social media profiles. SocGuess leverages Named Entity Recognition (NER) to identify key data points within this information, such as dates, locations, and names. To further enhance its accuracy, SocGuess is trained on the rockyou dataset, a large collection of leaked passwords. By identifying different kinds of entities within these passwords, SocGuess can calculate the probability of these entities appearing in passwords. Armed with this knowledge, SocGuess dynamically generates password guesses in order of probability by filling these entity placeholders with the corresponding data points harvested from the target’s social media profiles. This targeted approach shows SocGuess to crack 33% more passwords than existing algorithms during experimentation, demonstrably surpassing traditional methods.


Ethan Grantz

Swarm: A Backend-Agnostic Language for Simple Distributed Programming

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Drew Davidson, Chair
Perry Alexander
Prasad Kulkarni


Abstract

Writing algorithms for a parallel or distributed environment has always been plagued with a variety of challenges, from supervising synchronous reads and writes, to managing job queues and avoiding deadlock. While many languages have libraries or language constructs to mitigate these obstacles, very few attempt to remove those challenges entirely, and even fewer do so while divorcing the means of handling those problems from the means of parallelization or distribution. This project introduces a language called Swarm, which attempts to do just that.

Swarm is a first-class parallel/distributed programming language with modular, swappable parallel drivers. It is intended for everything from multi-threaded local computation on a single machine to large scientific computations split across many nodes in a cluster.

Swarm contains next to no explicit syntax for typical parallel logic, only containing keywords for declaring which variables should reside in shared memory, and describing what code should be parallelized. The remainder of the logic (such as waiting for the results from distributed jobs or locking shared accesses) are added in when compiling to a custom bytecode called Swarm Virtual Instructions (SVI). SVI is then executed by a virtual machine whose parallelization logic is abstracted out, such that the same SVI bytecode can be executed in any parallel/distributed environment.


Johnson Umeike

Optimizing gem5 Simulator Performance: Profiling Insights and Userspace Networking Enhancements

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Mohammad Alian, Chair
Prasad Kulkarni
Heechul Yun


Abstract

Full-system simulation of computer systems is critical for capturing the complex interplay between various hardware and software components in future systems. Modeling the network subsystem is indispensable for the fidelity of full-system simulations due to the increasing importance of scale-out systems. Over the last decade, the network software stack has undergone major changes, with userspace networking stacks and data-plane networks rapidly replacing the conventional kernel network stack. Nevertheless, the current state-of-the-art architectural simulator, gem5, still employs kernel networking, which precludes realistic network application scenarios.

First, we perform a comprehensive profiling study to identify and propose architectural optimizations to accelerate a state-of-the-art architectural simulator. We choose gem5 as the representative architectural simulator, run several simulations with various configurations, perform a detailed architectural analysis of the gem5 source code on different server platforms, tune both system and architectural settings for running simulations, and discuss the future opportunities in accelerating gem5 as an important application. Our detailed profiling of gem5 reveals that its performance is extremely sensitive to the size of the L1 cache. Our experimental results show that a RISC-V core with 32KB data and instruction cache improves gem5’s simulation speed by 31%∼61% compared with a baseline core with 8KB L1 caches. Second, this work extends gem5’s networking capabilities by integrating kernel-bypass/user-space networking based on the DPDK framework, significantly enhancing network throughput and reducing latency. By enabling user-space networking, the simulator achieves a substantial 6.3× improvement in network bandwidth compared to traditional Linux software stacks. Our hardware packet generator model (EtherLoadGen) provides up to a 2.1× speedup in simulation time. Additionally, we develop a suite of networking micro-benchmarks for stress testing the host network stack, allowing for efficient evaluation of gem5’s performance. Through detailed experimental analysis, we characterize the performance differences when running the DPDK network stack on both real systems and gem5, highlighting the sensitivity of DPDK performance to various system and microarchitecture parameters.