Defense Notices


All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

Vinay Kumar Reddy Budideti

NutriBot: An AI-Powered Personalized Nutrition Recommendation Chatbot Using Rasa

When & Where:


Eaton Hall, Room 2001B

Committee Members:

David Johnson, Chair
Victor Frost
Prasad Kulkarni


Abstract

In recent years, the intersection of Artificial Intelligence and healthcare has paved the way for intelligent dietary assistance. NutriBot is an AI-powered chatbot developed using the Rasa framework to deliver personalized nutrition recommendations based on user preferences, diet types, and nutritional goals. This full-stack system integrates Rasa NLU, a Flask backend, the Nutritionix API for real-time food data, and a React.js + Tailwind CSS frontend for seamless interaction. The system is containerized using Docker and deployable on cloud platforms like GCP.

The chatbot supports multi-turn conversations, slot-filling, and remembers user preferences such as dietary restrictions or nutrient focus (e.g., high protein). Evaluation of the system showed perfect intent and entity recognition accuracy, fast API response times, and user-friendly fallback handling. While NutriBot currently lacks persistent user profiles and multilingual support, it offers a highly accurate, scalable framework for future extensions such as fitness tracker integration, multilingual capabilities, and smart assistant deployment.


Arun Kumar Punjala

Deep Learning-Based MRI Brain Tumor Classification: Evaluating Sequential Architectures for Diagnostic Accuracy

When & Where:


Eaton Hall, Room 2001B

Committee Members:

David Johnson, Chair
Prasad Kulkarni
Dongjie Wang


Abstract

Accurate classification of brain tumors from MRI scans plays a vital role in assisting clinical diagnosis and treatment planning. This project investigates and compares three deep learning-based classification approaches designed to evaluate the effectiveness of integrating recurrent layers into conventional convolutional architectures. Specifically, a CNN-LSTM model, a CNN-RNN model with GRU units, and a baseline CNN classifier using EfficientNetB0 are developed and assessed on a curated MRI dataset.

The CNN-LSTM model uses ResNet50 as a feature extractor, with spatial features reshaped and passed through stacked LSTM layers to explore sequential learning on static medical images. The CNN-RNN model implements TimeDistributed convolutional layers followed by GRUs, examining the potential benefits of GRU-based modeling. The EfficientNetB0-based CNN model, trained end-to-end without recurrent components, serves as the performance baseline.

All three models are evaluated using training accuracy, validation loss, confusion matrices, and class-wise performance metrics. Results show that the CNN-LSTM architecture provides the most balanced performance across tumor types, while the CNN-RNN model suffers from mild overfitting. The EfficientNetB0 baseline offers stable and efficient classification for general benchmarking.


Mahmudul Hasan

Assertion-Based Security Assessment of Hardware IP Protection Methods

When & Where:


Eaton Hall, Room 2001B

Committee Members:

Tamzidul Hoque, Chair
Esam El-Araby
Sumaiya Shomaji


Abstract

Combinational and sequential locking methods are promising solutions for protecting hardware intellectual property (IP) from piracy, reverse engineering, and malicious modifications by locking the functionality of the IP based on a secret key. To improve their security, researchers are developing attack methods to extract the secret key.  

While the attacks on combinational locking are mostly inapplicable for sequential designs without access to the scan chain, the limited applicable attacks are generally evaluated against the basic random insertion of key gates. On the other hand, attacks on sequential locking techniques suffer from scalability issues and evaluation of improperly locked designs. Finally, while most attacks provide an approximately correct key, they do not indicate which specific key bits are undetermined. This thesis proposes an oracle-guided attack that applies to both combinational and sequential locking without scan chain access. The attack applies light-weight design modifications that represent the oracle using a finite state machine and applies an assertion-based query of the unlocking key. We have analyzed the effectiveness of our attack against 46 sequential designs locked with various classes of combinational locking including random, strong, logic cone-based, and anti-SAT based. We further evaluated against a sequential locking technique using 46 designs with various key sequence lengths and widths. Finally, we expand our framework to identify undetermined key bits, enabling complementary attacks on the smaller remaining key space.


Masoud Ghazikor

Distributed Optimization and Control Algorithms for UAV Networks in Unlicensed Spectrum Bands

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Morteza Hashemi, Chair
Victor Frost
Prasad Kulkarni


Abstract

UAVs have emerged as a transformative technology for various applications, including emergency services, delivery, and video streaming. Among these, video streaming services in areas with limited physical infrastructure, such as disaster-affected areas, play a crucial role in public safety. UAVs can be rapidly deployed in search and rescue operations to efficiently cover large areas and provide live video feeds, enabling quick decision-making and resource allocation strategies. However, ensuring reliable and robust UAV communication in such scenarios is challenging, particularly in unlicensed spectrum bands, where interference from other nodes is a significant concern. To address this issue, developing a distributed transmission control and video streaming is essential to maintaining a high quality of service, especially for UAV networks that rely on delay-sensitive data.

In this MSc thesis, we study the problem of distributed transmission control and video streaming optimization for UAVs operating in unlicensed spectrum bands. We develop a cross-layer framework that jointly considers three inter-dependent factors: (i) in-band interference introduced by ground-aerial nodes at the physical layer, (ii) limited-size queues with delay-constrained packet arrival at the MAC layer, and (iii) video encoding rate at the application layer. This framework is designed to optimize the average throughput and PSNR by adjusting fading thresholds and video encoding rates for an integrated aerial-ground network in unlicensed spectrum bands. Using consensus-based distributed algorithm and coordinate descent optimization, we develop two algorithms: (i) Distributed Transmission Control (DTC) that dynamically adjusts fading thresholds to maximize the average throughput by mitigating trade-offs between low-SINR transmission errors and queue packet losses, and (ii) Joint Distributed Video Transmission and Encoder Control (JDVT-EC) that optimally balances packet loss probabilities and video distortions by jointly adjusting fading thresholds and video encoding rates. Through extensive numerical analysis, we demonstrate the efficacy of the proposed algorithms under various scenarios.


Ganesh Nurukurti

Customer Behavior Analytics and Recommendation System for E-Commerce

When & Where:


Eaton Hall, Room 2001B

Committee Members:

David Johnson, Chair
Prasad Kulkarni
Han Wang


Abstract

In the era of digital commerce, personalized recommendations are pivotal for enhancing user experience and boosting engagement. This project presents a comprehensive recommendation system integrated into an e-commerce web application, designed using Flask and powered by collaborative filtering via Singular Value Decomposition (SVD). The system intelligently predicts and personalizes product suggestions for users based on implicit feedback such as purchases, cart additions, and search behavior.

 

The foundation of the recommendation engine is built on user-item interaction data, derived from the Brazilian e-commerce Olist dataset. Ratings are simulated using weighted scores for purchases and cart additions, reflecting varying degrees of user intent. These interactions are transformed into a user-product matrix and decomposed using SVD, yielding latent user and product features. The model leverages these latent factors to predict user interest in unseen products, enabling precise and scalable recommendation generation.

 

To further enhance personalization, the system incorporates real-time user activity. Recent search history is stored in an SQLite database and used to prioritize recommendations that align with the user’s current interests. A diversity constraint is also applied to avoid redundancy, limiting the number of recommended products per category.

 

The web application supports robust user authentication, product exploration by category, cart management, and checkout simulations. It features a visually driven interface with dynamic visualizations for product insights and user interactions. The home page adapts to individual preferences, showing tailored product recommendations and enabling users to explore categories and details.

 

In summary, this project demonstrates the practical implementation of a hybrid recommendation strategy combining matrix factorization with contextual user behavior. It showcases the importance of latent factor modeling, data preprocessing, and user-centric design in delivering an intelligent retail experience.


Srijanya Chetikaneni

Plant Disease Prediction Using Transfer Learning

When & Where:


Eaton Hall, Room 2001B

Committee Members:

David Johnson, Chair
Prasad Kulkarni
Han Wang


Abstract

Timely detection of plant diseases is critical to safeguarding crop yields and ensuring global food security. This project presents a deep learning-based image classification system to identify plant diseases using the publicly available PlantVillage dataset. The core objective was to evaluate and compare the performance of a custom-built Convolutional Neural Network (CNN) with two widely used transfer learning models—EfficientNetB0 and MobileNetV3Small. 

All models were trained on augmented image data resized to 224×224 pixels, with preprocessing tailored to each architecture. The custom CNN used simple normalization, whereas EfficientNetB0 and MobileNetV3Small utilized their respective pre-processing methods to standardize the pretrained ImageNet domain inputs. To improve robustness, the training pipeline included data augmentation, class weighting, and early stopping.

Training was conducted using the Adam optimizer and categorical cross-entropy loss over 30 epochs, with performance assessed using accuracy, loss, and training time metrics. The results revealed that transfer learning models significantly outperformed the custom CNN. EfficientNetB0 achieved the highest accuracy, making it ideal for high-precision applications, while MobileNetV3Small offered a favorable balance between speed and accuracy, making it suitable for lightweight, real-time inference on edge devices.

This study validates the effectiveness of transfer learning for plant disease detection tasks and emphasizes the importance of model-specific preprocessing and training strategies. It provides a foundation for deploying intelligent plant health monitoring systems in practical agricultural environments.


Rahul Purswani

Finetuning Llama on custom data for QA tasks

When & Where:


Eaton Hall, Room 2001B

Committee Members:

David Johnson, Chair
Drew Davidson
Prasad Kulkarni


Abstract

Fine-tuning large language models (LLMs) for domain-specific use cases, such as question answering, offers valuable insights into how their performance can be tailored to specialized information needs. In this project, we focused on the University of Kansas (KU) as our target domain. We began by scraping structured and unstructured content from official KU webpages, covering a wide array of student-facing topics including campus resources, academic policies, and support services. From this content, we generated a diverse set of question-answer pairs to form a high-quality training dataset. LLaMA 3.2 was then fine-tuned on this dataset to improve its ability to answer KU-specific queries with greater relevance and accuracy. Our evaluation revealed mixed results—while the fine-tuned model outperformed the base model on most domain-specific questions, the original model still had an edge in handling ambiguous or out-of-scope prompts. These findings highlight the strengths and limitations of domain-specific fine-tuning, and provide practical takeaways for customizing LLMs for real-world QA applications.


Ahmet Soyyigit

Anytime Computing Techniques for LiDAR-based Perception In Cyber-Physical Systems

When & Where:


Nichols Hall, Room 250 (Gemini Room)

Committee Members:

Heechul Yun, Chair
Michael Branicky
Prasad Kulkarni
Hongyang Sun
Shawn Keshmiri

Abstract

The pursuit of autonomy in cyber-physical systems (CPS) presents a challenging task of real-time interaction with the physical world, prompting extensive research in this domain. Recent advances in artificial intelligence (AI), particularly the introduction of deep neural networks (DNN), have significantly improved the autonomy of CPS, notably by boosting perception capabilities.

CPS perception aims to discern, classify, and track objects of interest in the operational environment, a task that is considerably challenging for computers in a three-dimensional (3D) space. For this task, the use of LiDAR sensors and processing their readings with DNNs has become popular because of their excellent performance However, in CPS such as self-driving cars and drones, object detection must be not only accurate but also timely, posing a challenge due to the high computational demand of LiDAR object detection DNNs. Satisfying this demand is particularly challenging for on-board computational platforms due to size, weight, and power constraints. Therefore, a trade-off between accuracy and latency must be made to ensure that both requirements are satisfied. Importantly, the required trade-off is operational environment dependent and should be weighted more on accuracy or latency dynamically at runtime. However, LiDAR object detection DNNs cannot dynamically reduce their execution time by compromising accuracy (i.e. anytime computing). Prior research aimed at anytime computing for object detection DNNs using camera images is not applicable to LiDAR-based detection due to architectural differences. This thesis addresses these challenges by proposing three novel techniques: Anytime-LiDAR, which enables early termination with reasonable accuracy; VALO (Versatile Anytime LiDAR Object Detection), which implements deadline-aware input data scheduling; and MURAL (Multi-Resolution Anytime Framework for LiDAR Object Detection), which introduces dynamic resolution scaling. Together, these innovations enable LiDAR-based object detection DNNs to make effective trade-offs between latency and accuracy under varying operational conditions, advancing the practical deployment of LiDAR object detection DNNs.


Rithvij Pasupuleti

A Machine Learning Framework for Identifying Bioinformatics Tools and Database Names in Scientific Literature

When & Where:


LEEP2, Room 2133

Committee Members:

Cuncong Zhong, Chair
Dongjie Wang
Han Wang
Zijun Yao

Abstract

The absence of a single, comprehensive database or repository cataloging all bioinformatics databases and software creates a significant barrier for researchers aiming to construct computational workflows. These workflows, which often integrate 10–15 specialized tools for tasks such as sequence alignment, variant calling, functional annotation, and data visualization, require researchers to explore diverse scientific literature to identify relevant resources. This process demands substantial expertise to evaluate the suitability of each tool for specific biological analyses, alongside considerable time to understand their applicability, compatibility, and implementation within a cohesive pipeline. The lack of a central, updated source leads to inefficiencies and the risk of using outdated tools, which can affect research quality and reproducibility. Consequently, there is a critical need for an automated, accurate tool to identify bioinformatics databases and software mentions directly from scientific texts, streamlining workflow development and enhancing research productivity. 

 

The bioNerDS system, a prior effort to address this challenge, uses a rule-based named entity recognition (NER) approach, achieving an F1 score of 63% on an evaluation set of 25 articles from BMC Bioinformatics and PLoS Computational Biology. By integrating the same set of features such as context patterns, word characteristics and dictionary matches into a machine learning model, we developed an approach using an XGBoost classifier. This model, carefully tuned to address the extreme class imbalance inherent in NER tasks through synthetic oversampling and refined via systematic hyperparameter optimization to balance precision and recall, excels at capturing complex linguistic patterns and non-linear relationships, ensuring robust generalization. It achieves an F1 score of 82% on the same evaluation set, significantly surpassing the baseline. By combining rule-based precision with machine learning adaptability, this approach enhances accuracy, reduces ambiguities, and provides a robust tool for large-scale bioinformatics resource identification, facilitating efficient workflow construction. Furthermore, this methodology holds potential for extension to other technological domains, enabling similar resource identification in fields like data science, artificial intelligence, or computational engineering.


Past Defense Notices

Dates

ISHA KHADKA

Multi-Controller SDN for Fault-Tolerant Resilient Network

When & Where:


246 Nichols Hall

Committee Members:

James Sterbenz, Chair
Fengjun Li
Gary Minden


Abstract

Software Defined Networking (SDN) decouples the control or logical plane of a network from its physical/data plane thus enabling features such as centralized control, network programmability, virtualization, network application development, automation and more. However, SDN is still vulnerable to attacks and failures just like any other non-SDN network. The failure in SDN can be either a link or device failure. Controller is the central device, acting like the brain of a network, and its failure can propagate rapidly rendering the underlying data plane dysfunctional. The concept of Multi-Controller SDN uses redundancy as an effective method to ensure resilience and fault-tolerance in a Software-Defined Network. Multiple Controllers are connected in a cluster to form a physically distributed but logically centralized network. The backup controllers ensure resilience against failure, attack, disaster and other network disruptions. In this project, we implement multi-controller SDN and measure performance metrics such as high availability, reliability, latency, datastore persistency and failure recovery time in a clustered environment.


MD AMIMUL EHSAN

Enabling Technologies for Three-dimensional (3D) Integrated Circuits (ICs): Through Silicon Via (TSV) Modeling and Analysis

When & Where:


246 Nichols Hall

Committee Members:

Yang Yi, Chair
Chris Allen
Ron Hui
Lingjia Liu
Judy Wu

Abstract

Three-dimensional (3D) integrated circuits (ICs) offer a promising near-term solution for pushing beyond Moore’s Law because of their compatibility with current technology. Through silicon vias (TSVs) provide electrical connections that pass vertically through wafers or dies to generate high-performance interconnects, which allows for higher design densities through shortened connection lengths. In recent years, we have seen tremendous technological and economic progress in adoption of 3D ICs with TSVs for mainstream commercial use. 
Along with the need for low-cost and high-yield process technology, the successful application of TSV technology requires further optimization of the TSV electrical modeling and design. In the millimeter wave (mmW) frequency range, the root mean square (rms) height of the through silicon via (TSV) sidewall roughness is comparable to the skin depth and hence becomes a critical factor for TSV modeling and analysis. The impact of TSV sidewall roughness on electrical performance, such as the loss and impedance alteration in the mmW frequency range, is examined and analyzed. The second order small analytical perturbation method is applied to obtain a simple closed-form expression for the power absorption enhancement factor of the TSV. In this study, we propose an accurate and efficient electrical model for TSVs which considers the TSV sidewall roughness effect, the skin effect, and the metal oxide semiconductor (MOS) effect. The accuracy of the model is validated through a comparison of circuit model behavior for full wave electromagnetic field simulations up to 100 GHz. 
Another advanced neurophysiological computing system that can incorporate 3D integration could provide massive parallelism with fast and energy efficient links. While the 3D neuro-inspired system offers a fantastic level of integration, it becomes inordinately arduous for the designer to model, merely because of the innumerable interconnected elements. When a TSV array is utilized in a 3D neuromorphic system, crosstalk has a malefic effect upon the system’s signal to noise ratio; the result is an overall deterioration of system performance. To countervail the crosstalk, we propose a novel optimized TSV array pattern by applying the force directed optimization algorithm. 


ADAM PETZ

A Semantics for Attestation Protocols using Session Types in Coq

When & Where:


246 Nichols Hall

Committee Members:

Perry Alexander, Chair
Andy Gill
Prasad Kulkarni


Abstract

As our world becomes more connected, the average person must place more trust in cloud systems for everyday transactions. We rely on banks and credit card services to protect our money, hospitals to conceal and selectively disclose sensitive health information, and government agencies to protect our identity and uphold national security interests. However, establishing trust in remote systems is not a trivial task, especially in the diverse, distributed ecosystem of todays networked computers. Remote Attestation is a mechanism for establishing trust in a remotely running system where an appraiser requests information from a target that can be used to evaluate its operational state. The target responds with evidence providing configuration information, run-time measurements, and authenticity meta-evidence used by the appraiser to determine if it trusts the target system. For Remote Attestation to be applied broadly, we must have attestation protocols that perform operations on a collection of applications, each of which must be measured differently. Verifying that these protocols behave as expected and accomplish their diverse attestation goals is a unique challenge. An important first step is to understand the structural properties and execution patterns they share. In this thesis I present a semantic framework for attestation protocol execution within the Coq verification environment including a protocol representation based on Session Types, a dependently typed model of perfect cryptography, and an operational execution semantics. The expressive power of dependent types constrains the structure of protocols and supports precise claims about their behavior. If we view attestation protocols as programming language expressions, we can borrow from standard language semantics techniques to model their execution. The proof framework ensures desirable properties of protocol execution, such as progress and termination, that hold for all protocols. It also ensures properties of authenticity and secrecy for individual protocols.


RACHAD ATAT

Communicating over Internet Things: Security, Energy-Efficiency, Reliability and Low-Latency

When & Where:


250 Nichols Hall

Committee Members:

Lingjia Liu, Chair
Yang Yi
Shannon Blunt
Jim Rowland
David Nualart

Abstract

The Internet of Things (IoT) is expected to revolutionize the world through its myriad applications in health-care, public safety, environmental management, vehicular networks, industrial automation, etc. Some of the concepts related to IoT include Machine Type Communications (MTC), Low power Wireless Personal Area Networks (LoWPAN), wireless sensor networks (WSN) and Radio-Frequency Identification (RFID). Characterized by large amount of traffic with smart decision making with little or no human interaction, these different networks pose a set of challenges, among which security, energy, reliability and latency are the most important ones. First, the open wireless medium and the distributed nature of the system introduce eavesdropping, data fabrication and privacy violation threats. Second, the large number of IoT devices are expected to operate in a self-sustainable and self-sufficient manner without degrading system performance. That means energy efficiency is critical to prolong devices' lifetime. Third, many IoT applications require the information to be successfully transmitted in a reliable and timely manner, such as emergency response and health-care scenarios. To address these challenges, we propose low-complexity approaches by exploiting the physical layer and using stochastic geometry as a powerful tool to accurately model the spatial locations of ''things''. This helps provide a tractable analytical framework to provide solutions for the mentioned challenges of IoT.


OMAR BARI

Ensembles of Text and Time-Series Models for Automatic Generation of Financial Trading Signals

When & Where:


2001B Eaton Hall

Committee Members:

Arvin Agah, Chair
Joseph Evans
Andy Gill
Jerzy Grzymala-Busse
Sara Wilson

Abstract

Event Studies in finance have focused on traditional news headlines to assess the impact an event has on a traded company. The increased proliferation of news and information produced by social media content has disrupted this trend. Although researchers have begun to identify trading opportunities from social media platforms, such as Twitter, almost all techniques use a general sentiment from large collections of tweets. Though useful, general sentiment does not provide an opportunity to indicate specific events worthy of affecting stock prices.


AQSA PATEL

Interpretation of Radar Altimeter Waveforms using Ku-band Ultra-Wideband Altimeter Data

When & Where:


317 Nichols Hall

Committee Members:

Carl Leuschen, Chair
Prasad Kulkarni
Ron Hui
John Paden
David Braaten

Abstract

The surface-elevation of ice sheets and sea ice is currently measured using both satellite and airborne radar altimeters. These measurements are used for generating mass balance estimates of ice sheets and thickness estimates of sea ice. However, due to the penetration of the altimeter signal into the snow there is ambiguity between the surface tracking point and the actual surface location which produces errors in the surface elevation measurement. In order to address how the penetration of the signal affects the shape of the return waveform, it is important to study the effect sub-surface scattering and seasonal variations in properties of snow have on the return waveform to correctly interpret the satellite radar altimeter data. To address this problem, an ultra-wide bandwidth Ku-band radar altimeter was developed at the Center for Remote Sensing of Ice Sheets (CReSIS). The Ku-band altimeter operates over the frequency range of 12 to 18 GHz providing very fine resolution to measure ice surface and resolve the sub-surface features of the snow. It is designed to encompass the frequency band of satellite radar altimeters. The data from Ku-band altimeter can be used to simulate satellite radar altimeter data, and these simulated waveforms can help us understand the effect of signal penetration and sub-surface scattering on low bandwidth satellite altimeter returns. The extensive dataset collected as a part of the Operation Ice Bridge (OIB) campaign can be used to interpret satellite radar altimeter data over surfaces with varying snow conditions. The goal of this research is to use waveform modeling and data inter-comparisons of full and reduced bandwidth data products from Ku-band radar altimeter to investigate the effect of signal penetration and snow conditions on surface tracking using threshold and waveform fitting retracking algorithms to improve the retrieval of surface elevation from satellite radar altimeters.


VAISHNAVI YADALAM

Real Time Video Streaming over a Multihop Ad Hoc Network

When & Where:


1 Eaton Hall

Committee Members:

Aveek Dutta, Chair
Victor Frost
Richard Wang


Abstract

High rate data transmission is very common in cellular and wireless local area networks. It is achievable because of its wired backbone where only the first or the last hop is wireless, commonly known as wireless “last-mile” link. With this type of infrastructure network, it is not surprising to achieve the desired performance of wirelessly-transmitted video. However, the current challenge is to transmit an enunciated and a high quality real time video over multiple wireless hops in an ad hoc network. The performance of multiple wireless hops to transmit a high quality video is limited by data rate, bandwidth of wireless channel and interference from adjacent channels. These factors constrain the applications for a wireless multihop network but are fundamental to military tactical network solutions. The project addresses and studies the effect of packet sensitivity, latency, bitrate and bandwidth on the quality of video for line of sight and non-line of sight test scenarios. It aims to achieve the best visual user experience at the receiver end on transmission over multiple wireless hops. Further, the project provides an algorithm for placement of drones in sub-terrain environment to stream real time videos for border surveillance to monitor and detect unauthorized activity.


YANG TIAN

Integrating Textual Ontology and Visual Features for Content Based Search in an Invertebrate Paleontology Knowledgebase

When & Where:


246 Nichols Hall

Committee Members:

Bo Luo, Chair
Fengjun Li
Richard Wang


Abstract

The Treatise on Invertebrate Paleontology (TIP) is a definitive work completed by more than 300 authors in the field of Paleontology, covering all categories of invertebrate animals. The digital version for TIP is consisted of multiple PDF files, however, these files are just a clone of paper version and are not well formatted, which makes it hard to extract structured data using only straightforward methods. In order to make fossil and extant records in TIP organized and searchable from a web interface, a digital library which is called Invertebrate Paleontology Knowledgebase (IPKB) was built for information sharing and querying in TIP. It is consisted of a database which stores records of all fossils and extant invertebrate animals, and a web interface which provides an online access. 
The existing IPKB system provides a general framework for TIP information showing and searching, however, it has very limited search functions, only allowing users querying by pure text. Details of structural properties in the fossil descriptions are not carefully taken into consideration. Moreover, sometimes users cannot provide correct and rich enough query terms. Although authors of TIP are all paleontologists, the expected users of IPKB may not be that professional. 
In order to overcome this limitation and bring more powerful search features into the IPKB system, in this thesis, we present a content-based search function, which allow users to search using textual ontology descriptions and images of fossils. First, this thesis describes the work done by previous research on IPKB system. Except for the original text and image processing approaches, we also present our new efforts on improving these original methods. Second, this thesis presents the algorithm and approach adopted in the construction of content-based search system for IPKB. The search functions in the old IPKB system did not consider the differences among morphological details of certain regions of fossils. Three major parts are discussed in detail: (1) Textual ontology based search. (2) Image based search. (3) Text-image based search. 


ANIL PEDIREDLA

Information Revelation and Privacy in Online Social Networks

When & Where:


250 Nichols Hall

Committee Members:

Bo Luo, Chair
Fengjun Li
Richard Wang


Abstract

Participation in social networking sites has dramatically increased in recent years. Services such as Linkedin, Facebook, or Twitter allow millions of individuals to create online profiles and share personal information with vast networks of friends - and, often, unknown numbers of strangers. The relation between privacy and a person’s social network is multi-faced. At certain occasions we want information about ourselves to be know only to a limited set of people, and not to strangers. Privacy implications associated with online social networking depend on the level of identifiability of the information provided, its possible recipients, and its possible uses. Even social networking websites that do not openly expose their users’ identities may provide enough information to identify profile’s owner. 


SERGIO LEON CUEN

Visualization and Performance Analysis of N-Body Dynamics Comparing GPGPU Approaches

When & Where:


2001B Eaton Hall

Committee Members:

Jim Miller, Chair
Man Kong
Suzanne Shontz


Abstract

With the advent of general-purpose programming tools and newer GPUs, programmers now have access to a more flexible general-purpose approach to using GPUs for something other than graphics. With single instruction stream, multiple data streams (SIMD), the same instruction is executed by multiple processors using different data streams. GPUs are SIMD computers that exploit data-level parallelism by applying the same operations to multiple items of data in parallel. There are many areas where GPUs can be used for general-purpose computing. We have chosen to focus on a project in the astrophysics area of scientific computing called N-body simulation which computes the evolution of a system of bodies that interact with each other. Each body represents an object such as a planet or a star, and each exerts a gravitational force on all the others. It is performed by using a numerical integration method to compute the interactions among the system of bodies, and begins with the initial conditions of the system which are the masses and starting position and velocity of every body. These data are repeatedly used to compute the gravitational force between all bodies of the system to show updates on screen. We investigate alternative implementation approaches to the problem in an attempt to determine the factors that maximize its performance, including speed and accuracy. Specifically, we compare an OpenCL approach to one based on using OpenGL Compute Shaders. We select these two for comparison to generate real-time interactive displays with OpenGL. Ultimately, we anticipate our results will be generalizable to other APIs (e.g., CUDA) as well as to applications other than the N-Body problem. A comparison of various numerical integration and memory optimization techniques is also included in our analysis in an attempt to understand how they work in the SIMD GPGPU environment and how they contribute to our performance metrics. We conclude that, for our particular implementation of the problem, taking advantage of efficiently using local memory considerably increases performance.