Defense Notices


All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

Jennifer Quirk

Aspects of Doppler-Tolerant Radar Waveforms

When & Where:


Nichols Hall, Room 246 (Executive Conference Room)

Committee Members:

Shannon Blunt, Chair
Patrick McCormick
Charles Mohr
James Stiles
Zsolt Talata

Abstract

The Doppler tolerance of a waveform refers to its behavior when subjected to a fast-time Doppler shift imposed by scattering that involves nonnegligible radial velocity. While previous efforts have established decision-based criteria that lead to a binary judgment of Doppler tolerant or intolerant, it is also useful to establish a measure of the degree of Doppler tolerance. The purpose in doing so is to establish a consistent standard, thereby permitting assessment across different parameterizations, as well as introducing a Doppler “quasi-tolerant” trade-space that can ultimately inform automated/cognitive waveform design in increasingly complex and dynamic radio frequency (RF) environments. 

Separately, the application of slow-time coding (STC) to the Doppler-tolerant linear FM (LFM) waveform has been examined for disambiguation of multiple range ambiguities. However, using STC with non-adaptive Doppler processing often results in high Doppler “cross-ambiguity” side lobes that can hinder range disambiguation despite the degree of separability imparted by STC. To enhance this separability, a gradient-based optimization of STC sequences is developed, and a “multi-range” (MR) modification to the reiterative super-resolution (RISR) approach that accounts for the distinct range interval structures from STC is examined. The efficacy of these approaches is demonstrated using open-air measurements. 

The proposed work to appear in the final dissertation focuses on the connection between Doppler tolerance and STC. The first proposal includes the development of a gradient-based optimization procedure to generate Doppler quasi-tolerant random FM (RFM) waveforms. Other proposals consider limitations of STC, particularly when processed with MR-RISR. The final proposal introduces an “intrapulse” modification of the STC/LFM structure to achieve enhanced sup pression of range-folded scattering in certain delay/Doppler regions while retaining a degree of Doppler tolerance.


Past Defense Notices

Dates

RAKSHA GANESH

Structured-Irregular Repeat Accumulate Codes

When & Where:


250 Nichols Hall

Committee Members:

Erik Perrins, Chair
Shannon Blunt
Ron Hui


Abstract

There is a strong need for efficient and reliable communication systems in the present day context. To design an efficient transmission system the errors that occur during transmission should be minimized. This can be achieved by channel encoding. The Irregular repeat accumulate codes are a class of serially concatenated codes that have a linear encoding algorithm, flexibility in code rate and code lengths and good performance. 

Here we implement a design technique for Structured Irregular repeat accumulate codes. S-IRA codes can be decoded reliably using the iterative log likelihood decoding (sum-product) algorithm at low error rates. We perform encoding, decoding and performance analysis of S-IRA codes of different code rates and code word lengths and compare their performances on the AWGN channel. In this project we also design codes with different column weights for the parity check matrices and compare their performances on the AWGN channel with the already designed codes. 


MADHURI MORLA

Effect of SOA Nonlinearities on CO-OFDM System

When & Where:


2001B Eaton Hall

Committee Members:

Ron Hui, Chair
Victor Frost
Erik Perrins


Abstract

The use of Semiconductor Optical Amplifier (SOA) for amplification in Coherent Optical-Orthogonal Frequency Division Multiplexing (CO-OFDM) system has been of interest in recent studies. The gain saturation of SOA induces inter-channel crosstalk. This effect is analyzed by simulation and compared with some recent experimental results. Performance of the optical transmission system is measured using Error Vector Magnitude (EVM) which is the measure of deviation of received symbols from their ideal positions in the constellation diagram. EVM as a function of input power to SOA is investigated. Improvement in EVM has been observed in linear region with the increase of input power due to the increase of Optical Signal to Noise Ratio (OSNR). In the nonlinear region, increase of the input optical power to SOA results in the degradation of EVM due to the nonlinear saturation of SOA The effect of gain saturation on EVM as a function of number of subcarriers is investigated. 
The relation between different evaluation techniques like Bit Error Rate (BER), SNR and EVM is also presented. EVM is analytically estimated from OSNR by considering the ideal case of additive white Gaussian noise (AWGN) without nonlinearities. Bit Error Rate (BER) is estimated from the analytical and simulated EVM. The role of Peak to Average Power Ratio (PAPR) in the degradation of EVM in the nonlinear region is also studied through numerical simulation. 


SAMBHAV SETHIA

Sentiment Analysis on Wikipedia People Pages Using Enhanced Naive Bayes Model

When & Where:


246 Nichols Hall

Committee Members:

Fengjun Li, Chair
Bo Luo
Jerzy Grzymala-Busse
Prasad Kulkarni

Abstract

Sentiment Analysis involves capturing the viewpoint or opinion expressed by the people on various objects. These objects are diverse set of things like a movie, an article, a person of interest, a product, basically anything on which we can opine about. The opinions that are expressed can take different forms, like a review of a movie, feedback on a product, an article in a newspaper expressing the sentiment of the author on the given topic or even a Wikipedia page on a person. The key challenge of sentiment analysis is to classify the underlying text to the correct class i.e., positive, negative or neutral. Sentiment analysis also deals with the computational treatment of opinion, sentiment and the subjectivity in a text. 
Wikipedia provides a large repository of pages of people around the world. This project conducts large scale experiment using one of the popular sentiment analysis tools, which is modeled on an enhanced version of Naïve Bayes. Here a sentence by sentence sentiment analysis is done for each biographical page retrieved from Wikipedia. The overall sentiment of a person is then calculated by taking an average of every sentiment value of all the sentences related to that particular person. There are advantages of doing this type of analysis. First, the results obtained are better calibrated on a decimal scale which provides a clearer distinction about the sentiment value associated with the individual as compared to the standard result provided by the tool which is based on tri-scale i.e., positive, negative and neutral. Second, this will allow us to understand statistically the viewpoint of Wikipedia on those people. Finally, this project enables us to perform large-scale temporal and geographical analysis, e.g., examine the overall sentiment associated with the people of each state, and thus helping us to analyze the opinion trend. 


XIAOMENG SU

A comparison of the Quality of Rule Induction from Inconsistent Data sets and Incomplete Data Sets

When & Where:


246 Nichols Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Prasad Kulkarni
Zongbo Wang


Abstract

In data mining, decision rules inducted from known examples are used to classify unseen cases. There are various rule induction algorithms, such as LEM1 (Learning from Examples Module version 1), LEM2 (Learning from Examples Module version 2) and MLEM2 (Modified Learning from Examples Module version 2). In the real world, many data sets are imperfect, either inconsistent or incomplete. The idea of lower and upper approximations, or more generally, the probabilistic approximation, provides an effective way to induct rules from inconsistent data sets and incomplete data sets. But the accuracy of rule sets inducted from imperfect data sets are expected to be lower. The objective of this project is to investigate which kind of imperfect data sets (inconsistent or incomplete) is worse in terms of the quality of inducted rule set. In this project, experiments were conducted on eight inconsistent data sets and eight incomplete data sets with lost values. We implemented the MLEM2 algorithm to induct certain and possible rules from inconsistent data sets, and implemented the local probabilistic version of MLEM2 algorithm to induct certain and possible rules from incomplete data sets. A program called Rule Checker was also developed to classify unseen cases with inducted rules and measure the classification error rate. Ten-cross fold validation was carried out and the average error rate was used as the criteria for comparison. The Mann-Whitney nonparametric test was performed to compare, separately for certain and possible rules, incompleteness with inconsistency. The results show that there is no significant difference between inconsistent and incomplete data sets in terms of the quality of rule induction.


SIVA PRAMOD BOBBILI

Static Disassembly of Binary using Symbol Table Information

When & Where:


250 Nichols Hall

Committee Members:

Prasad Kulkarni, Chair
Andy Gill
Jerzy Grzymala-Busse


Abstract

Static binary analysis is an important challenge with many applications in security and performance optimization. One of the main challenges with analyzing an executable file statically is to discover all the instructions in the binary executable. It is often difficult to discover all program instructions due to a well-known limitation in static binary analysis, called the code discovery problem. Some of the main contributors to the code discovery problem are variable length CISC instructions, data interspersed with code, padding bytes for branch target alignment and indirect jumps. All these problems manifest themselves in x86 binary files, which is unfortunate since x86 is the most popular architecture format in desktop and server domains. 
Although much of the research work in the recent times have stated that the symbol table might be of help to overcome the difficulties of code discovery, the extent to which it can actually help in the code discovery problem is still in question. This work focuses on assessing the benefit of using the symbol table information to overcome the limitations of the code discovery problem and identify more or all instructions in x86 binary executable files. We will discuss the details, extent, limitations and impact of instruction discovery with and without symbol table information in this work. 


JONATHAN LUTES

SafeExit: Exit Node Protection for TOR

When & Where:


2001B Eaton Hall

Committee Members:

Bo Luo, Chair
Arvin Agah
Prasad Kulkarni


Abstract

TOR is one of the most important networks for providing anonymity over the internet. However, in some cases its exit node operators open themselves up to various legal challenges, a fact which discourages participation in the network. In this paper, we propose a mechanism for allowing some users to be voluntarily verified by trusted third parties, providing a means by which an exit node can verify that they are not the true source of traffic. This is done by extending TOR’s anonymity model to include 
another class of user, and using a web of trust mechanism to create chains of trust. 


KAVYASHREE PILAR

Digital Down Conversion and Compression of Radar Data

When & Where:


317 Nichols Hall

Committee Members:

Carl Leuschen, Chair
Shannon Blunt
Glenn Prescott


Abstract

Storage and handling of huge amount of received data samples is one of the major challenges associated with Radar system design. Radar data samples potentially have high temporal and spatial correlation depending on the target characteristics and radar settings. This correlation can be utilized to compress them without any loss in sensitivity in post processed products. This project focuses on reducing the storage requirement of a Radar used for remote sensing of ice sheets. At the front-end of Radar receiver, the data sample rate can be reduced at real-time by performing frequency down-conversion and decimation of the incoming data. The decimated signal can be further compressed by applying suitable data compression algorithm. The project implements a digital down-converter, decimator and a data compression module on FPGA. Literature survey suggests that there are quite a few research works being done towards developing customized Radar data compression algorithms. This project analyses the possibility of using general-purpose algorithms like GZIP, JPEG-2000 (lossless) to compress Radar data. It also considers a simple floating point compression technique to convert 16 bit data to 8 bit data, guaranteeing a 50% reduction in data size. The project implements the 16-to-8 bit conversion, JPEG 2000 lossless and GZIP algorithms in Matlab and compares their SNR performance with Radar data. Simulations suggest that all of them have similar SNR performance but JPEG 2000, GZIP algorithms offer a compression ratio of over 90%. However, 16-to-8-bit compression is implemented in this project because of its simplicity. 
A hardware test bed is implemented to integrate the digital radar electronics with the Matlab Simulink Simulation tools in a hardware in the loop (HIL) configuration. The digital down converter, decimator and the data compression module are prototyped on SimuLink. The design is later implemented on FPGA using Verilog code. The functionality is tested at various stages of development using ModelSim simulations, Altera DSPBuilder’s HDL import, HIL co-simulation and using SignalTap. This test bed can also be used for future development efforts. 


SURYA TEJ NIMMAKAYALA

Exploring Causes of Performance Overhead during Dynamic Binary Translation

When & Where:


250 Nichols Hall

Committee Members:

Prasad Kulkarni, Chair
Fengjun Li
Bo Luo


Abstract

Dynamic Binary Translators (DBT) have applications ranging from program 
portability, instrumentation, optimizations, and improving software security. To achieve these goals and maintain control over the application's execution, DBTs translate and run the original source/guest programs in a sand-boxed environment. DBT systems apply several optimization techniques like code caching, trace creation, etc. to reduce the translation overhead and 
enhance program performance at run-time. However, even with these 
optimizations, DBTs typically impose a significant performance overhead, 
especially for short-running applications. This performance penalty has 
restricted the more wide-spread adoption of DBT technology, in spite of its obvious need. 

The goal of this work is to determine the different factors that contribute to the performance penalty imposed by dynamic binary translators. In this thesis, we describe the experiments that we designed to achieve our goal and report our results and observations. We use a popular and sophisticated DBT, DynamoRio, for our test platform, and employ the industry-standard SPEC CPU2006 benchmarks to capture run-time statistics. Our experiments find that DynamoRio executes a large number of additional instructions when compared to the native application execution. We further measure that this increase in the number of executed instructions is caused by the DBT frequently exiting 
the code cache to perform various management tasks at run-time, including 
code translation, indirect branch resolution and trace formation. We also 
find that the performance loss experienced by the DBT is directly 
proportional to the number of code cache exits. We will discuss the details on the experiments, results, observations, and analysis in this work.


XUN WU

A Global Discretization Approach to Handle Numerical Attributes as Preprocessing Presenter

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Prasad Kulkarni
Heechul Yun


Abstract

Discretization is a common technique to handle numerical attributes in data mining, and it divides continuous values into several intervals by defining multiple thresholds. Decision tree learning algorithms, such as C4.5 and random forests, are able to deal with numerical attributes by applying discretization technique and transforming them into nominal attributes based on one impurity-based criterion, such as information gain or Gini gain. However, there is no doubt that a considerable amount of distinct values are located in the same interval after discretization, through which digital information delivered by the original continuous values are lost. 
In this thesis, we proposed a global discretization method that is able to keep the information within the original numerical attributes by expanding them into multiple nominal ones based on each of the candidate cut-point values. The discretized data set, which includes only nominal attributes, evolves from the original data set. We analyzed the problem by applying two decision tree learning algorithms, namely C4.5 and random forests, respectively to each of the twelve pairs of data sets (original and discretized data sets) and evaluating the performances (prediction accuracy rate) of the obtained classification models in Weka Experimenter. This is followed by two separate Wilcoxon tests (each test for one learning algorithm) to decide whether there is a level of statistical significance among these paired data sets. Results of both tests indicate that there is no clear difference in terms of performances by using the discretized data sets compared to the original ones. 


YUFEI CHENG

Future Internet Routing Design for Massive Failures and Attacks

When & Where:


246 Nichols Hall

Committee Members:

James Sterbenz, Chair
Victor Frost
Fengjun Li
Gary Minden
Michael Vitevitch

Abstract

With the increasing frequency of natural disasters and intentional attacks that challenge the optical network, vulnerability to cascading and regional-correlated challenges is escalating. Given the high complexity and large traffic load of the optical networks, the correlated challenges pose great damage to reliable network communication. We start our research by proposing a critical regional identification mechanism and study different vulnerability scales using real-world physical network topologies. We further propose geographical diversity and incorporate it into a new graph resilience metric cTGGD (compensated Total Geographical Graph Diversity), which is capable of characterizing and differentiating resiliency level from different physical networks. We propose path geodiverse problem (PGD) and two heuristics for solving the problem with less complexity compared to the optimal algorithm. The geodiverse paths are optimized with a delay-skew optimization formulation for optimal traffic allocation. We implement GeoDivRP in ns-3 to employ the optimized paths and demonstrate their effectiveness compared to OSPF Equal-Cost Multi-Path routing (ECMP) in terms of both throughput and overall link utilization. As from the attackers perspective, we have analyzed the mechanism by which the attackers could use to maximize the attack impact with a limited budget and demonstrate the effectiveness of different network restoration plans.