Defense Notices


All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.

Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.

Upcoming Defense Notices

Andrew Riachi

An Investigation Into The Memory Consumption of Web Browsers and A Memory Profiling Tool Using Linux Smaps

When & Where:


Nichols Hall, Room 250 (Gemini Conference Room)

Committee Members:

Prasad Kulkarni, Chair
Perry Alexander
Drew Davidson
Heechul Yun

Abstract

Web browsers are notorious for consuming large amounts of memory. Yet, they have become the dominant framework for writing GUIs because the web languages are ergonomic for programmers and have a cross-platform reach. These benefits are so enticing that even a large portion of mobile apps, which have to run on resource-constrained devices, are running a web browser under the hood. Therefore, it is important to keep the memory consumption of web browsers as low as practicable.

In this thesis, we investigate the memory consumption of web browsers, in particular, compared to applications written in native GUI frameworks. We introduce smaps-profiler, a tool to profile the overall memory consumption of Linux applications that can report memory usage other profilers simply do not measure. Using this tool, we conduct experiments which suggest that most of the extra memory usage compared to native applications could be due the size of the web browser program itself. We discuss our experiments and findings, and conclude that even more rigorous studies are needed to profile GUI applications.


Past Defense Notices

Dates

SRUTHI POTLURI

A Web Application for Recommending Movies to Users

When & Where:


2001B Eaton hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Man Kong
Bo Luo


Abstract

Recommendation systems are becoming more and more important with increasing popularity of e-commerce platforms. An ideal recommendation system recommends preferred items to the user. In this project, an algorithm named item-item collaborative filtering is implemented as premise. The recommendations are smarter by going through movies similar to the movies of different ratings by the user, calculating predictions and recommending those movies which have high predictions. The primary goal of the proposed recommendation algorithm is to include user’s preference and to include lesser known items in recommendations. The proposed recommendation system was evaluated on basis of Mean Absolute Error(MAE) and Root Mean Square Error(RMSE) against 1 Million movie rating involving 6040 users and 3900 movies. The implementation is made as a web-application to simulate the real-time experience for the user.  


DEBABRATA MAJHI

IRIM: Interesting Rule Induction Module with Handling Missing Attribute Values

When & Where:


2001B Eaton Hall

Committee Members:

Jerzy Grzymala-Busse, Chair
Prasad Kulkarni
Bo Luo


Abstract

In the current era of big data, huge amount of data can be easily collected, but the unprocessed data is not useful on its own. It can be useful only when we are able to find interesting patterns or hidden knowledge. The algorithm to find interesting patterns is known as Rule Induction Algorithm. Rule induction is a special area of data mining and machine learning in which formal rules are extracted from a dataset. The extracted rules may represent some general or local (isolated) patterns related to the data.
In this report, we will focus on the IRIM (Interesting Rule Inducing Module) which induces strong interesting rules that covers most of the concept. Usually, the rules induced by IRIM provides interesting and surprising insight to the expert in the domain area.
The IRIM algorithm was implemented using Python and pySpark library, which is specially customize for data mining. Further, the IRIM algorithm was extended to handle the different types of missing data. Then at the end the performance of the IRIM algorithm with and without missing data feature was analyzed. As an example, interesting rules induced from IRIS dataset are shown.

 


SUSHIL BHARATI

Vision Based Adaptive Obstacle Detection, Robust Tracking and 3D Reconstruction for Autonomous Unmanned Aerial Vehicles

When & Where:


246 Nichols Hall

Committee Members:

Richard Wang, Chair
Bo Luo
Suzanne Shontz


Abstract

Vision-based autonomous navigation of UAVs in real-time is a very challenging problem, which requires obstacle detection, tracking, and depth estimation. Although the problems of obstacle detection and tracking along with 3D reconstruction have been extensively studied in computer vision field, it is still a big challenge for real applications like UAV navigation. The thesis intends to address these issues in terms of robustness and efficiency. First, a vision-based fast and robust obstacle detection and tracking approach is proposed by integrating a salient object detection strategy within a kernelized correlation filter (KCF) framework. To increase its performance, an adaptive obstacle detection technique is proposed to refine the location and boundary of the object when the confidence value of the tracker drops below a predefined threshold. In addition, a reliable post-processing technique is implemented for an accurate obstacle localization. Second, we propose an efficient approach to detect the outliers present in noisy image pairs for the robust fundamental matrix estimation, which is a fundamental step for depth estimation in obstacle avoidance. Given a noisy stereo image pair obtained from the mounted stereo cameras and initial point correspondences between them, we propose to utilize reprojection residual error and 3-sigma principle together with robust statistic based Qn estimator (RES-Q) to efficiently detect the outliers and accurately estimate the fundamental matrix. The proposed approaches have been extensively evaluated through quantitative and qualitative evaluations on a number of challenging datasets. The experiments demonstrate that the proposed detection and tracking technique significantly outperforms the state-of-the-art methods in terms of tracking speed and accuracy, and the proposed RES-Q algorithm is found to be more robust than other classical outlier detection algorithms under both symmetric and asymmetric random noise assumptions.


MOHSEN ALEENEJAD

New Modulation Methods and Control Strategies for Power Electronics Inverters

When & Where:


1 Eaton Hall

Committee Members:

Reza Ahmadi, Chair
Glenn Prescott
Alessandro Salandrino
Jim Stiles
Huazhen Fang*

Abstract

The DC to AC power Converters (so-called Inverters) are widely used in industrial applications. The multilevel inverters are becoming increasingly popular in industrial apparatus aimed at medium to high power conversion applications.  In comparison to the conventional inverters, they feature superior characteristics such as lower total harmonic distortion (THD), higher efficiency, and lower switching voltage stress.  Nevertheless, the superior characteristics come at the price of a more complex topology with an increased number of power electronic switches. The increased number of power electronics switches results in more complicated control strategies for the inverter. Moreover, as the number of power electronic switches increases, the chances of fault occurrence of the switches increases, and thus the inverter’s reliability decreases. Due to the extreme monetary ramifications of the interruption of operation in commercial and industrial applications, high reliability for power inverters utilized in these sectors is critical.  As a result, developing simple control strategies for normal and fault-tolerant operation of multilevel inverters has always been an interesting topic for researchers in related areas.  The purpose of this dissertation is to develop new control and fault-tolerant strategies for the multilevel power inverter.  For the normal operation of the inverter, a new high switching frequency technique is developed.  The proposed method extends the utilization of the dc link voltage while minimizing the dv/dt of the switches. In the event of a fault, the line voltages of the faulty inverters are unbalanced and cannot be applied to the three phase loads. For the faulty condition of the inverter, three novel fault-tolerant techniques are developed. The proposed fault-tolerant strategies generate balanced line voltages without bypassing any healthy and operative inverter element, makes better use of the inverter capacity and generates higher output voltage. These strategies exploit the advantages of the Selective Harmonic Elimination (SHE) and Space Vector Modulation (SVM) methods in conjunction with a slightly modified Fundamental Phase Shift Compensation (FPSC) technique to generate balanced voltages and manipulate voltage harmonics at the same time.  The proposed strategies are applicable to several classes of multilevel inverters with three or more voltage levels.


XIAOLI LI

Constructivism Learning

When & Where:


246 Nichols Hall

Committee Members:

Luke Huan, Chair
Victor Frost
Bo Luo
Richard Wang
Alfred Ho*

Abstract

Aiming to achieve the learning capabilities possessed by intelligent beings, especially human, researchers in machine learning field have the long-standing tradition of borrowing ideas from human learning, such as reinforcement learning, active learning, and curriculum learning.  Motivated by a philosophical theory called  "constructivism", in this work, we propose a new machine learning paradigm, constructivism learning.   The constructivism theory has had wide-ranging impact on various human learning theories about how human acquire knowledge.  To adapt this human learning theory to the context of machine learning, we first studied how to improve leaning performance by exploring inductive bias or prior knowledge from multiple learning tasks with multiple data sources, that is multi-task multi-view learning, both in offline and lifelong setting.  Then we formalized a Bayesian nonparametric approach using sequential Dirichlet Process Mixture Models to support constructivism learning.  To further exploit constructivism learning, we also developed a constructivism deep learning method utilizing Uniform Process Mixture Models.


MOHANAD AL-IBADI

Array Processing Techniques for Ice-Sheet Bottom Tracking

When & Where:


317 Nichols Hall

Committee Members:

Shannon Blunt, Chair
John Paden
Eric Perrins
Jim Stiles
Huazhen Fang*

Abstract

   In airborne multichannel radar sounder signal processing, the collected data are most naturally represented in a cylindrical coordinate system: along-track, range, and elevation angle. The data are generally processed in each of these dimensions sequentially to focus or resolve the data in the corresponding dimension such that a 3D image of the scene can be formulated. Pulse-compression is used to process the data along the range dimension, synthetic aperture radar (SAR) processing is used to process the data in the along-track dimension, and array-processing techniques are used for the elevation angle dimension. After the first two steps, the 3D scene is resolved into toroids with constant along-track and constant range that are centered on the flight path. The targets lying in a particular toroid need to be resolved by estimating their respective elevation angles.
   In the proposed work, we focus on the array processing step, where several direction of arrival (DoA) estimation methods will be used to resolve the targets in the elevation-angle dimension, such as MUltiple Signal Classification (MUSIC) and maximum-likelihood estimation (MLE). A tracker is then used on the output of the DoA estimation to track the ice-bottom interface. We propose to use the tree re-weighted message passing algorithm or Kalman filtering, based on the array-processing technique, to track the ice-bottom. The outcome of this is a digital elevation model (DEM) of the ice-bottom. While most published work assumes a narrowband model for the array, we will use a wideband model and focus on issues related to wideband arrays. Along these lines, we propose a theoretical study to evaluate the performance of the radar products based on the array characteristics using different array-processing techniques, such as wideband MLE and focusing-matrices methods. In addition, we will investigate tracking targets using a sparse array composed of three sub-arrays, each separated by a large multiwavelength baseline. Specifically, we propose to develop and investigate the performance of a Kalman tracking solution to this wideband sparse array problem when applied to data collected by the CReSIS radar sounder.

 


QIAOZHI WANG

Towards the Understanding of Private Content -- Content-based Privacy Assessment and Protection in Social Networks

When & Where:


2001B Eaton Hall

Committee Members:

Bo Luo, Chair
Fengjun Li
Richard Wang
Heechul Yun
Prajna Dhar*

Abstract

In the 2016 presidential election, social networks showed their great power as a “modern form of communication”. With the increasing popularity of social networks, privacy concerns arise. For example, it has been shown that microblogs are revealed to audiences that are significantly larger than users' perceptions. Moreover, when users are emotional, they may post messages with sensitive content and later regret doing so.  As a result, users become very vulnerable – private or sensitive information may be accidentally disclosed, even in tweets about trivial daily activities.
Unfortunately, existing research projects on data privacy, such as the k-anonymity and differential privacy mechanisms, mostly focus on protecting individual’s identity from being discovered in large data sets. We argue that the key component of privacy protection in social networks is protecting sensitive content, i.e. privacy as having the ability to control dissemination of information. The overall objectives of the proposed research are: to understand the sensitive content of social network posts, to facilitate content-based protection of private information, and to identify different types of sensitive information.  In particular, we propose a user-centered, quantitative measure of privacy based on textual content, and a customized privacy protection mechanism for social networks. 
We consider private tweet identification and classification as dual-problems. We propose to develop an algorithm to identify all types of private messages, and, more importantly, automatically score the sensitiveness of private message.  We first collect the opinions from a diverse group of users w.r.t. sensitiveness of private information through Amazon Mechanical Turk, and analyze the discrepancies between users' privacy expectations and actual information disclosure. We then develop a computational method to generate the context-free privacy score, which is the “consensus” privacy score for average users. Meanwhile, classification of private tweets is necessary for customized privacy protection. We have made the first attempt to understand different types of private information, and to automatically classify sensitive tweets into 13 pre-defined topic categories. In proposed research, we will further include personal attitudes, topic preferences, and social context into the scoring mechanism, to generate a personalized, context-aware privacy score, which will be utilized in a comprehensive privacy protection mechanism.  

 


STEVE HAENCHEN

A Model to Identify Insider Threats Using Growing Hierarchical Self-Organizing Map of Electronic Media Indicators

When & Where:


1 Eaton Hall

Committee Members:

Hossein Saiedian, Chair
Arvin Agah
Prasad Kulkarni
Bo Luo
Reza Barati

Abstract

Fraud from insiders costs an estimated $3.7 trillion annually. Current fraud prevention and detection methods that include analyzing network logs, computer events, emails, and behavioral characteristics have not been successful in reducing the losses. The proposed Occupational Fraud Prevention and Detection Model uses existing data from the field of digital forensics along with text clustering algorithms, machine learning, and a growing hierarchical self-organizing map model to predict insider threats based on computer usage behavioral characteristics.

The proposed research leverages research results from information security, software engineering, data science and information retrieval, context searching, search patterns, and machine learning to build and employ a database server and workstations to support 50+ terabytes of data representing entire hard drives from work computers. Forensic software FTK and EnCase are used to generate disk images and test extraction results. Primary research tools are built using modern programming languages. The research data is derived from disk images obtained from actual investigations when fraud was asserted and other disk images when fraud was not asserted.

The research methodology includes building a data extraction tool that is a disk level reader to store the disk, partition, and operating system data in a relational database. An analysis tool is also created to convert the data into information representing usage patterns including summarization, normalization, and redundancy removal. We build a normalizing tool that uses machine learning to adjust the baselines for company, department, and job deviations.  A prediction component is developed to derive insider threat scores reflecting the anomalies from the adjusted baseline. The resulting product will allow identification of the computer users most likely to commit fraud so investigators can focus their limited resources on the suspects.

Our primarily plan to evaluate and validate our research results is via empirical study, statistical evaluation and benchmarking with tests of precision and recall from a second set of disk images.


JAMIE ROBINSON

Code Cache Management in Managed Language VMs to Reduce Memory Consumption for Embedded Systems

When & Where:


129 Nichols Hall

Committee Members:

Prasad Kulkarni, Chair
Bo Luo
Heechul Yun


Abstract

The compiled native code generated by a just-in-time (JIT) compiler in managed language virtual machines (VM) is placed in a region of memory called the code cache. Code cache management (CCM) in a VM is responsible to find and evict methods from the code cache to maintain execution correctness and manage program performance for a given code cache size or memory budget. Effective CCM can also boost program speed by enabling more aggressive JIT compilation, powerful optimizations, and improved hardware instruction cache and I-TLB performance.

Though important, CCM is an overlooked component in VMs. We find that the default CCM policies in Oracle’s production-grade HotSpot VM perform poorlyeven at modest memory pressure. We develop a detailed simulation-based framework to model and evaluate the potential efficiency of many different CCM policies in a controlled and realistic, but VM-independent environment. We make the encouraging discovery that effective CCM policies can sustain high program performance even for very small cache sizes.

Our simulation study provides the rationale and motivation to improve CCM strategies in existing VMs. We implement and study the properties of several CCM policies in HotSpot. We find that in spite of working within the bounds of the HotSpot VM’s current CCM sub-system, our best CCM policy implementation in HotSpot improves program performance over the default CCM algorithm by 39%, 41%, 55%, and 50% with code cache sizes that are 90%, 75%, 50%, and 25% of the desired cache size, on average.


AIME DE BERNER

Application of Machine Learning Techniques to the Diagnosis of Vision Disorders

When & Where:


2001B Eaton Hall

Committee Members:

Arvin Agah, Chair
Nicole Beckage
Jerzy Grzymala-Busse


Abstract

In the age of data collection and as we search for knowledge, over time numerous techniques have been developed and used to capture, manipulate, and to process data to acquire the hidden correlations, relations, patterns, and mappings that one may not be able to see. Computers as machines with the help of improved algorithms have proven to provide Artificial Intelligence (AI) by applying models to predict outcomes within an acceptable margin of error. Through performance metrics applied using Data Mining and Machine Learning models to predict human vision disorders, we are able to see promising models. AI techniques used in this work include an improved version of C.45 called C.48, Neuro-Networks, K-Nearest-Neighbor, Random Forest, Support Vector Machines, AdaBoost, among many. The best predictive models were determined that could be applied to the diagnosis of vision disorders, focusing on Strabismus, the need for patient referral to a specialist.