Defense Notices
All students and faculty are welcome to attend the final defense of EECS graduate students completing their M.S. or Ph.D. degrees. Defense notices for M.S./Ph.D. presentations for this year and several previous years are listed below in reverse chronological order.
Students who are nearing the completion of their M.S./Ph.D. research should schedule their final defenses through the EECS graduate office at least THREE WEEKS PRIOR to their presentation date so that there is time to complete the degree requirements check, and post the presentation announcement online.
Upcoming Defense Notices
David Felton
Optimization and Evaluation of Physical Complementary Radar WaveformsWhen & Where:
Nichols Hall, Room 129 (Apollo Auditorium)
Committee Members:
Shannon Blunt, ChairRachel Jarvis
Patrick McCormick
James Stiles
Zsolt Talata
Abstract
The RF spectrum is a precious, finite resource with ever-increasing demand. Consequently, the mandate to be a "good spectral neighbor" is in direct conflict with the requirements for high-performance sensing where correlation error is fundamentally limited. As such, matched-filter radar performance is often sidelobe-limited with estimation error being constrained by the time-bandwidth (TB) of the collective emission. The methods developed here seek to bridge this gap between idealized radar performance and practical utility via waveform design.
Estimation error becomes more complex when employing pulse-agility. In doing so, range-sidelobe modulation (RSM) spreads energy across Doppler, rendering traditional methods ineffective. To address this, the gradient-based complementary-FM framework was developed to produce complementary sidelobe cancellation (CSC) after coherently combining subsets within a pulse-agile emission. In contrast to the majority of complementary signals, explored via phase-coding, these Comp-FM waveform subsets achieve CSC while preserving hardware-compatibility since they are FM (though design distortion is never completely avoided). Although Comp-FM addressed practicality via hardware amenability, CSC was localized to zero-Doppler. This work expands the Comp-FM notion to a Doppler-generalized (DG) framework, extending the cancellation condition to an arbitrary span. The same framework can likewise be employed to jointly optimize an entire coherent processing interval (CPI) to minimize RSM within the radar point-spread-function (PSF), thereby generalizing the notion of complementarity and introducing the potential for cognitive operation if sufficient scattering knowledge is available a-priori.
Sensing with a single emitter is limited by self-inflicted error alone (e.g., clutter, sidelobes), while MIMO systems must additionally contend with the cross-responses from emitters operating concurrently (e.g., simultaneously, spatially proximate, in a shared spectrum), further degrading radar sensitivity. Now, total correlation error is dictated by the overlapping TB (i.e., how coincident are the signals) and number of operating emitters, compounding difficulty to estimate if left unaddressed. As such, the determination of "orthogonal waveforms" comprises a large portion of MIMO literature, though remains a phenomenological misnomer for pulsed emissions. Here, the notion of complementary-FM is applied to a multi-emitter context in which transmitter-amenable quasi-orthogonal subsets, occupying the same spectral band, are produced via a similar gradient-based approach. To further practicalize these MIMO-Comp-FM waveform subsets, the same "DG" approach described above, addressing the otherwise-default Doppler-induced degradation of complementary signals, is applied. In doing so, Doppler-independent separability and complementarity greatly improves estimation sensitivity for multi-emitter systems.
This MIMO-Comp-FM framework is developed for standard matched filter processing. Coupling this framework with a "DG" form of the previously explored MIMO-MiCRFt is also investigated, illustrating the added benefit of pairing optimized subsets with similarly calibrated processing.
Each of these methods is developed to address unique and increasingly complex sources of estimation error. All approaches are initially developed and evaluated via simulated analysis where ground-truth is known. Then, despite hardware-induced distortion being unavoidable, the MIMO-Comp-FM framework is confirmed via loopback measurements to preserve the majority of CSC that was observed in simulation. Finally, open-air demonstration of each approach validates practical utility on a radar system.
Hao Xuan
Toward an Integrated Computational Framework for Metagenomics: From Sequence Alignment to Automated Knowledge DiscoveryWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Cuncong Zhong, ChairFengjun Li
Suzanne Shontz
Hongyang Sun
Liang Xu
Abstract
Metagenomic sequencing has become a central paradigm for studying complex microbial communities and their interactions with the host, with emerging applications in clinical prediction and disease modeling. In this work, we first investigate two representative application scenarios: predicting immune checkpoint inhibitor response in non-small cell lung cancer using gut microbial signatures, and characterizing host–microbiome interactions in neonatal systems. The proposed reference-free neural network captures both compositional and functional signals without reliance on reference genomes, while the neonatal study demonstrates how environmental and genetic factors reshape microbial communities and how probiotic intervention can mitigate pathogen-induced immune activation.
These studies highlight both the promise and the inherent difficulty of metagenomic analysis: transforming raw sequencing data into clinically actionable insights remains an algorithmically fragmented and computationally intensive process. This challenge arises from two key limitations: the lack of a unified algorithmic foundation for sequence alignment and the absence of systematic approaches for selecting and organizing analytical tools. Motivated by these challenges, we present a unified computational framework for metagenomic analysis that integrates complementary algorithmic and systems-level solutions.
First, to resolve fragmentation at the alignment level, we develop the Versatile Alignment Toolkit (VAT), a unified algorithmic system for biological sequence alignment across diverse applications. VAT introduces an asymmetric multi-view k-mer indexing scheme that integrates multiple seeding strategies within a single architecture and enables dynamic seed-length adjustment via longest common prefix (LCP)–based inference without re-indexing. A flexible seed-chaining mechanism further supports diverse alignment scenarios, including collinear, rearranged, and split alignments. Combined with a hardware-efficient in-register bitonic sorting algorithm and dynamic index-loading strategy, VAT achieves high efficiency and broad applicability across read mapping, homology search, and whole-genome alignment. Second, to address the challenge of tool selection and pipeline construction, we develop SNAIL, a natural language processing system for automated recognition of bioinformatics tools from large-scale and rapidly growing scientific literature. By integrating XGBoost and Transformer-based models such as SciBERT, SNAIL enables structured extraction of analytical tools and supports automated, reproducible pipeline construction.
Together, this work establishes a unified framework that is grounded in real-world applications and addresses key bottlenecks in metagenomic analysis, enabling more efficient, scalable, and clinically actionable workflows.
Pramil Paudel
Learning Without Seeing: Privacy-Preserving and Adversarial Perspectives in Lensless ImagingWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Fengjun Li, ChairAlex Bardas
Bo Luo
Cuncong Zhong
Haiyang Chao
Abstract
Conventional computer vision relies on spatially resolved, human-interpretable images, which inherently expose sensitive information and raise privacy concerns. In this study, we explore an alternative paradigm based on lensless imaging, where scenes are captured as diffraction patterns governed by the point spread function (PSF). Although unintelligible to humans, these measurements encode structured, distributed information that remains useful for computational inference.
We propose a unified framework for privacy-preserving vision that operates directly on lensless sensor measurements by leveraging their frequency-domain and phase-encoded properties. The framework is developed along two complementary directions. First, we enable reconstruction-free inference by exploiting the intrinsic obfuscation of lensless data. We show that semantic tasks such as classification can be performed directly on diffraction patterns using models tailored to non-local, phase-scrambled representations. We further design lensless-aware architectures and integrate them into practical pipelines, including a Swin Transformer-based steganographic framework (DiffHide) for secure and imperceptible information embedding. To assess robustness, we formalize adversarial threat models and develop defenses against learning-based reconstruction attacks, particularly GAN-driven inversion. Second, we investigate the limits of privacy by studying the reconstructability of lensless measurements without explicit knowledge of the forward model. We develop learning-based reconstruction methods that approximate the inverse mapping and analyze conditions under which sensitive information can be recovered. Our results demonstrate that lensless measurements enable effective vision tasks without reconstruction, while providing a principled framework to evaluate and mitigate privacy risks.
Sharmila Raisa
Digital Coherent Optical System: Investigation and MonitoringWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Rongqing Hui, ChairMorteza Hashemi
Erik Perrins
Alessandro Salandrino
Jie Han
Abstract
Coherent wavelength-division multiplexed (WDM) optical fiber systems have become the primary transmission technology for high-capacity data networks, driven by the explosive bandwidth demand of cloud computing, streaming services, and large-scale artificial intelligence training infrastructure. This dissertation investigates two fundamental aspects of digital coherent fiber optic systems under the unifying theme of source and monitoring: the design of multi-wavelength optical sources compatible with high-order coherent detection, and the leveraging of fiber Kerr-effect nonlinearity at the coherent receiver to perform physical-layer link health monitoring and to assess inherent security vulnerabilities — both achieved through digital signal processing of the received complex optical field without dedicated hardware.
We begin by addressing the multi-wavelength transmitter challenge in WDM coherent systems. Existing quantum-dot, quantum-dash, and quantum-well based optical frequency comb (OFC) sources share a common limitation: individual comb line linewidths in the tens of MHz range caused by low output power levels of 1–20 mW, making them incompatible with high-order coherent detection. We demonstrate coherent system application of a single-section InGaAsP QW Fabry-Perot laser diode with greater than 120 mW optical power at the fiber pigtail and 36.14 GHz mode spacing. The high optical power per mode produces Lorentzian equivalent linewidths below 100 kHz — compatible with 16-QAM carrier phase recovery without optical phase locking. Experimental results obtained using a commercial Ciena WaveLogic-Ai coherent transceiver demonstrate 20-channel WDM transmission over 78.3 km of standard single-mode fiber with all channels below the HD-FEC threshold of 3.8 × 10⁻³ at 30 GBaud differential-coded 16-QAM, corresponding to an aggregate capacity of 2.15 Tb/s from a single laser device.
After investigating the QW Fabry-Perot laser as a multi-wavelength source for coherent WDM transmission, we leverage the coherent receiver DSP to exploit fiber Kerr-effect nonlinearity for longitudinal power profile estimation, enabling reconstruction of the signal power distribution P(z) along the full multi-span link without dedicated hardware or traffic interruption. We propose a modified enhanced regular perturbation (ERP) method that corrects two independent physical error sources of the standard RP1 least-squares baseline: the accumulated nonlinear phase rotation, and the dispersion-mediated phase-to-intensity conversion — a second bias source not addressed by prior methods. The RP1 method produces mean absolute error (MAE) that scales quadratically with span count, growing to 1.656 dB at 10 spans and 3 dBm. The modified ERP reduces this to 0.608 dB — an improvement that grows consistently with link length, confirming increasing advantage in the long-haul regime. Extension to WDM through an XPM-aware per-channel formulation achieves MAE of 0.113–0.419 dB across 150–500 km link lengths.
In addition to its role in enabling DSP-based longitudinal power profile estimation, the fiber Kerr-effect nonlinearity is shown to give rise to an inherent physical-layer security vulnerability in coherent WDM systems. We show that an eavesdropper co-tenanting a shared fiber — transmitting a continuous-wave probe at a wavelength adjacent to the legitimate signal — can capture the XPM-induced waveform at the fiber output and apply a bidirectional gated recurrent unit neural network, trained on split-step Fourier method simulation data, to reconstruct the transmitted symbol sequence without physical fiber access and without perturbing the legitimate signal. This eavesdropping mechanism is experimentally validated using a commercial Ciena WaveLogic-Ai coherent transceiver for ASK, BPSK, QPSK, and 16-QAM modulation formats at 4.26 GBaud and 8.56 GBaud over one- and two-span 75 km fiber systems, achieving zero symbol errors under high-OSNR conditions. Noise-aware training over OSNR from 20 to 60 dB maintains symbol error rate below 10⁻² for OSNR above 25–30 dB.
Together, these three contributions demonstrate that the coherent fiber optic system is a versatile physical instrument extending well beyond its role as a data transmission medium. The coherent receiver infrastructure — deployed for high-order modulation and data recovery — simultaneously enables the high-power OFC laser to serve as a practical multi-wavelength transmitter source, and provides the complex field measurement capability through which fiber Kerr-effect nonlinearity can be exploited constructively for distributed link monitoring and, as a direct consequence, reveals an inherent physical-layer security exposure in shared fiber infrastructure. This unified perspective on the coherent system as both a transmission platform and a general-purpose measurement instrument has direct relevance to the design of spectrally efficient, self-monitoring, and physically secure optical interconnects for next-generation AI computing networks.
Arman Ghasemi
Task-Oriented Data Communication and Compression for Timely Forecasting and Control in Smart GridsWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Morteza Hashemi, ChairAlexandru Bardas
Prasad Kulkarni
Taejoon Kim
Zsolt Talata
Abstract
Advances in sensing, communication, and intelligent control have transformed power systems into data-driven smart grids, where forecasting and intelligent decision-making are essential components. Modern smart grids include distributed energy resources (DERs), renewable generation, battery energy storage systems, and large numbers of grid-edge devices that continuously generate time-series data. At the same time, increasing renewable penetration introduces substantial uncertainty in generation, net load, and market operations, while communication networks impose bandwidth, latency, and reliability constraints on timely data delivery. This dissertation addresses how time-series forecasting, data compression, and task-oriented wireless communication can be jointly designed for smart grid applications.
First, we study weather-aware distributed energy management in prosumer-centric microgrids and show that incorporating day-ahead weather information into decision-making improves battery dispatch and reduces the impact of renewable uncertainty. Second, we introduce forecasting-aware energy management in both wholesale and retail electricity markets, highlighting how renewable generation forecasting affects pricing, scheduling, and uncertainty mitigation. Third, we develop and evaluate deep learning methods for renewable generation forecasting, showing that Transformer-based models outperform recurrent baselines such as RNN and LSTM for wind and solar prediction tasks.
Building on this forecasting foundation, we develop a communication-efficient forecasting framework in which high-dimensional smart grid measurements are compressed into low-dimensional latent representations before transmission. This framework is extended into a task-oriented communication system that jointly optimizes data relevance and information timeliness, so that the receiver obtains compressed updates that remain useful for downstream forecasting tasks. Finally, we extend this framework to a distributed multi-node uplink setting, where multiple grid sensors share a bandwidth-limited channel, and develop scheduling policy that improves both the timeliness and task-relevance of received updates.
Pardaz Banu Mohammad
Towards Early Detection of Alzheimer’s Disease based on Speech using Reinforcement Learning Feature SelectionWhen & Where:
Eaton Hall, Room 2001B
Committee Members:
Arvin Agah, ChairDavid Johnson
Sumaiya Shomaji
Dongjie Wang
Sara Wilson
Abstract
Alzheimer’s Disease (AD) is a progressive, irreversible neurodegenerative disorder and the leading cause of dementia worldwide, affecting an estimated 55 million people globally. The window of opportunity for intervention is demonstrably narrow, making reliable early-stage detection a clinical and scientific imperative. While current diagnostic techniques such as neuroimaging and cerebrospinal fluid (CSF) biomarkers carry well-defined limitations in scalability, cost, and access equity, speech has emerged as a compelling non-invasive proxy for cognitive function evaluation.
This work presents a novel approach for using acoustic feature selection as a decision-making technique and implements it using deep reinforcement learning. Specifically, we use a Deep-Q-Network (DQN) agent to navigate a high dimensional feature space of over 6,000 acoustic features extracted using the openSMILE toolkit, dynamically constructing maximally discriminative and non-redundant features subsets. In order to capture the latent structural dependencies among
acoustic features which classifier and wrapper methods have difficulty to model, we introduce the Graph Convolutional Network (GCN) based correlation awareness feature representation layer that operates as an auxiliary input to the DQN state encoder. Post selection interpretability is reinforced through TF-IDF weighting and K-means clustering which together yield both feature level and cluster level explanations that are clinically actionable. The framework is evaluated across five classifiers, namely, support vector machines (SVM), logistic regression, XGBoost, random forest, and feedforward neural network. We use 10-fold stratified cross-validation on established benchmarks of datasets, including DementiaBank Pitt Corpus, Ivanova, and ADReSS challenge data. The proposed approach is benchmarked against state-of-the-art feature selection methods such as LASSO, Recursive feature selection, and mutual information selectors. This research contributes to three primary intellectual advances: (1) a graph augmented state representation that encodes inter-feature relational structure within a reinforcement learning agent, (2) a clinically interpretable pipeline that bridges the gap between algorithmic performance and translational utility, and (3) multilingual data approach for the reinforcement learning agent framework. This study has direct implications for equitable, low-cost and scalable AD screening in both clinical and community settings.
Zhou Ni
Bridging Federated Learning and Wireless Networks: From Adaptive Learning to FLdriven System OptimizationWhen & Where:
Nichols Hall, Room 246 (Executive Conference Room)
Committee Members:
Morteza Hashemi, ChairFengjun Li
Van Ly Nguyen
Han Wang
Shawn Keshmiri
Abstract
Federated learning (FL) has emerged as a promising distributed machine learning
framework that enables multiple devices to collaboratively train models without sharing raw
data, thereby preserving privacy and reducing the need for centralized data collection. However,
deploying FL in practical wireless environments introduces two major challenges. First, the data
generated across distributed devices are often heterogeneous and non-IID, which makes a single
global model insufficient for many users. Second, learning performance in wireless systems is
strongly affected by communication constraints such as interference, unreliable channels, and
dynamic resource availability. This PhD research aims to address these challenges by bridging
FL methods and wireless networks.
In the first thrust, we develop personalized and adaptive FL methods given the underlying
wireless link conditions. To this end, we propose channel-aware neighbor selection and
similarity-aware aggregation in wireless device-to-device (D2D) learning environments. We
further investigate the impacts of partial model update reception on FL performance. The
overarching goal of the first thrust is to enhance FL performance under wireless constraints.
Next, we investigate the opposite direction and raise the question: How can FL-based distributed
optimization be used for the design of next-generation wireless systems? To this end, we
investigate communication-aware participation optimization in vehicular networks, where
wireless resource allocation affects the number of clients that can successfully contribute to FL.
We further extend this direction to integrated sensing and communication (ISAC) systems,
where personalized FL (PFL) is used to support distributed beamforming optimization with joint
sensing and communication objectives.
Overall, this research establishes a unified framework for bridging FL and wireless networks. As
a future direction, this work will be extended to more realistic ISAC settings with dynamic
spectrum access, where communication, sensing, scheduling, and learning performance must be
considered jointly.
Past Defense Notices
Lumumba Harnett
Reduced Dimension Optimal and Adaptive Mismatch Processing for Interference CancellationWhen & Where:
246 Nichols Hall
Committee Members:
Shannon Blunt, ChairChristopher Allen
Erik Perrins
James Stiles
Richard Hale
Abstract
Interference has been a subject of interest to radars for generations due to its ability to degrade performance. Commercial radars can experience radio frequency (RF) interference from a different RF service (such as radio broadcasting, television broadcasting, communications, satellites, etc.) if it operates simultaneously in the same spectrum. The RF spectrum is a finite asset that is regulated to mitigate interference and maximum resources. Recently, shared spectrum have been proposed to accommodate the growing commercial demand of communication systems. Airborne radars, performing ground moving target indication (GMTI), encounter interference from clutter scattering that may mask slow-moving, low-power targets. Least-squares (LS) optimal and re-iterative minimum-mean square error (RMMSE) adaptive mismatch processing recent advancements are proposed for GMTI and shared spectrum. Each estimation technique reduces sidelobes, provides less signal-to-noise loss, and less resolution degradation than windowing. For GMTI, LS and RMMSE filters are considered with angle-Doppler filters and pre-existing interference cancellation techniques for better detection performance. Application specific reduce rank versions of the algorithms are also introduced for real-time operation. RMMSE is further considered to separate radar and mobile communication systems operating in the same RF band to mitigate interference and information loss.
April Wade
Exploring Properties, Impact, and Deployment Mechanisms of Profile-Guided Optimizations in Static and Dynamic CompilersWhen & Where:
2001 B Eaton Hall
Committee Members:
Prasad Kulkarni, ChairPerry Alexander
Garrett Morris
Heechul Yun
Kyle Camarda
Abstract
Managed language virtual machines (VM) rely on dynamic or just-in-time (JIT) compilation to generate optimized native code at run-time to deliver high execution performance. Many VMs and JIT compilers collect \emph{profile} data at run-time to enable profile-guided optimizations (PGO) that customize the generated native code to different program inputs. PGOs are generally considered integral for VMs to produce high-quality and performant native code. Likewise, many static, ahead-of-time (AOT) compilers employ PGOs to achieve peak performance, though they are less commonly employed in practice.
We propose a study that analyzes and quantifies the performance benefits of PGOs in both AOT and JIT enviroments, understand the importance of profiling data quantity and quality/accuracy to effectively guide PGOs, and assess the impact of individual PGOs on performance. Additionally, we propose an extension of PGOs found in AOT compiler based on specialization and seek to perform a feasibility study to determine its viability.
Luyao Shang
Memory Based LT Encoders for Delay Sensitive CommunicationsWhen & Where:
246 Nichols Hall
Committee Members:
Erik Perrins, ChairShannon Blunt
Taejoon Kim
David Petr
Tyrone Duncan
Abstract
As the upcoming fifth-generation (5G) and future wireless network is envisioned in areas such as augmented and virtual reality, industrial control, automated driving or flying, robotics, etc, the requirement of supporting ultra-reliable low-latency communications (URLLC) is increasingly urgent than ever. From the channel coding perspective, URLLC requires codewords being transported in finite block-lengths. In this regards, we propose novel encoding algorithms and analyze their performance behaviors for the finite-length Luby transform (LT) codes.
Luby transform (LT) codes, the first practical realization and the fundamental core of fountain codes, play a key role in the fountain codes family. Recently, researchers show that the performance of LT codes for finite block-lengths can be improved by adding memory into the encoder. However, this work only utilizes one memory, leaving the possibilities of exploiting and how to exploiting more memories an open problem. To explore this unknown, in this work we propose an entire family of memory based LT encoders, and analyze their performance behaviors thoroughly over binary erasure channels and AWGN channels.
Pushkar Singh Negi
A comparison of global and saturated probabilistic approximations using characteristic sets in mining incomplete dataWhen & Where:
2001 B Eaton Hall
Committee Members:
Jerzy Grzymala-Busse , ChairPrasad Kulkarni
Cuncong Zhong
Abstract
Data mining is an important part of the knowledge discovery process. Data mining helps in finding out patterns across large data sets and establishing relationship through data analysis to solve problems.
Input data sets are often incomplete, i.e., some attribute values are missing. The rough set theory offers mathematical tools to discover patterns hidden in inconsistent and incomplete data. Rough set theory handles inconsistent data by introducing probabilistic approximations. These approximations are combined with an additional parameter (or threshold) called alpha.
The main objective of this project is to compare global and saturated probabilistic approximations using characteristic sets in mining incomplete data. Eight different data sets with 35% missing values were used for experiments. Two different variations of missing values were used, namely, lost values and "do not care" conditions. For rule induction, we implemented the single local probabilistic approximation version of MLEM2. We implemented a rule checker system to verify the accuracy of our generated ruleset by computing the error rate. Along with the rule checker system, the k-fold cross-validation technique was implemented with a value of k as ten to validate the generated rule sets. Finally, a statistical analysis was done for all the approaches using the Friedman test.
Shashank Sambamoorthy
Security Analysis of Android Applications with OWASP Top 10When & Where:
1A Eaton, Dean's conference room
Committee Members:
Jerzy Grzymala-Busse, ChairDrew Davidson
Cuncong Zhong
Abstract
Mobile application security concerns safeguarding mobile apps from threats, such as malware, password cracking, social engineering and other attacks. Application security is crucial for every enterprise, as the business can be developed only with the guarantee that the apps are secure from potential threats. Open Web Application Security Project(OWASP) has compiled a list of top 10 mobile risks, and has formulated a set of guidelines for app development and testing. The objective of my project is to analyze the security risks of android application, using the guidelines from OWASP top 10. With the help of suitable tools, analysis is done to identify the vulnerabilities and threats in android applications, on API 4.4.1. Numerous tools have been used as a part of this endeavor, all of them are open source and freely available. As a part of this project, I have also attempted to demonstrate each of the top 10 risks, using individual android applications. A detailed analysis was performed on each of the top 10 mobile risks, and suitable countermeasures for mitigation was provided. A detailed survey of 100 popular applications from the Google Play store was also performed and the risks were categorized into low, medium and high impact, depending on the level of threats.
Shadi Pir Hosseinloo
Using deep learning methods for supervised speech enhancement in noisy and reverberant environmentsWhen & Where:
246 Nichols Hall
Committee Members:
Shannon Blunt, ChairJonathan Brumberg
Erik Perrins
Sara Wilson
John Hansen
Abstract
In real world environments, the speech signals received by our ears are usually a combination of different sounds that include not only the target speech, but also acoustic interference like music, background noise, and competing speakers. This interference has negative effect on speech perception and degrades the performance of speech processing applications such as automatic speech recognition (ASR), speaker identification, and hearing aid devices. One way to solve this problem is using source separation algorithms to separate the desired speech from the interfering sounds. Many source separation algorithms have been proposed to improve the performance of ASR systems and hearing aid devices, but it is still challenging for these systems to work efficiently in noisy and reverberant environments. On the other hand, humans have a remarkable ability to separate desired sounds and listen to a specific talker among noise and other talkers. Inspired by the capabilities of human auditory system, a popular method known as auditory scene analysis (ASA) was proposed to separate different sources in a two stage process of segmentation and grouping. The main goal of source separation in ASA is to estimate time frequency masks that optimally match and separate noise signals from a mixture of speech and noise. In this work, multiple algorithms are proposed to improve upon source separation in noisy and reverberant acoustic environment. First, a simple and novel algorithm is proposed to increase the discriminability between two sound sources by scaling (magnifying) the head-related transfer function of the interfering source. Experimental results from applications of this algorithm show a significant increase in the quality of the recovered target speech. Second, a time frequency masking-based source separation algorithm is proposed that can separate a male speaker from a female speaker in reverberant conditions by using the spatial cues of the source signals. Furthermore, the proposed algorithm has the ability to preserve the location of the sources after separation. Three major aims are proposed for supervised speech separation based on deep neural networks to estimate either the time frequency masks or the clean speech spectrum. Firstly, a novel monaural acoustic feature set based on a gammatone filterbank is presented to be used as the input of the deep neural network (DNN) based speech separation model, which shows significant improvement in objective speech intelligibility and speech quality in different testing conditions. Secondly, a complementary binaural feature set is proposed to increase the ability of source separation in adverse environment with non-stationary background noise and high reverberation using 2-channel recordings. Experimental results show that the combination of spatial features with this complementary feature set improves significantly the speech intelligibility and speech quality in noisy and reverberant conditions. Thirdly, a novel dilated convolution neural network is proposed to improve the generalization of the monaural supervised speech enhancement model to different untrained speakers, unseen noises and simulated rooms. This model increases the speech intelligibility and speech quality of the recovered speech significantly, while being computationally more efficient and requiring less memory in comparison to other models. In addition, the proposed model is modified with recurrent layers and dilated causal convolution layers for real-time processing. This model is causal which makes it suitable for implementation in hearing aid devices and ASR system, while having fewer trainable parameters and using only information about previous time frames in output prediction. The main goal of the proposed algorithms are to increase the intelligibility and the quality of the recovered speech from noisy and reverberant environments, which has the potential to improve both speech processing applications and signal processing strategies for hearing aid and cochlear implant technology.
Mustafa AL-QADI
Spectral Properties of Phase Noises and the Impact on the Performance of Optical InterconnectsWhen & Where:
246 Nichols Hall
Committee Members:
Ron Hui, ChairChristopher Allen
Victor Frost
Erik Perrins
Jie Han
Abstract
The non-ending growth of data traffic resulting from the continuing emergence of Internet applications with high data-rate demands sets huge capacity requirements on optical interconnects and transport networks. This requires the adoption of optical communication technologies that can make the best possible use of the available bandwidths of electronic and electro-optic components to enable data transmission with high spectral efficiency (SE). Therefore, advanced modulation formats are required to be used in conjunction with energy-efficient and cost-effective transceiver schemes, especially for medium- and short-reach applications. Important challenges facing these goals are the stringent requirements on the characteristics of optical components comprising these systems, especially laser sources. Laser phase noise is one of the most important performance-limiting factors in systems with high spectral efficiency. In this research work, we study the effects of the spectral characteristics of laser phase noise on the characterization of lasers and their impact on the performance of digital coherent and self-coherent optical communication schemes. The results of this study show that the commonly-used metric to estimate the impact of laser phase noise on the performance, laser linewidth, is not reliable for all types of lasers. Instead, we propose a Lorentzian-equivalent linewidth as a general characterization parameter for laser phase noise to assess phase noise-related system performance. Practical aspects of determining the proposed parameter are also studied and its accuracy is validated by both numerical and experimental demonstrations. Furthermore, we study the phase noises in quantum-dot mode-locked lasers (QD-MLLs) and assess the feasibility of employing these devices in coherent applications at relatively low symbol rates with high SE. A novel multi-heterodyne scheme for characterizing the phase noise of laser frequency comb sources is also proposed and validated by experimental results with the QD-MLL. This proposed scheme is capable of measuring the differential phase noise between multiple spectral lines instantaneously by a single measurement. Moreover, we also propose an energy-efficient and cost-effective transmission scheme based on direct detection of field-modulated optical signals with advanced modulation formats, allowing for higher SE compared to the current pulse-amplitude modulation schemes. The proposed system combines the Kramers-Kronig self-coherent receiver technique, with the use of QD-MLLs, to transmit multi-channel optical signals using a single diode laser source without the use of the additional RF or optical components required by traditional techniques. Semi-numerical simulations based on experimentally captured waveforms from practical lasers show that the proposed system can be used even for metro scale applications. Finally, we study the properties of phase and intensity noise changes in unmodulated optical signals passing through saturated semiconductor optical amplifiers for intensity noise reduction. We report, for the first time, on the effect of phase noise enhancement that cannot be assessed or observed by traditional linewidth measurements. We demonstrate the impact of this phase noise enhancement on coherent transmission performance by both semi-numerical simulations and experimental validation.
David Menager
A Hybrid Event Memory Theory for Integrated AgentsWhen & Where:
2001 B Eaton Hall
Committee Members:
Arvin Agah, ChairMichael Branicky
Prasad Kulkarni
Andrew Williams
Dongkyu Choi
Abstract
The memory for events is a central component in human cognition, but we have yet to see artificial agents that can demonstrate the same range of event memory capabilities as humans. Some machine learning systems are capable of behaving as if they remember and reason about events, but often times, their behavior is produced by an ad hoc assemblage of opaque statistical algorithms which yield little new insights on the nature of event memory. We propose a novel, psychologically plausible theory of event memory with an accompanying implementation which affords integrated agents the ability to remember events, present details about their past experiences, and reason about future events. We propose to demonstrate such event memory reasoning capabilities in three different experiments. First, we evaluate the fundamental capabilities of our theory to explain different event memory phenomena, such as remembering. Second, we aim to show that our event memory theory provides a unified framework for building intelligent agents that generate explanations of their own behavior and make inferences about the goals and intentions of other actors. Third, we evaluate whether our event memory theory facilitates cooperative behavior of computational agents in human-robot teams. The proposed work will be completed in December 2020. If our efforts are successful, we believe it will change the way humans interact with autonomous agents. People will better understand why robots, self-driving vehicles, and other agents behave the way they do, and as a result, will know when to trust them. This in turn will speed adoption of autonomous systems not only in military settings but, in everyday life.
Yuanwei Wu
Optimization for Training Deep Models and Deep Learning for Point Cloud Analysis and Image ClassificationWhen & Where:
246 Nichols Hall
Committee Members:
Guanghui Wang , ChairTaejoon Kim
Bo Luo
Heechul Yun
Haiyang Chao
Abstract
Deep learning (DL) has dramatically improved the state-of-the-art performances in broad applications of computer vision, such as image recognition, object detection, semantic/instance segmentation, and point cloud analysis. However, the reasons for such huge empirical success of DL still keep elusive theoretically. In this dissertation, to understand DL and improve its efficiency, robustness, and interpretability, we theoretically investigate optimization algorithms for training deep models and empirically explore deep learning for unsupervised learning tasks in point-cloud analysis and image classification.
1). Optimization for Training Deep Models: Neural network training is one of the most difficult optimization problems involved in DL. Recently, to understand the global optimality in DL has attracted a lot of attention. However, we observe that conventional DL solvers have not been developed intentionally to seek for such global optimality. In this dissertation, we propose a novel approximation algorithm, BPGrad, towards optimizing deep models globally via branch and pruning. Empirically an efficient solver based on BPGrad for DL is proposed as well, and it outperforms conventional DL solvers such as Adagrad, Adadelta, RMSProp, and Adam in the tasks of object recognition, detection, and segmentation.
2). Deep Learning for Unsupervised Learning Tasks: The architecture of neural networks is of central importance for many visual recognition tasks. In this dissertation, we focus on the emerging field of unsupervised learning for point clouds analysis and image classification.
2.1) For point cloud analysis, we propose a novel unsupervised approach to jointly learn the 3D object model and estimate the 6D poses of multiple instances of the same object in a single end-to-end deep neural network framework, with applications to depth-based instance segmentation. Extensive experiments evaluate our technique on several object models and a varying number of instances in 3D point clouds. Compared with popular baselines for instance segmentation, our model not only demonstrates competitive performance, but also learns a 3D object model that is represented as a 3D point cloud.
2.2) For low-quality image classification, we propose a simple while effective unsupervised deep feature transfer network to address the degrading problem of the state-of-the-art recognition algorithms on low-quality images. No fine-tuning is required in our method. Our network can be embedded into the state-of-the-art deep neural networks as a plug-in feature enhancement module. It preserves data structures in feature space for high-resolution images, and transfers the distinguishing features to low-resolution features space. Extensive experiments show that the proposed transfer network achieves significant improvements over the baseline method.
Dhwani Pandya
A Comparison of Mining Incomplete and Inconsistent DataWhen & Where:
2001 B Eaton Hall
Committee Members:
Jerzy Grzymala-Busse, ChairPrasad Kulkarni
Suzanne Shontz
Abstract
In today's world of digital data, the field of data mining has come into the limelight. In data mining, patterns in data are found and accordingly can be analyzed further. Processing data as deep as possible is relevant in case of pattern recognition in huge data sets. In this whole process, we try to understand the data well in order to gain some useful results out of it. For the data to be analyzed correctly, it is better if it is complete and consistent.
We compare the effect of incomplete and inconsistent data in this project. The algorithm used for generating rules is the Modified Learning from Example Module version 2 (MLEM2). We used a single local probabilistic approach for all the datasets. We took 141 datasets into consideration for the error rate comparison of incomplete and inconsistent data. We used ten-fold cross validation and computed average error rate for each of the datasets. From our experiments, we observed that the error rate for incomplete data is greater than the error rate for inconsistent data.