Image Classification & Segmentation based on Enhanced CNN and Transformer Networks

Student Name: Krushi Patel

Defense Date: Monday, May 9, 2022 - 2:00 PM

Location: Nichols Hall, Room 250 - Gemini Room

Chair: Fengjun Li

Prasad Kulkarni

Bo Luo

Cuncong Zhong

Guanghui Wang

Xinmai Yang

Abstract:

Convolutional Neural Networks (CNNs) have significantly improved the performance on various computer vision tasks such as image recognition and segmentation based on their rich representation power. To enhance the performance of CNN, a self-attention module is embedded after each layer in the network. Recently proposed Transformer-based models achieve outstanding performance by employing a multi-head self-attention module as the main building block. However, several challenges still need to be addressed, such as (1) focusing only on class-specified limited channels in CNN; (2) limited respective field in the local transformer; and (3) addition of redundant features and lack of multi-scale features in U-Net type segmentation architecture.

In our work, we propose new strategies to address these issues. First, we propose a novel channel-based self-attention module to diversify the focus more on the discriminative and significant channels, and the module can be embedded at the end of any backbone network for image classification. Second, to limit the noise added by the shallow layers of an encoder in U-Net type architecture, we replaced the skip connections with the Adaptive Global Context Module (AGCM). In addition, we introduced the Semantic Feature Enhancement Module (SFEM) for multi-scale feature enhancement in polyp segmentation. Third, we propose a Multi-scaled Overlapped Attention (MOA) mechanism in the local transformer-based network for image classification to establish the long-range dependencies and initiate the neighborhood window communication.

Degree: PhD Comprehensive Defense (EE)

Degree Type: PhD Comprehensive Defense

Degree Field: Electrical Engineering