Convolutional Neural Network in Pattern Recognition
Taejoon Kim
Fengjun Li
Bo Luo
Hauzhen Fang
Since convolutional neural network (CNN) was first implemented by Yann LeCun et al. in 1989, CNN and its variants have been widely implemented to numerous topics of pattern recognition, and have been considered as the most crucial techniques in the field of artificial intelligence and computer vision. This dissertation not only demonstrates the implementation aspect of CNN, but also lays emphasis on the methodology of neural network (NN) based classifier.
As known to many, one general pipeline of NN-based classifier can be recognized as three stages: pre-processing, inference by models, and post-processing. To demonstrate the importance of pre-processing techniques, this dissertation presents how to model actual problems in medical pattern recognition and image processing by introducing conceptual abstraction and fuzzification. In particular, a transformer on the basis of self-attention mechanism, namely beat-rhythm transformer, greatly benefits from correct R-peak detection results and conceptual fuzzification.
Recently proposed self-attention mechanism has been proven to be the top performer in the fields of computer vision and natural language processing. In spite of the pleasant accuracy and precision it has gained, it usually consumes huge computational resources to perform self-attention. Therefore, realtime global attention network is proposed to make a better trade-off between efficiency and performance for the task of image segmentation. To illustrate more on the stage of inference, we also propose models to detect polyps via Faster R-CNN - one of the most popular CNN-based 2D detectors, as well as a 3D object detection pipeline for regressing 3D bounding boxes from LiDAR points and stereo image pairs powered by CNN.
The goal for post-processing stage is to refine artifacts inferred by models. For the semantic segmentation task, the dilated continuous random field is proposed to be better fitted to CNN-based models than the widely implemented fully-connected continuous random field. Proposed approaches can be further integrated into a reinforcement learning architecture for robotics.