Welcome to P K Kelkar Library, Online Public Access Catalogue (OPAC)

Normal view MARC view ISBD view

A guide to convolutional neural networks for computer vision /

By: Khan, Salman [author.].
Contributor(s): Rahmani, Hossein [author.] | Shah, Syed Afaq Ali [author.] | Bennamoun, M [author.].
Material type: materialTypeLabelBookSeries: Synthesis digital library of engineering and computer science: ; Synthesis lectures on computer vision: # 15.Publisher: [San Rafael, California] : Morgan & Claypool, 2018.Description: 1 PDF (xix, 187 pages) : illustrations.Content type: text Media type: electronic Carrier type: online resourceISBN: 9781681730226.Subject(s): Computer vision -- Mathematical models | Neural networks (Computer science) | Convolutions (Mathematics) | deep learning | computer vision | convolution neural networks | perception | back-propagation | feed-forward networks | image classification | action recognition | object detection | object tracking | video processing | semantic segmentation | scene understanding | 3D processingGenre/Form: Electronic books.DDC classification: 006.37 Online resources: Abstract with links to resource Also available in print.
Contents:
1. Introduction -- 1.1 What is computer vision? -- 1.1.1 Applications -- 1.1.2 Image processing vs. computer vision -- 1.2 What is machine learning? -- 1.2.1 Why deep learning? -- 1.3 Book overview --
2. Features and classifiers -- 2.1 Importance of features and classifiers -- 2.1.1 Features -- 2.1.2 Classifiers -- 2.2 Traditional feature descriptors -- 2.2.1 Histogram of oriented gradients (HOG) -- 2.2.2 Scale-invariant feature transform (SIFT) -- 2.2.3 Speeded-up robust features (SURF) -- 2.2.4 Limitations of traditional hand-engineered features -- 2.3 Machine learning classifiers -- 2.3.1 Support vector machine (SVM) -- 2.3.2 Random decision forest -- 2.4 Conclusion --
3. Neural networks basics -- 3.1 Introduction -- 3.2 Multi-layer perceptron -- 3.2.1 Architecture basics -- 3.2.2 Parameter learning -- 3.3 Recurrent neural networks -- 3.3.1 Architecture basics -- 3.3.2 Parameter learning -- 3.4 Link with biological vision -- 3.4.1 Biological neuron -- 3.4.2 Computational model of a neuron -- 3.4.3 Artificial vs. biological neuron --
4. Convolutional neural network -- 4.1 Introduction -- 4.2 Network layers -- 4.2.1 Pre-processing -- 4.2.2 Convolutional layers -- 4.2.3 Pooling layers -- 4.2.4 Nonlinearity -- 4.2.5 Fully connected layers -- 4.2.6 Transposed convolution layer -- 4.2.7 Region of interest pooling -- 4.2.8 Spatial pyramid pooling layer -- 4.2.9 Vector of locally aggregated descriptors layer -- 4.2.10 Spatial transformer layer -- 4.3 CNN loss functions -- 4.3.1 Cross-entropy loss -- 4.3.2 SVM hinge loss -- 4.3.3 Squared hinge loss -- 4.3.4 Euclidean loss -- 4.3.5 The l1 error -- 4.3.6 Contrastive loss -- 4.3.7 Expectation loss -- 4.3.8 Structural similarity measure --
5. CNN learning -- 5.1 Weight initialization -- 5.1.1 Gaussian random initialization -- 5.1.2 Uniform random initialization -- 5.1.3 Orthogonal random initialization -- 5.1.4 Unsupervised pre-training -- 5.1.5 Xavier initialization -- 5.1.6 ReLU aware scaled initialization -- 5.1.7 Layer-sequential unit variance -- 5.1.8 Supervised pre-training -- 5.2 Regularization of CNN -- 5.2.1 Data augmentation -- 5.2.2 Dropout -- 5.2.3 Drop-connect -- 5.2.4 Batch normalization -- 5.2.5 Ensemble model averaging -- 5.2.6 The l2 regularization -- 5.2.7 The l1 regularization -- 5.2.8 Elastic net regularization -- 5.2.9 Max-norm constraints -- 5.2.10 Early stopping -- 5.3 Gradient-based CNN learning -- 5.3.1 Batch gradient descent -- 5.3.2 Stochastic gradient descent -- 5.3.3 Mini-batch gradient descent -- 5.4 Neural network optimizers -- 5.4.1 Momentum -- 5.4.2 Nesterov momentum -- 5.4.3 Adaptive gradient -- 5.4.4 Adaptive delta -- 5.4.5 RMSprop -- 5.4.6 Adaptive moment estimation -- 5.5 Gradient computation in CNNs -- 5.5.1 Analytical differentiation -- 5.5.2 Numerical differentiation -- 5.5.3 Symbolic differentiation -- 5.5.4 Automatic differentiation -- 5.6 Understanding CNN through visualization -- 5.6.1 Visualizing learned weights -- 5.6.2 Visualizing activations -- 5.6.3 Visualizations based on gradients --
6. Examples of CNN architectures -- 6.1 LeNet -- 6.2 AlexNet -- 6.3 Network in network -- 6.4 VGGnet -- 6.5 GoogleNet -- 6.6 ResNet -- 6.7 ResNeXt -- 6.8 FractalNet -- 6.9 DenseNet --
7. Applications of CNNs in computer vision -- 7.1 Image classification -- 7.1.1 PointNet -- 7.2 Object detection and localization -- 7.2.1 Region-based CNN -- 7.2.2 Fast R-CNN -- 7.2.3 Regional proposal network (RPN) -- 7.3 Semantic segmentation -- 7.3.1 Fully convolutional network (FCN) -- 7.3.2 Deep deconvolution network (DDN) -- 7.3.3 DeepLab -- 7.4 Scene understanding -- 7.4.1 DeepContext -- 7.4.2 Learning rich features from RGB-D images -- 7.4.3 Pointnet for scene understanding -- 7.5 Image generation -- 7.5.1 Generative adversarial networks (GANs) -- 7.5.2 Deep convolutional generative adversarial networks (DCGANs) -- 7.5.3 Super resolution generative adversarial network (SRGAN) -- 7.6 Video-based action recognition -- 7.6.1 Action recognition from still video frames -- 7.6.2 Two-stream CNNs -- 7.6.3 Long-term recurrent convolutional network (LRCN) --
8. Deep learning tools and libraries -- 8.1 Caffe -- 8.2 TensorFlow -- 8.3 MatConvNet -- 8.4 Torch7 -- 8.5 Theano -- 8.6 Keras -- 8.7 Lasagne -- 8.8 Marvin -- 8.9 Chainer -- 8.10 PyTorch --
9. Conclusion -- Bibliography -- Authors' biographies.
Abstract: Computer vision has become increasingly important and effective in recent years due to its wide-ranging applications in areas as diverse as smart surveillance and monitoring, health and medicine, sports and recreation, robotics, drones, and self-driving cars. Visual recognition tasks, such as image classification, localization, and detection, are the core building blocks of many of these applications, and recent developments in Convolutional Neural Networks (CNNs) have led to outstanding performance in these state-of-the-art visual recognition tasks and systems. As a result, CNNs now form the crux of deep learning algorithms in computer vision. This self-contained guide will benefit those who seek to both understand the theory behind CNNs and to gain hands-on experience on the application of CNNs in computer vision. It provides a comprehensive introduction to CNNs starting with the essential concepts behind neural networks: training, regularization, and optimization of CNNs. The book also discusses a wide range of loss functions, network layers, and popular CNN architectures, reviews the different techniques for the evaluation of CNNs, and presents some popular CNN tools and libraries that are commonly used in computer vision. Further, this text describes and discusses case studies that are related to the application of CNN in computer vision, including image classification, object detection, semantic segmentation, scene understanding, and image generation. This book is ideal for undergraduate and graduate students, as no prior background knowledge in the field is required to follow the material, as well as new researchers, developers, engineers, and practitioners who are interested in gaining a quick understanding of CNN models.
    average rating: 0.0 (0 votes)
Item type Current location Call number Status Date due Barcode Item holds
E books E books PK Kelkar Library, IIT Kanpur
Available EBKE868
Total holds: 0

Mode of access: World Wide Web.

Part of: Synthesis digital library of engineering and computer science.

Includes bibliographical references (pages 173-184).

1. Introduction -- 1.1 What is computer vision? -- 1.1.1 Applications -- 1.1.2 Image processing vs. computer vision -- 1.2 What is machine learning? -- 1.2.1 Why deep learning? -- 1.3 Book overview --

2. Features and classifiers -- 2.1 Importance of features and classifiers -- 2.1.1 Features -- 2.1.2 Classifiers -- 2.2 Traditional feature descriptors -- 2.2.1 Histogram of oriented gradients (HOG) -- 2.2.2 Scale-invariant feature transform (SIFT) -- 2.2.3 Speeded-up robust features (SURF) -- 2.2.4 Limitations of traditional hand-engineered features -- 2.3 Machine learning classifiers -- 2.3.1 Support vector machine (SVM) -- 2.3.2 Random decision forest -- 2.4 Conclusion --

3. Neural networks basics -- 3.1 Introduction -- 3.2 Multi-layer perceptron -- 3.2.1 Architecture basics -- 3.2.2 Parameter learning -- 3.3 Recurrent neural networks -- 3.3.1 Architecture basics -- 3.3.2 Parameter learning -- 3.4 Link with biological vision -- 3.4.1 Biological neuron -- 3.4.2 Computational model of a neuron -- 3.4.3 Artificial vs. biological neuron --

4. Convolutional neural network -- 4.1 Introduction -- 4.2 Network layers -- 4.2.1 Pre-processing -- 4.2.2 Convolutional layers -- 4.2.3 Pooling layers -- 4.2.4 Nonlinearity -- 4.2.5 Fully connected layers -- 4.2.6 Transposed convolution layer -- 4.2.7 Region of interest pooling -- 4.2.8 Spatial pyramid pooling layer -- 4.2.9 Vector of locally aggregated descriptors layer -- 4.2.10 Spatial transformer layer -- 4.3 CNN loss functions -- 4.3.1 Cross-entropy loss -- 4.3.2 SVM hinge loss -- 4.3.3 Squared hinge loss -- 4.3.4 Euclidean loss -- 4.3.5 The l1 error -- 4.3.6 Contrastive loss -- 4.3.7 Expectation loss -- 4.3.8 Structural similarity measure --

5. CNN learning -- 5.1 Weight initialization -- 5.1.1 Gaussian random initialization -- 5.1.2 Uniform random initialization -- 5.1.3 Orthogonal random initialization -- 5.1.4 Unsupervised pre-training -- 5.1.5 Xavier initialization -- 5.1.6 ReLU aware scaled initialization -- 5.1.7 Layer-sequential unit variance -- 5.1.8 Supervised pre-training -- 5.2 Regularization of CNN -- 5.2.1 Data augmentation -- 5.2.2 Dropout -- 5.2.3 Drop-connect -- 5.2.4 Batch normalization -- 5.2.5 Ensemble model averaging -- 5.2.6 The l2 regularization -- 5.2.7 The l1 regularization -- 5.2.8 Elastic net regularization -- 5.2.9 Max-norm constraints -- 5.2.10 Early stopping -- 5.3 Gradient-based CNN learning -- 5.3.1 Batch gradient descent -- 5.3.2 Stochastic gradient descent -- 5.3.3 Mini-batch gradient descent -- 5.4 Neural network optimizers -- 5.4.1 Momentum -- 5.4.2 Nesterov momentum -- 5.4.3 Adaptive gradient -- 5.4.4 Adaptive delta -- 5.4.5 RMSprop -- 5.4.6 Adaptive moment estimation -- 5.5 Gradient computation in CNNs -- 5.5.1 Analytical differentiation -- 5.5.2 Numerical differentiation -- 5.5.3 Symbolic differentiation -- 5.5.4 Automatic differentiation -- 5.6 Understanding CNN through visualization -- 5.6.1 Visualizing learned weights -- 5.6.2 Visualizing activations -- 5.6.3 Visualizations based on gradients --

6. Examples of CNN architectures -- 6.1 LeNet -- 6.2 AlexNet -- 6.3 Network in network -- 6.4 VGGnet -- 6.5 GoogleNet -- 6.6 ResNet -- 6.7 ResNeXt -- 6.8 FractalNet -- 6.9 DenseNet --

7. Applications of CNNs in computer vision -- 7.1 Image classification -- 7.1.1 PointNet -- 7.2 Object detection and localization -- 7.2.1 Region-based CNN -- 7.2.2 Fast R-CNN -- 7.2.3 Regional proposal network (RPN) -- 7.3 Semantic segmentation -- 7.3.1 Fully convolutional network (FCN) -- 7.3.2 Deep deconvolution network (DDN) -- 7.3.3 DeepLab -- 7.4 Scene understanding -- 7.4.1 DeepContext -- 7.4.2 Learning rich features from RGB-D images -- 7.4.3 Pointnet for scene understanding -- 7.5 Image generation -- 7.5.1 Generative adversarial networks (GANs) -- 7.5.2 Deep convolutional generative adversarial networks (DCGANs) -- 7.5.3 Super resolution generative adversarial network (SRGAN) -- 7.6 Video-based action recognition -- 7.6.1 Action recognition from still video frames -- 7.6.2 Two-stream CNNs -- 7.6.3 Long-term recurrent convolutional network (LRCN) --

8. Deep learning tools and libraries -- 8.1 Caffe -- 8.2 TensorFlow -- 8.3 MatConvNet -- 8.4 Torch7 -- 8.5 Theano -- 8.6 Keras -- 8.7 Lasagne -- 8.8 Marvin -- 8.9 Chainer -- 8.10 PyTorch --

9. Conclusion -- Bibliography -- Authors' biographies.

Abstract freely available; full-text restricted to subscribers or individual document purchasers.

Compendex

INSPEC

Google scholar

Google book search

Computer vision has become increasingly important and effective in recent years due to its wide-ranging applications in areas as diverse as smart surveillance and monitoring, health and medicine, sports and recreation, robotics, drones, and self-driving cars. Visual recognition tasks, such as image classification, localization, and detection, are the core building blocks of many of these applications, and recent developments in Convolutional Neural Networks (CNNs) have led to outstanding performance in these state-of-the-art visual recognition tasks and systems. As a result, CNNs now form the crux of deep learning algorithms in computer vision. This self-contained guide will benefit those who seek to both understand the theory behind CNNs and to gain hands-on experience on the application of CNNs in computer vision. It provides a comprehensive introduction to CNNs starting with the essential concepts behind neural networks: training, regularization, and optimization of CNNs. The book also discusses a wide range of loss functions, network layers, and popular CNN architectures, reviews the different techniques for the evaluation of CNNs, and presents some popular CNN tools and libraries that are commonly used in computer vision. Further, this text describes and discusses case studies that are related to the application of CNN in computer vision, including image classification, object detection, semantic segmentation, scene understanding, and image generation. This book is ideal for undergraduate and graduate students, as no prior background knowledge in the field is required to follow the material, as well as new researchers, developers, engineers, and practitioners who are interested in gaining a quick understanding of CNN models.

Also available in print.

Title from PDF title page (viewed on February 24, 2018).

There are no comments for this item.

Log in to your account to post a comment.

Powered by Koha