Skip to main content

Deep Learning For Computer Vision: Single Shot Detectors for Object Detection


Department of Information and Communications Technology
Enrollment in this course is by invitation only

Course Overview

In most of the introductory materials in machine learning, we are introduced to the application of artificial neural networks in classifying objects in images. In these courses, we learn about convolutional networks and are introduced to the network architecture that started the popularity of the application of artificial neural networks and deep learning in computer vision, the AlexNet network architecture. Image classification is a core task in computer vision and is used in conjunction with other machine learning tasks to deliver a solution to a problem like detecting objects in an image. Most of the time, we are interested in identifying multiple target objects in an image and by this, we mean to locate their specific location in the image and to classify each of them. This is known as the task of object detection in computer vision. Imagine your photos in a social media app wherein you and your friends are automatically located and identified, or a self-driving car in which the vehicles around are automatically and continuously located and classified by the car for decision-making. These cases are just some of the use cases of object detection in real life.

In this course, we will look into the task of object detection in computer vision. We will discuss two of the fundamental neural network architectures for this task, namely, the You Only Look Once or YOLO architecture and the Single-Shot Detector or SSD architecture. We will learn about the fundamental concepts that drive the effectiveness of these architectures such as anchor and bounding boxes, the measurement of similarities between the areas that these boxes cover, and the blocks of neural network layers that make up these architectures. We will walk through a set of Python source code to demonstrate these theoretical concepts and to reinforce your learning.

What You Will Learn

At the end of this course, you will be able to:

  • define the key features of several single-shot detection architectures
  • train object detection models using the YOLO and SSD approaches to object detection

Course Content

Week 1: Object Detection

10 Videos | 1 Activity

10 Videos

  • Welcome to the course!
  • Software Tools and Libraries
  • Object Detection Task
  • Dataset Preparation
  • Bounding Boxes
  • Anchor Boxes
  • Measuring Similarity Between Anchor Boxes
  • Labeling Anchor Boxes for Training Data
  • Predicting Object Category Using Bounding and Anchor Boxes
  • Summary

1 Activity

  • Exit Assessment

Week 2: You Only Look Once (YOLO) Approach to Object Detection

8 Videos | 1 Activity

8 Videos

  • Architecture
  • Image Grid and Bound Boxes
  • Image Detection Block
  • Loss Function (IOU)
  • Limitations
  • Training a YOLO-based Model
  • Testing the Model
  • Summary

1 Activity

  • Exit Assessment

Week 3: Single Shot Multibox Detection (SSD) - Part 1

10 Videos | 1 Activity

10 Videos

  • Too Many Anchor Boxes
  • Multiscale Bounding and Anchor Boxes
  • The Design of Single Shot Detection Model
  • Category Prediction Layer
  • Bounding Box Prediction Layer
  • Concatenating Predictions for Multiple Scales
  • Height and Width Downsample Block
  • Base Network Block
  • The Complete Model
  • Summary

1 Activity

  • Exit Assessment

Week 4: Single Shot Multibox Detection (SSD) - Part 2

5 Videos | 1 Activity

5 Videos

  • Localization Loss
  • Classification Loss
  • Training an SSD-based Model
  • Testing the Model
  • Key Takeaways

1 Activity

  • Exit Assessment
  1. Course Number

    DICT-ICT005
  2. Classes Start

    TBA
  3. Estimated Effort

    2 hrs./week (8 hours)
  4. Price

    Free