2018 Workshop

Workshop: The Great Escape

Organizers: Feras Dayoub, Dimity Miller, Robert Lee, Lachlan Nicholson, Troy Cordie, Steven Martine

Summary:  The workshop focuses on the use of vision to build an inner representation of the world so a robot can plan and navigate in order to escape a maze. The robot uses a camera tilted towards the ground plane. Using a homography estimation of the plane, the robot has to build an occupancy grid of a network of tracks on the floor marked using colored tapes. As an added level of complexity, the floor plane might contain vertical structures.  The goal is to make the robot escape a maze. The process involves:

  • image segmentation (image processing)
  • the use of a homography to project image information to the floor plane (geometry)
  • occupancy grid of the floor plane (data structure)
  • path planning (search)

The aim is to make the students think about the role of a camera as a sensor on a mobile robot and how the robot uses visual information to build an inner representation of its workspace. The students will use familiar concepts such as image segmentation as well as unfamiliar concepts of using the camera to populate a 2D occupancy grid of the floor.

We have created a Slack workspace for all the students in the workshop to join. This workspace will allow us, the workshop organisers, to provide technical help and provide answers to the students before the school starts. This way we make sure all the required setups are done before Sunday 4th. Also, this workspace will allow the students to communicate and discuss ideas while solving the workshop exercise. You can access the workspace here.


Tutorial A: Region based CNN for Computer Vision and Robotics

Presenters: Ravi Garg, Chao Ma, Thanh-Toan Do

Summary: Deep neural networks have shown state-of-the-art performance on many computer vision problems, including image classification, object detection, semantic segmentation, visual tracking and scene classification. This tutorial will cover basics of deep neural networks and its application to some of the computer vision and robotics problems. The first part will cover the basic building blocks of deep networks, training procedure and popular network architectures. The second part will focus on a particular type of deep network called “Region based CNN” and how it can be applied to various computer vision and robotics problems such as object detection, semantic/instance level segmentation, visual object tracking, object affordances and pose estimation.


Tutorial B: Semantic SLAM – Making robots map and understand the world

Presenters: Yasir Latif, Vincent Lui, Viorela Ila, Trung Pham

Summary: The goal of Simultaneous Localization and Mapping (SLAM) is to construct the representation of an environment while localizing the robot with respect to it. However, this does not provide any understanding of the physical world that the robot is moving in. Such understanding is important for meaning interactions with the world and also results in improved performance for the mapping and localization tasks. This tutorial will introduce the problem of SLAM at it current stage of development and then address various development towards a semantic understanding of the world and its effects on the original SLAM formulation.


Tutorial C: Vision and Action

Presenters: Suman Bista, Valerio Ortenzi, Juxi Leitner

Summary: For an effective deployment of robotics in real-world applications, a robot must be able to perceive, detect and locate objects in its surroundings in order to inform future motion control and decision-making. Interesting examples of effective acting on the environment are robotic manipulation of objects; visual navigation in complex dynamic environments; and an active understanding of the environment [1]. In this tutorial, we will explore most of these aspects. We begin with some fundamentals of Visual servoing and grasping, and explore their applications in manipulation and visual navigation. Furthermore, we present the case of Cartman, our robot that won the Amazon Picking Challenge 2017 as an exercise to think and discuss the use of deep learning methods for robotic vision and action. Finally, we will have some concluding discussions on possible additional relevant topics.