Margarita Chli (ETH Zurich)
Vision-based perception for aerial robots
Abstract: This talk will describe the journey of the evolution of visual perception for aerial robots, presenting our latest results at the Vision for Robotics Lab on building blocks enabling autonomous navigation of a small aircraft and the transition from single to multi-robot collaborative estimation, touching on some of the most challenging problems we are faced with currently.
Bio: Margarita Chli is an Assistant Professor leading the Vision for Robotics Lab (V4RL) of ETH Zurich, Switzerland and the vice-director of the Institute of Robotics and Intelligent Systems at ETH Zurich. Originally from Greece and Cyprus, she received both her Bachelor and Master degrees in Information and Computing Engineering from Trinity College of the University of Cambridge, UK, and her PhD from Imperial College London. After a postdoctoral position at the ASL of ETH Zurich, she moved to the University of Edinburgh, UK, to accept the prestigious Chancellor’s Fellowship and initiate V4RL. In 2015, she relocated with her group to ETH Zurich to accept a Swiss National Science Foundation Assistant Professorship (success rate of 15.7%). Prof Chli received numerous academic scholarships from both Cyprus and Cambridge and she continues to hold an Honorary Fellowship from the University of Edinburgh. She is a World Economics Forum (WEF) Expert in Artificial Intelligence and Robotics and was a speaker at WEF 2017 in Davos, while she also featured in Robohub’s 2016 list of 25 women in Robotics you need to know about. Her research interests lie in developing vision-based perception for robots, such as small aircraft, leading teams in multiple national and international projects, such as the European Commission projects sfly, myCopter, SHERP
A, and AEROWORKS. Prof Chli participated in the first vision-based autonomous flight of a small helicopter, and her work has received international acclaim from the community recently featuring in Reuters.
Vincent Lepetit (University of Bordeaux)
3D Registration with Deep Learning
The first part of my talk will describe a novel method for 3D object detection and pose estimation from color images only. We introduce a “holistic’’ approach that relies on a representation of a 3D pose suitable to Deep Networks and on a feedback loop. This approach, like many previous ones is however not sufficient for handling objects with an axis of rotational symmetry, as the pose of these objects is in fact ambiguous. We show how to relax this ambiguity with a combination of classification and regression. The second part will describe an approach bridging the gap between learning-based approaches and geometric approaches, for accurate and robust camera pose estimation in urban environments from single images and simple 2D maps.
Bio: Dr. Vincent Lepetit is a Full Professor at the LaBRI, University of Bordeaux, and an associate member of the Inria Manao team. He also supervizes a research group in Computer Vision for Augmented Reality at the Institute for Computer Graphics and Vision, TU Graz.
He received the PhD degree in Computer Vision in 2001 from the University of Nancy, France, after working in the ISA INRIA team. He then joined the Virtual Reality Lab at EPFL as a post-doctoral fellow and became a founding member of the Computer Vision Laboratory. He became a Professor at TU Graz in February 2014, and at University of Bordeaux in January 2017. His research interests include computer vision and machine learning, and their application to 3D hand pose estimation, feature point detection and description, and 3D registration from images. In particular, he introduced with his colleagues methods such as Ferns, BRIEF, LINE-MOD, and DeepPrior for feature point matching and 3D object recognition.
He often serves as program committee member and area chair of major vision conferences (CVPR, ICCV, ECCV, ACCV, BMVC). He is an editor for the International Journal of Computer Vision (IJCV) and the Computer Vision and Image Understanding (CVIU) journal.
Yarin Gal (University of Oxford)
Bayesian Deep Learning
Abstract: Bayesian models are rooted in Bayesian statistics and easily benefit from the vast literature in the field. In contrast, deep learning lacks a solid mathematical grounding. Instead, empirical developments in deep learning are often justified by metaphors, evading the unexplained principles at play. These two fields are perceived as fairly antipodal to each other in their respective communities. It is perhaps astonishing then that most modern deep learning models can be cast as performing approximate inference in a Bayesian setting. The implications of this are profound: we can use the rich Bayesian statistics literature with deep learning models, explain away many of the curiosities with this technique, combine results from deep learning into Bayesian modeling, and much more. In this talk I will review a new theory linking Bayesian modeling and deep learning and demonstrate the practical impact of the framework with a range of real-world applications. I will also explore open problems for future research—problems that stand at the forefront of this new and exciting field.
Bio: Yarin Gal obtained his PhD from the Machine Learning group at the University of Cambridge, and was a Research Fellow at St Catherine’s college, Cambridge. He is currently the Associate Professor of Machine Learning at the University of Oxford Computer Science department, holding positions also as a Tutorial Fellow in Computer Science at Christ Church, Oxford, a Visiting Researcher position at the University of Cambridge, as well as a Turing Fellowship at the Alan Turing Institute, the UK’s national institute for data science.
Andrea Cherubini (Université de Montpellier)
Perception to Inter-Action
Abstract: Traditionally, heterogeneous sensor data was fed to fusion algorithms (e.g., Kalman or Bayesian-based), so as to provide state estimation for modeling the environment. However, since robot sensors generally measure different physical phenomena, it is preferable to use them directly in the low-level servo controller rather than to apply them to multi-sensory fusion or to design complex state machines. This idea, originally proposed in the hybrid position-force control paradigm, when extended to multiple sensors brings new challenges to the control design; challenges related to the task representation and to the sensor characteristics (synchronization, hybrid control, task compatibility, etc.). The rationale behind our work has precisely been to use sensor-based control as a means to facilitate the physical interaction between robots and humans.
In particular, we have used vision, proprioceptive force, touch and distance to address case studies, targeting four main research axes: teach-and-repeat navigation of wheeled mobile robots, collaborative industrial manipulation with safe physical interaction, force and visual control for interacting with humanoid robots, and shared robot control. Each of these axes will be presented here, before concluding with a general view of the issues at stake, and on the research projects that we plan to carry out in the upcoming years.
Bio: Andrea Cherubini is Associate Professor at Université de Montpellier and Researcher at LIRMM IDH (Interactive Digital Humans Group) since 2011. He received an MSc in 2001 from the University of Rome « La Sapienza » and a second one in 2003 from the University of Sheffield, U.K. From 2004 to 2008, he was PhD student, and then Postdoctoral fellow, at the Dipartimento di Informatica e Sistemistica (now DIAG), University of Rome « La Sapienza ». Then, from 2008 to 2011, he worked as PostDoc at INRIA Rennes. With IDH, he was involved in European projects VERE and RoboHow.Cog, and in the French Project ANR ICARO.
His main research interests include sensor-based control, humanoid robotics, and physical human-robot interaction. This research is targeted by the French projects CoBot@LR and ANR SISCOB, and by the European project H2020 VERSATILE, all of which he manages as Principal Investigator at LIRMM.
Elizabeth Croft (Monash University)
Hey robot – do you see what I see? Creating common task frameworks through visual cue
Abstract: To be confirmed
Bio: Professor Elizabeth A. Croft (B.A.Sc UBC ’88, M.A.Sc Waterloo ’92, Ph.D. Toronto ’95) is the Dean of Engineering at Monash University commencing January 2018. She is formerly a Professor of Mechanical Engineering and Senior Associate Dean for the Faculty of Applied Science at the University of British Columbia (UBC) and Director of the Collaborative Advanced Robotics and Intelligent Systems (CARIS) Laboratory. Her research investigates how robotic systems can behave, and be perceived to behave, in a safe, predictable, and helpful manner, and how people interact with, and understand, robotic systems with applications ranging from manufacturing to healthcare and assistive technology. She held the NSERC Chair for Women in Science and Engineering (BC/Yukon) from 2010-2015 and the Marshall Bauder Professorship in Engineering Economics, Business and Management Training from 2015-2017. Her recognitions include a Peter Wall Early Career Scholar award, an NSERC Accelerator award, and WXN’s top 100 most powerful women in Canada. She is a Fellow of the Canadian Academy of Engineers, Engineers Canada, and the American Society of Mechanical Engineers.
Chunhua Shen (University of Adelaide)
Title: Deep Learning for Dense Per-Pixel Prediction and Vision-to-Language Problems
Abstract: Dense per-pixel prediction provides an estimate for each pixel given an image, offering much richer information
than conventional sparse prediction models. Thus the Computer Vision community have been increasingly shifting the research focus to per-pixel prediction. In the first part of my talk, I will introduce my team’s recent work on deep structured methods for per-pixel prediction that combine deep learning and graphical models such as conditional random fields. I show how to improve depth estimation from single images and semantic segmentation with the use of contextual information in the context of deep structured learning.
Recent advances in computer vision and natural language processing (NLP) have led to new interesting applications.
Two popular ones are automatically generating natural captions for images/video and answering questions relevant to a given image
(i.e., visual question answering or VQA). In the second part of my talk, I will describe several recent work from my group that take advantage of state-of-the-art computer vision and NLP techniques to produce promising results on both tasks of image captioning and VQA.
Bio: Chunhua is a Professor of Computer Science at University of Adelaide, leading the Machine Learning Group. He held an ARC Future Fellowship from 2012 to 2016. His research and teaching have been focusing on Statistical Machine Learning and Computer Vision. These days his team focuses their effort on Deep Learning. In particular, with tools from deep learning, his research contributes to understand the visual world around us by exploiting the large amounts of imaging data.
Chunhua received a PhD degree at University of Adelaide; then worked at the NICTA (National ICT Australia) computer vision program for about six years. From 2006 to 2011, he held an adjunct position at College of Engineering & Computer Science, Australian National University. He moved back to University of Adelaide in 2011.
Tom Drummond (Monash University)
Algorithms and Architecture: Past, Present and Future
Abstract: To be confirmed.
Bio: Professor Drummond is a Chief Investigator based at Monash. He studied a BA in mathematics at the University of Cambridge. In 1989 he emigrated to Australia and worked for CSIRO in Melbourne for four years before moving to Perth for his PhD in Computer Science at Curtin University. In 1998 he returned to Cambridge as a post-doctoral Research Associate and in 1991 was appointed as a University Lecturer. In 2010 he returned to Melbourne and took up a Professorship at Monash University. His research is principally in the field of real-time computer vision (i.e. processing of information from a video camera in a computer in real-time typically at frame rate), machine learning and robust methods. These have applications in augmented reality, robotics, assistive technologies for visually impaired users as well as medical imaging.
Matthew Dunbabin (Queensland University of Technology)
Title: To be confirmed
Abstract: To be confirmed
Bio: Dr Matthew Dunbabin joined QUT as a Principal Research Fellow (Autonomous Systems) in 2013. He is known internationally for his research into field robotics, particularly environmental robots, and their application to large-scale marine habitat monitoring, marine pest (Crown-of-Thorns Starfish) control, and aquatic greenhouse gas mapping. He has wide research interests including adaptive sampling and path planning, vision-based navigation, cooperative robotics, as well as robot and sensor network interactions. Dr Dunbabin received his Bachelor of Engineering in Aerospace Engineering from the Royal Melbourne Institute of Technology and his PhD from the Queensland University of Technology. He started his professional career in 1995 as a project engineer at Roaduser Research International, and following his PhD joined the Commonwealth Scientific and Industrial Research Organisation (CSIRO) in the Autonomous Systems Laboratory. At CSIRO he held various roles including Principal Research Scientist, project leader and the Robotics Systems and Marine Robotics team leader before moving to QUT in 2013. A strong advocate of robotic systems in civilian applications, Dr Dunbabin is involved in a number of initiatives aimed at promoting, educating and demonstrating autonomous systems to a range of interest groups nationally and internationally.
Nick Barnes (Australian National University & Data61)
Title: Low level computer vision techniques for 3D scene parsing in bionic eyes and endoscopy
Abstract: Implantable visual prosthetic devices have low dynamic range and so users may have difficulty with poorly contrasted objects. We have shown that computer vision techniques to help with by ensuring the visibility of key objects in the scene. In Computer Vision this is semantic segmentation. Underlying this is are techniques in visual saliency and edge detection. I’ll present some of our recent work in this area as well as results in human implanted vision and our ongoing studies with Bionic Vision Technologies.
Bio: Nick Barnes received the B.Sc. degree with honours in 1992, and a Ph.D. in computer vision for robot navigation in 1999 from the University of Melbourne. From 1992-1994 he worked as a consultant in the IT industry. In 1999 he was a visiting research fellow at the LIRA-Lab at the University of Genoa, Italy, supported by an Achiever Award from the Queens’ Trust for Young Australians. From 2000 to 2003, he was a lecturer with the Department of Computer Science and Software Engineering, The University of Melbourne. Since 2003 he has been with NICTA’s Canberra Research Laboratory, which merged to become Data61@CSIRO. He has been conducting research in the areas of computer vision, vision for
driver and low vision assistance, and vision for vehicle guidance for more than 15 years. His team developed vision processing for bionic vision that was tested with three individuals implanted with a retinal prosthesis in 2012-2014. Their results showed that by using improved vision processing, implanted individuals could achieve better results on standard low vision tests, and functional vision tests. Further trials will commence during 2016. He has more than 100 peer reviewed scientific publications and is co-inventor of eight patent applications. He is currently a senior principal researcher and research group leader in computer vision for Data61 and an Adjunct Associate Professor with the Australian National University. His research interests include visual dynamic scene analysis, wearable sensing, vision for low vision assistance, computational models of biological vision, feature detection, vision for vehicle guidance and medical image analysis.