Learning Real-world Visuo-motor Policies from Simulation

May 31, 2018

This is my Ph.D. project in the Australian Centre for Robotic Vision at QUT, with supervisions from Prof. Peter Corke, Dr. Jürgen Leitner, Prof. Michael Milford and Dr. Ben Upcroft.

Learning Planar Reaching in Simulation

Robotic Planar Reaching in the Real World

Learning Table-top Object Reaching with a 7 DoF Robotic Arm from Simulation

Contributions:

Feasibility analysis on learning vision-based robotic planar reaching using DQNs in simulation (Zhang et al., 2015).
Proposed a modular deep Q network architecture for fast and low-cost transfer of visuo-motor policies from simulation to the real world (Zhang et al., 2017).
Proposed an end-to-end fine-tuning method using weighted losses to improve hand-eye coordination (Zhang et al., 2017).
Proposed a kinematics-based guided policy search method (K-GPS) to speed up Q learning for robotic applications where kinematic models are known (Zhang et al., 2017).
Demonstrated in robotic reaching tasks on a real Baxter robot in velocity and position control modes, e.g., table-top object reaching in clutter (Zhang et al., 2019) and planar reaching (Zhang et al., 2017).
More investigations are undergoing for semi-supervised and unsupervised transfer from simulation to the real world using adversarial discriminative approaches (Zhang et al., 2019).

References

2019

Adversarial discriminative sim-to-real transfer of visuo-motor policies

Fangyi Zhang, Jürgen Leitner, Zongyuan Ge, Michael Milford, and Peter Corke

The International Journal of Robotics Research (IJRR), 2019

Abs PDF Code

Various approaches have been proposed to learn visuo-motor policies for real-world robotic applications. One solution is first learning in simulation then transferring to the real world. In the transfer, most existing approaches need real-world images with labels. However, the labelling process is often expensive or even impractical in many robotic applications. In this paper, we propose an adversarial discriminative sim-to-real transfer approach to reduce the cost of labelling real data. The effectiveness of the approach is demonstrated with modular networks in a table-top object reaching task where a 7 DoF arm is controlled in velocity mode to reach a blue cuboid in clutter through visual observations. The adversarial transfer approach reduced the labelled real data requirement by 50%. Policies can be transferred to real environments with only 93 labelled and 186 unlabelled real images. The transferred visuo-motor policies are robust to novel (not seen in training) objects in clutter and even a moving target, achieving a 97.8% success rate and 1.8 cm control accuracy.

2017

Modular Deep Q Networks for Sim-to-real Transfer of Visuo-motor Policies

Fangyi Zhang, Jürgen Leitner, Michael Milford, and Peter Corke

In Australasian Conference on Robotics and Automation (ACRA), Dec 2017

Abs PDF

While deep learning has had significant successes in computer vision thanks to the abundance of visual data, collecting sufficiently large real-world datasets for robot learning can be costly. To increase the practicality of these techniques on real robots, we propose a modular deep reinforcement learning method capable of transferring models trained in simulation to a real-world robotic task. We introduce a bottleneck between perception and control, enabling the networks to be trained independently, but then merged and fine-tuned in an end-to-end manner to further improve hand-eye coordination. On a canonical, planar visually-guided robot reaching task a fine-tuned accuracy of 1.6 pixels is achieved, a significant improvement over naive transfer (17.5 pixels), showing the potential for more complicated and broader applications. Our method provides a technique for more efficient learning and transfer of visuo-motor policies for real robotic systems without relying entirely on large real-world robot datasets.
Tuning Modular Networks With Weighted Losses for Hand-Eye Coordination

Fangyi Zhang, Jürgen Leitner, Michael Milford, and Peter Corke

In IEEE Conference on Computer Vision and Pattern Recognition Workshops, Jul 2017

Abs PDF

This paper introduces an end-to-end fine-tuning method to improve hand-eye coordination in modular deep visuo-motor policies (modular networks) where each module is trained independently. Benefiting from weighted losses, the fine-tuning method significantly improves the performance of the policies for a robotic planar reaching task.

2015

Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control

Fangyi Zhang, Jürgen Leitner, Michael Milford, Ben Upcroft, and Peter Corke

In Australasian Conference on Robotics and Automation (ACRA), Dec 2015

Abs PDF

This paper introduces a machine learning based system for controlling a robotic manipulator with visual perception only. The capability to autonomously learn robot controllers solely from raw-pixel images and without any prior knowledge of configuration is shown for the first time. We build upon the success of recent deep reinforcement learning and develop a system for learning target reaching with a three-joint robot manipulator using external visual observation. A Deep Q Network (DQN) was demonstrated to perform target reaching after training in simulation. Transferring the network to real hardware and real observation in a naive approach failed, but experiments show that the network works when replacing camera images with synthetic images.