End-to-End Deep Reinforcement Learning Without Reward Engineering

End-to-End Deep Reinforcement Learning Without Reward Engineering

We have developed an end-to-end method that allows robots to learn from a modest number of images that depict successful completion of a task, without any manual reward engineering. While VICE is capable of learning end-to-end policies for solving real world robotic tasks without any engineering for obtaining rewards, it does have a limitation: it needs thousands of positive examples provided upfront in order to learn, and this could be a burden on the human user. The combined method, which we call VICE-RAQ, is able to solve real world robotics tasks with about 80 goal example images provided up front, followed by 25-75 active queries.

Source: bair.berkeley.edu