uav dqn gazebo (PDF)




File information


This PDF 1.4 document has been generated by Google / , and has been sent on pdf-archive.com on 09/05/2017 at 06:51, from IP address 128.2.x.x. The current document download page has been viewed 440 times.
File size: 597.81 KB (1 page).
Privacy: public file




Document preview - uav_dqn_gazebo.pdf - Page 1/1





File preview


Video results : FPV of quadrotor flight

Shut up and show me the code

Autonomous Quadrotor Flight in Simulation using RL
Ratnesh Madaan, Dhruv Mauria Saxena, Rogério Bonatti, Shohin Mukherjee

Motivation

Objectives

➢ Despite advancements in sensing technologies, it
is difficult to develop robust systems by separating
perception and control.
➢ Learning to fly in the real world is impractical since
it is time consuming and expensive.
➢ Learning to fly in simulation opens the possibility
of transferring learned policies to the real world.

➢ Develop an open-source Gazebo environment
integrated with Gym which can be used by the
community for reinforcement learning research.
➢ Train a deep Q-network capable of flying a drone
autonomously in the Gazebo environment.

Environment
➢ Environment consists of randomly places
cylindrical obstacles, simulated and rendered in
Gazebo.
➢ Position of cylinders changes for each episode.
➢ Quadrotor is equipped with a planar laser
rangefinder and a front-facing RGB-D camera.

Partial Results
Graphs for grayscale, monocular camera images
Train Reward

Learning On Images
Network architecture:
➢ Input: 84 X 84 X 4 images
➢ Conv layer 1: 32, 8 X 8 filters, stride 4
➢ Conv layer 2: 64, 4 X 4 filters, stride 2
➢ Conv layer 3: 64, 3 X 3 filters, stride 1
➢ Fully-connected layer: 512 units
➢ Output: 9 units (correspond to yaw angles)

Train Episode Length

Test Reward

Conclusion And Future Work
Learning On Laser Data
Network architecture:
➢ Input: 70 X 4 array
➢ Fully-connected layer: 512 units
➢ Fully-connected layer: 512 units
➢ Output: 9 units (number of actions)

Learning Algorithm - DQN
➢ Qw- : Target Network
➢ Qw : Online Network

➢ With less than 1M iterations, we still could not
observe significant learning in the environments
with laser and depth images as inputs.
➢ Learning on depth images and laser data can be
more easily transferrable to real-life
applications.
➢ During the summer we plan to test real
quadcopters flying with the policies learned in
simulation.

References
➢ [1] Mnih, Volodymyr, et al. "Playing atari with deep
reinforcement learning." arXiv preprint arXiv:1312.5602
(2013).
➢ [2] F. Sadeghi and S. Levine, “(cad) 2 rl: Real single-image flight
without a single real image,” arXiv preprint arXiv:1611.04201,
2016






Download uav dqn gazebo



uav_dqn_gazebo.pdf (PDF, 597.81 KB)


Download PDF







Share this file on social networks



     





Link to this page



Permanent link

Use the permanent link to the download page to share your document on Facebook, Twitter, LinkedIn, or directly with a contact by e-Mail, Messenger, Whatsapp, Line..




Short link

Use the short link to share your document on Twitter or by text message (SMS)




HTML Code

Copy the following HTML code to share your document on a Website or Blog




QR Code to this page


QR Code link to PDF file uav_dqn_gazebo.pdf






This file has been shared publicly by a user of PDF Archive.
Document ID: 0000594303.
Report illicit content