2017 > May > May 09, 2017

uav dqn gazebo (PDF)

File information

This PDF 1.4 document has been generated by Google / , and has been sent on pdf-archive.com on 09/05/2017 at 06:51, from IP address 128.2.x.x. The current document download page has been viewed 440 times.
File size: 597.81 KB (1 page).
Privacy: public file

Document preview - uav_dqn_gazebo.pdf - Page 1/1

File preview

Video results : FPV of quadrotor flight

Shut up and show me the code

Autonomous Quadrotor Flight in Simulation using RL
Ratnesh Madaan, Dhruv Mauria Saxena, Rogério Bonatti, Shohin Mukherjee

Motivation

Objectives

➢ Despite advancements in sensing technologies, it
is difficult to develop robust systems by separating
perception and control.
➢ Learning to fly in the real world is impractical since
it is time consuming and expensive.
➢ Learning to fly in simulation opens the possibility
of transferring learned policies to the real world.

➢ Develop an open-source Gazebo environment
integrated with Gym which can be used by the
community for reinforcement learning research.
➢ Train a deep Q-network capable of flying a drone
autonomously in the Gazebo environment.

Environment
➢ Environment consists of randomly places
cylindrical obstacles, simulated and rendered in
Gazebo.
➢ Position of cylinders changes for each episode.
➢ Quadrotor is equipped with a planar laser
rangefinder and a front-facing RGB-D camera.

Partial Results
Graphs for grayscale, monocular camera images
Train Reward

Learning On Images
Network architecture:
➢ Input: 84 X 84 X 4 images
➢ Conv layer 1: 32, 8 X 8 filters, stride 4
➢ Conv layer 2: 64, 4 X 4 filters, stride 2
➢ Conv layer 3: 64, 3 X 3 filters, stride 1
➢ Fully-connected layer: 512 units
➢ Output: 9 units (correspond to yaw angles)

Train Episode Length

Test Reward

Conclusion And Future Work
Learning On Laser Data
Network architecture:
➢ Input: 70 X 4 array
➢ Fully-connected layer: 512 units
➢ Fully-connected layer: 512 units
➢ Output: 9 units (number of actions)

Learning Algorithm - DQN
➢ Qw- : Target Network
➢ Qw : Online Network

➢ With less than 1M iterations, we still could not
observe significant learning in the environments
with laser and depth images as inputs.
➢ Learning on depth images and laser data can be
more easily transferrable to real-life
applications.
➢ During the summer we plan to test real
quadcopters flying with the policies learned in
simulation.

References
➢ [1] Mnih, Volodymyr, et al. "Playing atari with deep
reinforcement learning." arXiv preprint arXiv:1312.5602
(2013).
➢ [2] F. Sadeghi and S. Levine, “(cad) 2 rl: Real single-image flight
without a single real image,” arXiv preprint arXiv:1611.04201,
2016

Download uav dqn gazebo

uav_dqn_gazebo.pdf (PDF, 597.81 KB)

Download PDF

Share this file on social networks

Link to this page

Permanent link

Use the permanent link to the download page to share your document on Facebook, Twitter, LinkedIn, or directly with a contact by e-Mail, Messenger, Whatsapp, Line..

Short link

Use the short link to share your document on Twitter or by text message (SMS)

HTML Code

Copy the following HTML code to share your document on a Website or Blog

QR Code to this page

QR Code link to PDF file uav_dqn_gazebo.pdf

This file has been shared publicly by a user of PDF Archive.
Document ID: 0000594303.
Report illicit content