Introduction

This project was inspired by Radu Mariescu's tutorial series. Autonomous agents employ neural networks to map sensory input to control decisions, a process mastered through reinforcement learning. In the realm of autonomous vehicles, extensive training is imperative. Simulations prove invaluable for accelerating this learning curve. This project focuses on simulating vehicle dynamics, creating a simulator as a preliminary step.

By developing a simulator, a basic neural network is trained to navigate complex traffic scenarios. This methodology streamlines the training process, ensuring that autonomous cars efficiently grasp the intricacies of real-world environments.

Goals

The goal of this project is to develop an autonomous reinforcement learning agent (car) simulator and then train it using a simple neural network so that it can navigate simulated traffic using its sensors to traverse along a straight highway.

1. Develop a vehicle simulator
2. Simulate environment and traffic
3. Train the agent to be autonomous 

Work Done

The background was setup using basic html and CSS canvas. On this, the using CSS draw methods, the road, its lanes, etc. were made with each declared as its own Javascript object. The vehicle was modelled with no slip condition satisfying the Ackermann steering.

The car has 3 input parameters:
1. Accelerate
2. Left Steer
3. Right Steer

This is not too accurate but works for this simple model. These three will be the output of the neural network.

Then we add sensor lines for the vehicle which would help it detect what is in front of it. The sensor distances are directly mapped to the neural network as inputs. This will be useful later when traffic is simulated and the vehicle needs to dodge the traffic. The network is a two layer fully connected network. The code is not vectorized making it very slow to run. So only few nodes were chosen. A visualizer is also made for the nodes.


Results

To train the network, lots of episodes are randomly generated with randomly initialized values for the weights and biases. The idea behind this is that there will be some network weight that will work correctly that will give us a proper mapping between sensor activations to the actuator activations.

Such an approach gives random results and is dependent on the initial seed. In some cases the car ends up navigating initial traffic but manages to struggle with getting past the cars. In other cases, it gets stuck in a loop where it gets close to the car, slows down and again accelerates. This is because the reward policy rewards the agent with the maximum run time as the other agents reach an end state due to collision.

Genetic algorithm is used to engineer the best agent. The network is ran in parallel for multiple iterations to get the best possible result. Best agents are chosen after every iteration. Adding slight variation in such best iterations, new runs are made. This eventually helps us to optimize the network weight.

Although the current network has more than 50 parameters, its architecture is too basic for the given problem - it does not get to encode complex information in its mapping. This makes the car struggle as the traffic is increased. But for a preliminary demo, it performs pretty satisfactorily.

Contact me!

๐Ÿ“ž(404) 388-3944    ๐Ÿ“Atlanta, GA, 30309   ๐Ÿ“งbhushan.pawaskar@hotmail.com

ยฉ Copyright 2023 Bhushan Pawaskar - All Rights Reserved