Skip to content

Commit

Permalink
feat: state of art
Browse files Browse the repository at this point in the history
  • Loading branch information
manuandru committed Jan 26, 2024
1 parent 55eae15 commit c2e9e3c
Show file tree
Hide file tree
Showing 3 changed files with 132 additions and 22 deletions.
11 changes: 10 additions & 1 deletion report/bibliography.bib
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ @misc{OpenAIGym
eprint = {arXiv:1606.01540}
}

@article{ROS2,
@article{Ros2,
author = {Steven Macenski and Tully Foote and Brian Gerkey and Chris Lalancette and William Woodall},
title = {Robot Operating System 2: Design, architecture, and uses in the wild},
journal = {Science Robotics},
Expand All @@ -69,3 +69,12 @@ @article{ROS2
doi = {10.1126/scirobotics.abm6074},
url = {https://www.science.org/doi/abs/10.1126/scirobotics.abm6074}
}

@inproceedings{F1tenthGym,
title = {F1TENTH: An Open-source Evaluation Environment for Continuous Control and Reinforcement Learning},
author = {O’Kelly, Matthew and Zheng, Hongrui and Karthik, Dhruv and Mangharam, Rahul},
booktitle = {NeurIPS 2019 Competition and Demonstration Track},
pages = {77--89},
year = {2020},
organization = {PMLR}
}
Binary file added report/img/ppo.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
143 changes: 122 additions & 21 deletions report/index.tex
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ \section{Introduction}

% \end{itemize}

The autonomous driving represents a vital area of research in the automotive technology advancement with applications stretching from city roads to extreme motor sport environments.
Autonomous driving represents a vital area of research in the advancement of automotive technology with applications that range from city roads to extreme motor sport environments.
%
In the context of racing cars, there is a unique challenge of demand for excellent performance and timely decisions that prompts the adoption of innovative approaches.
%
Expand All @@ -57,7 +57,7 @@ \section{Introduction}
%
This approach seeks to improve the vehicle’s ability to follow optimal paths while considering unique features of circuits used in car racing competitions.
%
By analyzing and optimizing waypoint-based trajectories, we aim to show how our autonomous driving system can dynamically adjust its driving path to fit the lane changes with better lap timing and dealing with adverse conditions.
By analyzing and optimizing waypoint-based trajectories, our goal is to show how our autonomous driving system can dynamically adjust its driving path to fit lane changes with better lap timing and to deal with adverse conditions.

After training the model using PPO in a simulated environment, it will subsequently be used to predict the trajectory and speed of a vehicle inside a ROS-enabled simulator.

Expand All @@ -74,41 +74,122 @@ \section{State of the art}

The advent of driverless car research has made great strides with applications ranging from road cars to race cars.
%
In motor racing, the incorporation of autonomous driving systems has become a significant challenge necessitating sophisticated solutions to tackle competitive environment peculiarities.
In motor racing, the incorporation of autonomous driving systems has become a significant challenge necessitating sophisticated solutions to tackle the peculiarities of the competitive environment.
%
Different approaches and relevant study findings upon literature review provide a full picture of the present landscape and major techniques of autonomous driving approaches.
Different approaches and relevant study findings after literature review provide a full picture of the current landscape and the main techniques of autonomous driving approaches.

Many studies have focused on a few traditional control techniques such as model predictive control (MPC) by Schwenzer et al. \cite{MPC}.
% TODO: descrivi MPC
%
%
%
\subsection{PID}

One of the most significant approaches of autonomous driving is the PID control algorithm.
%
PID stands for a proportional, integral, and derivative controller used in automated control systems.

\begin{itemize}
\item \textbf{Proportional (P)}: The proportional component responds proportionally to the current error, determining the response speed of the system.

\item \textbf{Integrated (I)}: The integrated component takes into account past errors and operates to eliminate any cumulative discrepancies, guaranteeing that the system reaches and maintains the set point in the long run.

\item \textbf{Derivative (D)}: The derivative component predicts the future behavior of the system thereby helping to prevent undue oscillations and enhance stability.

\end{itemize}

A significant variant is the Adaptive-PID \cite{ADAPTIVE_PID}.
%
It introduces adaptability into traditional PID, enabling the controller to automatically adjust its proportional, integral, and derivative parameters in response to changes in system dynamics.
%
This adaptation is important when the system is subjected to changes in operational conditions, such as variations in speeds, vehicle masses, or road surface conditions.
%
The key stages of Adaptive-PID are:

\subsubsection{System Identification}
An important aspect of Adaptive-PID is the ability to dynamically identify system parameters in real time.
%
This could be done through parameter identification techniques such as linear regression or adaptive estimation algorithms.

\subsubsection{Parameter Adaptation}
Based on the identified information, the controller dynamically adapts PID parameters for optimal performance.
%
For instance, if a vehicle experiences a change in mass due to variation in the load, the Adaptive-PID can automatically adapt parameters to ensure stable and responsive control response.

\subsubsection{Tolerance to Changes}
The adaptive approach ensures that control remains robust and effective even when significant changes are made to operating conditions, thus improving dynamic handling capabilities.

%
%
%
\subsection{MPC}

Many studies have focused on some traditional control techniques such as model predictive control (MPC) by Schwenzer et al. \cite{MPC}.

MPC is an advanced control technique that relies on iterative prediction of the evolution of the system over time, enabling the generation of optimal control commands.
%
In more detail, Model Predictive Control can be divided into several key phases:

\subsubsection{Dynamic model of the system}
MPC demands an accurate and dynamic model of the system to be controlled.
%
In the context of self-driving, this model includes parameters like vehicle dynamics, road geometry and other factors influencing its dynamics.

Another significant approach is the PID \cite{ADAPTIVE_PID}
% TODO: descrivi PID
\subsubsection{Future prediction}
By using the dynamic model and prediction horizon, MPC iteratively predicts the future behavior of the system.
%
This means that in each step the system foresees how it will evolve, and hence different control input possibilities are taken into account.

\subsubsection{Control optimization}
A cost function is defined to measure the quality of possible trajectories.
%
MPC solves an optimization problem in order to identify a sequence of control commands that minimize this cost function while taking into consideration binding dynamics and kinematics of the system.

\subsubsection{Control implementation}
The implementation of the identified optimal control law for the system is carried out.
%
The prediction and optimization process is then repeated cyclically, adapting to the system conditions in real time.

On the downside, these approaches are often restricted in how well they handle dynamic complexities of race circuits and machine learning overfitting could be a resource in this particular use case.
\medskip

One of the most important milestones is the increasing adoption of machine learning algorithms focusing on Reinforcement Learning in order to achieve a driving style as like as possible as a human driver, but free from distraction and emotions that can have a negative impact on the performance \cite{andru}.
What distinguishes the MPC algorithm is its ability to handle complex constraints and non-linear dynamics of the system.
%
Thus, it provides an adaptive, optimal control solution.
%
However, implementing it may involve significant computational effort, and forecast accuracy highly depends on the precision of a dynamic model.

\medskip

The MPC algorithm uses a complete dynamic model of the system unlike the Adaptive-PID which is often less efficient from a computational point of view.
%
However, it may pose challenges in dealing with more complicated dynamics or in scenarios where variations are extreme and not readily modeled by a standard PID approach.

On the downside, these approaches are often restricted in how well they handle dynamic complexities of race circuits, and machine learning overfitting could be a resource in this particular use case.

One of the most important milestones is the increasing adoption of machine learning algorithms focusing on Reinforcement Learning in order to achieve a driving style as similar to as possible as a human driver, but free from distractions and emotions that can have a negative impact on performance \cite{andru}.
%
The use of reward and penalty based techniques along with dynamic interaction between agent and environment have been shown to be effective in enhancing performance in autonomous driving.
%
Researches such as Silver et al. (2016) \cite{GO_DNN} have made notable successes in training deep neural networks through Reinforcement Learning for autonomous driving in contexts akin to car racing.
Researches such as Silver et al. (2016) \cite{GO_DNN} have made notable successes in training deep neural networks through Reinforcement Learning for human game contexts.

In the specific framework of car races, the optimal handling of vehicles calls for a combination of accuracy, speed and adaptability to the change of the track.

Proximal Policy Optimization (PPO) is one of the algorithms that has become popular for Reinforcement Learning algorithm due to its ability to handle continuous action spaces and stability during training (Schulman et al., 2017) \cite{PPOOpenAI}.
%
This makes PPO particularly useful in applications where precision as well as dynamic management of the car is important such as automobile racing.
This makes PPO particularly useful in applications where precision and dynamic management of the car is important such as automobile racing.

Our approach is different from existing literature in introducing a specific use of race track waypoints into training maps.
Our approach is different from the existing literature in introducing a specific use of race track waypoints in training maps.
%
This decision aims at improving the models ability to follow optimal trajectories over particular circuits taking into account unique characteristics of each track.
This decision aims to improve the model's ability to follow optimal trajectories on particular circuits taking into account the unique characteristics of each track.
%
After the training step, the model will be tested in the a another kind of environment, supported by ROS, in order to achieve a bit more realistic use case.
%
In summary, our work lies at the intersection between Reinforcement Learning research for autonomous driving and specific needs of auto racing by training an innovative approach based on PPO and accurate use of waypoints on tracks given the fact that circuits will not change over the time and could be optimized.

In summary, our work lies at the intersection between Reinforcement Learning research for autonomous driving and specific needs of auto racing by training a PPO model using waypoints on tracks given the fact that circuits will not change over the time and could be optimized.
%
Next section provides detailed methodology, illustrating how we implemented and trained our model to achieve best results on selected circuits.

\section{Metodologia}
%
%
%
\section{The proposed system}

% \begin{itemize}
% \item Descrizione dell'architettura del modello utilizzato, inclusi i dettagli su come avete implementato l'algoritmo PPO.
Expand All @@ -119,15 +200,31 @@ \section{Metodologia}

% \end{itemize}

La nostra metodologia mira a fornire una visione approfondita dell'architettura del modello, del processo di addestramento e dell'integrazione dei waypoints nei circuiti selezionati. L'obiettivo è presentare un quadro chiaro e riproducibile del nostro approccio alla guida autonoma basata su Reinforcement Learning, con un focus particolare sull'utilizzo dell'algoritmo PPO.
The project is aimed at providing a two-part integrated architecture. The first part employs use of the Simulator Gym (F1tenthGym) \cite{F1tenthGym} for the training process of a reinforcement learning model based on PPO \cite{PPOOpenAI} and using waypoints in the circuits.

The second part uses the previously trained model to predict actions that need to be taken by a car inside the ros based simulator employing sensor feedback.

Through a containerized environment, we aim at giving you an insight into our approach to Reinforcement Learning-based Autonomous Driving, especially when using PPO algorithm.

%
%
%
\subsection{Model Training}

\begin{figure}
\centering
\includegraphics[width=0.485\textwidth]{img/ppo.jpg}
\caption{PPO Algorithm. https://medium.com/@oleglatypov/a-comprehensive-guide-to-proximal-policy-optimization-ppo-in-ai-82edab5db200}
\label{fig:ppo}
\end{figure}

\subsection{Architettura del modello}
\subsubsection{Architettura del modello}
Il cuore del nostro sistema è una rete neurale profonda addestrata attraverso l'algoritmo PPO. La rete neurale accetta input relativi allo stato attuale del veicolo, quali posizione, velocità, angolo di sterzata e dati sensoriali provenienti da telecamere e sensori a ultrasuoni. Il modello produce un'azione di controllo, rappresentata da una distribuzione di probabilità su possibili comandi, consentendo una gestione dinamica e continua del veicolo.

\subsection{Addestramento del modello}
\subsubsection{Addestramento del modello}
Abbiamo utilizzato una vasta raccolta di dati provenienti da simulazioni di guida su diversi circuiti. Ogni episodio di addestramento ha coinvolto il modello che interagisce con l'ambiente simulato, ricevendo ricompense basate su metriche di prestazione come tempi di percorrenza, traiettorie seguite e reazioni a condizioni impreviste come curve strette o variazioni di superficie stradale. L'addestramento è stato eseguito per numerosi cicli, garantendo la convergenza del modello verso strategie ottimali di guida.

\subsection{Integrazione dei waypoints}
\subsubsection{Integrazione dei waypoints}

Un aspetto distintivo della nostra metodologia è l'integrazione dei waypoints dei circuiti nelle mappe di addestramento.
%
Expand All @@ -145,6 +242,10 @@ \subsection{Parametri e configurazioni}

Questa metodologia integrata ha consentito l'addestramento di un modello di guida autonoma altamente adattivo, capace di gestire in modo dinamico i circuiti di gara e di ottimizzare le prestazioni in risposta a variazioni ambientali e specificità della pista. Nella sezione successiva, presenteremo i risultati dei nostri esperimenti, evidenziando le capacità e le limitazioni del nostro approccio.

%
%
%
\subsection{Evaluation}

\section{Esperimenti}

Expand Down

0 comments on commit c2e9e3c

Please sign in to comment.