An Unsupervised Random Forest Clustering Technique for Automatic Traffic Scenario Categorization
This work addresses the need for faster validation in traffic scenario development, though it is incremental as it adapts an existing algorithm.
The paper tackles the problem of automatically categorizing traffic scenarios by introducing an unsupervised Random Forest modification that clusters simulation data, proving the method is well-suited for generating scenario templates to accelerate validation.
A modification of the Random Forest algorithm for the categorization of traffic situations is introduced in this paper. The procedure yields an unsupervised machine learning method. The algorithm generates a proximity matrix which contains a similarity measure. This matrix is then reordered with hierarchical clustering to achieve a graphically interpretable representation. It is shown how the resulting proximity matrix can be visually interpreted and how the variation of the methods' metaparameter reveals different insights into the data. The proposed method is able to cluster data from any data source. To demonstrate the methods' potential, multiple features derived from a traffic simulation are used in this paper. The knowledge of traffic scenario clusters is crucial to accelerate the validation process. The clue of the method is that scenario templates can be generated automatically from actual traffic situations. These templates can be employed in all stages of the development process. The results prove that the procedure is well suited for an automatic categorization of traffic scenarios. Diverse other applications can benefit from this work.