Synthetic data generation using game engines for deep learning in robotics = Generación de...
Contenido de la obra
Contenido de la obra
Registro bibliográfico
Registro
- Título: Synthetic data generation using game engines for deep learning in robotics = Generación de conjuntos de datos sintéticos usando motores de juego para aplicaciones de deep learning en la robótica
- Autor: Hernández León, Michael Johan
- Publicación original: 2019
- Descripción física: PDF
-
Nota general:
- Colombia
- Notas de reproducción original: Digitalización realizada por la Biblioteca Virtual del Banco de la República (Colombia)
-
Notas:
- Resumen: An accurate understanding of the environment is key for a robot in order to execute tasks safe and efficiently. In the field of perception, after the introduction of deep learning, computer vision tasks have made big leaps, surpassing even the human inference capability. As a trade-off, big amounts of annotated data were required to train such algorithms. On its own, the collection of annotated dataset is a highly time consuming activity prone to human errors, setting a limit to the maximum achievable performance. In this sense, annotations (quality) and samples (quantity) bound the optimization of perception algorithms. One extra challenge encountered when training object detectors for robotics applications is that the sensor setup can be multi-modal, and vary significantly between robots. This work explores how to generate and use synthetic RGB-D training data from a near photo-realistic game engine to train modality-specific person detectors, and perform ablation studies on a challenging, real-world dataset recorded using a reference RGB-D sensor in different intralogistics environments. A virtual RGB-D camera was implemented, leveraging the underlying deferred rendering architecture. Multiple environments were tailored, exploring various data augmentation techniques and enabling the comparison between different types of synthetic data. Detection layers of a pre-trained object detector network have been trained from scratch for the RGB and depth modality, with the latter being transformed by applying a Jet-colormap. Compared to a pre-trained network, a domain gap of 5 mPA points was still present for RGB images. Meanwhile with synthetic (15k) and real (1.5k) depth images, it was already possible to train robust human detectors. Comparing simulation features against data preparation, filtering annotations had a major impact on performance than adopting an explicit time-of-flight sensor model.
- © Derechos reservados del autor
- Colfuturo
- Forma/género: tesis
- Idioma: castellano
- Institución origen: Biblioteca Virtual del Banco de la República
-
Encabezamiento de materia:
- People detection; Perception; Logistic robot; Mobile robot; Sensors; RGB-D; Deep learning; Game engines; Simulation; Transfer learning; Detección de personas; Percepción; Robot de logística; Robot móvil; Sensores; RGB-D; Deep learning; Motores de juego; Simulación; Transfer learning
- Tecnología; Tecnología / Ingeniería y operaciones afines