AI & Computer Vision Engineer
About me:
I am an experienced Computer Vision and AI specialist, holding a PhD in the field along with over five years of industry experience. During this time, I developed multiple applications that utilize neural networks for real-time human body pose estimation. My role encompassed the entire development pipeline - from algorithm design and data annotation to runtime optimization and integration testing. As the first employee at a startup, I played a pivotal role in structuring the company, recruiting team members, and reporting directly to the CEO. Following the cessation of funding in August 2024, I am actively seeking new opportunities where I can continue to apply my technical expertise and leadership skills. I earned my PhD under the guidance of Prof. Dr. Jürgen Gall in the Computer Vision Group at the University of Bonn, making significant contributions to projects and publications, including "SemanticKITTI," a cornerstone in the field with over 1,900 citations. |
Publications
SemanticKITTI: A Dataset for Semantic Segmentation of Point Cloud Sequences
Authors: Jens Behley *, Martin Garbade *, Andres Miloto, Jan Quenzel, Sven Behnke, Cyrill Stachniss, Jürgen Gall (* denotes equal contribution) Conference: IEEE International Conference on Computer Vision (ICCV), 2019 Developed the largest freely accessible dataset at the time for 3D Semantic Scene Completion and laser point cloud classification, tailored for autonomous driving applications. The dataset has been highly cited with over 1,900 citations on Google Scholar and was presented at ICCV 2019. [PDF] [Video-Dataset] [Talk] |
Semantic Scene Completion from a Single Depth Image using Adversarial Training
Authors: Yueh-Tung Chen, Martin Garbade, Jürgen Gall Conference: IEEE International Conference on Image Processing (ICIP), 2019 Explored the use of conditional GANs for 3D semantic scene completion from a single depth image, demonstrating that GANs can outperform traditional 3D CNNs when annotations align well. [PDF] |
Two Stream 3D Semantic Scene Completion
Authors: Martin Garbade, Yueh-Tung Chen, Johann Sawatzky, Jürgen Gall Conference: CVPR-Workshop on Multimodal Learning and Applications (MULA), 2019 Introduced a novel two-stream approach combining depth and RGB-derived semantic information for 3D scene completion, significantly improving upon state-of-the-art performance. [PDF] |
Ex Paucis Plura: Learning Affordance Segmentation from Very Few Examples
Authors: Johann Sawatzky, Martin Garbade, Jürgen Gall Conference: German Conference on Patern Recognition (GCPR), 2018 Developed a method to segment affordances and object parts from few examples using a semantic alignment network, showing superior performance over other weakly supervised approaches. [PDF] |
Thinking Outside the Box: Spatial Anticipation of Semantic Categories
Authors: Martin Garbade, Jürgen Gall British Machine Vision Conference (BMVC), 2017 Proposed a new approach for anticipating semantic categories outside the direct field of view, enhancing semantic segmentation capabilities for autonomous systems. [PDF] [Images/Data] [Talk] [Code] |
Pose for Action - Action for Pose
Authors: Umar Iqbal, Martin Garbade, Jürgen Gall Conference: IEEE International Conference on Automatic Face and Gesture Recognition (FG), 2017 Utilized action information to enhance human pose estimation in videos, presenting an action-prior model that improves pose estimation without additional action recognition frameworks. [PDF] [Code] |
Real-time Semantic Segmentation with Label Propagation
Authors: Rasha Sheikh, Martin Garbade, Jürgen Gall Conference: ECCV Workshop on Computer Vision for Road Scene Understanding and Autonomous Driving (CVRSUAD'16), Springer Introduced a superpixel and label propagation method that significantly speeds up semantic segmentation, increasing accuracy while reducing computational demands. [PDF] |
Handcrafting vs Deep Learning: An Evaluation of NTraj+ Features for Pose Based Action Recognition
Authors: Martin Garbade, Jürgen Gall Conference: GCPR Workshop on New Challenges in Neural Computation and Machine Learning (NC2), 2016. Evaluated NTraj+ features against deep learning models for action recognition, illustrating the transition from handcrafted features to neural network approaches in pose-based action recognition. [PDF] |
Projects
Goose Detection
Neural network based falcon / goose detection system to protect falcon nests from geese invasion, 2016 - present (German Media Article)
Semantic Scene Completion on SemanticKITTI
Neural network based road scene anticipation for anticipatory driving (German Media Article).
Comparative Visualization of LIDAR Input and Algorithm Output
This video presents a side-by-side comparison of input and output data for a 3D semantic scene completion algorithm. On the left, the visualization displays the input data captured from a LIDAR device, showing the raw laser scans. On the right, the output of the algorithm is illustrated, showcasing a completed 3D scene where gaps in the input data are filled and enriched with semantic labels.
Edge AI Applications
All of the following games and apps have been developed during my five years at AISC GmbH, powered by our AI models. My primary role was to develop and refine real-time 2D and 3D human body pose estimation algorithms, crucial for enhancing interactive gaming and health apps.
Key Contributions:
Key Contributions:
- 2D Multi-Person Body Pose Estimation: I trained models specifically for detecting and tracking multiple individuals in real-time.
- 3D Single-Person Pose Estimation: Leveraging Google’s MediaPipe, I optimized and integrated their 3D pose estimation model.
- Custom Model Extension: I collected data and managed its annotation to develop models with extra keypoints for detailed spine pose estimation.
- MediaPipeUnityPlugin Enhancement: I optimized this vital bridge between MediaPipe and the Unity game engine, enabling seamless integration of AI-driven pose estimation into Unity.
MoveBook: Farm Adventures
An interactive game for children aged 3 to 6. As players listen to an audiobook, they are presented with vibrant illustrations that bring the stories to life. Throughout each episode, players engage physically by performing fun activities in front of their device camera - like jumping like a frog or climbing a ladder.
[Website] [Youtube-Channel] [Google Play]
[Website] [Youtube-Channel] [Google Play]
Happy Move
A suite of AI-driven mini-games that harness body movement for control, featuring a soccer goalkeeper simulation, a space racing game, a rhythm-based dance challenge, and a ninja combat game. Each game demonstrates my expertise in integrating real-time motion detection with interactive game mechanics.
Physio Plus
This prototype application for physical therapy demonstrates the potential of AI and computer vision in rehabilitative exercise monitoring. Although it did not reach production due to funding constraints, it effectively showcases five common therapeutic exercises used for various injury treatments. The app utilizes a state machine to count and assess the quality of each exercise performed. When an exercise is executed incorrectly, the user receives immediate audio feedback to correct their form.