ICG - Research & Projects

Institute of Computer Graphics and Vision

Research And Projects

We develop computer vision methods, including 3D object detection, hand pose estimation, geo-localization, and indoor 3D reconstruction, with application to augmented reality and robotics.

Keypoint Transformer: Solving Joint Identification in Challenging Hands and Object Interactions for Accurate 3D Pose Estimation

We propose an efficient transformer based architecture for 3D pose estimation of two-hands and object during complex interaction from a single RGB image.

MonteFloor: Extending MCTS for Reconstructing Accurate Large-Scale Floor Plans

We propose a novel method for reconstructing floor plans from noisy 3D point clouds that proposes Monte Carlo Tree Search (MCTS) with integrated refinement step to solve this problem.

Monte Carlo Scene Search for 3D Scene Understanding

We explore how a general AI algorithm can be used for 3D scene understanding in order to reduce the need for training data. More exactly, we propose a modification of the Monte Carlo Tree Search (MCTS) algorithm to retrieve objects and room layouts from noisy RGB-D scans.

Hand-Object Pose Annotation and Estimation

We develop a method for automatic hand-object 3D pose annotation when captured with one or more RGBD cameras. We create large scale hand-object dataset using this method and make it public along with baseline results for hand pose estimation from single RGB image.

General 3D Room Layout from a Single View by Render-and-Compare

We introduce a novel method for estimating 3D Room Layout from a single image.

Domain Transfer for 3D Pose Estimation

While acquiring annotations for color images is a difficult task, we introduce a novel learning method for 3D pose estimation from color images.

Robust Object Pose Estimation

We introduce a novel approach for object 3D pose estimation, which is inherently robust to partial occlusions of the object.

3D Pose Estimation and 3D Model Retrieval

We present a scalable approach to retrieve 3D models for objects in the wild. Our method builds on the fact that knowing the object pose significantly reduces the complexity of the task.

Feature Mapping

We propose a simple and efficient method for exploiting synthetic images when training a Deep Network to predict a 3D pose from an image.

Physics-Based Hand Object Interaction

We propose a simple and efficient method for physics-based hand object interaction in VR.

Segmentation-Based 3D Tracking

Given simple 2.5D city maps, we show how to exploit recent results in semantic segmentation to efficiently track a camera in urban environments.

3D Pose Estimation

BB8 is a novel method for 3D object detection and pose estimation from color images only. It predicts the 3D poses of the objects in the form of 2D projections of the 8 corners of their 3D bounding boxes.

ALCN: Adaptive Local Contrast Normalization

We propose a novel illumination normalization method that lets us learn to detect objects and estimate their 3D poses under challenging illumination condition from very few training samples.

Hand Detection and 3D Pose Estimation

We introduce novel methods for predicting the 3D joint locations of a hand given a depth map using Convolutional Neural Networks (CNN).

Geo-localization from Images and 2.5D Maps

We propose methods for accurate camera pose estimation in urban environments from single images and 2.5D maps made of the surrounding buildings’ outlines and their heights.

Accurate Geo-Localization from Images

We present a method for large-scale geo-localization and global tracking of mobile devices in urban outdoor environments.

Object Detection and 3D Pose Estimation

We introduce a simple but powerful approach to computing descriptors for object views that efficiently capture both the object identity and 3D pose.

3D Object Tracking

We present a method that estimates in real-time and under challenging conditions the 3D pose of a known object.

Learning to Detect Keypoints (at CVLab, EPFL)

We introduce a learning-based approach to detect repeatable keypoints under drastic imaging changes of weather and lighting conditions to which state-of-the-art keypoint detectors are surprisingly sensitive.

Flying Object Detection from a Single Moving Camera (at CVLab, EPFL)

We propose an approach to detect flying objects such as UAVs and aircrafts when they occupy a small portion of the field of view, possibly moving against complex backgrounds, and are filmed by a camera that itself moves.

Realistic Synthetic Data Generation (at CVLab, EPFL)

We propose a novel approach to synthesizing images that are effective for training object detectors.

Older Projects at CVLab