Dominik Hirner recieved the B.Sc. and M.Sc. degrees in Computer Science from the University of Technology Graz, Austria, in 2016 and 2018, respectively. During his Master's study he spent two years at the AIT Austrian Institute of Technology. There he worked on many customer projects but mainly focused on his work for his master thesis. His thesis was part of an industrial light-field camera used for high-speed in-line computation. In particular he worked on a Neural-Network that predicted surface normals as well as object classes using their epipolar geometry given by the light-field data.
Example result of the light-field NN for a glossy coin object.
The color encodes the surface-normal gradient in x-direction.
Since November 2018, he is a Ph.D. student at the Institute of Computer Graphics and Vision (ICG), Graz University of Technology.
Projects
In 2019 he worked on the Core3D project together with General Electrics , Cornell and CMU. In this project he worked mainly on improving 3D reconstruction of urban scenes via satellite stereo images.
Example of a point-cloud reconstruction of an urban scene (Jacksonville FL, USA).
In 2020 he worked on the SV4I (street-view for the visually impaired) project. This FFG funded project was focused on using 3D data generated from structure-from-motion techniques in order to create a geo-referenced map for walkable and non-walkable areas and pure vision based localization.
Research
His current research interest includes computer vision, image analysis and supervised/unsupervised deep learning. In particular, he is dealing with lightweight CNNs for stereo vision and 3D reconstruction using deep learning techniques.
In 2020 he published a paper [1] at the International Conference on Pattern Recognition called: FC-DCNN: A densely connected neural network for stereo estimation. FC-DCNN stands for fully-convolutional densely connected neural network. In this work, he proved that by using machine learning in order to train rich and deep features for the feature matching step in the stereo pipeline, you can already outperform traditional methods such as SGM (Sem-Global Matching) or MGM (More Global Matching).
FC-DCNN network structure. A lightweight disparity estimation network.
source-code: FC-DCNN
Disparity map results of the FC-DCNN network output.
Left: Middlebury test image, Right: ETH3D test image
He continued his work on machine learning stereo estimation and in 2022 he published a paper [2] at the International Conference on Pattern Recognition called: FCDSN-DC: An Accurate and Lightweight Convolutional Neural Network for Stereo Estimation with Depth Completion. In this work, he continued to improve the feature extraction part as well as add a machine learning based similarity function and a novel depth completion part.
Source-code: FCDSN-DC
FCDSN-DC network structure
Disparity map results from FCDSN-DC.
Left: sparse disparity from the Middlebury dataset
Right: filled disparity from the Middlebury dataset
In 2024 he published his latest paper at the International Conference on Pattern Recognition (ICPR 2024) called: SAda-Net: A Self-Supervised Adaptive Stereo Estimation CNN For Remote Sensing Image Data. This network introduces a novel training scheme that does not rely on annotated ground-truth data which is especially important for such use-cases as remote sensing, where there is plentiful image data but annotated ground-truth data is missing. This is done by introducing a pseudo-ground truth that is updated and refined after each training step, leading to a dense and accurate reconstruction.
Self-Supervised network overview
Disparity creation training loop
For the tiling, rectification, point cloud creation and geo-referenced digital surface model creation (dsm), the off-the shelf software package s2p (satellite stereo pipeline) was used and modified to fit our specifications. This modified version is forked on the official github page: SAda-Net.
The paper can be found here.