About me

Hi, I’m Ganlin Zhang (张甘霖), a PhD student at Technical University of Munich, supervised by Prof. Daniel Cremers. Currently I focus on Visual SLAM, Structure from Motion and 3D reconstruction.

Previously, I received my Master’s degree in Computer Science from ETH Zurich, where I worked on 3D Vision research projects with Prof. Luc Van Gool and Prof. Marc Pollefeys. Before that, I obtained my Bachelor’s degree in Computer Science from ShanghaiTech University, supervised by Prof. Laurent Kneip. During my undergraduate studies, I also spent a year at UC Berkeley, as a visiting student.

Experience

News

Publications

Flow4R is a feed-forward framework for dynamic 4D reconstruction and tracking from unposed image pairs. By modeling camera-space scene flow as a unified representation of geometry, object motion, and camera motion, it predicts 3D position and bidirectional motion in a single forward pass without explicit pose regression or bundle adjustment, achieving state-of-the-art accuracy and temporal consistency.

NOVA3R is a feed-forward method for non-pixel-aligned 3D reconstruction from unposed images that learns a global, view-agnostic scene representation via scene tokens and a diffusion-based 3D decoder, enabling complete and physically plausible geometry and outperforming state of the art in accuracy and completeness.

ViSTA-SLAM is a real-time monocular dense SLAM pipeline that combines a Symmetric Two-view Association (STA) frontend with Sim(3) pose graph optimization and loop closure, enabling accurate camera trajectories and high-quality 3D scene reconstruction from RGB inputs.

SNI-SLAM++ is a tightly coupled semantic SLAM system that achieves robust tracking and dense semantic mapping through hierarchical semantic encoding, cross-attention feature fusion, and a semantics-coupled tracking framework.

A method for consistent dynamic scene reconstruction via motion decoupling, bundle adjustment, and global refinement.

We use a keyframe based frame to frame tracker based on dense optical flow connected to a pose graph for global consistency. For dense mapping, we resort to a 3DGS representation, suitable for extracting both dense geometry and rendering from.

1. A monocular SLAM pipeline with deformable neural point cloud scene representation.
2. Novel DSPO layer for BA, which can jointly optimize depth map, depth scale, and camera pose.

1. Better model the underlying noise distributions by directly propagating the uncertainty from the point correspondences into the rotation averaging.
2. Integrate a variant of the MAGSAC++ loss into the rotation averaging, instead of using the classical robust losses.

Selected Projects

Course project of Mixed Reality 2022 in ETH Zurich

In this project, we design, implement and deploy a mixed-reality-based method with HoloLens 2 that enables users to control the Boston Dynamics Spot robot.

NICE-SLAM with Adaptive Feature Grids
Ganlin Zhang, Deheng Zhang, Feichi Lu, Anqi Li
Course project of 3D Vision 2022 in ETH Zurich

In this project, we present a sparse version of NICE-SLAM, which is a SLAM system incorporating the idea of Voxel Hashing into NICE-SLAM framework. Instead of initializing feature grids in the whole space, voxel features near the surface are adaptively added and optimized.

Optimization by Particle Swarm Using Surrogates via Bunch-Kaufman Pivoting and Standard Optimization
Course project of Advanced System Lab 2022 in ETH Zurich

Focus on speeding up black-box optimization algorithm OPUS from paper Particle Swarm with Radial Basis Function Surrogates for Expensive Black-box Optimization by Rommel G. Regis.
Besides, we implement the speed-up C++ version of Bunch-Kaufman Pivoting.

Improved PSMNet for Deep Stereo Disparity Estimation
Course project of Deep Learning 2021 in ETH Zurich

Combining PSM Net, group-wise corr, dilatedResNet, semantic segmentation information to estimate accurate disparity of stereo image pairs efficiently.

Course project of Introduction to Robotics 2019 in UC Berkeley

We design a path-finding algorithm that could generate a path to draw a portrait/character in one stroke. Then we use our self-designed control system to draw this path. This project could be used with any arm-robot with at least 4 joints.

Teaching

Practical Course: Deep Learning for Spatial AI
Master Seminar — Modern Methods for 3D Representation and Reconstruction
Master Seminar — 3D Vision Foundation Models