I am interested in generative models for vision. More specifically, I work on leveraging diffusion models and other generative approaches for tasks such as depth estimation, image restoration, video synthesis, and 3D content creation. I am also interested in the intersection of generative AI with graphics, computational photography, and novel imaging systems.
If you are interested in my profile, please feel free to reach out to me via email.
A camera-controlled video diffusion model that performs novel video trajectory synthesis by conditioning on 4D scene latents from a large 4D reconstruction model.
A novel diffusion-based approach for dense prediction, providing foundation models and pipelines for a range of computer vision and image analysis tasks, including monocular depth estimation, surface normal prediction, and intrinsic image decomposition.
Fine-tuning a pre-trained video diffusion model on event camera data turns it into a state-of-the-art event-based video interpolater and extrapolator with high temporal resolution.
We present an accurate and fast gaze tracking method using single-shot phase-measuring-deflectometry (PMD), taking advantage of dense 3D surface measurements of the eye surface.
Using a differentiable renderer to simulate reflection light transport allows us to take advantage of dense sturctured screen illumination to accurately optimize eye gaze direction.
We specialize a pre-trained latent diffusion model into an efficient data generation pipeline with style control and multi-resolution semantic adherence, improving downstream domain generalization performance.
Our method lifts the power of generative 2D image models, such as Stable Diffusion, into 3D. Using them as a way to synthesize unseen content for novel views, and NeRF to reconcile all generated views, our method provides vivid painting of the input 3D assets in a variety of shape categories.
We present a first study on the trade-offs between projection and reflection based 3D imaging systems under diffusive and specular reflection mixtures. Experiments are conducted in our projection/reflection simulator framework using the Mitsuba2 renderer.
Aside from research, I delight in the symphony of flavors that dance on the tongue (good food) and the harmonious melodies that serenade the ears (music). A few of my musical heroes include 竇唯 (Dou Wei), Radiohead, Brian Eno, and Fleet Foxes.