I am currently a third-year Ph.D. student majoring in Computer Science at Texas A&M University, advised by Prof. Wenping Wang and Prof. Xin Li.
I received my Bachelor of Engineering degree at Fudan University in 2023. I also worked as a research assistant at HKUST CSE with Prof. Dan Xu. Previously, I worked with Prof. Tao Chen and Prof. Jiayuan Fan.
My research interests lie in Computer Vision and Graphics.
Department of Computer Science and Engineering
Ph.D. Student
August 2023 - PresentAbstract: We propose a foundational panoramic scene understanding model capabale of predicting multiple dense prediction tasks. We achieve this by i) curating large dataset with auto labeling pipeline by existing perspective prediction models, ii) propose PD-BridgeNet to tackle the multi-task interaction challenges under EPR distortions .
Abstract: We propose a foundational image soft effect removal (SER) model with: i) a large, curated pair-wise dataset with diverse soft effects (e.g. lens flare, haze, shadows, and reflections), ii) fine-grained user control with spatial masks and strength control, iii) generalize on zero-shot unseen effects, iv) add or enhance effects.
Abstract: SPGen leverages Spherical Projection (SP) to generate high-quality 3D shapes with i) Consistency: SP maps ensure view-consistent and unambiguous 3D reconstruction, ii) Flexibility: Supports arbitrary topologies, iii) Efficiency: Inherit powerful 2D diffusion priors and enables efficient finetuning.
Abstract: We present SolidGS, which reconstructs a consolidated Gaussian field from sparse inputs. Given only three input views, our approach enables high-precision and detailed mesh extraction, and high-quality novel view synthesis, achieved within just three minutes.
Abstract: This research proposes a new approach to multi-task dense predictions with partially labeled data. We introduce hierarchical task tokens (HiTTs) to capture multi-level representations. The global task tokens conduct cross-task interactions and transfer knowledge from labeled to unlabeled tasks.
Abstract: This work introduces a novel BridgeNet for multi-task learning on dense predictions. It uses a Bridge Feature Extractor (BFE) to create strong bridge features and a Task Pattern Propagation (TPP) to solve the task-pattern entanglement issue, resulting in task-specific features with higher quality.
Abstract: We introduce a new approach for cross-domain pedestrian one-stage detectors. The paper identifies a foreground-background misalignment issue in image-level feature alignment, and a novel framework, Background-Focused Distribution Alignment (BFDA) is proposed to address this issue.
I love photography and road trips. Intermediate skier.