- Tracking
- Human Pose Estimation
- Radar Object Detection
- Domain Adaptation
- Generation for Perception
- LLM
- Continual Learning
- Fish Pose Estimation and Length Measurement
-
AICity Challenge
- IEEE/CVF CVPR AICity Challenge 2023
- Winner of Track 1: Multi-Camera People Tracking
- IEEE/CVF CVPR AICity Challenge 2019
- Winner of Track 1: City-Scale Multi-Camera Vehicle Tracking
- Runner-up of Track 2: City-Scale Multi-Camera Vehicle Reidentification
- Runner-up of Track 3: Traffic Anomaly Detection
- IEEE/CVF CVPR AICity Challenge 2018
- Winner of Track 1: Traffic Flow Analysis
- Winner of Track 3: Multi-sensor Vehicle Detection and Reidentification
- IEEE/CVF CVPR AICity Challenge 2023
-
BMTT (Benchmarking Multi-Target Tracking) Challenge
- IEEE/CVF ICCV BMTT Challenge 2021
- Winner of Video Track: KITTI-STEP
- Winner of Vdeo Track: MOTChallenge-STEP
- IEEE/CVF CVPR BMTT Challenge 2020
- Winner of Track 3: MOTChallenge+KITTI (Tracking)
- Runner-up of Track 2: KITTI-MOTS (Tracking and Segmentation)
- IEEE/CVF ICCV BMTT Challenge 2021
-
IEEE/CVF WACV MaCVi Challenge 2024
- Winner of Track: UAV-based Multi-Object Tracking with Reidentification
- Winner of Track: USV-based Multi-Object Tracking
Global Adaptation meets Local Generalization: Unsupervised Domain Adaptation for 3D Human Pose Estimation
Wenhao Chai, Zhongyu Jiang, Jenq-Neng Hwang, Gaoang Wang
International Conference on Computer Vision (ICCV), 2023
[Paper]
[Code]
A simple yet effective framework of unsupervised domain adaptation for 3D human pose estimation.
MPM: A Unified 2D-3D Human Pose Representation via Masked Pose Modeling
Zhenyu Zhang*, Wenhao Chai*, Zhongyu Jiang, Tian Ye, Mingli Song, Jenq-Neng Hwang, Gaoang Wang
arXiv Preprint.
[Paper]
[Code]
Treat 2D and 3D pose as two different modalities and apply three mask modeling based pretext tasks for human pose pre-training to learn spatial and temporal
relations.
StableVideo: Text-driven Consistency-aware Diffusion Video Editing
Wenhao Chai, Xun Guo, Gaoang Wang, Yan Lu
International Conference on Computer Vision (ICCV), 2023
[Website]
[Paper]
[Demo]
[Code]
We tackle introduce temporal dependency to existing text-driven diffusion models, which allows them to generate consistent appearance for the new objects.
Human Pose Estimation (HPE) is a task focusing on estimating 2D or 3D human keypoints/shapes from images or videos. We aim to develop robust and effective (HPE) pipelines.
- ZeDO: Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation (WACV 2024)
Use large generative models (e.g. Large Language Models (LLM), Diffusion Models) to aid in perception tasks, including classification, detection and image captioning. Methods are applied in multiple applications, including:
- Human Object Interaction (HOI) Recognition
- SOTA in HOI classification on HICO: DEFR (Detection-Free HOI Recognition), since 2021
- SOTA in Zero-shot HOI classification: HTS (Heterogenious Teacher-Student Framework), since 2022
- SOTA in Zero-shot HOI detection: HTS (Heterogenious Teacher-Student Framework), since 2022
- Medical Image Understanding
- SOTA in Chest X-ray (CXR) Radiology Report Generation, MIMIC-CXR: C2C (Concept-to-Content Method for Radiology Report Generation), 2023