Generalist Embodied AI Lab

News

2025.03: Two papers are accepted at CVPR 2025 & IJCV 2025! Kudos to Shihang and Sanqing!

2024.09: Two papers are accepted at NeurIPS 2024! Kudos to Weiyi and Tianhang!

2024.09: Our survey paper "A Review of Safe Reinforcement Learning: Methods, Theory and Applications" is accepted by IEEE T-PAMI! Kudos to my Shangding!

2024.07: Nine papers are accepted(2@ECCV 2024, 1@IJCAI2024, 1@IEEE T-II, 1@IEEE T-ITS, 1@IEEE T-ASE, 3@IROS2024)! Kudos to my students!

2024.02: Four papers are accepted at CVPR 2024! Kudos to my students!

2023.09: One paper is accepted at NeurIPS 2023 (New Orleans)! Kudos to Jiayi and Ao Zhou!

2023.07: Three papers are accepted at ICCV 2023 (Paris, France)! Kudos to my students!

2023.07: I am invited to serve as an Associate Editor (AE) for ICRA 2024, the largest and most prestigious event of the year in the Robotics and Automation!

2023.06: our work on large scale point cloud registration is accepted by IEEE T-PAMI (Impact Factor: 24.314)! Kudos to Fan Lu!

2023.03: our work on point cloud registration is accepted by IEEE T-PAMI (Impact Factor: 24.314)! Kudos to Fan Lu!

2023.02: Three papers are accepted at CVPR 2023 (Vancouver, Canada)! Kudos to my students!

Pre-Prints

A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Shangding Gu, Long Yang, Yali Du, Guang Chen, Florian Walter, Jun Wang, Yaodong Yang, Alois Knoll
IEEE TPAMI（Accepted）, 2024
code / video / arXiv / bibtex

@misc{gu2022,
      title={A Review of Safe Reinforcement Learning: Methods, Theory and Applications}, 
      author={Shangding Gu and Long Yang and Yali Du and Guang Chen and Florian Walter and Jun Wang and Yaodong Yang and Alois Knoll},
      year={2022},
      eprint={2205.10330},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

Peer-Reviewed Paper (Selected, 2020-)

Divide and Conquer: Exploring Language-centric Tree Reasoning for Video Question-Answering
Zhaohe Liao, Jiangtong Li, Siyu Sun, Qingyang Liu, Fengshun Xiao, Tianjiao Li, Qiang Zhang, Guang Chen, Li Niu, Changjun Jiang, Liqing Zhang International Conference on Machine Learning (ICML), 2025
PDF / code / bibtex

@inproceedings{du2025rcp,
  title={Divide and Conquer: Exploring Language-centric Tree Reasoning for Video Question-Answering},
  author={Zhaohe Liao, Jiangtong Li, Siyu Sun, Qingyang Liu, Fengshun Xiao, Tianjiao Li, Qiang Zhang, Guang Chen, Li Niu, Changjun Jiang, Liqing Zhang},
  booktitle={International Conference on Machine Learning},
  year={2025}
}

In this work, we propose a novel two-stage Languagecentric Tree Reasoning (LTR) framework that enhances the reasoning capabilities and transparency of MLLMs. Experiments across 11 VideoQA benchmarks demonstrate that our LTR framework significantly improves both accuracy and interpretability compared to state-of-the-art MLLMs. To our knowledge, this is the first work to implement a language-centric logical tree to guide MLLM reasoning in VideoQA, paving the way for language-centric video understanding from perception to cognition.

RCP-Bench: Benchmarking Robustness for Collaborative Perception Under Diverse Corruptions
Shihang Du, Sanqing Qu, Tianhang Wang, Xudong Zhang, Yunwei Zhu, Jian Mao, Fan Lu, Qiao Lin, Guang Chen^* IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
PDF / code / bibtex

@inproceedings{du2025rcp,
  title={RCP-Bench: Benchmarking Robustness for Collaborative Perception Under Diverse Corruptions},
  author={Shihang Du, Sanqing Qu, Tianhang Wang, Xudong Zhang, Yunwei Zhu, Jian Mao, Fan Lu, Qiao Lin, Guang Chen},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2025}
}

In this paper, we introduce RCP-Bench, a comprehensive benchmark for evaluating the robustness of collaborative detection models against real-world corruptions. It includes three new datasets simulating various collaborative scenarios and camera corruptions. Experiments reveal that leading models are significantly affected by these corruptions. To address this, we propose two strategies (RCP-Drop and RCP-Mix) to enhance robustness and identify key factors influencing model performance. Our aim is to advance research on more robust collaborative perception models.

General Class-Balanced Multicentric Dynamic Prototype Pseudo-Labeling for Source-Free Domain Adaptation
Sanqing Qu, Guang Chen^*, Jing Zhang, Zhijun Li, Wei He, Dacheng Tao
International Journal of Computer Vision (IJCV), 2025
PDF / code / project / bibtex

@article{qu2025general,
  title={General Class-Balanced Multicentric Dynamic Prototype Pseudo-Labeling for Source-Free Domain Adaptation},
  author={Qu, Sanqing and Chen, Guang and Zhang, Jing and Li, Zhijun and He, Wei and Tao, Dacheng},
  journal={International Journal of Computer Vision},
  pages={1--22},
  year={2025},
  publisher={Springer}
}

We promote the vanilla BMD to BMD-v2 by incorporating a consistency-guided reweighting strategy to improve inter-class balanced sampling, and leveraging the silhouettes metric to realize adaptive intra-class multicentric clustering. Extensive experiments conducted on both 2D images and 3D point cloud recognition demonstrate that our proposed BMD strategy significantly improves existing representative methods.

LEAD: Learning Decomposition for Source-free Universal Domain Adaptation
Sanqing Qu, Tianpei Zou, Lianghua He, Florian Röhrbein, Alois Knoll, Guang Chen^*, Changjun Jiang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
arXiv / code / project / bibtex

@inproceedings{qu2024lead,
  title={Lead: Learning decomposition for source-free universal domain adaptation},
  author={Qu, Sanqing and Zou, Tianpei and He, Lianghua and R{\"o}hrbein, Florian and Knoll, Alois and Chen, Guang and Jiang, Changjun},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={23334--23343},
  year={2024}
}

In this paper, we propose a new idea of LEArning Decomposition (LEAD) for universal domain adaptation, which decouples features into source-known and -unknown components to identify target-private data. This solution leads to elegant views for identifying target-private unknown data without tedious tuning thresholds or relying on iterative clustering.

RCDN: Towards Robust Camera-Insensitivity Collaborative Perception via Dynamic Feature-based 3D Neural Modeling
Tianhang Wang, Fan Lu, Zehan Zheng, Zhijun Li, Guang Chen^*, Changjun Jiang
Advances in Neural Information Processing Systems (NeurlPS), 2024
arXiv / code / bibtex


@inproceedings{wang2024rcdn,
  title={RCDN: Towards Robust Camera-Insensitivity Collaborative Perception via Dynamic Feature-based 3D Neural Modeling},
  author={Tianhang Wang, Fan Lu, Zehan Zheng, Zhijun Li, Guang Chen, Changjun Jiang},
  booktitle={Advances in Neural Information Processing Systems (NeurlPS)},
  year={2024}
  }

We propose RCDN, a Robust Camera-insensitivity collaborative perception with a novel Dynamic feature-based 3D Neural modeling mechanism. The key intuition of RCDN is to construct collaborative neural rendering field representations to recover failed perceptual messages sent by multiple agents. To better model collaborative neural rendering field, RCDN first establishes a geometry BEV feature based time-invariant static field with other agents via fast hash grid modeling. Based on the static background field, the proposed time-varying dynamic field can model corresponding motion vectors for foregrounds with appropriate positions.

GeoNLF: Geometry guided Pose-Free Neural LiDAR Fields
Weiyi Xue, Zehan Zheng, Fan Lu, Haiyun Wei, Guang Chen, Changjun Jiang
Advances in Neural Information Processing Systems (NeurlPS), 2024
arXiv / code / bibtex

    
@article{xue2024geonlf,
  title={GeoNLF: Geometry guided Pose-Free Neural LiDAR Fields},
  author={Xue, Weiyi and Zheng, Zehan and Lu, Fan and Wei, Haiyun and Chen, Guang and others},
  journal={Advances in Neural Information Processing Systems},
  volume={37},
  pages={73672--73692},
  year={2024}
}

We introduce GeoNLF for multi-view registration and novel view synthesis from a sequence of sampled point clouds. We demonstrate the challenges encountered by previous pairwise and multi-view registration methods, as well as the difficulties faced by previous pose-free methods. Through the utilization of our Geo-Optimizer, Graph-based Robust CD, selective-reweighting strategy, and geometric constraints from a 3D perspective, our outlier-aware and geometry-aware GeoNLF demonstrates promising performance in both multi-view registration and novel view synthesis (NVS) tasks.

HGL: Hierarchical Geometry Learning for Test-time Adaptation in 3D Point Cloud Segmentation
Tianpei Zou, Sanqing Qu, Zhijun Li, Alois Knoll, Lianghua He, Guang Chen^*, Changjun Jiang
European Conference on Computer Vision (ECCV), 2024
arXiv / code / bibtex

    
@inproceedings{zou2024hgl,
  title={HGL: Hierarchical Geometry Learning for Test-time Adaptation in 3D Point Cloud Segmentation},
  author={Zou, Tianpei and Qu, Sanqing and Li, Zhijun and Knoll, Alois and He, Lianghua and Chen, Guang and Jiang, Changjun},
  booktitle={European Conference on Computer Vision (ECCV)},
  year={2024}
  }

we delve into TTA in 3D point cloud segmentation and propose a novel Hierarchical Geometry Learning (HGL) framework. HGL comprises three complementary modules from local, global to temporal learning in a bottom-up manner. Technically, we first construct a local geometry learning module for pseudo-label generation. Next, we build prototypes from the global geometry perspective for pseudo-label fine-tuning. Furthermore, we introduce a temporal consistency regularization module to mitigate negative transfer. Extensive experiments on four datasets demonstrate the effectiveness and superiority of our HGL.

POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement Learning
Jiayi Guan, Li Shen, Ao Zhou, Lusong Li, Han Hu, Xiaodong He, Guang Chen^*, Changjun Jiang
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
PDF / code / project / bibtex

@inproceedings{guan2024poce,
  title={Poce: Primal policy optimization with conservative estimation for multi-constraint offline reinforcement learning},
  author={Guan, Jiayi and Shen, Li and Zhou, Ao and Li, Lusong and Hu, Han and He, Xiaodong and Chen, Guang and Jiang, Changjun},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={26243--26253},
  year={2024}
}

We propose a novel Primal policy Optimization with Conservative Estimation algorithm (POCE) to address the problem of multi-constraint offline RL. Concretely, we reframe the objective of multi-constraint offline RL by introducing the concept of Maximum Markov Decision Processes (MMDP). Subsequently, we present a primal policy optimization algorithm to confront the multi-constraint problems, which improves the stability and convergence speed of model training. Furthermore, we propose a conditional Bellman operator to estimate cumulative and state-wise Q-values, reducing the extrapolation error caused by out-of-distribution (OOD) actions. Finally, extensive experiments demonstrate that the POCE algorithm achieves competitive performance across multiple experimental tasks, particularly outperforming baseline algorithms in terms of safety.

MAP: MAsk-Pruning for Source-Free Model Intellectual Property Protection
Boyang Peng, Sanqing Qu, Yong Wu, Tianpei Zou, Lianghua He, Alois Knoll, Guang Chen^*, Changjun Jiang
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
PDF / code / project / bibtex

@inproceedings{peng2024map,
  title={MAP: MAsk-Pruning for Source-Free Model Intellectual Property Protection},
  author={Peng, Boyang and Qu, Sanqing and Wu, Yong and Zou, Tianpei and He, Lianghua and Knoll, Alois and Chen, Guang and Jiang, Changjun},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={23585--23594},
  year={2024}
}

In this paper we target a practical setting where only a well-trained source model is available and investigate how we can realize IP protection. To achieve this we propose a novel MAsk Pruning (MAP) framework. MAP stems from an intuitive hypothesis i.e. there are target-related parameters in a well-trained model locating and pruning them is the key to IP protection. Technically MAP freezes the source model and learns a target-specific binary mask to prevent unauthorized data usage while minimizing performance degradation on authorized data. Moreover we introduce a new metric aimed at achieving a better balance between source and target performance degradation.

VOCE: Variational Optimization with Conservative Estimation for Offline Safe Reinforcement Learning
Jiayi Guan, Guang Chen^*, Jiaming Ji, Long Yang, Ao Zhou, Zhijun Li, Changjun Jiang
Advances in Neural Information Processing Systems 36 (NeurIPS 2023), 2023
PDF / code / project / bibtex

@inproceedings{NEURIPS2023_6a7c2a32,
 author = {Guan, Jiayi and Chen, Guang and Ji, Jiaming and Yang, Long and zhou, ao and Li, Zhijun and jiang, changjun},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {A. Oh and T. Naumann and A. Globerson and K. Saenko and M. Hardt and S. Levine},
 pages = {33758--33780},
 publisher = {Curran Associates, Inc.},
 title = {VOCE: Variational Optimization with Conservative Estimation for Offline Safe Reinforcement Learning},
 url = {https://proceedings.neurips.cc/paper_files/paper/2023/file/6a7c2a320f5f36bb98f8eb878c6f1180-Paper-Conference.pdf},
 volume = {36},
 year = {2023}
}

We propose a Variational Optimization with Conservative Estimation algorithm (VOCE) to solve the problem of optimizing safety policies in the offline dataset. Concretely, we reframe the problem of offline safe RL using probabilistic inference, which introduces variational distributions to make the optimization of policies more flexible. Subsequently, we utilize pessimistic estimation methods to estimate the Q-value of cost and reward, which mitigates the extrapolation errors induced by OOD actions. Finally, extensive experiments demonstrate that the VOCE algorithm achieves competitive performance across multiple experimental tasks, particularly outperforming state-of-the-art algorithms in terms of safety.

LiDAR4D: Dynamic Neural Fields for Novel Space-time View LiDAR Synthesis
Zehan Zheng, Fan Lu, Weiyi Xue, Guang Chen^*, Changjun Jiang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
arXiv / code / project / video / bibtex

@inproceedings{zheng2024lidar4d,
  title={LiDAR4D: Dynamic Neural Fields for Novel Space-time View LiDAR Synthesis},
  author={Zheng, Zehan and Lu, Fan and Xue, Weiyi and Chen, Guang and Jiang, Changjun},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2024}
  }

We propose LiDAR4D, a differentiable LiDAR-only framework for novel space-time LiDAR view synthesis, which reconstructs dynamic driving scenarios and generates realistic LiDAR point clouds end-to-end. In consideration of the sparsity and large-scale characteristics of dynamic point clouds, we design a 4D hybrid representation combined with multi-planar and grid features to achieve effective reconstruction in a coarse-to-fine manner. Furthermore, we introduce geometric constraints to improve temporal consistency and incorporate the global optimization of ray-drop probability for better generation realism.

TMA: Temporal Motion Aggregation for Event-based Optical Flow
Haotian Liu, Guang Chen^*, Sanqing Qu, Yanping Zhang, Zhijun Li, Alois Knoll, Changjun Jiang
IEEE International Conference on Computer Vision (ICCV), 2023
Project / code / arXiv / bibtex

@misc{liu2023tma,
      title={TMA: Temporal Motion Aggregation for Event-based Optical Flow}, 
      author={Haotian Liu and Guang Chen and Sanqing Qu and Yanping Zhang and Zhijun Li and Alois Knoll and Changjun Jiang},
      conference={IEEE International Conference on Computer Vision (ICCV)},
      year={2023},
}

Most existing learning-based approaches for event optical flow estimation directly remould the paradigm of conventional images by representing the consecutive event stream as static frames, ignoring the inherent temporal continuity of event data. In this paper, we argue that temporal continuity is a vital element of event-based optical flow and propose a novel Temporal Motion Aggregation (TMA) approach to unlock its potential. TMA comprises three components: an event splitting strategy to incorporate intermediate motion information underlying the temporal context, a linear lookup strategy to align temporally fine-grained motion features and a novel motion pattern aggregation module to emphasize consistent patterns for motion feature enhancement. Extensive experiments on DSEC-Flow and MVSEC datasets verify the effectiveness and superiority of our approach.

UMC: A Unified Bandwidth-efficient and Multi-resolution based Collaborative Perception Framework
Tianhang Wang, Guang Chen^*, Kai Chen, Zhengfa Liu, Bo Zhang, Alois Knoll, Changjun Jiang
International Conference on Computer Vision (ICCV), 2023
Project / code / arXiv / bibtex

@inproceedings{wang2023umc,
        title     = {UMC: A Unified Bandwidth-efficient and Multi-resolution based Collaborative Perception Framework},
        author    = {Tianhang, Wang and Guang, Chen and Kai, Chen and Zhengfa, Liu, Bo, Zhang, Alois, Knoll, Changjun, Jiang},
        booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
        year      = {2023}
        },
}

We aim to propose a Unified Collaborative perception framework named UMC, optimizing the communication, collaboration, and reconstruction processes with the Multi-resolution technique. The communication introduces a novel trainable multi-resolution and selective-region (MRSR) mechanism, achieving higher quality and lower bandwidth. Then, a graph-based collaboration is proposed, conducting on each resolution to adapt the MRSR. Finally, the reconstruction integrates the multi-resolution collaborative features for downstream tasks. Since the general metric can not reflect the performance enhancement brought by MCP systematically, we introduce a brand-new evaluation metric that evaluates the MCP from different perspectives.

Urban Radiance Field Representation with Deformable Neural Mesh Primitives
Fan Lu, Yan Xu, Guang Chen^*, Hongsheng Li, Kwan-Yee Lin^*, Changjun Jiang
International Conference on Computer Vision (ICCV), 2023
Project / code / arXiv / bibtex

@misc{lu2023dnmp,
      title={Urban Radiance Field Representation with Deformable Neural Mesh Primitives}, 
      author={Fan Lu and Yan Xu and Guang Chen and Hongsheng Li and Kwan-Yee Lin and Changjun Jiang},
      conference={IEEE/CVF International Conference on Computer Vision (ICCV)},
      year={2023},
}

We propose a novel neural rendering framework for urban scenes. We represent the urban scene as a set of deformable neural mesh primitives (DNMPs). The DNMP is a flexible and compact neural variant of classic mesh representation, which enjoys both the efficiency of rasterization-based rendering and the powerful neural representation capability for photo-realistic image synthesis.

Sparse-to-Dense Matching Network for Large-scale LiDAR Point Cloud Registration
Fan Lu, Guang Chen^*, Yinlong Liu, Yibing Zhan, Zhijun Li, Dacheng Tao, Changjun Jiang
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2023
code / video / arXiv / bibtex

@misc{lu2023SDMNet,
      title={Sparse-to-Dense Matching Network for Large-scale LiDAR Point Cloud Registration }, 
      author={ Fan Lu and Guang Chen and Yinlong Liu and Yibing Zhan and Zhijun Li and Dacheng Tao and Changjun Jiang },
      conference={IEEE Transactions on Pattern Analysis and Machine Intelligence},
      year={2023},
}

We propose a novel Sparse-to-Dense Matching Network (SDMNet) for large-scale outdoor LiDAR point cloud registration. Specifically, SDMNet performs registration in two sequential stages: sparse matching stage and local-dense matching stage. We design a novel neighborhood matching module to incorporate local neighborhood consensus, significantly improving performance. The local-dense matching stage is followed for fine-grained performance. Extensive experiments on three large-scale outdoor LiDAR point cloud datasets demonstrate that the proposed SDMNet achieves state-of-the-art performance with high efficiency.

NeuralPCI: Spatio-temporal Neural Field for 3D Point Cloud Multi-frame Non-linear Interpolation
Zehan Zheng, Danni Wu, Ruisi Lu, Fan Lu, Guang Chen^*, Changjun Jiang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Project / code / arXiv / bibtex

@misc{zheng2023neuralpci,
      title={NeuralPCI: Spatio-temporal Neural Field for 3D Point Cloud Multi-frame Non-linear Interpolation}, 
      author={Zehan Zheng and Danni Wu and Ruisi Lu and Fan Lu and Guang Chen and Changjun Jiang},
      conference={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
      year={2023},
}

We propose NeuralPCI: an end-to-end 4D spatio-temporal Neural field for 3D Point Cloud Interpolation, which implicitly integrates multi-frame information to handle nonlinear large motions for both indoor and outdoor scenarios. And we also construct a new multi-frame point cloud interpolation dataset called NL-Drive for large nonlinear motions in autonomous driving scenes. Furthermore, NeuralPCI tends to be a flexible unified framework to conduct both the interpolation and extrapolation, facilitating several applications as well.

Upcycling Models under Domain and Category Shift
Sanqing Qu, Tianpei Zou, Florian Röhrbein, Cewu Lu, Guang Chen^*, Dacheng Tao, Changjun Jiang
IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2023
Project / Code / arXiv / bibtex

@misc{qu2023GLC,
      title={Upcycling Models under Domain and Category Shift }, 
      author={Sanqing Qu and Tianpei Zou and Florian Röhrbein and Cewu Lu and Guang Chen and Dacheng Tao and Changjun Jiang},
      conference={IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR)},
      year={2023},
}

Deep neural networks (DNNs) often perform poorly in the presence of domain shift and category shift. To address this, in this paper, we explore the Source-free Universal Domain Adaptation (SF-UniDA). SF-UniDA is appealing in view that universal model adaptation can be resolved only on the basis of a standard pre-trained closed-set model, i.e., without source raw data and dedicated model architecture. To achieve this, we develop a generic global and local clustering technique (GLC). GLC equips with an inovative one-vs-all global pseudo-labeling strategy to realize "known" and "unknown" data samples separation under various category-shift. Remarkably, in the most challenging open-partial-set DA scenario, GLC outperforms UMAD by 14.8% on the VisDA benchmark.

Modality-Agnostic Debiasing for Single Domain Generalization
Sanqing Qu, Yingwei Pan, Guang Chen^*, Ting Yao, Changjun Jiang, Tao Mei
IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2023
code / video / arXiv / bibtex

@misc{qu2023MAD,
      title={Modality-Agnostic Debiasing for Single Domain Generalization}, 
      author={Sanqing Qu and Yingwei Pan and Guang Chen and Ting Yao and Changjun Jiang and Tao Mei},
      conference={IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR)},
      year={2023},
}

Existing single-DG techniques commonly devise various data-augmentation algorithms, and remould the multi-source domain generalization methodology to learn domain-generalized (semantic) features. Nevertheless, these methods are typically modality-specific, thereby being only applicable to one single modality (e.g.,image). In contrast, we target a versatile Modality-Agnostic Debiasing (MAD) framework for single-DG, that enables generalization for different modalities. We have evaluated the effectiveness and superiority of MAD for single-DG via various empirical evidences on a series of tasks, including recognition on 1D texts, 2D images, 3D point clouds, and semantic segmentation on 2D images.

GSC: A Graph and Spatio-temporal Continuity Based Framework for Accident Anticipation
Tianhang Wang, Kai Chen, Guang Chen^*, Bin Li, Zhijun Li, Zhengfa Liu, Changjun Jiang
IEEE Transactions on Intelligent Vehicles (TIV), 2023
code / video / arXiv / bibtex

@misc{wang2023GSC,
      title={ GSC: A Graph and Spatio-temporal Continuity Based Framework for Accident Anticipation }, 
      author={Tianhang Wang and Kai Chen and Guang Chen and Bin Li and Zhijun Li and Zhengfa Liu and Changjun Jiang},
      conference={IEEE Transactions on Intelligent Vehicles (TIV)},
      year={2023},
}

we propose a Graph and Spatio-temporal Continuity based framework for accident anticipation called GSC, which takes the missing agents into account. Specifically, the proposed GSC maintains the spatio-temporal continuity of missing agents, which are in the occluded spatial state in the process of the graph convolution operation.

D²IFLN: Disentangled Domain-Invariant Feature Learning Networks for Domain Generalization
Zhengfa Liu, Guang Chen^*, Zhijun Li, SanQing Qu, Alois Knoll, Changjun Jiang
IEEE Transactions on Cognitive and Developmental Systems, 2023
code / video / arXiv / bibtex

@article{'liu2023DDIFLN,
  title={D2IFLN: Disentangled Domain-Invariant Feature Learning Networks for Domain Generalization},
  author={Liu, Zhengfa and Chen, Guang and Li, Zhijun and Qu, Sanqing and and Alois Knoll and Jiang, Changjun},
  journal={IEEE Transactions on Cognitive and Developmental Systems},
  year={2023},
  publisher={IEEE}
}

Domain generalization (DG) aims to learn a model that generalizes well to an unseen test distribution. Mainstream methods follow the domain-invariant representational learning philosophy to achieve this goal. However, due to the lack of priori knowledge to determine which features are domain-specific and task-independent, and which features are domain-invariant and task-relevant, existing methods typically learn entangled representations, limiting their capacity to generalize to the distribution-shifted target domain. To address this issue, in this paper, we propose novel Disentangled Domain-Invariant Feature Learning Networks (D2IFLN), to adapt feature realize feature disentanglement and facilitate domain-invariant feature learning.

A Discrete Soft Actor-Critic Decision-Making Strategy with Sample Filter for Freeway Autonomous Driving
Jiayi Guan, Guang Chen^*, Jin Huang, Zhijun Li, Lu Xiong, Jing Hou, Alois Knoll
IEEE Transactions on Vehicular Technology, 2022
paper / code / bibtex

@misc{guan2022a,
      title={ A Discrete Soft Actor-Critic Decision-Making Strategy with Sample Filter for Freeway Autonomous Driving }, 
      author={ Jiayi Guan, Guang Chen, Jin Huang, Zhijun Li, Lu Xiong, Jing Hou, Alois Knoll },
      journal={ IEEE Transactions on Vehicular Technology },
      year={2022},
      publisher={IEEE}
}

In this work, we design a discrete decision-making strategy based on the discrete soft actor-critic with a sample filter algorithm (DSAC-SF) to improve driving efficiency and safety on freeways with dynamic traffic. Experimental results indicate that our strategy obtains a high success rate and a fast vehicle speed in the decision-making tasks on freeways.

PIPO: Policy Optimization with Permutation Invariant Constraint for Distributed Multi-Robot Navigation
Ruiqi Zhang, Guang Chen^*, Jing Hou, Zhijun Li, Alois Knoll
IEEE International Conference on Multisensor Fusion and Integration, 2022

Best Student Paper Award

paper / bibtex

@inproceedings{zhang2022PIPO,
  author={Zhang, Ruiqi and Chen, Guang and Hou Jing and Li, Zhijun and Knoll, Alois},
  journal={IEEE International Conference on Multisensor Fusion and Integration}, 
  title={PIPO: Policy Optimization with Permutation Invariant Constraint for Distributed Multi-Robot Navigation}, 
  year={2022}}

In this study, we propose a decentralized multi-agent reinforcement learning method through constructing the representation constraint via the graph convolutional network. Meanwhile, leverage the permutation-invariant property shuffling observation to enhance the representation and generalization ability of the actor-critic structure. Our method is much safer than centralized MARL baselines and can be generalized to an arbitrary number of agents in different scenarios.

BMD: A General Class-balanced Multicentric Dynamic Prototype Strategy for Source-free Domain Adaptation
Sanqing Qu, Guang Chen^*, Jing Zhang, Zhijun Li, Wei He, Dacheng Tao
European Conference on Computer Vision (ECCV), 2022
code / video / arXiv / bibtex

@misc{qu2022BMD,
      title={BMD: A General Class-balanced Multicentric Dynamic Prototype Strategy for Source-free Domain Adaptation}, 
      author={Sanqing Qu and Guang Chen and Jing Zhang and Zhijun Li and Wei He and Dacheng Tao},
      conference={European Conference on Computer Vision (ECCV)},
      year={2022},
}

we propose a general class-Balanced Multicentric Dynamic prototype (BMD) strategy for the SFDA task. Specifically, for each target category, we first introduce a global inter-class balanced sampling strategy to aggregate potential representative target samples. Then, we design an intra-class multicentric clustering strategy to achieve more robust and representative prototypes generation. We further introduce a dynamic pseudo labeling strategy to incorporate network update information during model adaptation.

PSDC: A Prototype-Based Shared-Dummy Classifier Model for Open-Set Domain Adaptation
Zhengfa Liu, Guang Chen^*, Zhijun Li, Yu Kang, SanQing Qu, Changjun Jiang
IEEE Transactions on Cybernetics, 2022
code / video / arXiv / bibtex

@article{liu2022psdc,
  title={PSDC: A Prototype-Based Shared-Dummy Classifier Model for Open-Set Domain Adaptation},
  author={Liu, Zhengfa and Chen, Guang and Li, Zhijun and Kang, Yu and Qu, Sanqing and Jiang, Changjun},
  journal={IEEE Transactions on Cybernetics},
  year={2022},
  publisher={IEEE}
}

Open Set Domain Adaptation (OSDA) aims to achieve knowledge transfer in the presence of both domain shift and label shift, which assumes that there exist additional unknown target classes not presented in the source domain. To solve the OSDA problem, most existing methods introduce an additional unknown class to the source classifier and represent the unknown target instances as a whole. However, it is unreasonable to treat all unknown target instances as a group, since these unknown instances typically consist of distinct categories and distributions. It is challenging to identify all unknown instances with only one additional class. In addition, most existing methods directly introduce marginal distribution alignment to alleviate distribution shift between source and target domain, failing to learn discriminative class boundaries in the target domain since they ignore categorical discriminative information in the adaptation. To address these problems, in this paper, we propose a novel Prototype-based Shared-Dummy Classifier (PSDC) model for the OSDA.

Robot Policy Improvement with Natural Evolution Strategies for Stable Nonlinear Dynamical System
Yingbai Hu^#, Guang Chen^#, Zhijun Li^*, Alois Knoll
IEEE Transactions on Cybernetics, 2022
Link / code / video / bibtex

@misc{hucyber2022,
      title={Robot Policy Improvement with Natural Evolution Strategies for Stable Nonlinear Dynamical System}, 
      author={Yingbai Hu and Guang Chen and Zhijun Li and Alois Knoll},
      journal={IEEE Transactions on Cybernetics},
      year={2022},
      publisher={IEEE}
}

This paper focuses on improving the adaptability and robustness of robot learning. We propose a policy improvement-based hierarchical learning strategy to imitate and motor skills from human demonstration. The low-level learning method only focuses on behavioural cloning, while the high-level one aims to enhance adaptability and robustness through policy improvement.

Residual Policy Learning Facilitates Efficient Model-Free Autonomous Racing
Ruiqi Zhang, Jing Hou, Guang Chen^*, Zhijun Li, Jianxiao Chen, Alois Knoll
IEEE Robotics and Automation Letters, 2022
arXiv / code / video / bibtex

@misc{zhang2022resrace,
      title={Residual Policy Learning Facilitates Efficient Model-Free Autonomous Racing}, 
      author={Ruiqi Zhang and Jing Hou and Guang Chen and Zhijun Li and Jianxiao Chen and Alois Knoll},
      journal={IEEE Robotics and Automation Letters},
      year={2022},
      publisher={IEEE}
}

In this study, we propose an efficient residual policy learning method for high-speed autonomous racing named ResRace, which leverages only the real-time raw observation of LiDAR and IMU for low-latency obstacle avoiding and navigation. Experiments illustrate our method outperforms the leading algorithms and reaches the comparable level of professional human players on the five F1Tenth tracks.