publications

A list of my previous publications.

2024

  1. Rag-driver: Generalisable driving explanations with retrieval-augmented in-context learning in multi-modal large language model
    Jianhao Yuan, Shuyang Sun, Daniel Omeiza, Bo Zhao, Paul Newman, Lars Kunze, and Matthew Gadd
    RSS, 2024
  2. SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model
    Bin Cao, Jianhao Yuan, Yexin Liu, Jian Li, Shuyang Sun, Jing Liu, and Bo Zhao
    arXiv preprint arXiv:2402.18068, 2024
  3. kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually Expanding Large Vocabularies
    Zhongrui Gui, Shuyang Sun, Runjia Li, Jianhao Yuan, Zhaochong An, Karsten Roth, Ameya Prabhu, and Philip Torr
    arXiv preprint arXiv:2404.09447, 2024
  4. clip_as_rnn.jpg
    CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor
    Shuyang Sun*, Runjia Li*, Philip Torr, Xiuye Gu, and Siyang Li
    In CVPR, 2024
  5. realfake.jpg
    Real-Fake: Effective Training Data Synthesis Through Distribution Matching
    Jianhao Yuan, Jie Zhang, Shuyang Sun, Philip Torr, and Bo Zhao
    ICLR, 2024
  6. Lumix: Improving mixup by better modelling label uncertainty
    Shuyang Sun*, Jie-Neng Chen*, Ruifei He, Alan Yuille, Philip Torr, and Song Bai
    ICASSP, 2024

2023

  1. remax.jpg
    ReMaX: Relaxing for better training on efficient panoptic segmentation
    Shuyang Sun, Weijun Wang, Qihang Yu, Andrew Howard, Philip Torr, and Liang-Chieh Chen
    NeurIPS, 2023
  2. OxfordTVG-HIC: Can Machine Make Humorous Captions from Images?
    Runjia Li*, Shuyang Sun*, Mohamed Elhoseiny, and Philip Torr
    In ICCV, 2023
  3. synthesis.png
    Is synthetic data from generative models ready for image recognition?
    Ruifei He, Shuyang Sun, Xin Yu, Chuhui Xue, Wenqing Zhang, Philip Torr, Song Bai, and Xiaojuan Qi
    ICLR, spotlight, 2023

2022

  1. Slot-vps: Object-centric representation learning for video panoptic segmentation
    Yi Zhou, Hui Zhang, Hana Lee, Shuyang Sun, Pingjun Li, Yangguang Zhu, ByungIn Yoo, Xiaojuan Qi, and Jae-Joon Han
    In CVPR, 2022
  2. Knowledge distillation as efficient pre-training: Faster convergence, higher data-efficiency, and better transferability
    Ruifei He, Shuyang Sun, Jihan Yang, Song Bai, and Xiaojuan Qi
    In CVPR, 2022
  3. Patch-based separable transformer for visual recognition
    Shuyang Sun, Xiaoyu Yue, Hengshuang Zhao, Philip Torr, and Song Bai
    T-PAMI, 2022

2021

  1. transmix.jpg
    TransMix: Attend to Mix for Vision Transformers
    Shuyang Sun*, Jie-Neng Chen*, Ju He, Philip Torr, Alan Yuille, and Song Bai
    CVPR, 2021
  2. vip.png
    Visual Parser: Representing Part-whole Hierarchies with Transformers
    Shuyang Sun, Xiaoyu Yue, Song Bai, and Philip Torr
    arXiv preprint arXiv:2107.05790, 2021
  3. psvit.png
    Vision transformer with progressive sampling
    Xiaoyu Yue*, Shuyang Sun*, Zhanghui Kuang, Meng Wei, Philip Torr, Wayne Zhang, and Dahua Lin
    In ICCV, 2021
  4. Aggregation with Feature Detection
    Shuyang Sun, Xiaoyu Yue, Xiaojuan Qi, Wanli Ouyang, Victor Adrian Prisacariu, and Philip Torr
    In ICCV, 2021

2020

  1. Exploring the hierarchy in relation labels for scene graph generation
    Yi Zhou, Shuyang Sun, Chao Zhang, Yikang Li, and Wanli Ouyang
    arXiv preprint arXiv:2009.05834, 2020
  2. Learning to sample the most useful training patches from images
    Shuyang Sun, Liang Chen, Gregory Slabaugh, and Philip Torr
    arXiv preprint arXiv:2011.12097, 2020

2019

  1. Hybrid task cascade for instance segmentation
    Kai Chen, Jiangmiao Pang, Jiaqi Wang, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jianping Shi, Wanli Ouyang, and  others
    In CVPR, 2019
  2. MMDetection: Open mmlab detection toolbox and benchmark
    Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, and  others
    arXiv preprint arXiv:1906.07155, 2019
  3. Robust multi-modality multi-object tracking
    Wenwei Zhang, Hui Zhou, Shuyang Sun, Zhe Wang, Jianping Shi, and Chen Change Loy
    In ICCV, 2019

2018

  1. fish.png
    Fishnet: A versatile backbone for image, region, and pixel level prediction
    Shuyang Sun, Jiangmiao Pang, Jianping Shi, Shuai Yi, and Wanli Ouyang
    NeurIPS, 2018
  2. off.jpg
    Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition
    Shuyang Sun, Zhanghui Kuang, Lu Sheng, Wanli Ouyang, and Wei Zhang
    In CVPR, 2018

2017

  1. Spindle net: Person re-identification with human body region guided feature decomposition and fusion
    Haiyu Zhao, Maoqing Tian, Shuyang Sun, Jing Shao, Junjie Yan, Shuai Yi, Xiaogang Wang, and Xiaoou Tang
    In CVPR, 2017