发布时间:2023-04-19 文章分类:电脑基础 投稿人:樱花 字号: 默认 | | 超大 打印

CVPR 2023 录用论文

CVPR 2023 统计数据:

提交:9155 篇论文
接受:2360 篇论文(接受率 25.8%)
亮点:235 篇论文(接受论文的 10%,提交论文的 2.6%)
获奖候选人:12 篇论文(接受论文的 0.51%,提交论文的 0.13%)

已接受论文列表(未决抄袭和双重提交检查):

Generating Human Motion from Textual Descriptions with High Quality Discrete Representation
Jianrong Zhang · Yangsong Zhang · Xiaodong Cun · Yong Zhang · Hongwei Zhao · Hongtao Lu · Xi SHEN · Ying Shan
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Wenxuan Zhang · Xiaodong Cun · Xuan Wang · Yong Zhang · Xi SHEN · Yu Guo · Ying Shan · Fei Wang
Explicit Visual Prompting for Low-Level Structure Segmentations
Weihuang Liu · Xi SHEN · Chi-Man Pun · Xiaodong Cun
Privacy-preserving Adversarial Facial Features
Zhibo Wang · He Wang · Shuaifan Jin · Wenwen Zhang · Jiahui Hu · Yan Wang · Peng Sun · Wei Yuan whu · Kaixin Liu · Kui Ren
NeRF-RPN: A general framework for object detection in NeRFs
Benran Hu · Junkai Huang · Yichen Liu · Yu-Wing Tai · Chi-Keung Tang
Category Query Learning for Human-Object Interaction Classification
Chi Xie · Fangao Zeng · Yue Hu · Shuang Liang · Yichen Wei
A Unified Pyramid Recurrent Network for Video Frame Interpolation
Xin Jin · LONG WU · Jie Chen · Chen Youxin · Jay Koo · Cheul-hee Hahm
SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field
Chong Bao · Yinda Zhang · Bangbang Yang · Tianxing Fan · Zesong Yang · Hujun Bao · Guofeng Zhang · Zhaopeng Cui
PATS: Patch Area Transportation with Subdivision for Local Feature Matching
Junjie Ni · Yijin Li · Zhaoyang Huang · Hongsheng Li · Zhaopeng Cui · Hujun Bao · Guofeng Zhang
DualVector: Unsupervised Vector Font Synthesis with Dual-Part Representation
Ying-Tian Liu · Zhifei Zhang · Yuan-Chen Guo · Matthew Fisher · Zhaowen Wang · Song-Hai Zhang
Towards Robust Tampered Text Detection in Document Image: New dataset and New Solution
Chenfan Qu · Chongyu Liu · Yuliang Liu · Xinhong Chen · Dezhi Peng · Fengjun Guo · Lianwen Jin
PanoSwin: a Pano-style Swin Transformer for Panorama Understanding
Zhixin Ling · Zhen Xing · Xiangdong Zhou · Man Cao · Guichun Zhou
SVFormer: Semi-supervised Video Transformer for Action Recognition
Zhen Xing · Qi Dai · Han Hu · Jingjing Chen · Zuxuan Wu · Yu-Gang Jiang
Multi-Object Manipulation via Object-Centric Neural Scattering Functions
Stephen Tian · Yancheng Cai · Hong-Xing Yu · Sergey Zakharov · Katherine Liu · Adrien Gaidon · Yunzhu Li · Jiajun Wu
RealImpact: A Dataset of Impact Sound Fields for Real Objects
Samuel Clarke · Ruohan Gao · Mason L Wang · Mark Rau · Julia Xu · Jui-Hsien Wang · Doug James · Jiajun Wu
3D Neural Field Generation using Triplane Diffusion
Jesse Shue · Eric Chan · Ryan Po · Zachary Ankner · Jiajun Wu · Gordon Wetzstein
Putting People in Their Place: Affordance-Aware Human Insertion into Scenes
Sumith Kulal · Tim Brooks · Alex Aiken · Jiajun Wu · Jimei Yang · Jingwan Lu · Alexei A. Efros · Krishna Kumar Singh
Towards Effective Visual Representations for Partial-Label Learning
Shiyu Xia · Jiaqi Lyu · Ning Xu · Gang Niu · Xin Geng
AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation
Zhen Li · Zuo-Liang Zhu · Ling-Hao Han · Qibin Hou · Chunle Guo · Ming-Ming Cheng
DNF: Decouple and Feedback Network for Seeing in the Dark
Xin Jin · Ling-Hao Han · Zhen Li · Chunle Guo · Zhi Chai · Chongyi Li
Spectral Enhanced Rectangle Transformer for Hyperspectral Image Denoising
Miaoyu Li · Ji Liu · Ying Fu · Yulun Zhang · Dejing Dou
Dynamic Aggregated Network for Gait Recognition
Kang Ma · Ying Fu · Dezhi Zheng · Chunshui Cao · Xuecai Hu · Yongzhen Huang
LG-BPN: Local and Global Blind-Patch Network for Self-Supervised Real-World Denoising
ZiChun Wang · Ying Fu · Ji Liu · Yulun Zhang
Real-Time Neural Light Field on Mobile Devices
Junli Cao · Huan Wang · Pavlo Chemerys · Vladislav Shakhrai · Ju Hu · Yun Fu · Denys Makoviichuk · Sergey Tulyakov · Jian Ren
ScaleDet: A Scalable Multi-Dataset Object Detector
Yanbei Chen · Manchen Wang · Abhay Mittal · Zhenlin Xu · Paolo Favaro · Joseph Tighe · Davide Modolo
All in One: Exploring Unified Video-Language Pre-training
Jinpeng Wang · Yixiao Ge · Rui Yan · Yuying Ge · Kevin Qinghong Lin · Satoshi Tsutsui · Xudong Lin · Guanyu Cai · Jianping WU · Ying Shan · Xiaohu Qie · Mike Zheng Shou
Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
Ziyun Zeng · Yuying Ge · Xihui Liu · Bin Chen · Ping Luo · Shu-Tao Xia · Yixiao Ge
KD-GAN: Data Limited Image Generation via Knowledge Distillation
Kaiwen Cui · Yingchen Yu · Fangneng Zhan · Shengcai Liao · Shijian Lu · Eric Xing
Mapping Degeneration Meets Label Evolution: Learning Infrared Small Target Detection with Single Point Supervision
Xinyi Ying · Li Liu · Yingqian Wang · Ruojing Li · Nuo Chen · Zaiping Lin · Weidong Sheng · Shilin Zhou
Logical Consistency and Greater Descriptive Power for Facial Hair Attribute Learning
Haiyu Wu · Grace Bezold · Aman Bhatta · Kevin Bowyer
Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding
Gyeongman Kim · Hajin Shim · Hyunsu Kim · Yunjey Choi · Junho Kim · Eunho Yang
3D Video Object Detection with Learnable Object-Centric Global Optimization
Jiawei He · Yuntao Chen · Naiyan Wang · Zhaoxiang Zhang
BEVFormer v2: Adapting Modern Image Backbones to Bird’s-Eye-View Recognition via Perspective Supervision
Chenyu Yang · Yuntao Chen · Hao Tian · Chenxin Tao · Xizhou Zhu · Zhaoxiang Zhang · Gao Huang · Hongyang Li · Yu Qiao · Lewei Lu · Jie Zhou · Jifeng Dai
MarS3D: A Plug-and-Play Motion-Aware Model for Semantic Segmentation on Multi-Scan 3D Point Clouds
Jiahui Liu · Chirui CHANG · Jianhui Liu · Xiaoyang Wu · Lan Ma · XIAOJUAN QI
Understanding Imbalanced Semantic Segmentation Through Neural Collapse
Zhisheng Zhong · Jiequan Cui · Yibo Yang · Xiaoyang Wu · XIAOJUAN QI · Xiangyu Zhang · Jiaya Jia
Hierarchical Dense Correlation Distillation for Few-Shot Segmentation
Bohao PENG · Zhuotao Tian · Xiaoyang Wu · Chengyao Wang · Shu Liu · Jingyong Su · Jiaya Jia
Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning
Xiaoyang Wu · Xin Wen · Xihui Liu · Hengshuang Zhao
Self-Correctable and Adaptable Inference for Generalizable Human Pose Estimation
Zhehan Kan · Shuoshuo Chen · Ce Zhang · Yushun Tang · Zhihai He
Neuro-Modulated Hebbian Learning for Fully Test-Time Adaptation
Yushun Tang · Ce Zhang · Heng Xu · Shuoshuo Chen · Jie Cheng · Luziwei Leng · Qinghai Guo · Zhihai He
Noisy Correspondence Learning with Meta Similarity Correction
Haochen Han · Kaiyao Miao · Qinghua Zheng · Minnan Luo
Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency
Xiaogeng Liu · Minghui Li · Haoyu Wang · Shengshan Hu · Dengpan Ye · Hai Jin · Libing Wu · Chaowei Xiao
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
Jiang Liu · Hui Ding · Zhaowei Cai · Yuting Zhang · Ravi Satzoda · Vijay Mahadevan · R. Manmatha
Glocal Energy-based Learning for Few-Shot Open-Set Recognition
Haoyu Wang · Guansong Pang · Peng Wang · Lei Zhang · Wei Wei · Yanning Zhang
PointDistiller: Structured Knowledge Distillation Towards Efficient and Compact 3D Detection
Linfeng Zhang · Runpei Dong · Hung-Shuo Tai · Kaisheng Ma
LipFormer: High-fidelity and Generalizable Talking Face Generation with A Pre-learned Facial Codebook
Jiayu Wang · Kang Zhao · Shiwei Zhang · Yingya Zhang · Yujun Shen · Deli Zhao · Jingren Zhou
High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning
Chao Xu · Junwei Zhu · Jiangning Zhang · Yue Han · Wenqing Chu · Ying Tai · Chengjie Wang · Zhifeng Xie · Yong Liu
EC^2: Emergent Communication for Embodied Control
Yao Mu · Shunyu Yao · Mingyu Ding · Ping Luo · Chuang Gan
Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive Loss
Anas Mahmoud · Jordan Sir Kwang Hu · Tianshu Kuai · Ali Harakeh · Liam Paull · Steven Waslander
Instance Relation Graph Guided Source-Free Domain Adaptive Object Detection
Vibashan Vishnukumar Sharmini · Poojan Oza · Vishal Patel
Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations
Vibashan Vishnukumar Sharmini · Ning Yu · Chen Xing · Can Qin · Mingfei Gao · Juan Carlos Niebles · Vishal Patel · Ran Xu
STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition
Xiaoyu Zhu · Po-Yao Huang · Junwei Liang · Celso de Melo · Alexander Hauptmann
DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks
Qiangqiang Wu · Tianyu Yang · Ziquan Liu · Baoyuan Wu · Ying Shan · Antoni Chan
TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization
Ziquan Liu · Yi Xu · Xiangyang Ji · Antoni Chan
Optimal Transport Minimization: Crowd Localization on Density Maps for Semi-Supervised Counting
Wei Lin · Antoni Chan
Music-Driven Group Choreography
Nhat Le · Trong Thang Pham · Tuong Do · Erman Tjiputra · Quang Tran · Anh Nguyen
Where is my Wallet? Modeling Object Proposal Sets for Egocentric Visual Query Localization
Mengmeng Xu · Yanghao Li · Cheng-Yang Fu · Bernard Ghanem · Tao Xiang · Juan-Manuel Perez-Rua
Rotation-Invariant Transformer for Point Cloud Matching
Hao Yu · Zheng Qin · Ji Hou · Mahdi Saleh · Dongsheng Li · Benjamin Busam · Slobodan Ilic
Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors
Ji Hou · Xiaoliang Dai · Zijian He · Angela Dai · Matthias Niessner
Boosting Semi-Supervised Learning by Exploiting All Unlabeled Data
Yuhao Chen · Xin Tan · Borui Zhao · ZhaoWei CHEN · Renjie Song · jiajun liang · Xuequan Lu
Implicit Identity Leakage: The Stumbling Block to Improving Deepfake Detection Generalization
Shichao Dong · Jin Wang · Renhe Ji · jiajun liang · Haoqiang Fan · Zheng Ge
EFEM: Equivariant Neural Field Expectation Maximization for 3D Object Segmentation Without Scene Supervision
Jiahui Lei · Congyue Deng · Karl Schmeckpeper · Leonidas Guibas · Kostas Daniilidis
SemiCVT: Semi-Supervised Convolutional Vision Transformer for Semantic Segmentation
Huimin Huang · Shiao Xie · Lanfen Lin · Tong Ruofeng · Yen-wei Chen · Yuexiang Li · Hong Wang · Yawen Huang · Yefeng Zheng
CNVid-3.5M: Build, Filter, and Pre-train the Large-scale Public Chinese Video-text Dataset
Tian Gan · Qing Wang · Xingning Dong · Xiangyuan Ren · Liqiang Nie · Qingpei Guo
Disentangling Writer and Character Styles for Handwriting Generation
Gang Dai · Yifan Zhang · Qingfeng Wang · Qing Du · Zhuliang Yu · Zhuoman Liu · Shuangping Huang
A2J-Transformer: Anchor-to-Joint Transformer Network for 3D Interacting Hand Pose Estimation from a Single RGB Image
Changlong Jiang · Yang Xiao · Cunlin Wu · Mingyang Zhang · Jinghong Zheng · Zhiguo Cao · Joey Zhou
Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks
Hao Li · Jinguo Zhu · Xiaohu Jiang · Xizhou Zhu · Hongsheng Li · Chun Yuan · Xiaohua Wang · Yu Qiao · Xiaogang Wang · Wenhai Wang · Jifeng Dai
ShapeTalk: A Language Dataset and Framework for 3D Shape Edits and Deformations
Panos Achlioptas · Ian Huang · Minhyuk Sung · Sergey Tulyakov · Leonidas Guibas
Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR
Feng Li · Ailing Zeng · Shilong Liu · Hao Zhang · Hongyang Li · Lionel Ni · Lei Zhang
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
Feng Li · Hao Zhang · Huaizhe Xu · Shilong Liu · Lei Zhang · Lionel Ni · Heung-Yeung Shum
MP-Former: Mask-Piloted Transfomer for Image Segmentation
Hao Zhang · Feng Li · Huaizhe Xu · Shijia Huang · Shilong Liu · Lionel Ni · Lei Zhang
Enlarging Instance-specific and Class-specific Information for Open-set Action Recognition
Jun Cen · Shiwei Zhang · Xiang Wang · Yixuan Pei · Zhiwu Qing · Yingya Zhang · Qifeng Chen
MoLo: Motion-augmented Long-short Contrastive Learning for Few-shot Action Recognition
Xiang Wang · Shiwei Zhang · Zhiwu Qing · Changxin Gao · Yingya Zhang · Deli Zhao · Nong Sang
PCR: Proxy-based Contrastive Replay for Online Class-Incremental Continual Learning
Huiwei Lin · Baoquan Zhang · Shanshan Feng · Xutao Li · Yunming Ye
Building Rearticulable Models for Arbitrary 3D Objects from 4D Point Clouds
Shaowei Liu · Saurabh Gupta · Shenlong Wang
Slide-Transformer: Hierarchical Vision Transformer with Local Self-Attention
Xuran Pan · Tianzhu Ye · Zhuofan Xia · Shiji Song · Gao Huang
Compressing Volumetric Radiance Fields to 1 MB
Lingzhi Li · Zhen Shen · Zhongshu Wang · Li Shen · Liefeng Bo
REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
Ziniu Hu · Ahmet Iscen · Chen Sun · Zirui Wang · Kai-Wei Chang · Yizhou Sun · Cordelia Schmid · David Ross · Alireza Fathi
Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
Ahmet Iscen · Alireza Fathi · Cordelia Schmid
Learning to Name Classes for Vision and Language Models
Sarah Parisot · Yongxin Yang · Steven McDonagh
SteerNeRF: Accelerating NeRF Rendering via Smooth Viewpoint Trajectory
Sicheng Li · Hao Li · Yue Wang · Yiyi Liao · Lu Yu
Semi-Supervised Video Inpainting with Cycle Consistency Constraints
Zhiliang Wu · Han Xuan · Changchang Sun · Weili Guan · Kang Zhang · Yan Yan
Deep Stereo Video Inpainting
Zhiliang Wu · Changchang Sun · Han Xuan · Yan Yan
VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval
Siteng Huang · Biao Gong · Yulin Pan · Jianwen Jiang · Yiliang Lv · Yuyuan Li · Donglin Wang
NeRF-Supervised Deep Stereo
Fabio Tosi · Alessio Tonioni · Daniele Gregorio · Matteo Poggi
Collaborative Static and Dynamic Vision-Language Streams for Spatio-Temporal Video Grounding
Zihang Lin · Chaolei Tan · Jian-Fang Hu · Zhi Jin · Tiancai Ye · Wei-Shi Zheng
Hierarchical Semantic Correspondence Networks for Video Paragraph Grounding
Chaolei Tan · Zihang Lin · Jian-Fang Hu · Wei-Shi Zheng · Jianhuang Lai
Combining Implicit-Explicit View Correlation for Light Field Semantic Segmentation
Ruixuan Cong · Da Yang · Rongshan Chen · Sizhe Wang · Zhenglong Cui · HaoSheng
Improving Robustness of Vision Transformers by Reducing Sensitivity to Patch Corruptions
Yong Guo · David Stutz · Bernt Schiele
DF-Platter: Multi-Face Heterogeneous Deepfake Dataset
Kartik Narayan · Harsh Agarwal · Kartik Thakral · Surbhi Mittal · Mayank Vatsa · Richa Singh
Metadata-Based RAW Reconstruction via Implicit Neural Functions
Leyi Li · Huijie Qiao · Qi Ye · Qinmin Yang
I
2
-SDF: Intrinsic Indoor Scene Reconstruction and Editing via Raytracing in Neural SDFs
Jingsen Zhu · Yuchi Huo · Qi Ye · Fujun Luan · Jifan Li · Dianbing Xi · Lisha Wang · Rui Tang · Wei Hua · Hujun Bao · Rui Wang
Polarized Color Image Denoising
Zhuoxiao Li · Haiyang Jiang · Mingdeng Cao · Yinqiang Zheng
NeFII: Inverse Rendering for Reflectance Decomposition with Near-Field Indirect Illumination
Haoqian Wu · Zhipeng Hu · Lincheng Li · Yongqiang Zhang · Changjie Fan · Xin Yu
Balanced Energy Regularization Loss for Out-of-distribution Detection
Hyunjun Choi · Hawook Jeong · Jin Choi
DeCo : Decomposition and Reconstruction for Compositional Temporal Grounding via Coarse-to-Fine Contrastive Ranking
Lijin Yang · Quan Kong · Hsuan-Kung Yang · Wadim Kehl · Yoichi Sato · Norimasa Kobori
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Zixian Ma · Jerry Hong · Mustafa Omer Gul · Mona Gandhi · Irena Gao · Ranjay Krishna
Painting 3D Nature in 2D: View Synthesis of Natural Scenes from a Single Semantic Mask
Shangzhan Zhang · Sida Peng · Tianrun Chen · Linzhan Mou · Haotong Lin · Kaicheng Yu · Yiyi Liao · Xiaowei Zhou
Learning 3D-aware Image Synthesis with Unknown Pose Distribution
Zifan Shi · Yujun Shen · Yinghao Xu · Sida Peng · Yiyi Liao · Sheng Guo · Qifeng Chen · Dit-Yan Yeung
StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator
Jiazhi Guan · Zhanwang Zhang · Hang Zhou · Tianshu Hu · Kaisiyuan Wang · Dongliang He · Haocheng Feng · Jingtuo Liu · Errui Ding · Ziwei Liu · Jingdong Wang
A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others
Zhiheng Li · Ivan Evtimov · Albert Gordo · Caner Hazirbas · Tal Hassner · Cristian Canton · Chenliang Xu · Mark Ibrahim
Cooperation or Competition: Avoiding Player Domination for Multi-target Robustness by Adaptive Budgets
Yimu Wang · Dinghuai Zhang · Yihan Wu · Heng Huang · Hongyang Zhang
Gated Stereo: Joint Depth Estimation from Gated and Wide-Baseline Active Stereo Cues
Stefanie Walz · Mario Bijelic · Andrea Ramazzina · Amanpreet Walia · Fahim Mannan · Felix Heide
SliceMatch: Geometry-guided Aggregation for Cross-View Pose Estimation
Zimin Xia · Holger Caesar · Julian Kooij · Ted Lentsch
Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations
Lei Hsiung · Yun-Yun Tsai · Pin-Yu Chen · Tsung-Yi Ho
StyleGAN Salon: Multi-View Latent Optimization for Pose-Invariant Hairstyle Transfer
Sasikarn Khwanmuang · Pakkapon Phongthawee · Patsorn Sangkloy · Supasorn Suwajanakorn
Learning Geometric-aware Properties in 2D Representation Using Lightweight CAD Models, or Zero Real 3D Pairs
Pattaramanee Arsomngern · Sarana Nutanong · Supasorn Suwajanakorn
Visibility Constrained Wide-band Illumination Spectrum Design for Seeing-in-the-Dark
Muyao Niu · Zhuoxiao Li · Zhihang Zhong · Yinqiang Zheng
ToThePoint: Efficient Contrastive Learning of 3D Point Clouds via Recycling
Xinglin Li · Jiajing Chen · Jinhui Ouyang · Hanhui Deng · Senem Velipasalar · Di Wu
AUNet: Learning Relations Between Action Units for Face Forgery Detection
Weiming Bai · Yufan Liu · Zhipeng Zhang · Bing Li · Weiming Hu
Physical-World Optical Adversarial Attacks on 3D Face Recognition
Yanjie Li · Yiquan Li · Xuelong Dai · Songtao Guo · Bin Xiao
Robust Single Image Reflection Removal Against Adversarial Attacks
Zhenbo Song · Zhenyuan Zhang · Kaihao Zhang · Wenhan Luo · Zhaoxin Fan · Wenqi Ren · Jianfeng Lu
The Enemy of My Enemy is My Friend: Exploring Inverse Adversaries for Improving Adversarial Training
Junhao Dong · Seyed-Mohsen Moosavi-Dezfooli · Jianhuang Lai · Xiaohua Xie
Boosting Accuracy and Robustness of Student Models via Adaptive Adversarial Distillation
Bo Huang · Mingyang Chen · Yi Wang · JUNDA LU · Minhao Cheng · Wei Wang
Introducing Competition to Boost the Transferability of Targeted Adversarial Examples through Clean Feature Mixup
Junyoung Byun · Myung-Joon Kwon · Seungju Cho · Yoonji Kim · Changick Kim
Angelic Patches for Improving Third-Party Object Detector Performance
Wenwen Si · Shuo Li · Sangdon Park · Insup Lee · Osbert Bastani
Sibling-Attack: Rethinking Transferable Adversarial Attacks against Face Recognition
Zexin Li · Bangjie Yin · Taiping Yao · Junfeng Guo · Shouhong Ding · Simin Chen · Cong Liu
A Practical Upper Bound for the Worst-Case Attribution Deviations
Fan Wang · Adams Kong
You Are Catching My Attention: Are Vision Transformers Bad Learners under Backdoor Attacks?
Zenghui Yuan · Pan Zhou · Kai Zou · Yu Cheng
Architectural Backdoors in Neural Networks
Mikel Bober-Irizar · Ilia Shumailov · Yiren Zhao · Robert Mullins · Nicolas Papernot
The Dark Side of Dynamic Routing Neural Networks: Towards Efficiency Backdoor Injection
Simin Chen · Hanlin Chen · Mirazul Haque · Cong Liu · Wei Yang
StyleAdv: Meta Style Adversarial Training for Cross-Domain Few-Shot Learning
Yuqian Fu · YU XIE · Yanwei Fu · Yu-Gang Jiang
Rethinking Domain Generalization for Face Anti-spoofing: Separability and Alignment
Yiyou Sun · Yaojie Liu · Xiaoming Liu · Yixuan Li · Vincent Chu
Make Landscape Flatter in Differentially Private Federated Learning
Yifan Shi · Yingqi Liu · Kang Wei · Li Shen · Xueqian Wang · Dacheng Tao
Confidence-aware Personalized Federated Learning via Variational Expectation Maximization
Junyi Zhu · Xingchen Ma · Matthew Blaschko
ScaleFL: Resource-Adaptive Federated Learning with Heterogeneous Clients
Fatih Ilhan · Gong Su · Ling Liu
MetaMix: Towards Corruption-Robust Continual Learning with Temporally Self-Adaptive Data Transformation
Zhenyi Wang · Li Shen · Donglin Zhan · Qiuling Suo · Yanjun Zhu · Tiehang Duan · Mingchen Gao
Revisiting Reverse Distillation for Anomaly Detection
Tran Dinh Tien · Anh Tuan Nguyen · Nguyen Tran · Huy Ta · Soan Duong · Chanh Nguyen · Steven Truong
Generating Anomalies for Video Anomaly Detection with Prompt-based Feature Mapping
Zuhao Liu · Xiao-Ming Wu · Dian Zheng · Kun-Yu Lin · Wei-Shi Zheng
Explicit Boundary Guided Semi-Push-Pull Contrastive Learning for Supervised Anomaly Detection
Xincheng Yao · Ruoqi Li · Jing Zhang · Jun Sun · Chongyang Zhang
Towards Universal Fake Image Detectors that Generalize Across Generative Models
Utkarsh Ojha · Yuheng Li · Yong Jae Lee
Edges to Shapes to Concepts: Adversarial Augmentation for Robust Vision
Aditay Tripathi · Rishubh Singh · Anirban Chakraborty · Pradeep Shenoy
Sequential training of GANs against GAN-classifiers reveals correlated “knowledge gaps” present among independently trained GAN instances
Arkanath Pathak · Nicholas Dufour
Masked Auto-Encoders Meet Generative Adversarial Networks and Beyond
Zhengcong Fei · Mingyuan Fan · Li Zhu · Junshi Huang · Xiaoming Wei · Xiaolin Wei
Vector Quantization with Self-attention for Quality-independent Representation Learning
zhou yang · Weisheng Dong · Xin Li · Mengluan Huang · Yulin Sun · Guangming Shi
PD-Quant: Post-Training Quantization Based on Prediction Difference Metric
Jiawei Liu · Lin Niu · Zhihang Yuan · Dawei Yang · Xinggang Wang · Wenyu Liu
Hard Sample Matters a Lot in Zero-Shot Quantization
Huantong Li · Xiangmiao Wu · fanbing Lv · Daihai Liao · Thomas Li · Yonggang Zhang · Bo Han · Mingkui Tan
Fair Scratch Tickets: Finding Fair Sparse Networks without Weight Training
Pengwei Tang · Wei Yao · Zhicong Li · Yong Liu
Understanding Deep Generative Models with Generalized Empirical Likelihoods
Suman Ravuri · Mélanie Rey · Shakir Mohamed · Marc Deisenroth
Deep Deterministic Uncertainty: A New Simple Baseline
Jishnu Mukhoti · Andreas Kirsch · Joost van Amersfoort · Philip Torr · Yarin Gal
Compacting Binary Neural Networks by Sparse Kernel Selection
Yikai Wang · Wenbing Huang · Yinpeng Dong · Fuchun Sun · Anbang Yao
Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures
Eugenia Iofinova · Alexandra Peste · Dan Alistarh
X-Pruner: eXplainable Pruning for Vision Transformers
Lu Yu · Wei Xiang
Deep Graph Reprogramming
Yongcheng Jing · Chongbin Yuan · Li Ju · Yiding Yang · Xinchao Wang · Dacheng Tao
FlowGrad: Controlling the Output of Generative ODEs with Gradients
Xingchao Liu · Lemeng Wu · Shujian Zhang · Chengyue Gong · Wei Ping · qiang liu
Exploring Data Geometry for Continual Learning
Zhi Gao · Chen Xu · Feng Li · Yunde Jia · Mehrtash Harandi · Yuwei Wu
Improving Generalization with Domain Convex Game
Fangrui Lv · Jian Liang · Shuang Li · Jinming Zhang · Di Liu
SLACK: Stable Learning of Augmentations with Cold-start and KL regularization
Juliette Marrie · Michael Arbel · Diane Larlus · Julien Mairal
Critical Learning Periods for Multisensory Integration in Deep Networks
Michael Kleinman · Alessandro Achille · Stefano Soatto
Preserving Linear Separability in Continual Learning by Backward Feature Projection
Qiao Gu · Dongsub Shim · Florian Shkurti
Multi-level Logit Distillation
Ying Jin · Jiaqi Wang · Dahua Lin
Data-Free Knowledge Distillation via Feature Exchange and Activation Region Constraint
Shikang Yu · Jiachen Chen · Hu Han · Shuqiang Jiang
Masked Autoencoders Enable Efficient Knowledge Distillers
Yutong Bai · Zeyu Wang · Junfei Xiao · Chen Wei · Huiyu Wang · Alan Yuille · Yuyin Zhou · Cihang Xie
DKT: Diverse Knowledge Transfer Transformer for Class Incremental Learning
Xinyuan Gao · Yuhang He · SongLin Dong · Jie Cheng · Xing Wei · Yihong Gong
BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning
Changdae Oh · Hyeji Hwang · Hee-young Lee · YongTaek Lim · Geunyoung Jung · Jiyoung Jung · Hosik Choi · Kyungwoo Song
PIVOT: Prompting for Video Continual Learning
Andres Villa · Juan Leon Alcazar · Motasem Alfarra · Kumail Alhamoud · Julio Hurtado · Fabian Caba · Alvaro Soto · Bernard Ghanem
MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering
Jingjing Jiang · Nanning Zheng
NIFF: Alleviating Forgetting in Generalized Few-Shot Object Detection via Neural Instance Feature Forging
Karim Guirguis · Johannes Meier · George Eskandar · Matthias Kayser · Bin Yang · Jürgen Beyerer
Learning with Fantasy: Semantic-Aware Virtual Contrastive Constraint for Few-Shot Class-Incremental Learning
Zeyin Song · Yifan Zhao · Yujun Shi · Peixi Peng · Li Yuan · Yonghong Tian
Improved Test-Time Adaptation for Domain Generalization
Liang Chen · Yong Zhang · Yibing Song · Ying Shan · Lingqiao Liu
TIPI: Test Time Adaptation with Transformation Invariance
Anh Tuan Nguyen · Thanh Nguyen-Tang · Ser-Nam Lim · Philip Torr
ActMAD: Activation Matching to Align Distributions for Test-Time-Training
Muhammad Mirza Mirza · Pol Jane Soneira · Wei Lin · Mateusz Kozinski · Horst Possegger · Horst Bischof
Modality-Agnostic Debiasing for Single Domain Generalization
Sanqing Qu · Yingwei Pan · Guang Chen · Ting Yao · changjun jiang · Tao Mei
ALOFT: A Lightweight MLP-like Architecture with Dynamic Low-frequency Transform for Domain Generalization
Jintao Guo · Na Wang · Lei Qi · Yinghuan Shi
C-SFDA: A Curriculum Learning Aided Self-Training Framework for Efficient Source Free Domain Adaptation
Nazmul Karim · Niluthpol Chowdhury Mithun · Abhinav Rajvanshi · Han-pang Chiu · Supun Samarasekera · Nazanin Rahnavard
Adjustment and Alignment for Unbiased Open Set Domain Adaptation
Wuyang Li · Jie Liu · Bo Han · Yixuan Yuan
Semi-Supervised Domain Adaptation with Source Label Adaptation
Yu-Chu Yu · Hsuan-Tien Lin
Dynamically Instance-Guided Adaptation: A Backward-free Approach for Test-Time Domain Adaptive Semantic Segmentation
Wei Wang · Zhun Zhong · Weijie Wang · Xi Chen · Charles Ling · Boyu Wang · Nicu Sebe
FCC: Feature Clusters Compression for Long-Tailed Visual Recognition
Jian Li · Ziyao Meng · daqian Shi · Rui Song · Xiaolei Diao · Jingwen Wang · Hao Xu
DISC: Learning from Noisy Labels via Dynamic Instance-Specific Selection and Correction
Yifan Li · Hu Han · Shiguang Shan · Xilin CHEN
Superclass Learning with Representation Enhancement
Zeyu Gan · Suyun Zhao · Jinlong Kang · Liyuan Shang · Hong Chen · Cuiping Li
Improving Selective Visual Question Answering by Learning from Your Peers
Corentin Dancette · Spencer Whitehead · Rishabh Maheshwary · Shanmukha Ramakrishna Vedantam · Stefan Scherer · Xinlei Chen · Matthieu CORD · Marcus Rohrbach
Difficulty-based Sampling for Debiased Contrastive Representation Learning
Taeuk Jang · Xiaoqian Wang
Token Boosting for Robust Self-Supervised Visual Transformer Pre-training
Tianjiao Li · Lin Geng Foo · Ping Hu · Xindi Shang · Hossein Rahmani · Zehuan Yuan · Jun Liu
HyperMatch: Noise-Tolerant Semi-Supervised Learning via Relaxed Contrastive Constraint
Beitong Zhou · Jing Lu · Kerui Liu · Yunlu Xu · Zhanzhan Cheng · Yi Niu
Open-Set Likelihood Maximization for Few-Shot Learning
Malik Boudiaf · Etienne Bennequin · Myriam Tami · Antoine Toubhans · Pablo Piantanida · CELINE HUDELOT · Ismail Ayed
Transductive Few-Shot Learning with Prototypes Label-Propagation by Iterative Graph Refinement
Hao Zhu · Piotr Koniusz
Deep Fair Clustering via Maximizing and Minimizing Mutual Information: Theory, Algorithm and Metric
Pengxin Zeng · Yunfan Li · Peng Hu · Dezhong Peng · Jiancheng Lv · Xi Peng
On the Effects of Self-supervision and Contrastive Alignment in Deep Multi-view Clustering
Daniel J. Trosten · Sigurd Løkse · Robert Jenssen · Michael Kampffmeyer
Sample-level Multi-view Graph Clustering
Yuze Tan · Yixi Liu · Shudong Huang · Wentao Feng · Jiancheng Lv
Discriminating Known from Unknown Objects via Structure-Enhanced Recurrent Variational AutoEncoder
Aming WU · Cheng Deng
GEN: Pushing the Limits of Softmax-Based Out-of-Distribution Detection
Xixi Liu · Yaroslava Lochman · Christopher Zach
RankMix: Data Augmentation for Weakly Supervised Learning of Classifying Whole Slide Images with Diverse Sizes and Imbalanced Categories
Yuan-Chih Chen · Chun-Shien Lu
Best of Both Worlds: Multimodal Contrastive Learning with Tabular and Imaging Data
Paul Hager · Martin J. Menten · Daniel Rueckert
DeGPR: Deep Guided Posterior Regularisation For Multi-Class Cell Detection And Counting
Aayush Tyagi · Chirag Mohapatra · Prasenjit Das · Govind Makharia · Lalita Mehra · Prathosh AP · Mausam .
OCELOT: Overlapped Cell on Tissue Dataset for Histopathology
Jeongun Ryu · Aaron Valero Puche · JaeWoong Shin · Seonwook Park · Biagio Brattoli · Jinhee Lee · Wonkyung Jung · Soo Ick Cho · Kyunghyun Paeng · Chan-Young Ock · Donggeun Yoo · Sérgio Pereira
SQUID: Deep Feature In-Painting for Unsupervised Anomaly Detection
Tiange Xiang · Yixiao Zhang · Yongyi Lu · Alan Yuille · Chaoyi Zhang · Weidong Cai · Zongwei Zhou
Devil is in the Queries: Advancing Mask Transformers for Real-world Medical Image Segmentation and Out-of-Distribution Localization
Mingze Yuan · Yingda Xia · Hexin Dong · Zifan Chen · Jiawen Yao · Mingyan Qiu · Ke Yan · Xiaoli Yin · Yu Shi · Xin Chen · Zaiyi Liu · Bin Dong · Jingren Zhou · Le Lu · Ling Zhang · Li Zhang
MagicNet: Semi-Supervised Multi-Organ Segmentation via Magic-Cube Partition and Recovery
Duowen Chen · Yunhao Bai · Wei Shen · Qingli Li · Lequan Yu · Yan Wang
(ML)
2
P-Encoder: On Exploration of Channel-class Correlation for Multi-label Zero-shot Learning
Ziming Liu · Song Guo · Xiaocheng Lu · Jingcai Guo · Jiewei Zhang · Yue Zeng · Fushuo Huo
Out-of-Distributed Semantic Pruning for Robust Semi-Supervised Learning
Yu Wang · Pengchong Qiao · Chang Liu · Guoli Song · Xiawu Zheng · Jie Chen
Contrastive Mean Teacher for Domain Adaptive Object Detectors
Shengcao Cao · Dhiraj Joshi · Liangyan Gui · Yu-Xiong Wang
Harmonious Teacher for Cross-domain Object Detection
Jinhong Deng · Dongli Xu · Wen Li · Lixin Duan
Hierarchical Supervision and Shuffle Data Augmentation for 3D Semi-Supervised Object Detection
Chuandong Liu · CHENQIANG GAO · Fangcen Liu · Pengcheng Li · Deyu Meng · Xinbo Gao
Semi-DETR: Semi-Supervised Object Detection with Detection Transformers
Jiacheng Zhang · Xiangru Lin · Wei Zhang · Kuo Wang · Xiao Tan · Junyu Han · Errui Ding · Jingdong Wang · Guanbin Li
Continual Detection Transformer for Incremental Object Detection
Yaoyao Liu · Bernt Schiele · Andrea Vedaldi · Christian Rupprecht
DA-DETR: Domain Adaptive Detection Transformer with Information Fusion
Jingyi Zhang · Jiaxing Huang · Zhipeng Luo · Gongjie Zhang · Xiaoqin Zhang · Shijian Lu
CIGAR: Cross-Modality Graph Reasoning for Domain Adaptive Object Detection
Yabo Liu · Jinghua Wang · Chao Huang · Yaowei Wang · Yong Xu
Box-Level Active Detection
Mengyao Lyu · Jundong Zhou · Hui Chen · Yi-Jie Huang · Dongdong Yu · Yaqian Li · Yandong Guo · Yuchen Guo · Liuyu Xiang · Guiguang Ding
Enhanced Training of Query-Based Object Detection via Selective Query Recollection
Fangyi Chen · Han Zhang · Kai Hu · Yu-Kai Huang · Chenchen Zhu · Marios Savvides
Vision Transformers are Good Mask Auto-Labelers
Shiyi Lan · Xitong Yang · Zhiding Yu · Zuxuan Wu · Jose Alvarez · Anima Anandkumar
Weakly Supervised Posture Mining for Fine-grained Classification
Zhenchao Tang · Hualin Yang · Calvin Yu-Chian Chen
IDGI: A Framework to Eliminate Explanation Noise from Integrated Gradients
Ruo Yang · Binghui Wang · Mustafa Bilgic
Active Finetuning: Exploiting Annotation Budget in the Pretraining-Finetuning Paradigm
Yichen Xie · Han Lu · Junchi Yan · Xiaokang Yang · Masayoshi Tomizuka · Wei Zhan
Instance-specific and Model-adaptive Supervision for Semi-supervised Semantic Segmentation
Zhen Zhao · Sifan Long · Jimin Pi · Jingdong Wang · Luping Zhou
Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge Excavation
Yan Jin · Mengke LI · Yang Lu · Yiu-ming Cheung · Hanzi Wang
Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation
Chaohui Yu · Qiang Zhou · Jingliang Li · Jianlong Yuan · Zhibin Wang · Fan Wang
Out-of-Candidate Rectification for Weakly Supervised Semantic Segmentation
Zesen Cheng · Pengchong Qiao · Kehan Li · Siheng Li · Pengxu Wei · Xiangyang Ji · Li Yuan · Chang Liu · Jie Chen
FastInst: A Simple Query-Based Model for Real-Time Instance Segmentation
Junjie He · Pengyu Li · Yifeng Geng · Xuansong Xie
On Calibrating Semantic Segmentation Models: Analyses and An Algorithm
Dongdong Wang · Boqing Gong · Liqiang Wang
Content-aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers
Chenyang Lu · Daan de Geus · Gijs Dubbelman
Ultra-High Resolution Segmentation with Ultra-Rich Context: A Novel Benchmark
Deyi Ji · Feng Zhao · Hongtao Lu · Mingyuan Tao · Jieping Ye
Few-shot Semantic Image Synthesis with Class Affinity Transfer
Marlene Careil · Jakob Verbeek · Stéphane Lathuilière
Network-free, unsupervised semantic segmentation with synthetic images
Qianli Feng · Raghudeep Gadde · Wentong Liao · Eduard Ramon · Aleix Martinez
MISC210K: A Large-Scale Dataset for Multi-Instance Semantic Correspondence
Yixuan Sun · Yiwen Huang · HaiJing Guo · Yuzhou Zhao · Runmin Wu · Yizhou Yu · Weifeng Ge · Wenqiang Zhang
GRES: Generalized Referring Expression Segmentation
Chang Liu · Henghui Ding · Xudong Jiang
Semantic Prompt for Few-Shot Image Recognition
Wentao Chen · Chenyang Si · Zhang Zhang · Liang Wang · Zilei Wang · Tieniu Tan
Contrastive Grouping with Transformer for Referring Image Segmentation
Jiajin Tang · Ge Zheng · Cheng Shi · Sibei YANG
Decomposed Soft Prompt Guided Fusion Enhancing for Compositional Zero-Shot Learning
Xiaocheng Lu · Song Guo · Ziming Liu · Jingcai Guo
GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning
Zhenyu Xie · Zaiyu Huang · Xin Dong · Fuwei Zhao · Haoye Dong · Xijin Zhang · Feida Zhu · Xiaodan Liang
OvarNet: Towards Open-vocabulary Object Attribute Recognition
Keyan Chen · Xiaolong Jiang · Yao Hu · Xu Tang · Yan Gao · Jianqi Chen · Weidi Xie
HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models
Shan Ning · Longtian Qiu · Yongfei Liu · Xuming He
DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment
Lewei Yao · Jianhua Han · Xiaodan Liang · Dan Xu · Wei Zhang · Zhenguo Li · Hang Xu
Data-efficient Large Scale Place Recognition with Graded Similarity Supervision
Maria Leyva-Vallina · Nicola Strisciuglio · Nicolai Petkov
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
Zequn Zeng · Hao Zhang · Zhengjue Wang · Ruiying Lu · Dongsheng Wang · Bo Chen
Deep Hashing with Minimal-Distance-Separated Hash Centers
Liangdao Wang · Yan Pan · Cong Liu · Hanjiang Lai · Jian Yin · Ye Liu
Few-Shot Learning with Visual Distribution Calibration and Cross-Modal Distribution Alignment
Runqi Wang · Hao ZHENG · Xiaoyue Duan · Jianzhuang Liu · Yuning Lu · Tian Wang · Songcen Xu · Baochang Zhang
Masked Autoencoding Does Not Help Natural Language Supervision at Scale
Floris Weers · Vaishaal Shankar · Angelos Katharopoulos · Yinfei Yang · Tom Gunter
Improving Cross-Modal Retrieval with Set of Diverse Embeddings
Dongwon Kim · Namyup Kim · Suha Kwak
Revisiting Self-Similarity: Structural Embedding for Image Retrieval
Seongwon Lee · Suhyeon Lee · Hongje Seong · Euntai Kim
LANIT: Language-Driven Image-to-Image Translation for Unlabeled Data
Jihye Park · Sunwoo Kim · Soohyun Kim · Seokju Cho · Jaejun Yoo · Youngjung Uh · Seungryong Kim
Scaling Language-Image Pre-training via Masking
Yanghao Li · Haoqi Fan · Ronghang Hu · Christoph Feichtenhofer · Kaiming He
Variational Distribution Learning for Unsupervised Text-to-Image Generation
MINSOO KANG · Doyup Lee · Jiseob Kim · Saehoon Kim · Bohyung Han
Semantic-Conditional Diffusion Networks for Image Captioning
Jianjie Luo · Yehao Li · Yingwei Pan · Ting Yao · Jianlin Feng · Hongyang Chao · Tao Mei
Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style
Fengyin Lin · Mingkang Li · Da Li · Timothy Hospedales · Yi-Zhe Song · Yonggang Qi
MAGVLT: Masked Generative Vision-and-Language Transformer
Sungwoong Kim · Daejin Jo · Donghoon Lee · Jongmin Kim
SketchXAI: A First Look at Explainability for Human Sketches
Zhiyu Qu · Yulia Gryaditskaya · Ke Li · Kaiyue Pang · Tao Xiang · Yi-Zhe Song
Learning Geometry-aware Representations by Sketching
Hyundo Lee · Inwoo Hwang · Hyunsung Go · Won-Seok Choi · Kibeom Kim · Byoung-Tak Zhang
Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training
Dezhao Luo · Jiabo Huang · Shaogang Gong · Hailin Jin · Yang Liu
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Syed Talal Wasim · Muhammad Muzammal Naseer · Salman Khan · Fahad Khan · Mubarak Shah
Query-Dependent Video Representation for Moment Retrieval and Highlight Detection
WonJun Moon · Sangeek Hyun · SangUk Park · Dongchan Park · Jae-Pil Heo
Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-based Active Learning
Wei Ji · Renjie Liang · Zhedong Zheng · Wenqiao Zhang · Shengyu Zhang · Juncheng Li · Mengze Li · Tat-Seng Chua
Improving Weakly Supervised Temporal Action Localization by Bridging Train-Test Gap in Pseudo Labels
Jingqiu Zhou · Linjiang Huang · Liang Wang · Si Liu · Hongsheng Li
PivoTAL: Prior-Driven Supervision for Weakly-Supervised Temporal Action Localization
Mamshad Nayeem Rizve · Gaurav Mittal · Ye Yu · Matthew Hall · Sandra Sajeev · Mubarak Shah · Mei Chen
Open Set Action Recognition via Multi-Label Evidential Learning
Chen Zhao · Dawei Du · Anthony Hoogs · Christopher Funk
Object Discovery from Motion-Guided Tokens
Zhipeng Bao · Pavel Tokmakov · Yu-Xiong Wang · Adrien Gaidon · Martial Hebert
Unified Keypoint-based Action Recognition Framework via Structured Keypoint Pooling
Ryo Hachiuma · Fumiaki Sato · Taiki Sekii
Video Test-Time Adaptation for Action Recognition
Wei Lin · Muhammad Mirza Mirza · Mateusz Kozinski · Horst Possegger · Hilde Kuehne · Horst Bischof
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline
Tiantian Geng · Teng WANG · Jinming Duan · Runmin Cong · Feng Zheng
A Light Weight Model for Active Speaker Detection
Junhua Liao · Haihan Duan · Kanghui Feng · WanBing Zhao · Yanbing Yang · Liangyin Chen
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Paul Hongsuck Seo · Arsha Nagrani · Cordelia Schmid
Egocentric Audio-Visual Object Localization
Chao Huang · Yapeng Tian · Anurag Kumar · Chenliang Xu
An Empirical Study of End-to-End Video-Language Transformers with Masked Visual Modeling
Tsu-Jui Fu · Linjie Li · Zhe Gan · Kevin Lin · William Yang Wang · Lijuan Wang · Zicheng Liu
Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transformers
Jaehoon Yoo · Semin Kim · Doyup Lee · Chiheon Kim · Seunghoon Hong
Unifying Short and Long-Term Tracking with Graph Hierarchies
Orcun Cetintas · Guillem Braso · Laura Leal-Taixé
Hierarchical Neural Memory Network for Low Latency Event Processing
Ryuhei Hamaguchi · Yasutaka Furukawa · Masaki Onishi · Ken Sakurada
Mask-Free Video Instance Segmentation
Lei Ke · Martin Danelljan · Henghui Ding · Yu-Wing Tai · Chi-Keung Tang · Fisher Yu
Hierarchical Semantic Contrast for Scene-aware Video Anomaly Detection
Shengyang Sun · Xiaojin Gong
Breaking the “Object” in Video Object Segmentation
Pavel Tokmakov · Jie Li · Adrien Gaidon
VideoTrack: Learning to Track Objects via Video Transformer
Fei Xie · Lei Chu · Jiahao Li · Yan Lu · Chao Ma
Recurrence without Recurrence: Stable Video Landmark Detection with Deep Equilibrium Models
Paul Micaelli · Arash Vahdat · Hongxu Yin · Jan Kautz · Pavlo Molchanov
Unbiased Scene Graph Generation in Videos
Sayak Nag · Kyle Min · Subarna Tripathi · Amit Roy-Chowdhury
Graph Representation for Order-aware Visual Transformation
Yue Qiu · Yanjun Sun · Fumiya Matsuzawa · Kenji Iwata · Hirokatsu Kataoka
Prototype-based Embedding Network for Scene Graph Generation
Chaofan Zheng · Xinyu Lyu · Lianli Gao · Bo Dai · Jingkuan Song
Efficient Mask Correction for Click-Based Interactive Image Segmentation
Fei Du · Jianlong Yuan · Zhibin Wang · Fan Wang
G-MSM: Unsupervised Multi-Shape Matching with Graph-based Affinity Priors
Marvin Eisenberger · Aysim Toker · Laura Leal-Taixé · Daniel Cremers
Shape-Erased Feature Learning for Visible-Infrared Person Re-Identification
Jiawei Feng · Ancong Wu · Wei-Shi Zheng
Mixed Autoencoder for Self-supervised Visual Representation Learning
Kai Chen · Zhili LIU · Lanqing HONG · Hang Xu · Zhenguo Li · Dit-Yan Yeung
Stare at What You See: Masked Image Modeling without Reconstruction
Hongwei Xue · Peng Gao · Hongyang Li · Yu Qiao · Hao Sun · Houqiang Li · Jiebo Luo
ResFormer: Scaling ViTs with Multi-Resolution Training
Rui Tian · Zuxuan Wu · Qi Dai · Han Hu · Yu Qiao · Yu-Gang Jiang
Seeing Beyond the Brain: Conditional Diffusion Model with Sparse Masked Modeling for Vision Decoding
Zijiao Chen · Jiaxin Qing · Tiange Xiang · Wan Lin Yue · Juan Zhou Zhou
DropKey for Vision Transformer
Bonan Li · Yinhan Hu · Xuecheng Nie · Congying Han · Xiangjian Jiang · Tiande Guo · Luoqi Liu
Vision Transformer with Super Token Sampling
Huaibo Huang · Xiaoqiang Zhou · Jie Cao · Ran He · Tieniu Tan
Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers
Cong Wei · Brendan Duke · Ruowei Jiang · Parham Aarabi · Graham Taylor · Florian Shkurti
All are Worth Words: A ViT Backbone for Diffusion Models
Fan Bao · Shen Nie · Kaiwen Xue · Yue Cao · Chongxuan Li · Hang Su · Jun Zhu
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Chong Yu · Tao Chen · Zhongxue Gan · Jiayuan Fan
DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP Training
Yihao Chen · Xianbiao Qi · Jianan Wang · Lei Zhang
Structured Sparsity Learning for Efficient Video Super-Resolution
Bin Xia · Jingwen He · Yulun Zhang · Yitong Wang · Yapeng Tian · Wenming Yang · Luc Van Gool
Efficient Semantic Segmentation by Altering Resolutions for Compressed Videos
Yubin Hu · Yuze He · Yanghao Li · Jisheng Li · Yuxing Han · jiangtao wen · Yong-jin Liu
Neural Video Compression with Diverse Contexts
Jiahao Li · Bin Li · Yan Lu
Large-capacity and Flexible Video Steganography via Invertible Neural Network
Chong Mou · Youmin Xu · Jiechong Song · Chen Zhao · Bernard Ghanem · Jian Zhang
Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization
Mengqi Huang · Zhendong Mao · Zhuowei Chen · Yongdong Zhang
Binary Latent Diffusion
Ze Wang · Jiang Wang · Zicheng Liu · Qiang Qiu
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
Andreas Blattmann · Robin Rombach · Huan Ling · Tim Dockhorn · Seung Wook Kim · Sanja Fidler · Karsten Kreis
Diffusion Probabilistic Model Made Slim
Xingyi Yang · Daquan Zhou · Jiashi Feng · Xinchao Wang
Solving 3D Inverse Problems from Pre-trained 2D Diffusion Models
Hyungjin Chung · Dohoon Ryu · Michael McCann · Marc Klasky · Jong Ye
EDICT: Exact Diffusion Inversion via Coupled Transformations
Bram Wallace · Akash Gokul · Nikhil Naik
Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models
Patrick Schramowski · Manuel Brack · Björn Deiseroth · Kristian Kersting
GLIGEN: Open-Set Grounded Text-to-Image Generation
Yuheng Li · Haotian Liu · Qingyang Wu · Fangzhou Mu · Jianwei Yang · Jianfeng Gao · Chunyuan Li · Yong Jae Lee
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Nataniel Ruiz · Yuanzhen Li · Varun Jampani · Yael Pritch · Michael Rubinstein · Kfir Aberman
LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation
Guangcong Zheng · Xianpan Zhou · Xuewei Li · Zhongang Qi · Ying Shan · Xi Li
Affordance Diffusion: Synthesizing Hand-Object Interactions
Yufei Ye · Xueting Li · Abhinav Gupta · Shalini De Mello · Stan Birchfield · Jiaming Song · Shubham Tulsiani · Sifei Liu
SceneComposer: Any-Level Semantic Image Synthesis
Yu Zeng · Zhe Lin · Jianming Zhang · Qing Liu · John Collomosse · Jason Kuen · Vishal Patel
Handwritten Text Generation from Visual Archetypes
Vittorio Pippi · Silvia Cascianelli · Rita Cucchiara
Referring Image Matting
Jizhizi Li · Jing Zhang · Dacheng Tao
Neural Transformation Fields for Arbitrary-Styled Font Generation
Bin Fu · Junjun He · Jianjun Wang · Yu Qiao
SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Mode
Shaoan Xie · Zhifei Zhang · Zhe Lin · Tobias Hinz · Kun Zhang
Masked and Adaptive Transformer for Exemplar Based Image Translation
chang jiang · Fei Gao · Biao Ma · Lin Yuhao · Nannan Wang · Gang Xu
Efficient Scale-Invariant Generator with Column-Row Entangled Pixel Synthesis
Thuan Nguyen · Thanh Le · Anh Tran
RWSC-Fusion: Region-Wise Style-Controlled Fusion Network for the Prohibited X-ray Security Image Synthesis
luwen duan · Min Wu · Lijian Mao · Jun Yin · Xiong Jianping · Xi Li
Towards Artistic Image Aesthetics Assessment: a Large-scale Dataset and a New Method
Ran Yi · Haoyuan Tian · Zhihao Gu · Yu-Kun Lai · Paul Rosin
Omni Aggregation Networks for Lightweight Image Super-Resolution
Hang Wang · Xuanhong Chen · Bingbing Ni · Yutian Liu · Jinfan Liu
Activating More Pixels in Image Super-Resolution Transformer
Xiangyu Chen · Xintao Wang · Jiantao Zhou · Yu Qiao · Chao Dong
Spatial-Frequency Mutual Learning for Face Super-Resolution
Chenyang Wang · Junjun Jiang · Zhiwei Zhong · Xianming Liu
Kernel Aware Resampler
Michael Bernasconi · Abdelaziz Djelouah · Farnood Salehi · Markus Gross · Christopher Schroers
RGB no more: Minimally-decoded JPEG Vision Transformers
Jeongsoo Park · Justin Johnson
Multi-Realism Image Compression with a Conditional Generator
Eirikur Agustsson · David Minnen · George Toderici · Fabian Mentzer
Learning to Exploit the Sequence-Specific Prior Knowledge for Image Processing Pipelines Optimization
Haina Qin · Longfei Han · Weihua Xiong · Juan Wang · Wentao Ma · Bing Li · Weiming Hu
Quality-aware Pre-trained Models for Blind Image Quality Assessment
Kai Zhao · Kun Yuan · Ming Sun · Mading Li · Xing Wen
Robust Unsupervised StyleGAN Image Restoration
Yohan Poirier-Ginter · Jean-Francois Lalonde
RIDCP: Revitalizing Real Image Dehazing via High-Quality Codebook Priors
Rui-Qi Wu · Zheng-Peng Duan · Chunle Guo · Zhi Chai · Chongyi Li
Toward Stable, Interpretable, and Lightweight Hyperspectral Super-resolution
Wenjin Guo · Weiying Xie · Kai Jiang · Yunsong Li · Jie Lei · Leyuan Fang
Residual Degradation Learning Unfolding Framework with Mixing Priors across Spectral and Spatial for Compressive Spectral Imaging
Yubo Dong · Dahua Gao · Tian Qiu · Yuyan Li · Minxi Yang · Guangming Shi
Learning a Simple Low-light Image Enhancer from Paired Low-light Instances
Zhenqi Fu · Yan Yang · Xiaotong Tu · Yue Huang · Xinghao Ding · Kai-Kuang Ma
Learning a Deep Color Difference Metric for Photographic Images
Haoyu Chen · Zhihua Wang · Yang Yang · Qilin Sun · Kede Ma
Learning a Practical SDR-to-HDRTV Up-conversion using New Dataset and Degradation Models
Cheng Guo · Leidong Fan · Ziyu Xue · Xiuhua Jiang
BiasBed - Rigorous Texture Bias Evaluation
Nikolai Kalischek · Rodrigo Daudt · Torben Peters · Reinhard Furrer · Jan D. Wegner · Konrad Schindler
A Unified HDR Imaging Method with Pixel and Patch Level
Qingsen Yan · Weiye Chen · song zhang · Yu Zhu · Jinqiu Sun · Yanning Zhang
Gated Multi-Resolution Transfer Network for Burst Restoration and Enhancement
Nancy Mehta · Akshay Dudhane · Subrahmanyam Murala · Syed Waqas Zamir · Salman Khan · Fahad Khan
Deep Discriminative Spatial and Temporal Network for Efficient Video Deblurring
Jinshan Pan · Boming Xu · Jiangxin Dong · Jianjun Ge · Jinhui Tang
1000 FPS HDR Video with a Spike-RGB Hybrid Camera
Yakun Chang · Chu Zhou · Yuchen Hong · hu liwen · Chao Xu · Tiejun Huang · Boxin Shi
Exploring Motion Ambiguity and Alignment for High-Quality Video Frame Interpolation
Kun Zhou · Wenbo Li · Xiaoguang Han · Jiangbo Lu
Range-nullspace Video Frame Interpolation with Focalized Motion Estimation
Zhiyang Yu · Yu Zhang · Dongqing Zou · Xijun Chen · Jimmy Ren · Shunqing Ren
Deep Polarization Reconstruction with PDAVIS Events
Haiyang Mei · Zuowen Wang · Xin Yang · Xiaopeng Wei · Tobi Delbruck
Unsupervised space-time network for temporally-consistent segmentation of multiple motions
Etienne Meunier · Patrick Bouthemy
NeMo: Learning 3D Neural Motion Fields from Multiple Video Instances of the Same Action
Kuan-Chieh Wang · Zhenzhen Weng · Maria Xenochristou · Joao Araujo · Jeffrey Gu · Karen Liu · Serena Yeung
TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompted Reconstruction for Person Re-Identification
Haocong Rao · Chunyan Miao
FLAG3D: A 3D Fitness Activity Dataset with Language Instruction
Yansong Tang · Jinpeng Liu · Aoyang Liu · Bin Yang · Wenxun Dai · Yongming Rao · Jiwen Lu · Jie Zhou · Xiu Li
MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
Bowen Zhang · Chenyang Qi · Pan Zhang · Bo Zhang · HsiangTao Wu · Dong Chen · Qifeng Chen · Yong Wang · Fang Wen
Feature Representation Learning with Adaptive Displacement Generation and Transformer Fusion for Micro-Expression Recognition
Zhijun Zhai · Jianhui Zhao · Chengjiang Long · Wenju Xu · He Shuangjiang · huijuan zhao
Clothing-Change Feature Augmentation for Person Re-Identification
Ke Han · Shaogang Gong · Yan Huang · Liang Wang · Tieniu Tan
MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors
Yuang Zhang · Tiancai Wang · Xiangyu Zhang
Camouflaged Object Detection with Feature Decomposition and Edge Reconstruction
Chunming He · Kai Li · Yachao Zhang · Longxiang Tang · Yulun Zhang · Zhenhua Guo · Xiu Li
Source-free Adaptive Gaze Estimation with Uncertainty Reduction
Xin Cai · Jiabei Zeng · Shiguang Shan · Xilin CHEN
PyPose: A Library for Robot Learning with Physics-based Optimization
Chen Wang · Dasong Gao · Kuan Xu · Junyi Geng · Yaoyu Hu · Yuheng Qiu · Bowen Li · Fan Yang · Brady Moon · Abhinav Pandey · Aryan FNU · Jiahe Xu · Tianhao Wu · Haonan He · Daning Huang · Zhongqiang Ren · Shibo Zhao · Taimeng Fu · Pranay Reddy Anthireddy · Xiao Lin · Wenshan Wang · Jingnan Shi · Rajat Talak · Kun Cao · Yi Du · Han Wang · Huai Yu · Shanzhao Wang · Siyu Chen · Ananth Kashyap · Rohan Bandaru · Karthik Dantu · Jiajun Wu · Lihua Xie · Luca Carlone · Marco Hutter · Sebastian Scherer
Stimulus Verification is a Universal and Effective Sampler in Multi-modal Human Trajectory Prediction
Jianhua Sun · Yuxuan Li · Liang Chai · Cewu Lu
StarCraftImage: A Dataset For Prototyping Spatial Reasoning Methods For Multi-Agent Environments
Sean Kulinski · Nicholas Waytowich · James Hare · David I. Inouye
ProphNet: Efficient Agent-Centric Motion Forecasting with Anchor-Informed Proposals
Xishun Wang · Tong Su · Fang Da · Xiaodong Yang
Think Twice before Driving: Towards Scalable Decoders for End-to-End Autonomous Driving
Xiaosong Jia · Penghao Wu · Li Chen · Jiangwei Xie · Conghui He · Junchi Yan · Hongyang Li
HumanBench: Towards General Human-centric Perception with Projector Assisted Pretraining
SHIXIANG TANG · Cheng Chen · Meilin Chen · Qingsong Xie · Yizhou Wang · Yuanzheng Ci · LEI BAI · Feng Zhu · Haiyang Yang · Li Yi · Rui Zhao · Wanli Ouyang
BEV-Guided Multi-Modality Fusion for Driving Perception
Yunze Man · Liangyan Gui · Yu-Xiong Wang
Robust and Scalable Gaussian Process Regression and Its Applications
Yifan Lu · Jiayi Ma · Leyuan Fang · Xin Tian · Junjun Jiang
Tangentially Elongated Gaussian Belief Propagation for Event-based Incremental Optical Flow Estimation
Jun Nagata · Yusuke Sekikawa
Adaptive Annealing for Robust Geometric Estimation
Sidhartha Chitturi · Lalit Manam · Venu Madhav Govindu
Iterative Geometry Encoding Volume for Stereo Matching
Xu Gangwei · Xianqi Wang · Xiaohuan Ding · Xin Yang
PMatch: Paired Masked Image Modeling for Dense Geometric Matching
Shengjie Zhu · Xiaoming Liu
Adaptive Spot-Guided Transformer for Consistent Local Feature Matching
Jiahuan Yu · Jiahao Chang · Jianfeng He · Tianzhu Zhang · Jiyang Yu · Feng Wu
Learning Rotation-Equivariant Features for Visual Correspondence
Jongmin Lee · Byungjin Kim · Seungwook Kim · Minsu Cho
UTM: A Unified Multiple Object Tracking Model with Identity-Aware Feature Enhancement
Sisi You · Hantao Yao · Bing-Kun BAO · Changsheng Xu
Conjugate Product Graphs for Globally Optimal 2D-3D Shape Matching
Paul Rötzer · Zorah Laehner · Florian Bernard
LP-DIF: Learning Local Pattern-specific Deep Implicit Function for 3D Objects and Scenes
Meng Wang · Yushen Liu · Yue Gao · Kanle Shi · Yi Fang · Zhizhong Han
HGNet: Learning Hierarchical Geometry from Points, Edges, and Surfaces
Ting Yao · Yehao Li · Yingwei Pan · Tao Mei
Neural Intrinsic Embedding for Non-rigid Point Cloud Matching
puhua jiang · Mingze Sun · Ruqi Huang
PointClustering: Unsupervised Point Cloud Pre-training using Transformation Invariance in Clustering
Fuchen Long · Ting Yao · Zhaofan Qiu · Lusong Li · Tao Mei
Self-positioning Point-based Transformer for Point Cloud Understanding
Jinyoung Park · Sanghyeok Lee · Sihyeon Kim · Yunyang Xiong · Hyunwoo Kim
PointConvFormer: Revenge of the Point-Based Convolution
Wenxuan Wu · Li Fuxin · Qi Shan
Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders
Renrui Zhang · Liuhui Wang · Yu Qiao · Peng Gao · Hongsheng Li
Geometry and Uncertainty-Aware 3D Point Cloud Class-Incremental Semantic Segmentation
Yuwei Yang · Munawar Hayat · Zhao Jin · Chao Ren · Yinjie Lei
Learning Weather-General and Weather-Specific Features for Image Restoration Under Multiple Adverse Weather Conditions
Yurui Zhu · Tianyu Wang · Xueyang Fu · Xuanyu Yang · Xin Guo · Jifeng Dai · Yu Qiao · Xiaowei Hu
PartSLIP: Low-Shot Part Segmentation for 3D Point Clouds via Pretrained Image-Language Models
Minghua Liu · Yinhao Zhu · Hong Cai · Shizhong Han · Zhan Ling · Fatih Porikli · Hao Su
Semi-Weakly Supervised Object Kinematic Motion Prediction
Gengxin Liu · Qian Sun · Haibin Huang · Chongyang Ma · Yulan Guo · Li Yi · Hui Huang · Ruizhen Hu
Implicit Surface Contrastive Clustering for LiDAR Point Clouds
Zaiwei Zhang · Min Bai · Li Erran Li
LaserMix for Semi-Supervised LiDAR Semantic Segmentation
Lingdong Kong · Jiawei Ren · Liang Pan · Ziwei Liu
MSeg3D: Multi-modal 3D Semantic Segmentation for Autonomous Driving
Jiale Li · Hang Dai · Hao Han · Yong Ding
GraVoS: Voxel Selection for 3D Point-Cloud Detection
Oren Shrout · Yizhak Ben-Shabat · Ayellet Tal
VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking
Yukang Chen · Jianhui Liu · Xiangyu Zhang · XIAOJUAN QI · Jiaya Jia
Virtual Sparse Convolution for Multimodal 3D Object Detection
Hai Wu · Chenglu Wen · Shaoshuai Shi · Xin Li · Cheng Wang
MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection
Yang Jiao · ZEQUN JIE · Shaoxiang Chen · Jingjing Chen · Lin Ma · Yu-Gang Jiang
OrienterNet: Visual Localization in 2D Public Maps with Neural Matching
Paul-Edouard Sarlin · Daniel DeTone · Tsun-Yi Yang · Armen Avetisyan · Julian Straub · Tomasz Malisiewicz · Samuel Rota Bulò · Richard Newcombe · Peter Kontschieder · Vasileios Balntas
Uncertainty-aware Vision-based Metric Cross-view Geolocalization
Florian Fervers · Sebastian Bullinger · Christoph Bodensteiner · Michael Arens · Rainer Stiefelhagen
BEVHeight: A Robust Framework for Vision-based Roadside 3D Object Detection
Lei Yang · Kaicheng Yu · tao tang · Jun Li · Kun Yuan · Li Wang · Xinyu Zhang · Peng Chen
Understanding the Robustness of 3D Object Detection with Bird’s-Eye-View Representations in Autonomous Driving
Zijian Zhu · Yichi Zhang · Hai Chen · Yinpeng Dong · Shu Zhao · Wenbo Ding · Jiachen Zhong · Shibao Zheng
Object Detection with Self-Supervised Scene Adaptation
ZEKUN ZHANG · Minh Hoai
AeDet: Azimuth-invariant Multi-view 3D Object Detection
Chengjian Feng · ZEQUN JIE · Yujie Zhong · Xiangxiang Chu · Lin Ma
CAPE: Camera View Position Embedding for Multi-View 3D Object Detection
Kaixin Xiong · Shi Gong · Xiaoqing Ye · Xiao Tan · Ji Wan · Errui Ding · Jingdong Wang · Xiang Bai
VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud
Ziqin Wang · Bowen Cheng · Lichen Zhao · Dong Xu · Yang Tang · Lyu Sheng
Modality-invariant Visual Odometry for Embodied Vision
Marius Memmel · Roman Bachmann · Amir Zamir
Learning to Fuse Monocular and Multi-view Cues for Multi-frame Depth Estimation in Dynamic Scenes
Rui Li · Dong Gong · Wei Yin · Hao Chen · Yu Zhu · Kaixuan Wang · Xiaozhi Chen · Jinqiu Sun · Yanning Zhang
OmniVidar: Omnidirectional Depth Estimation from Multi-Fisheye Images
Sheng Xie · Daochuan Wang · Yun-Hui Liu
DINN360: Deformable Invertible Neural Networks for Latitude-aware 360
\degree
Image Rescaling
Yichen Guo · Mai Xu · Lai Jiang · Ning Li · Leon Sigal · Yunjin Chen
GeoMVSNet: Learning Multi-View Stereo with Geometry Perception
Zhe Zhang · Rui Peng · Yuxi Hu · Ronggang Wang
A Practical Stereo Depth System for Smart Glasses
Jialiang Wang · Daniel Scharstein · Akash Bapat · Kevin Blackburn-Matzen · Matthew Yu · Jonathan Lehman · Suhib Alsisan · Yanghan Wang · Sam Tsai · Jan-Michael Frahm · Zijian He · Peter Vajda · Michael Cohen · Matt Uyttendaele
DC
2
: Dual-Camera Defocus Control by Learning to Refocus
Hadi AlZayer · Abdullah Abuolaim · Leung Chun Chan · Yang Yang · Ying Lou · Jia-Bin Huang · Abhishek Kar
iDisc: Internal Discretization for Monocular Depth Estimation
Luigi Piccinelli · Christos Sakaridis · Fisher Yu
SfM-TTR: Using Structure from Motion for Test-Time Refinement of Single-View Depth Networks
Sergio Izquierdo · Javier Civera
Inverting the Imaging Process by Learning an Implicit Camera Model
Xin Huang · Qi Zhang · Ying Feng · Hongdong Li · Qing Wang
Learning to Measure the Point Cloud Reconstruction Loss in a Representation Space
Tianxin Huang · Zhonggan Ding · Jiangning Zhang · Ying Tai · Zhenyu Zhang · Mingang Chen · Chengjie Wang · Yong Liu
Better “CMOS” Produces Clearer Images: Learning Space-Variant Blur Estimation for Blind Image Super-Resolution
Xuhai Chen · Jiangning Zhang · Chao Xu · Yabiao Wang · Chengjie Wang · Yong Liu
Delivering Arbitrary-Modal Semantic Segmentation
Jiaming Zhang · Ruiping Liu · Hao Shi · Kailun Yang · Simon Reiß · Haodong Fu · Kunyu Peng · Kaiwei Wang · Rainer Stiefelhagen
Efficient Hierarchical Entropy Model for Learned Point Cloud Compression
Rui Song · Chunyang Fu · Shan Liu · Ge Li
Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring
Ruyang Liu · Jingjia Huang · Ge Li · Jiashi Feng · Xinglong Wu · Thomas Li
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Feng Liang · Bichen Wu · Xiaoliang Dai · Kunpeng Li · Yinan Zhao · Hang Zhang · Peizhao Zhang · Peter Vajda · Diana Marculescu
Imagic: Text-Based Real Image Editing with Diffusion Models
Bahjat Kawar · Shiran Zada · Oran Lang · Omer Tov · Huiwen Chang · Tali Dekel · Inbar Mosseri · michal Irani
Neumann Network with Recursive Kernels for Single Image Defocus Deblurring
Yuhui Quan · Zicong Wu · Hui Ji
Transfer4D: A framework for frugal motion capture and deformation transfer
Shubh Maheshwari · Rahul Narain · Ramya Hebbalaguppe
Iterative Proposal Refinement for Weakly-Supervised Video Grounding
Meng Cao · Fangyun Wei · Can Xu · Xiubo Geng · Long Chen · Can Zhang · Yuexian Zou · Tao Shen · Daxin Jiang
X
3
KD: Knowledge Distillation Across Modalities, Tasks and Stages for Multi-Camera 3D Object Detection
Marvin Klingner · Shubhankar Borse · Varun Ravi Kumar · Behnaz Rezaei · Venkatraman Narayanan · Senthil Yogamani · Fatih Porikli
AnyFlow: Arbitrary Scale Optical Flow with Implicit Neural Representation
Hyunyoung Jung · Zhuo Hui · Lei Luo · Haitao Yang · Feng Liu · Sungjoo Yoo · Rakesh Ranjan · Denis Demandolx
IterativePFN: True Iterative Point Cloud Filtering
Dasith de Silva Edirimuni · Xuequan Lu · Zhiwen Shao · Gang Li · Antonio Robles-Kelly · Ying He
Fake it till you make it: Learning transferable representations from synthetic ImageNet clones
Mert Bulent Sariyildiz · Karteek Alahari · Diane Larlus · Yannis Kalantidis
Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness
Zhijie Shen · Zishuo Zheng · Chunyu Lin · Lang Nie · Kang Liao · Shuai Zheng · Yao Zhao
Exploring Incompatible Knowledge Transfer in Few-shot Image Generation
Yunqing Zhao · Chao Du · Milad Abdollahzadeh · Tianyu Pang · Min Lin · Shuicheng YAN · Ngai-man Cheung
OmniObject3D: Large Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation
Tong Wu · Jiarui Zhang · Xiao Fu · Yuxin WANG · Jiawei Ren · Liang Pan · Wenyan Wu · Lei Yang · Jiaqi Wang · Chen Qian · Dahua Lin · Ziwei Liu
CelebV-Text: A Large-Scale Facial Text-Video Dataset
Jianhui Yu · Hao Zhu · Liming Jiang · CHEN CHANGE LOY · Weidong Cai · Wenyan Wu
TensoIR: Tensorial Inverse Rendering
Haian Jin · Isabella Liu · Peijia Xu · Xiaoshuai Zhang · Songfang Han · Sai Bi · Xiaowei Zhou · Zexiang Xu · Hao Su
Simultaneously Short- and Long-Term Temporal Modeling for Semi-Supervised Video Semantic Segmentation
Jiangwei Lao · Weixiang Hong · Xin Guo · Yingying Zhang · Wang Jian · Jingdong Chen · Wei Chu
Integral Neural Networks
Kirill Solodskikh · Azim Kurbanov · Ruslan Aydarkhanov · Irina Zhelavskaya · Yury Parfenov · Dehua Song · Stamatios Lefkimmiatis
FEND: A Future Enhanced Distribution-Aware Contrastive Learning Framework For Long-tail Trajectory Prediction
Yuning Wang · Pu Zhang · LEI BAI · Jianru Xue
NeuralEditor: Editing Neural Radiance Fields via Manipulating Point Clouds
Junkun Chen · Jipeng Lyu · Yu-Xiong Wang
3D Line Mapping Revisited
Shaohui Liu · Yifan Yu · Rémi Pautrat · Marc Pollefeys · Viktor Larsson
Single View Scene Scale Estimation using Scale Field
Byeong-Uk Lee · Jianming Zhang · Yannick Hold-Geoffroy · In So Kweon
PlaneDepth: Self-supervised Depth Estimation via Orthogonal Planes
Ruoyu Wang · Zehao Yu · Shenghua Gao
Self-supervised Super-plane for Neural 3D Reconstruction
Botao Ye · Sifei Liu · Xueting Li · Ming-Hsuan Yang
NeurOCS: Neural NOCS Supervision for Monocular 3D Object Localization
Zhixiang Min · Bingbing Zhuang · Samuel Schulter · Buyu Liu · Enrique Dunn · Manmohan Chandraker
Multi-sensor large-scale dataset for multi-view 3D reconstruction
Oleg Voynov · Gleb Bobrovskikh · Pavel Karpyshev · Saveliy Galochkin · Andrei-Timotei Ardelean · Arseniy Bozhenko · Ekaterina Karmanova · Pavel Kopanev · Yaroslav Labutin-Rymsho · Ruslan Rakhimov · Aleksandr Safin · Valerii Serpiva · Alexey Artemov · Evgeny Burnaev · Dzmitry Tsetserukou · Denis Zorin
AutoRecon: Automated 3D Object Discovery and Reconstruction
Yuang Wang · Xingyi He · Sida Peng · Haotong Lin · Hujun Bao · Xiaowei Zhou
A Large-Scale Homography Benchmark
Daniel Barath · Dmytro Mishkin · Michal Polic · Wolfgang Förstner · Jiri Matas
SparsePose: Sparse-View Camera Pose Regression and Refinement
Samarth Sinha · Jason Zhang · Andrea Tagliasacchi · Igor Gilitschenski · David Lindell
Few-shot Geometry-Aware Keypoint Localization
Xingzhe He · Gaurav Bharaj · David Ferman · Helge Rhodin · Pablo Garrido
Self-Supervised Representation Learning for CAD
Benjamin Jones · Michael Hu · Milin Kodnongbua · Vladimir Kim · Adriana Schulz
IMP: Iterative Matching and Pose Estimation with Adaptive Pooling
Fei XUE · Ignas Budvytis · Roberto Cipolla
SMOC-Net: Leveraging Camera Pose for Self-Supervised Monocular Object Pose Estimation
Tao Tan · Qiulei Dong
Markerless Camera-to-Robot Pose Estimation via Self-supervised Sim-to-Real Transfer
Jingpei Lu · Florian Richter · Michael Yip
TTA-COPE: Test-Time Adaptation for Category-Level Object Pose Estimation
Taeyeop Lee · Jonathan Tremblay · Valts Blukis · Bowen Wen · Byeong-Uk Lee · Inkyu Shin · Stan Birchfield · In So Kweon · Kuk-Jin YOON
3D-POP - An automated annotation approach to facilitate markerless 2D-3D tracking of freely moving birds with marker-based motion capture
Hemal Naik · Hoi Hang Chan · Junran Yang · Mathilde Delacoux · Iain Couzin · Fumihiro Kano · Máté Nagy
Delving into Discrete Normalizing Flows on SO(3) Manifold for Probabilistic Rotation Modeling
Yulin Liu · Haoran Liu · Yingda Yin · Yang Wang · Baoquan Chen · He Wang
PSVT: End-to-End Multi-person 3D Pose and Shape Estimation with Progressive Video Transformers
Zhongwei Qiu · Yang Qiansheng · Jian Wang · Haocheng Feng · Junyu Han · Errui Ding · Chang Xu · Dongmei Fu · Jingdong Wang
Hierarchical Temporal Transformer for 3D Hand Pose Estimation and Action Recognition from Egocentric RGB Videos
Yilin Wen · Hao Pan · Lei Yang · Jia Pan · Taku Komura · Wenping Wang
GarmentTracking: Category-Level Garment Pose Tracking
Han Xue · Wenqiang Xu · Jieyi Zhang · Tutian Tang · Yutong Li · Wenxin Du · Ruolin Ye · Cewu Lu
Towards Transferable Targeted Adversarial Examples
Zhibo Wang · Hongshan Yang · Yunhe Feng · Peng Sun · Hengchang Guo · Zhifei Zhang · Kui Ren
Proximal Splitting Adversarial Attack for Semantic Segmentation
Jérôme Rony · Jean-Christophe Pesquet · Ismail Ayed
T-SEA: Transfer-based Self-Ensemble Attack on Object Detection
Hao Huang · Ziyan Chen · Huanran Chen · Yongtao Wang · Kevin Zhang
Reinforcement Learning-Based Black-Box Model Inversion Attacks
Gyojin Han · Jaehyun Choi · Haeil Lee · Junmo Kim
Progressive Backdoor Erasing via connecting Backdoor and Adversarial Attacks
Bingxu Mu · Zhenxing Niu · Le Wang · xue wang · Qiguang Miao · Rong Jin · Gang Hua
MEDIC: Remove Model Backdoors via Importance Driven Cloning
Qiuling Xu · Guanhong Tao · Jean Honorio · Yingqi Liu · Shengwei An · Guangyu Shen · Siyuan Cheng · Xiangyu Zhang
Model Barrier: A Compact Un-Transferable Isolation Domain for Model Intellectual Property Protection
Lianyu Wang · Meng Wang · Daoqiang Zhang · Huazhu Fu
Adversarially Masking Synthetic to Mimic Real: Adaptive Noise Injection for Point Cloud Segmentation Adaptation
Guangrui Li · Guoliang Kang · Xiaohan Wang · Yunchao Wei · Yi Yang
Instance-Aware Domain Generalization for Face Anti-Spoofing
Qianyu Zhou · Ke-Yue Zhang · Taiping Yao · Xuequan Lu · Ran Yi · Shouhong Ding · Lizhuang Ma
Bias-Eliminating Augmentation Learning for Debiased Federated Learning
Yuan-Yi Xu · Ci-Siang Lin · Yu-Chiang Frank Wang
Adaptive Channel Sparsity for Federated Learning under System Heterogeneity
Dongping Liao · Xitong Gao · Yiren Zhao · Cheng-zhong Xu
Reliable and Interpretable Personalized Federated Learning
Zixuan Qin · Liu Yang · Qilong Wang · Yahong Han · Qinghua Hu
DaFKD: Domain-aware Federated Knowledge Distillation
Haozhao Wang · Yichen Li · Wenchao Xu · Ruixuan Li · Yufeng Zhan · Zhigang Zeng
SimpleNet: A Simple Network for Image Anomaly Detection and Localization
Zhikang Liu · Yiming Zhou · Yuansheng Xu · Zilei Wang
A New Comprehensive Benchmark for Semi-supervised Video Anomaly Detection and Anticipation
Congqi Cao · Yue Lu · PENG WANG · Yanning Zhang
Masked Jigsaw Puzzle : A Versatile Position Embedding for Vision Transformers
Bin Ren · Yahui Liu · Yue Song · Wei Bi · Rita Cucchiara · Nicu Sebe · Wei Wang
ImageNet-E: Benchmarking Neural Network Robustness against Attribute Editing
Xiaodan Li · YUEFENG CHEN · Yao Zhu · Shuhui Wang · Rong Zhang · Hui Xue’
Private Image Generation with Dual-Purpose Auxiliary Classifier
Chen Chen · Daochang Liu · Siqi Ma · Surya Nepal · Chang Xu
Discriminator-Cooperated Feature Map Distillation for GAN Compression
Tie Hu · Mingbao Lin · Lizhou You · Fei Chao · Rongrong Ji
TeSLA: Test-Time Self-Learning With Automatic Adversarial Augmentation
DEVAVRAT TOMAR · Guillaume Vray · Behzad Bozorgtabar · Jean-Philippe Thiran
Practical Network Acceleration with Tiny Sets
Guo-Hua Wang · Jianxin Wu
NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers
Yijiang Liu · Huanrui Yang · ZHEN DONG · Kurt Keutzer · Li Du · Shanghang Zhang
Bias Mimicking: A Simple Sampling Approach for Bias Mitigation
Maan Qraitem · Kate Saenko · Bryan Plummer
Masked Images Are Counterfactual Samples for Robust Fine-tuning
Yao Xiao · Ziyi Tang · Pengxu Wei · Cong Liu · Liang Lin
Samples with Low Loss Curvature Improve Data Efficiency
Isha Garg · Kaushik Roy
Defining and Quantifying the Emergence of Sparse Concepts in DNNs
Jie Ren · Mingjie Li · Qirui Chen · Huiqi Deng · Quanshi Zhang
Network Expansion For Practical Training Acceleration
Ning Ding · Yehui Tang · Kai Han · Chao Xu · Yunhe Wang
AstroNet: When Astrocyte Meets Artificial Neural Network
Mengqiao Han · Liyuan Pan · Xiabi Liu
Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization
Xingxuan Zhang · Renzhe Xu · Han Yu · Hao Zou · Peng Cui
Re-basin via implicit Sinkhorn differentiation
Fidel A Guerrero Pena · Heitor Medeiros · Thomas Dubail · Masih Aminbeidokhti · Eric Granger · Marco Pedersoli
Tunable Convolutions with Parametric Multi-Loss Optimization
Matteo Maggioni · Thomas Tanay · Francesca Babiloni · Steven McDonagh · Ales Leonardis
Frustratingly Easy Regularization on Representation Can Boost Deep Reinforcement Learning
Xinwen Hou · Huangyuan Su · Jieyu Zhang · Xinwen Hou
Simulated Annealing in Early Layers Leads to Better Generalization
Amirmohammad Sarfi · Zahra Karimpour · Muawiz Chaudhary · Nasir Khalid · Mirco Ravanelli · Sudhir Mudur · Eugene Belilovsky
On the Stability-Plasticity Dilemma of Class-Incremental Learning
Dongwan Kim · Bohyung Han
Decoupling Learning and Remembering: a Bilevel Memory Framework with Knowledge Projection for Task-Incremental Learning
Wenju Sun · Qingyong Li · Jing Zhang · Wen Wang · Yangliao Geng
Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation
Tianli Zhang · Mengqi Xue · Jiangtao Zhang · Haofei Zhang · Yu Wang · Lechao Cheng · Jie Song · Mingli Song
Regularizing Second-Order Influences for Continual Learning
Zhicheng Sun · Yadong MU · Gang Hua
Rethinking Feature-based Knowledge Distillation for Face Recognition
Jingzhi Li · Zidong Guo · Hui Li · Seungju Han · Ji-won Baek · Min Yang · Ran Yang · Sungjoo Suh
ERM-KTP: Knowledge-level Machine Unlearning via Knowledge Transfer
Shen Lin · Xiaoyu Zhang · Chenyang Chen · Xiaofeng Chen · Willy Susilo
Partial Network Cloning
Jingwen Ye · Songhua Liu · Xinchao Wang
Rebalancing Batch Normalization for Exemplar-based Class-Incremental Learning
Sungmin Cha · Sungjun Cho · Dasol Hwang · Sunwon Hong · Moontae Lee · Taesup Moon
1% VS 100%: Parameter-Efficient Low Rank Adapter for Dense Predictions
Dongshuo Yin · Yiran Yang · Zhechao Wang · Hongfeng Yu · kaiwen wei · Xian Sun
MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models
Dohwan Ko · Joonmyung Choi · Hyeong Kyu Choi · Kyoung-Woon On · Byungseok Roh · Hyunwoo Kim
MDL-NAS: A Joint Multi-domain Learning framework for Vision Transformer
Shiguang Wang · TAO XIE · Jian Cheng · Xingcheng ZHANG · Haijun Liu
Independent Component Alignment for Multi-Task Learning
Dmitry Senushkin · Nikolay Patakin · Arsenii Kuznetsov · Anton Konushin
Revisiting Prototypical Network for Cross Domain Few-Shot Learning
Fei Zhou · Peng Wang · Lei Zhang · Wei Wei · Yanning Zhang
Feature Alignment and Uniformity for Test Time Adaptation
Shuai Wang · Daoan Zhang · Zipei YAN · Jianguo Zhang · Rui Li
MMANet: Margin-aware Distillation and Modality-aware Regularization for Incomplete Multimodal Learning
shicai wei · Chunbo Luo · Yang Luo
PMR: Prototypical Modal Rebalance for Multimodal Learning
Yunfeng FAN · Wenchao Xu · Haozhao Wang · Junxiao Wang · Song Guo
Upcycling Models under Domain and Category Shift
Sanqing Qu · Tianpei Zou · Florian Röhrbein · Cewu Lu · Guang Chen · Dacheng Tao · changjun jiang
MHPL: Minimum Happy Points Learning for Active Source Free Domain Adaptation
Fan Wang · Zhongyi Han · Zhiyan Zhang · Rundong He · Yilong Yin
COT: Unsupervised Domain Adaptation with Clustering and Optimal Transport
Yang Liu · Zhipeng Zhou · Baigui Sun
FREDOM: Fairness Domain Adaptation Approach to Semantic Scene Understanding
Thanh-Dat Truong · Ngan Le · Bhiksha Raj · Jackson Cothren · Khoa Luu
Transfer Knowledge from Head to Tail: Uncertainty Calibration under Long-tailed Distribution
Jiahao Chen · Bing Su
Balanced Product of Calibrated Experts for Long-Tailed Recognition
Emanuel Sanchez Aimar · Arvi Jonnarth · Michael Felsberg · Marco Kuhlmann
Why is the winner the best?
Matthias Eisenmann · Annika Reinke · Vivienn Weru · Minu Tizabi · Fabian Isensee · Tim Adler · Sharib Ali · Vincent Andrearczyk · Marc Aubreville · Ujjwal Baid · Spyridon Bakas · Niranjan Balu · Sophia Bano · Jorge Bernal · Sebastian Bodenstedt · Alessandro Casella · Veronika Cheplygina · Marie Daum · Marleen de Bruijne · Adrien Depeursinge · Reuben Dorent · Jan Egger · David Ellis · Sandy Engelhardt · Melanie Ganz · Noha Ghatwary · Gabriel Girard · Patrick Godau · Anubha Gupta · Lasse Hansen · Kanako Harada · Mattias Heinrich · Nicholas Heller · Alessa Hering · Arnaud Huaulmé · Pierre Jannin · Ali Emre Kavur · Oldřich Kodym · Michal Kozubek · Jianning Li · Hongwei Li · Jun Ma · Carlos Isla · bjoern menze · Alison Noble · Valentin Oreiller · Nicolas Padoy · Sarthak Pati · Kelly Payette · Tim Rädsch · Jonathan Rafael-Patino · Vivek Bawa · Stefanie Speidel · Carole Sudre · Kimberlin van Wijnen · Martin Wagner · Donglai Wei · Amine Yamlahi · Moi Hoon Yap · Chun Yuan · Maximilian Zenk · Aneeq Zia · David Zimmerer · Dogu Baran Aydogan · Binod Bhattarai · Louise Bloch · Raphael Brüngel · Jihoon Cho · Chanyeol Choi · DOU QI · Ivan Ezhov · Christoph M. Friedrich · Clifton Fuller · Rebati Gaire · Adrian Galdran · Álvaro García Faura · Maria Grammatikopoulou · SeulGi Hong · Mostafa Jahanifar · Ikbeom Jang · Abdolrahim Kadkhodamohammadi · Inha Kang · Florian Kofler · Satoshi Kondo · Hugo Kuijf · Mingxing Li · Huan Luu · Tomaž Martinčič · Pedro Morais · Mohamed Naser · Bruno Oliveira · David Owen · Subeen Pang · Jinah Park · Sung-Hong Park · Szymon Plotka · Elodie Puybareau · Nasir Rajpoot · Kanghyun Ryu · Numan Saeed · Adam Shephard · Pengcheng Shi · Dejan Štepec · Ronast Subedi · Guillaume Tochon · Helena Torres · Helene Urien · João Vilaça · Kareem Wahid · haojie wang · jiacheng wang · Liansheng Wang · Xiyue Wang · Benedikt Wiestler · Marek Wodzinski · Fangfang Xia · Juanying Xie · Zhiwei Xiong · Sen Yang · Yanwu Yang · Zixuan Zhao · Klaus Maier-Hein · Paul Jaeger · Annette Kopp-Schneider · Lena Maier-hein
SuperDisco: Super-Class Discovery Improves Visual Recognition for the Long-Tail
Yingjun Du · Jiayi Shen · Xiantong Zhen · Cees Snoek
Learning from Noisy Labels with Decoupled Meta Label Purifier
Yuanpeng Tu · Boshen Zhang · Yuxi Li · Liang Liu · Jian Li · Yabiao Wang · Chengjie Wang · Cai Zhao
Class Prototypes based Contrastive Learning for Classifying Multi-Label and Fine-Grained Educational Videos
Rohit Gupta · Anirban Roy · Sujeong Kim · Claire Christensen · Todd Grindal · Sarah Gerard · Madeline Cincebeaux · Ajay Divakaran · Mubarak Shah
MaskCon: Masked Contrastive Learning for Coarse-Labelled Dataset
Chen Feng · Ioannis Patras
HIER: Metric Learning Beyond Class Labels via Hierarchical Regularization
Sungyeon Kim · Boseung Jeong · Suha Kwak
Bi-directional Distribution Alignment for Transductive Zero Shot Learning
Zhicai Wang · YANBIN HAO · Tingting Mu · Ouxiang Li · Shuo Wang · Xiangnan He
BiCro: Noisy Correspondence Rectification for Multi-modality Data via Bi-directional Cross-modal Similarity Consistency
Shuo Yang · xu Pan · Kai Wang · Yang You · Hongxun Yao · Tongliang Liu · Min Xu
Exploring and Exploiting Uncertainty for Incomplete Multi-View Classification
Mengyao Xie · Zongbo Han · Changqing Zhang · Yichen Bai · Qinghua Hu
GCFAgg: Global and Cross-view Feature Aggregation for Multi-view Clustering
Weiqing Yan · Yuanyang Zhang · Chenlei Lv · Chang Tang · Guanghui Yue · Liang Liao · Weisi Lin
LINe: Out-of-Distribution Detection by Leveraging Important Neurons
Yong Hyun Ahn · Gyeong-Moon Park · Seong Tae Kim
Visual prompt tuning for generative transfer learning
Kihyuk Sohn · Huiwen Chang · Jose Lezama · Luisa Polania Cabrera · Han Zhang · Yuan Hao · Irfan Essa · Lu Jiang
Interventional Bag Multi-Instance Learning On Whole-Slide Pathological Images
Tiancheng Lin · Yu Zhimiao · Hongyu Hu · Yi Xu · Chang-Wen Chen
Image Quality-aware Diagnosis via Meta-knowledge Co-embedding
Haoxuan Che · Siyu Chen · Hao Chen
KiUT: Knowledge-injected U-Transformer for Radiology Report Generation
Zhongzhen Huang · Xiaofan Zhang · Shaoting Zhang
Hierarchical discriminative learning improves visual representations of biomedical microscopy
Cheng Jiang · Xinhai Hou · Akhil Kondepudi · Asadur Chowdury · Christian Freudiger · Daniel Orringer · Honglak Lee · Todd Hollon
Pseudo-label Guided Contrastive Learning for Semi-supervised Medical Image Segmentation
Hritam Basak · Zhaozheng Yin
FFF: Fragment-Guided Flexible Fitting for Building Complete Protein Structures
Weijie Chen · Xinyan Wang · Yuhang Wang
Visual Language Pretrained Multiple Instance Zero-Shot Transfer for Histopathology Images
Ming Y. Lu · Bowen Chen · Andrew Zhang · Drew Williamson · Richard Chen · Tong Ding · Long Le · Yung-Sung Chuang · Faisal Mahmood
ProD: Prompting-to-disentangle Domain Knowledge for Cross-domain Few-shot Image Classification
Tianyi Ma · Yifan Sun · Zongxin Yang · Yi Yang
Open-Set Representation Learning through Combinatorial Embedding
Geeho Kim · Junoh Kang · Bohyung Han
Multiclass Confidence and Localization Calibration for Object Detection
Bimsara Pathiraja · Malitha Gunawardhana · Muhammad Khan Khan
Distilling Scale-Aware Knowledge in Small Object Detector
Yichen Zhu · Qiqi Zhou · Ning Liu · Zhiyuan Xu · Zhicai Ou · mou xiaofeng · Jian Tang
Generating Features with Increased Crop-related Diversity for Few-Shot Object Detection
Jingyi Xu · Hieu Le · Dimitris Samaras
DETRs with Hybrid Matching
Ding Jia · Yuhui Yuan · Haodi He · Xiaopei Wu · Haojun Yu · Weihong Lin · Lei Sun · Chao Zhang · Han Hu
Adaptive Sparse Pairwise Loss for Object Re-Identification
Xiao Zhou · Yujie Zhong · Zhen Cheng · Fan Liang · Lin Ma
CAT: LoCalization and IdentificAtion Cascade Detection Transformer for Open-World Object Detection
Shuailei Ma · Yuefeng Wang · Ying Wei · Jiaqi Fan · Thomas Li · Hongli Liu · fanbing Lv
Weak-shot Object Detection through Mutual Knowledge Transfer
Xuanyi Du · Weitao Wan · Chong Sun · Chen Li
Modeling the Distributional Uncertainty for Salient Object Detection Models
Jing Zhang · Mochu Xiang · Yuchao Dai · Xinyu Tian
Supervised Masked Knowledge Distillation for Few-Shot Transformers
Han Lin · Guangxing Han · Jiawei Ma · Shiyuan Huang · Xudong Lin · Shih-Fu Chang
Co-Salient Object Detection with Uncertainty-aware Group Exchange-Masking
Yang Wu · Huihui Song · Bo Liu · Kaihua Zhang · Dong Liu
Distilling Self-Supervised Vision Transformers for Weakly-Supervised Few-Shot Classification & Segmentation
Dahyun Kang · Piotr Koniusz · Minsu Cho · Naila Murray
DualRel: Semi-Supervised Mitochondria Segmentation from A Prototype Perspective
Huayu Mai · Rui Sun · Tianzhu Zhang · Zhiwei Xiong · Feng Wu
WinCLIP: Zero-/Few-Shot Anomaly Classification and Segmentation
Jongheon Jeong · Yang Zou · Taewan Kim · DongQing Zhang · Avinash Ravichandran · Onkar Dabeer
Learning Multi-Modal Class-Specific Tokens for Weakly Supervised Dense Object Localization
Lian Xu · Wanli Ouyang · Mohammed Bennamoun · Farid Boussaid · Dan Xu
Conflict-Based Cross-View Consistency for Semi-Supervised Semantic Segmentation
Zicheng Wang · Zhen Zhao · Xiaoxia Xing · Dong Xu · Xiangyu Kong · Luping Zhou
Boundary-enhanced Co-training for Weakly Supervised Semantic Segmentation
Shenghai Rong · Bohai Tu · Zilei Wang · Junjie Li
Balancing Logit Variation for Long-tailed Semantic Segmentation
Yuchao Wang · Jingjing Fei · Haochen Wang · Wei Li · Tianpeng Bao · Liwei Wu · Rui Zhao · Yujun Shen
Leveraging Hidden Positives for Unsupervised Semantic Segmentation
Hyun Seok Seong · WonJun Moon · Su Been Lee · Jae-Pil Heo
PIDNet: A Real-time Semantic Segmentation Network Inspired by PID Controllers
Jiacong Xu · Zixiang Xiong · Shankar P Bhattacharyya
AttentionShift: Iteratively Estimated Part-based Attention Map for Pointly Supervised Instance Segmentation
Mingxiang Liao · Zonghao Guo · Yuze Wang · Peng Yuan · bailan feng · Fang Wan
Principles of Forgetting in Domain-Incremental Semantic Segmentation in Adverse Weather Conditions
Tobias Kalb · Jürgen Beyerer
Semantic-Promoted Debiasing and Background Disambiguation for Zero-Shot Instance Segmentation
SHUTING HE · Henghui Ding · Wei Jiang
Interactive Segmentation as Gaussion Process Classification
Minghao Zhou · Hong Wang · Qian Zhao · Yuexiang Li · Yawen Huang · Deyu Meng · Yefeng Zheng
Meta Compositional Referring Expression Segmentation
Li Xu · Mark Huang · Xindi Shang · Zehuan Yuan · Ying Sun · Jun Liu
DejaVu: Conditional Regenerative Learning to Enhance Dense Prediction
Shubhankar Borse · Debasmit Das · Hyojin Park · Hong Cai · Risheek Garrepalli · Fatih Porikli
Zero-shot Referring Image Segmentation with Global-Local Context Features
seonghoon yu · Paul Hongsuck Seo · Jeany Son
FreeSeg: Unified, Universal and Open-Vocabulary Image Segmentation
Jie Qin · Jie Wu · Pengxiang Yan · Ming Li · Yuxi Ren · Xuefeng Xiao · Yitong Wang · Rui Wang · Shilei Wen · Xin Pan · Xingang Wang
Semantic Human Parsing via Scalable Semantic Transfer over Multiple Label Domains
Jie Yang · Chaoqun Wang · Zhen Li · Junle Wang · Ruimao Zhang
Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning
Jishnu Mukhoti · Tsung-Yu Lin · Omid Poursaeed · Rui Wang · Ashish Shah · Philip Torr · Ser-Nam Lim
Neural Congealing: Aligning Images to a Joint Semantic Atlas
Dolev Ofri-Amar · Michal Geyer · Yoni Kasten · Tali Dekel
Open-Category Human-Object Interaction Pre-training via Language Modeling Framework
Sipeng Zheng · Boshen Xu · Qin Jin
Open-set Fine-grained Retrieval via Prompting Vision-Language Evaluator
Shijie Wang · Jianlong Chang · Haojie Li · Zhihui Wang · Wanli Ouyang · Qi Tian
R
2
Former: Unified
R
etrieval and
R
eranking Transformer for Place Recognition
Sijie Zhu · Linjie Yang · Chen Chen · Mubarak Shah · Xiaohui Shen · Heng Wang
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Yuxin Fang · Wen Wang · Binhui Xie · Quan Sun · Ledell Wu · Xinggang Wang · Tiejun Huang · Xinlong Wang · Yue Cao
DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
Maoyuan Ye · Jing Zhang · Shanshan Zhao · Juhua Liu · Tongliang Liu · Bo Du · Dacheng Tao
Finetune like you pretrain: Improved finetuning of zero-shot vision models
Sachin Goyal · Ananya Kumar · Sankalp Garg · J Kolter · Aditi Raghunathan
Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with Multimodal Models
Zhiqiu Lin · Samuel Yu · Zhiyi Kuang · Deepak Pathak · Deva Ramanan
DATE: Domain Adaptive Product Seeker for E-commerce
Haoyuan Li · Hao Jiang · Tao Jin · Mengyan Li · Yan Chen · Zhijie Lin · Yang Zhao · Zhou Zhao
Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval
Kuniaki Saito · Kihyuk Sohn · Xiang Zhang · Chun-Liang Li · Chen-Yu Lee · Kate Saenko · Tomas Pfister
Text-guided Unsupervised Latent Transformations for Multi-attribute Image Manipulation
Xiwen Wei · Zhen Xu · Cheng Liu · Si Wu · Zhiwen Yu · Hau-San Wong
Fine-grained Image-text Matching by Cross-modal Hard Aligning Network
pan zhengxin · Fangyu Wu · Bailing Zhang
RA-CLIP: Retrieval Augmented Contrastive Language-Image Pre-training
Chen-Wei Xie · Siyang Sun · Xiong Xiong · Yun Zheng · Deli Zhao · Jingren Zhou
Unifying Vision, Language, Layout and Tasks for Universal Document Processing
Zineng Tang · Ziyi Yang · Guoxin Wang · Yuwei Fang · Yang Liu · Chenguang Zhu · Michael Zeng · Cha Zhang · Mohit Bansal
MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID
Jianyang Gu · Kai Wang · Hao Luo · Chen Chen · Wei Jiang · Yuqiang Fang · Shanghang Zhang · Yang You · Jian ZHAO
EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding
Yanmin Wu · Xinhua Cheng · Renrui Zhang · Zesen Cheng · Jian Zhang
L-CoIns: Language-based Colorization with Instance Awareness
Zheng Chang · Shuchen Weng · Peixuan Zhang · Yu Li · Si Li · Boxin Shi
Learning Visual Representations via Language-Guided Sampling
Mohamed Samir Mahmoud Hussein Elbanani · Karan Desai · Justin Johnson
Shepherding Slots to Objects: Towards Stable and Robust Object-Centric Learning
Jinwoo Kim · Janghyuk Choi · Ho-Jin Choi · Seon Joo Kim
Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification
Yue Yang · Artemis Panagopoulou · Shenghao Zhou · Daniel Jin · Chris Callison-Burch · Mark Yatskar
Image as a Foreign Language: BEiT Pretraining for Vision and Vision-Language Tasks
Wenhui Wang · Hangbo Bao · Li Dong · Johan Bjorck · Zhiliang Peng · Qiang Liu · Kriti Aggarwal · Owais Khan Mohammed · Saksham Singhal · Subhojit Som · Furu Wei
Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations
Ziyan Yang · Kushal Kafle · Franck Dernoncourt · Vicente Ordonez
Leveraging per Image-Token Consistency for Vision-Language Pre-training
Yunhao GOU · Tom Ko · Hansi Yang · James Kwok · Yu Zhang · Mingxuan Wang
RefTeacher: A Strong Baseline for Semi-Supervised Referring Expression Comprehension
Jiamu Sun · Gen Luo · Yiyi Zhou · Xiaoshuai Sun · GUANNAN JIANG · Zhiyu Wang · Rongrong Ji
Understanding and Improving Visual Prompting: A Label-Mapping Perspective
Aochuan Chen · Yuguang Yao · Pin-Yu Chen · Yihua Zhang · Sijia Liu
Meta-Personalizing Vision-Language Models to Find Named Instances in Video
Chun-Hsiao Yeh · Bryan Russell · Josef Sivic · Fabian Caba · Simon Jenni
MaPLe: Multi-modal Prompt Learning
Muhammad Uzair Khattak · Hanoona Bangalath · Muhammad Maaz · Salman Khan · Fahad Khan
VQACL: A Novel Visual Question Answering Continual Learning Setting
Xi Zhang · Feifei Zhang · Changsheng Xu
Exploring the Effect of Primitives for Compositional Generalization in Vision-and-Language
Chuanhao Li · Zhen Li · Chenchen Jing · Yunde Jia · Yuwei Wu
Unicode Analogies: An Anti-Objectivist Visual Reasoning Challenge
Steven Spratley · Krista A. Ehinger · Tim Miller
Token Turing Machines
Michael Ryoo · Keerthana Gopalakrishnan · Kumara Kahatapitiya · Ted Xiao · Kanishka Rao · Austin Stone · Yao Lu · Julian Ibarz · Anurag Arnab
Policy Adaptation from Foundation Model Feedback
Yuying Ge · Annabella Macaluso · Li Erran Li · Ping Luo · Xiaolong Wang
LANA: A Language-Capable Navigator for Instruction Following and Generation
Xiaohan Wang · Wenguan Wang · Jiayi shao · Yi Yang
LEGO-Net: Learning Regular Rearrangements of Objects in Rooms
Qiuhong Anna Wei · Sijie Ding · Jeong Joon Park · Rahul Sajnani · Adrien Poulenard · Srinath Sridhar · Leonidas Guibas
Discovering the Real Association: Multimodal Causal Reasoning in Video Question Answering
Chuanqi Zang · Hanqing Wang · Mingtao Pei · Wei Liang
CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning
Yiting Cheng · Fangyun Wei · Jianmin Bao · Dong Chen · Wenqiang Zhang
Context De-confounded Emotion Recognition
Dingkang Yang · Zhaoyu Chen · Yuzheng Wang · Shunli Wang · Mingcheng Li · Liu Siao · Xiao Zhao · Shuai Huang · Zhiyan Dong · Peng Zhai · Lihua Zhang
Learning Emotion Representations from Verbal and Nonverbal Communication
Sitao Zhang · Yimu Pan · James Wang
CLIPPING: Distilling CLIP-Based Models with a Student Base for Video-Language Retrieval
Renjing Pei · Jianzhuang Liu · Weimian Li · Bin Shao · Songcen Xu · Peng Dai · Juwei Lu · Youliang Yan
Dual Alignment Unsupervised Domain Adaptation for Video-Text Retrieval
Xiaoshuai Hao · Wanqian Zhang · Dayan Wu · Fei Zhu · Bo Li
StepFormer: Self-supervised Step Discovery and Localization in Instructional Videos
Nikita Dvornik · Isma Hadji · Ran Zhang · Konstantinos Derpanis · Rick Wildes · Allan Jepson
Text with Knowledge Graph Augmented Transformer for Video Captioning
Xin Gu · Guang Chen · Yufei Wang · Libo Zhang · Tiejian Luo · Longyin Wen
RILS: Masked Visual Reconstruction in Language Semantic Space
Shusheng Yang · Yixiao Ge · Kun Yi · Dian Li · Ying Shan · Xiaohu Qie · Xinggang Wang
DegAE: A New Pretraining Paradigm for Low-level Vision
Yihao Liu · Jingwen He · Jinjin Gu · Xiangtao Kong · Yu Qiao · Chao Dong
Teacher-generated spatial-attention labels boost robustness and accuracy of contrastive models
Yushi Yao · Chang Ye · Gamaleldin Elsayed · Junfeng He
CLAMP: Prompt-based Contrastive Learning for Connecting Language and Animal Pose
Xu Zhang · Wen Wang · Zhe Chen · Yufei Xu · Jing Zhang · Dacheng Tao
MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model
Yatai Ji · Junjie Wang · Yuan Gong · Lin Zhang · yanru Zhu · WANG HongFa · Jiaxing Zhang · Tetsuya Sakai · Yujiu Yang
Intrinsic Physical Concepts Discovery with Object-Centric Predictive Models
qu tang · Xiangyu Zhu · Zhen Lei · Zhaoxiang Zhang
Position-guided Text Prompt for Vision-Language Pre-training
Jinpeng Wang · Pan Zhou · Mike Zheng Shou · Shuicheng YAN
LASP: Text-to-Text Optimization for Language-Aware Soft Prompting of Vision & Language Models
Adrian Bulat · Georgios Tzimiropoulos
Being Comes from Not-being: Open-vocabulary Text-to-Motion Generation with Wordless Training
Junfan Lin · Jianlong Chang · Lingbo Liu · Guanbin Li · Liang Lin · Qi Tian · Chang-Wen Chen
GeoVLN: Learning Geometry-Enhanced Visual Representation with Slot Attention for Vision-and-Language Navigation
Jingyang Huo · Qiang Sun · Boyan Jiang · Haitao Lin · Yanwei Fu
MetaCLUE: Towards Comprehensive Visual Metaphors Research
Arjun Akula · Brendan Driscoll · Pradyumna Narayana · Soravit Changpinyo · Zhiwei Jia · Suyash Damle · Garima Pruthi · S Basu · Leonidas Guibas · William Freeman · Yuanzhen Li · Varun Jampani
ANetQA: A Large-scale Benchmark for Fine-grained Compositional Reasoning over Untrimmed Videos
Zhou Yu · Lixiang Zheng · Zhou Zhao · Fei Wu · Jianping Fan · Kui Ren · Jun Yu
Where We Are and What We’re Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes
Brandon Clark · Alec Kerrigan · Parth Parag Kulkarni · Vicente Vivanco Cepeda · Mubarak Shah
CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation
Samir Yitzhak Gadre · Mitchell Wortsman · Gabriel Ilharco · Ludwig Schmidt · Shuran Song
Accelerating Vision-Language Pretraining with Free Language Modeling
Teng WANG · Yixiao Ge · Feng Zheng · Ran Cheng · Ying Shan · Xiaohu Qie · Ping Luo
Joint Visual Grounding and Tracking with Natural Language Specification
Li Zhou · Zikun Zhou · Kaige Mao · Zhenyu He
CVT-SLR: Contrastive Visual-Textual Transformation for Sign Language Recognition with Variational Alignment
Jiangbin Zheng · Yile Wang · Cheng Tan · Siyuan Li · Ge Wang · Jun Xia · Yidong Chen · Stan Li
LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling
Linjie Li · Zhe Gan · Kevin Lin · Chung-Ching Lin · Zicheng Liu · Ce Liu · Lijuan Wang
Learning Action Changes by Measuring Verb-Adverb Textual Relationships
Davide Moltisanti · Frank Keller · Hakan Bilen · Laura Sevilla-Lara
WINNER: Weakly-supervised hIerarchical decompositioN and aligNment for spatio-tEmporal video gRounding
Mengze Li · Han Wang · Wenqiao Zhang · Jiaxu Miao · Zhou Zhao · Shengyu Zhang · Wei Ji · Fei Wu
HierVL: Learning Hierarchical Video-Language Embeddings
Kumar Ashutosh · Rohit Girdhar · Lorenzo Torresani · Kristen Grauman
Hierarchical Video-Moment Retrieval and Step-Captioning
Abhay Zala · Jaemin Cho · Satwik Kottur · Xilun Chen · Barlas Oguz · Yashar Mehdad · Mohit Bansal
AutoAD: Movie Description in Context
Tengda Han · Max Bain · Arsha Nagrani · Gul Varol · Weidi Xie · Andrew Zisserman
SViTT: Temporal Learning of Sparse Video-Text Transformers
Yi Li · Kyle Min · Subarna Tripathi · Nuno Vasconcelos
Weakly Supervised Temporal Sentence Grounding with Uncertainty-Guided Self-training
Yifei Huang · Lijin Yang · Yoichi Sato
Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies
Bei Gan · Xiujun Shu · Ruizhi Qiao · Haoqian Wu · Keyu Chen · Hanjun Li · Bo Ren
Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network
Zhicheng Zhang · Lijuan Wang · Jufeng Yang
Two-Stream Networks for Weakly-Supervised Temporal Action Localization with Semantic-Aware Mechanisms
Yu Wang · Yadong Li · Hongbin Wang
Hybrid Active Learning via Deep Clustering for Video Action Detection
Aayush Jung B Rana · Yogesh Rawat
TriDet: Temporal Action Detection with Relative Boundary Modeling
Dingfeng Shi · Yujie Zhong · Qiong Cao · Lin Ma · Jia Li · Dacheng Tao
HaLP: Hallucinating Latent Positives for Skeleton-based Self-Supervised Learning of Actions
Anshul Shah · Aniket Roy · Ketul Shah · Shlok Mishra · David Jacobs · Anoop Cherian · Rama Chellappa
Post-Processing Temporal Action Detection
Sauradip Nag · Xiatian Zhu · Yi-Zhe Song · Tao Xiang
Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception
Junyu Gao · Mengyuan Chen · Changsheng Xu
SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision
Xubo Liu · Egor Lakomkin · Konstantinos Vougioukas · Pingchuan Ma · Honglie Chen · Ruiming Xie · Morrie Doulaty · Niko Moritz · Jachym Kolar · Stavros Petridis · Maja Pantic · Christian Fuegen
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Regeneration
Wei-Ning Hsu · Tal Remez · Bowen Shi · Jacob Donley · Yossi Adi
Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring
Joanna Hong · Minsu Kim · Jeongsoo Choi · Yong Man Ro
Temporal Attention Unit: Towards Efficient Spatiotemporal Predictive Learning
Cheng Tan · Zhangyang Gao · Lirong Wu · Yongjie Xu · Jun Xia · Siyuan Li · Stan Li
Latency Matters: Real-Time Action Forecasting Transformer
Harshayu Girase · Nakul Agarwal · Chiho Choi · Karttikeya Mangalam
Efficient Movie Scene Detection using State-Space Transformers
Md Mohaiminul Islam · Mahmudul Hasan · Kishan Shamsundar Athrey · Tony Braskich · Gediminas Bertasius
TarViS: A Unified Approach for Target-based Video Segmentation
Ali Athar · Alexander Hermans · Jonathon Luiten · Deva Ramanan · Bastian Leibe
HOOD: Hierarchical Graphs for Generalized Modelling of Clothing Dynamics
Artur Grigorev · Bernhard Thomaszewski · Michael Black · Otmar Hilliges
Structured 3D Features for Reconstructing Controllable Avatars
Enric Corona · Mihai Zanfir · Thiemo Alldieck · Eduard Bazavan · Andrei Zanfir · Cristian Sminchisescu
MonoHuman: Animatable Human Neural Field from Monocular Video
Zhengming Yu · Wei Cheng · Xian Liu · Wenyan Wu · Kwan-Yee Lin
JAWS: Just A Wild Shot for Cinematic Transfer in Neural Radiance Fields
Xi WANG · Robin Courant · Jinglei Shi · Eric Marchand · Marc Christie
InstantAvatar: Learning Avatars from Monocular Video in 60 Seconds
Tianjian Jiang · Xu Chen · Jie Song · Otmar Hilliges
X-Avatar: Expressive Human Avatars
Kaiyue Shen · Chen Guo · Manuel Kaufmann · Juan Zarate · Julien Valentin · Jie Song · Otmar Hilliges
OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering
Zhiyuan Ma · Xiangyu Zhu · Guo-Jun Qi · Zhen Lei · Lei Zhang
Learning Personalized High Quality Volumetric Head Avatars from Monocular RGB Videos
Ziqian Bai · Feitong Tan · Zeng Huang · Kripasindhu Sarkar · Danhang Tang · Di Qiu · Abhimitra Meka · Ruofei Du · Mingsong Dou · Sergio Orts-Escolano · Rohit Pandey · Ping Tan · Thabo Beeler · Sean Fanello · Yinda Zhang
AVFace: Towards Detailed Audio-Visual 4D Face Reconstruction
Aggelina Chatziagapi · Dimitris Samaras
NeuFace: Realistic 3D Neural Face Rendering from Multi-view Images
Mingwu Zheng · Haiyu Zhang · Hongyu Yang · Di Huang
Continuous Landmark Detection with 3D Queries
Prashanth Chandran · Gaspard Zoss · Paulo Gotardo · Derek Bradley
GlassesGAN: Eyewear Personalization using Synthetic Appearance Discovery and Targeted Subspace Modeling
Richard Plesh · Peter Peer · Vitomir Struc
High-Res Facial Appearance Capture from Polarized Smartphone Images
Dejan Azinovic · Olivier Maury · Christophe Hery · Matthias Niessner · Justus Thies
Interactive Cartoonization with Controllable Perceptual Factors
Namhyuk Ahn · Patrick Kwon · Jihye Back · Kibeom Hong · Mark Kim
SECAD-Net: Self-Supervised CAD Reconstruction by Learning Sketch-Extrude Operations
Pu Li · Jianwei Guo · Xiaopeng Zhang · Dong-ming Yan
TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision
Jiacheng Wei · Hao Wang · Jiashi Feng · Guosheng Lin · Kim-Hui Yap
High Fidelity 3D Hand Shape Reconstruction via Scalable Graph Frequency Decomposition
Tianyu Luan · Yuanhao Zhai · Jingjing Meng · Zhong Li · Zhang Chen · Yi Xu · Junsong Yuan
Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process
Yuhan Li · Yishun Dou · Xuanhong Chen · Bingbing Ni · Yilin Sun · Yutian Liu · Fuzhen Wang
Consistent View Synthesis with Pose-Guided Diffusion Models
Hung-Yu Tseng · Qinbo Li · Changil Kim · Suhib Alsisan · Jia-Bin Huang · Johannes Kopf
Patch-based 3D Natural Scene Generation from a Single Example
Weiyu Li · Xuelin Chen · Jue Wang · Baoquan Chen
Diffusion-based Generation, Optimization, and Planning in 3D Scenes
Siyuan Huang · Zan Wang · Puhao Li · Baoxiong Jia · Tengyu Liu · Yixin Zhu · Wei Liang · Song-Chun Zhu
DA Wand: Distortion-Aware Selection using Neural Mesh Parameterization
Richard Liu · Noam Aigerman · Vladimir Kim · Rana Hanocka
Neural Vector Fields: Implicit Representation by Explicit Learning
Xianghui Yang · Guosheng Lin · Zhenghao Chen · Luping Zhou
Octree Guided Unoriented Surface Reconstruction
Chamin Hewa Koneputugodage · Yizhak Ben-Shabat · Stephen Gould
Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction
Mingfang Zhang · Jinglu Wang · Xiao Li · Yifei Huang · Yoichi Sato · Yan Lu
Multi-View Reconstruction using Signed Ray Distance Functions (SRDF)
Pierre Zins · Yuanlu Xu · Edmond Boyer · Stefanie Wuhrer · Tony Tung
VolRecon: Volume Rendering of Signed Ray Distance Functions for Generalizable Multi-View Reconstruction
Yufan Ren · Fangjinhua Wang · Tong Zhang · Marc Pollefeys · Sabine Süsstrunk
TMO: Textured Mesh Acquisition of Objects with a Mobile Device by using Differentiable Rendering
Jaehoon Choi · Dongki Jung · Taejae Lee · SangWook Kim · YoungDong Jung · Dinesh Manocha · Donghwan Lee
RelightableHands: Efficient Neural Relighting of Articulated Hand Models
Shun Iwase · Shunsuke Saito · Tomas Simon · Stephen Lombardi · Timur Bagautdinov · Rohan Joshi · Fabian Prada · Takaaki Shiratori · Yaser Sheikh · Jason Saragih
Computational Flash Photography through Intrinsics
Sepideh Sarajian Maralan · Chris Careaga · Yagiz Aksoy
PixHt-Lab: Pixel Height Based Light Effect Generation for Image Compositing
Yichen Sheng · Jianming Zhang · Julien Philip · Yannick Hold-Geoffroy · Xin Sun · HE Zhang · Lu Ling · Bedrich Benes
Tensor4D : Efficient Neural 4D Decomposition for High-fidelity Dynamic Reconstruction and Rendering
Ruizhi Shao · Zerong Zheng · Hanzhang Tu · Boning Liu · Hongwen Zhang · Yebin Liu
UV Volumes for Real-time Rendering of Editable Free-view Human Performance
Yue Chen · Xuan Wang · Xingyu Chen · Qi Zhang · Xiaoyu Li · Yu Guo · Jue Wang · Fei Wang
HyperReel: High-Fidelity 6-DoF Video with Ray-Conditioned Sampling
Benjamin Attal · Jia-Bin Huang · Christian Richardt · Johannes Kopf · Michael Zollhöfer · Matthew O’Toole · Changil Kim
Complementary Intrinsics from Neural Radiance Fields and CNNs for Outdoor Scene Relighting
Siqi Yang · Xuanning Cui · Yongjie Zhu · Jiajun Tang · Si Li · Zhaofei Yu · Boxin Shi
Balanced Spherical Grid for Egocentric View Synthesis
Changwoon Choi · Sang Min Kim · Young Min Kim
pCON: Polarimetric Coordinate Networks for Neural Scene Representations
Henry Peters · Yunhao Ba · Achuta Kadambi
MobileNeRF: Exploiting the Polygon Rasterization Pipeline for Efficient Neural Field Rendering on Mobile Architectures
Zhiqin Chen · Thomas Funkhouser · Peter Hedman · Andrea Tagliasacchi
ABLE-NeRF: Attention-Based Rendering with Learnable Embeddings for Neural Radiance Field
Zhe Jun Tang · Tat-Jen Cham · Haiyu Zhao
NeRFVS: Neural Radiance Fields for Free View Synthesis via Geometry Scaffolds
chen yang · Peihao Li · Zanwei Zhou · Shanxin Yuan · Bingbing Liu · Xiaokang Yang · Weichao Qiu · Wei Shen
Progressively Optimized Local Radiance Fields for Robust View Synthesis
Andreas Meuleman · Yu-Lun Liu · Chen Gao · Jia-Bin Huang · Changil Kim · Min Kim Kim · Johannes Kopf
Removing Objects From Neural Radiance Fields
Silvan Weder · Guillermo Garcia-Hernando · Aron Monszpart · Marc Pollefeys · Gabriel Brostow · Michael Firman · Sara Vicente
SCADE: Space Carving with Ambiguity-aware Depth Estimates
Mikaela Uy · Ricardo Martin Brualla · Leonidas Guibas · Ke Li
ContraNeRF: Generalizable Neural Radiance Fields for Synthetic-to-real Novel View Synthesis via Contrastive Learning
Hao Yang · Lanqing HONG · Aoxue Li · Tianyang Hu · Zhenguo Li · Gim Lee · Liwei Wang
JacobiNeRF: NeRF Shaping with Mutual Information Gradients
Xiaomeng Xu · Yanchao Yang · Kaichun Mo · Boxiao Pan · Li Yi · Leonidas Guibas
Fresnel Microfacet BRDF: Unification of Polari-Radiometric Surface-Body Reflection
Tomoki Ichikawa · Yoshiki Fukao · Shohei Nobuhara · Ko Nishino
DartBlur: Privacy Preservation with Detection Artifacts Suppression
Baowei Jiang · Bing Bai · Haozhe Lin · Yu Wang · Yuchen Guo · LU FANG
Evading Forensic Classifiers with Attribute-Conditioned Adversarial Faces
Fahad Shamshad · Koushik Srivatsan · Karthik Nandakumar
RIATIG: Reliable and Imperceptible Adversarial Text-to-Image Generation with Natural Prompts
Han Liu · Yuhao Wu · Shixuan Zhai · Bo Yuan · Ning Zhang
Improving Robust Generalization by Direct PAC-Bayesian Bound Minimization
Zifan Wang · Nan Ding · Tomer Levinboim · Xi Chen · Radu Soricut
Randomized Adversarial Training via Taylor Expansion
Gaojie Jin · Xinping Yi · Dengyu Wu · Ronghui Mu · Xiaowei Huang
Adversarial Counterfactual Visual Explanations
Guillaume Jeanneret · Loic Simon · Frederic Jurie
Transferable Adversarial Attacks on Vision Transformers with Token Gradient Regularization
Jianping Zhang · Yizhan Huang · Weibin Wu · Michael Lyu
Dynamic Generative Targeted Attacks with Pattern Injection
Weiwei Feng · Nanqing Xu · Tianzhu Zhang · Yongdong Zhang
Turning Strengths into Weaknesses: A Certified Robustness Inspired Attack Framework against Graph Neural Networks
Binghui Wang · Meng Pang · Yun Dong
Re-thinking Model Inversion Attacks Against Deep Neural Networks
Ngoc-Bao Nguyen · Keshigeyan Chandrasegaran · Milad Abdollahzadeh · Ngai-man Cheung
Can’t Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders
Zeyang Sha · Xinlei He · Ning Yu · Michael Backes · Yang Zhang
Detecting Backdoors in Pre-trained Encoders
Shiwei Feng · Guanhong Tao · Siyuan Cheng · Guangyu Shen · Xiangzhe Xu · Yingqi Liu · Kaiyuan Zhang · Shiqing Ma · Xiangyu Zhang
STDLens: Model Hijacking-resilient Federated Learning for Object Detection
Ka-Ho Chow · Ling Liu · Wenqi Wei · Fatih Ilhan · Yanzhao Wu
Alias-Free Convnets: Fractional Shift Invariance via Polynomial Activations
Hagay Michaeli · Tomer Michaeli · Daniel Soudry
FedDM: Iterative Distribution Matching for Communication-Efficient Federated Learning
Yuanhao Xiong · Ruochen Wang · Minhao Cheng · Felix Yu · Cho-Jui Hsieh
Rethinking Federated Learning with Domain Shift: A Prototype View
Wenke Huang · Mang Ye · Zekun Shi · He Li · Bo Du
Fair Federated Medical Image Segmentation via Client Contribution Estimation
Meirui Jiang · Holger Roth · Wenqi Li · Dong Yang · Can Zhao · Vishwesh Nath · Daguang Xu · DOU QI · Ziyue Xu
Class Balanced Adaptive Pseudo Labeling for Federated Semi-Supervised Learning
Ming Li · Qingli Li · Yan Wang
Prototypical Residual Networks for Anomaly Detection and Localization
Hui Zhang · Zuxuan Wu · Zheng Wang · Zhineng Chen · Yu-Gang Jiang
Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection
Chen Zhang · Guorong Li · Yuankai Qi · Shuhui Wang · Laiyun Qing · Qingming Huang · Ming-Hsuan Yang
A New Dataset Based on Images Taken by Blind People for Testing the Robustness of Image Classification Models Trained for ImageNet Categories
Reza Akbarian Bafghi · Danna Gurari
Boosting Verified Training for Robust Image Classifications via Abstraction
Zhaodi Zhang · Zhiyi Xue · Yang Chen · Si Liu · Yueling Zhang · Jing Liu · Min Zhang
Soft Augmentation for Image Classification
Yang Liu · Shen Yan · Laura Leal-Taixé · James Hays · Deva Ramanan
Re-GAN: Data-Efficient GANs Training via Architectural Reconfiguration
Divya Saxena · Jiannong Cao · Jiahao XU · Tarun Kulshrestha
AdaptiveMix: Improving GAN Training via Feature Space Shrinkage
Haozhe Liu · Wentian Zhang · Bing Li · Haoqian Wu · Nanjun He · Yawen Huang · Yuexiang Li · Bernard Ghanem · Yefeng Zheng
Enhancing Multiple Reliability Measures via Nuisance-extended Information Bottleneck
Jongheon Jeong · Sihyun Yu · Hankook Lee · Jinwoo Shin
Bit-shrinking: Limiting Instantaneous Sharpness for Improving Post-training Quantization
Lin Chen · Bo Peng · Zheyang Li · Wenming Tan · Ye Ren · Jun Xiao · Shiliang Pu
Robust Generalization against Photon-Limited Corruptions via Worst-Case Sharpness Minimization
Zhuo Huang · Miaoxi Zhu · Xiaobo Xia · Li Shen · Jun Yu · Chen Gong · Bo Han · Bo Du · Tongliang Liu
OT-Filter: An Optimal Transport Filter for Learning with Noisy Labels
Chuanwen Feng · Yilong Ren · Xike Xie
Don’t Lie to Me! Robust and Efficient Explainability with Verified Perturbation Analysis
Thomas FEL · Melanie Ducoffe · David Vigouroux · Remi Cadene · Mikaël Capelle · Claire NICODEME · Thomas Serre
Shortcomings of Top-Down Randomization-Based Sanity Checks for Evaluations of Deep Neural Network Explanations
Alexander Binder · Leander Weber · Sebastian Lapuschkin · Grégoire Montavon · Klaus Muller · Wojciech Samek
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
Sanghyun Woo · Shoubhik Debnath · Ronghang Hu · Xinlei Chen · Zhuang Liu · In So Kweon · Saining Xie
Regularization of polynomial networks for image recognition
Grigorios Chrysos · Bohan Wang · Jiankang Deng · Volkan Cevher
Stitchable Neural Networks
Zizheng Pan · Jianfei Cai · Bohan Zhuang
DepGraph: Towards Any Structural Pruning
Gongfan Fang · Xinyin Ma · Mingli Song · Michael Bi Mi · Xinchao Wang
Meta-Learning with a Geometry-Adaptive Preconditioner
Suhyun Kang · Duhun Hwang · Moonjung Eo · Taesup Kim · Wonjong Rhee
Class Adaptive Network Calibration
Bingyuan Liu · Jérôme Rony · Adrian Galdran · Jose Dolz · Ismail Ayed
Differentiable Architecture Search with Random Features
zhang xuanyang · Yonggang Li · Xiangyu Zhang · Yongtao Wang · Jian Sun
DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks
Samyak Jain · Sravanti Addepalli · Pawan Sahu · Priyam Dey · Venkatesh Babu Radhakrishnan
NICO++: Towards better bechmarks for Out-of-Distribution Generalization
Xingxuan Zhang · Yue He · Renzhe Xu · Han Yu · Zheyan Shen · Peng Cui
Bilateral Memory Consolidation for Continual Learning
Xing Nie · Shixiong Xu · Xiyan Liu · Gaofeng Meng · Chunlei Huo · Shiming Xiang
CafeBoost: Causal Feature Boost to Eliminate Task-Induced Bias for Class Incremental Learning
Benliu Qiu · Hongliang Li · Haitao Wen · Heqian Qiu · Lanxiao Wang · Fanman Meng · Qingbo Wu · Lili Pan
Towards a Smaller Student: Capacity Dynamic Distillation for Efficient Image Retrieval
Yi Xie · Huaidong Zhang · Xuemiao Xu · Jianqing Zhu · Shengfeng He
Generic-to-Specific Distillation of Masked Autoencoders
Wei Huang · Zhiliang Peng · Li Dong · Furu Wei · Jianbin Jiao · Qixiang Ye
Heterogeneous Continual Learning
Divyam Madaan · Hongxu Yin · Wonmin Byeon · Jan Kautz · Pavlo Molchanov
Manipulating Transfer Learning for Property Inference
Yulong Tian · Fnu Suya · Anshuman Suri · Fengyuan Xu · David Evans
Adapting Shortcut with Normalizing Flow: An Efficient Tuning Framework for Visual Recognition
Yaoming Wang · Bowen Shi · XIAOPENG ZHANG · Jin Li · Yuchen Liu · Wenrui Dai · Chenglin Li · Hongkai Xiong · Qi Tian
A New Benchmark: On the Utility of Synthetic Data with Blender for Bare Supervised Learning and Downstream Domain Adaptation
Hui Tang · Kui Jia
Switchable Representation Learning Framework with Self-compatibility
shengsen wu · Yan Bai · Yihang Lou · Xiongkun Linghu · Jianzhong He · LINGYU DUAN
Domain Expansion of Image Generators
Yotam Nitzan · MICHAEL GHARBI · Richard Zhang · Taesung Park · Jun-Yan Zhu · Daniel Cohen-Or · Eli Shechtman
Robust Test-Time Adaptation in Dynamic Scenarios
Longhui Yuan · Binhui Xie · Shuang Li
Train/Test-Time Adaptation with Retrieval
Luca Zancato · Alessandro Achille · Tian Yu Liu · Matthew Trager · Pramuditha Perera · Stefano Soatto
Bi-level Meta-learning for Few-shot Domain Generalization
Xiaorong Qin · Xinhang Song · Shuqiang Jiang
Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information
Weijie Su · Xizhou Zhu · Chenxin Tao · Lewei Lu · Bin Li · Gao Huang · Yu Qiao · Xiaogang Wang · Jie Zhou · Jifeng Dai
Multi-modal Learning with Missing Modality via Shared-Specific Feature Modeling
Hu Wang · Yuanhong Chen · Congbo Ma · Jodie Avery · M. Louise Hull · Gustavo Carneiro
DiGA: Distil to Generalize and then Adapt for Domain Adaptive Semantic Segmentation
Fengyi Shen · Akhil Gurram · Ziyuan Liu · He Wang · Alois Knoll
Progressive Open Space Expansion for Open Set Model Attribution
Tianyun Yang · Danding Wang · Fan Tang · Xinying Zhao · Juan Cao · Sheng Tang
DLBD: A Self-Supervised Direct-Learned Binary Descriptor
Bin Xiao · Yang Hu · Bo Liu · Xiuli Bi · Weisheng Li · Xinbo Gao
DAA: A Delta Age AdaIN operation for age estimation via binary code transformer
Ping Chen · Xingpeng Zhang · Ye Li · Ju Tao · Bin Xiao · Bing Wang · zongjie jiang
Curvature-Balanced Feature Manifold Learning for Long-Tailed Classification
Yanbiao Ma · Licheng Jiao · Fang Liu · Shuyuan Yang · Xu Liu · Lingling Li
Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions
Fei Du · peng yang · Qi Jia · Fengtao Nan · xiaoting chen · Yun Yang
No One Left Behind: Improving the Worst Categories in Long-Tailed Learning
Yingxiao Du · Jianxin Wu
Learning Imbalanced Data with Vision Transformers
Zhengzhuo Xu · Ruikang Liu · Shuo Yang · Zenghao Chai · Chun Yuan
Ranking Regularization for Critical Rare Classes: Minimizing False Positives at a High True Positive Rate
Kiarash Mohammadi · He Zhao · Mengyao Zhai · Frederick Tung
MarginMatch: Using Training Dynamics of Unlabeled Data for Semi-Supervised Learning
Tiberiu Sosea · Cornelia Caragea
CHMATCH: Contrastive Hierarchical Matching and Robust Adaptive Threshold Boosted Semi-Supervised Learning
Jianlong Wu · Haozhe Yang · Tian Gan · Ning Ding · Feijun Jiang · Liqiang Nie
Boosting Transductive Few-Shot Fine-tuning with Margin-based Uncertainty Weighting and Probability Regularization
Ran Tao · Hao Chen · Marios Savvides
Three Guidelines You Should Know for Universally Slimmable Self-Supervised Learning
Yun-Hao Cao · Peiqin Sun · Shuchang Zhou
Towards Bridging the Performance Gaps of Joint Energy-based Models
Xiulong Yang · Qing Su · Shihao Ji
Siamese DETR
Zeren Chen · Gengshi Huang · Wei Li · Jianing Teng · Kun Wang · Jing Shao · CHEN CHANGE LOY · Lyu Sheng
Highly Confident Local Structure Based Consensus Graph Learning for Incomplete Multi-view Clustering
Jie Wen · Chengliang Liu · Gehui Xu · Zhihao Wu · Chao Huang · Lunke Fei · Yong Xu
Block Selection Method for Using Feature Norm in Out-of-Distribution Detection
Yeonguk Yu · Sungho Shin · Seongju Lee · Changhyun Jun · Kyoobin Lee
Causally-Aware Intraoperative Imputation for Overall Survival Time Prediction
Xiang Li · Xuelin Qian · Litian Liang · Lingjie Kong · Qiaole Dong · Chen Jiejun · Dingxia Liu · Xiuzhong Yao · Yanwei Fu
PEFAT: Boosting Semi-supervised Medical Image Classification via Pseudo-loss Estimation and Feature Adversarial Training
Zeng Qingjie · Yutong Xie · Lu Zilin · Yong Xia
Histopathology Whole Slide Image Analysis with Heterogeneous Graph Representation Learning
Tsai Chan Chan · Fernando Julio Cendra · Lan Ma · Guosheng Yin · Lequan Yu
MCF: Mutual Correction Framework for Semi-Supervised Medical Image Segmentation
Yongchao Wang · Bin Xiao · Xiuli Bi · Weisheng Li · Xinbo Gao
DoNet: Deep De-overlapping Network for Cytology Instance Segmentation
Hao JIANG · Rushan Zhang · Yanning Zhou · Yumeng Wang · Hao Chen
Weakly supervised segmentation with point annotations for histopathology images via contrast-based variational model
hongrun zhang · Liam Burrows · Yanda Meng · Declan Sculthorpe · ABHIK MUKHERJEE · Sarah Coupland · Ke Chen · Yalin Zheng
Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture
Mido Assran · Quentin Duval · Pascal Vincent · Ishan Misra · Piotr Bojanowski · Michael Rabbat · Yann LeCun · Nicolas Ballas
Boosting Detection in Crowd Analysis via Underutilized Output Features
Shaokai Wu · Fengyu Yang
Bi3D: Bi-domain Active Learning for Cross-domain 3D Object Detection
Jiakang Yuan · Bo Zhang · Xiangchao Yan · Tao Chen · Botian Shi · Yikang LI · Yu Qiao
Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection
Chang Liu · Weiming Zhang · Xiangru Lin · Wei Zhang · Xiao Tan · Junyu Han · Xiaomao Li · Errui Ding · Jingdong Wang
Large-scale Training Data Search for Object Re-identification
Yue Yao · Tom Gedeon · Liang Zheng
SOOD: Towards Semi-Supervised Oriented Object Detection
Wei Hua · Dingkang Liang · jingyu li · Xiaolong Liu · Zhikang Zou · Xiaoqing Ye · Xiang Bai
Zero-Shot Object Counting
Jingyi Xu · Hieu Le · Vu Nguyen · Viresh Ranjan · Dimitris Samaras
SAP-DETR: Bridging the Gap between Salient Points and Queries-Based Transformer Detector for Fast Model Convergency
Yang Liu · Yao Zhang · Yixin Wang · Yang Zhang · Jiang Tian · zhongchao shi · Jianping Fan · Zhiqiang He
Knowledge Combination to Learn Rotated Detection Without Rotated Annotation
Tianyu Zhu · Bryce Ferenczi · Pulak Purkait · Tom Drummond · Hamid Rezatofighi · Anton Hengel
The Treasure Beneath Multiple Annotations: An Uncertainty-aware Edge Detector
Caixia Zhou · Yaping Huang · Mengyang Pu · Qingji Guan · Li Huang · Haibin Ling
Decoupled Semantic Prototypes enable learning from arbitrary annotation types for semi-weakly segmentation in expert-driven domains
Simon Reiß · Constantin Seibold · Alexander Freytag · Erik Rodner · Rainer Stiefelhagen
Boosting Low-Data Instance Segmentation by Unsupervised Pre-training with Saliency Prompt
HAO LI · Dingwen Zhang · Nian Liu · Lechao Cheng · Yalun Dai · Chao Zhang · Xinggang Wang · Junwei Han
STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection
Zhenglin Zhou · Huaxia Li · Hong Liu · Nanyang Wang · Gang Yu · Rongrong Ji
Fuzzy Positive Learning for Semi-supervised Semantic Segmentation
Pengchong Qiao · Zhidan Wei · Yu Wang · Zhennan Wang · Guoli Song · Fan Xu · Xiangyang Ji · Chang Liu · Jie Chen
Sparsely Annotated Semantic Segmentation with Adaptive Gaussian Mixtures
Linshan Wu · Zhun Zhong · Leyuan Fang · Xingxin He · Qiang Liu · Jiayi Ma · Hao Chen
Spatial-temporal Concept based Explanation of 3D ConvNets
Ying Ji · Yu Wang · Jien Kato
Weakly-Supervised Domain Adaptive Semantic Segmentation with Prototypical Contrastive Learning
Anurag Das · Yongqin Xian · Dengxin Dai · Bernt Schiele
Exemplar-FreeSOLO: Enhancing Unsupervised Instance Segmentation with Exemplars
TAOSEEF ISHTIAK · Qing En · Yuhong Guo
Decoupling Human and Camera Motion from Videos in the Wild
Vickie Ye · Georgios Pavlakos · Jitendra Malik · Angjoo Kanazawa
CIRCLE: Capture In Rich Contextual Environments
Joao Araujo · Jiaman Li · Karthik Vetrivel · Rishi Agarwal · Deepak Gopinath · Jiajun Wu · Alexander Clegg · Karen Liu
CARTO: Category and Joint Agnostic Reconstruction of ARTiculated Objects
Nick Heppert · Muhammad Zubair Irshad · Sergey Zakharov · Katherine Liu · Rareș Ambruș · Jeannette Bohg · Abhinav Valada · Thomas Kollar
DexArt: Benchmarking Generalizable Dexterous Manipulation with Articulated Objects
Chen Bao · Helin Xu · Yuzhe Qin · Xiaolong Wang
FLEX: Full-Body Grasping Without Full-Body Grasps
Purva Tendulkar · Didac Suris Coll-Vinent · Carl Vondrick
Im2Hands: Learning Attentive Implicit Representation of Interacting Two-Hand Shapes
Jihyun Lee · Minhyuk Sung · Honggyu Choi · Tae-Kyun Kim
One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer
Jing Lin · Ailing Zeng · Haoqian Wang · Lei Zhang · Yu Li
Implicit 3D Human Mesh Recovery using Consistency with Pose and Shape from Unseen-view
Hanbyel Cho · Yooshin Cho · Jaesung Ahn · Junmo Kim
Flow supervision for Deformable NeRF
Chaoyang Wang · Lachlan MacDonald · Laszlo Jeni · Simon Lucey
FlexNeRF: Photorealistic Free-viewpoint Rendering of Moving Humans from Sparse Views
Vinoj Yasanga Jayasundara Magalle Hewa · Amit Agrawal · Nicolas Heron · Abhinav Shrivastava · Larry Davis
POEM: Reconstructing Hand in a Point Embedded Multi-view Stereo
Lixin Yang · Jian Xu · Licheng Zhong · Xinyu Zhan · Zhicheng Wang · Kejian Wu · Cewu Lu
Clothed Human Performance Capture with a Double-layer Neural Radiance Fields
Kangkan Wang · Guofeng Zhang · Suxu Cong · Jian Yang
VGFlow: Visibility guided Flow Network for Human Reposing
Rishabh Jain · Krishna Kumar Singh · Mayur Hemani · Jingwan Lu · Mausoom Sarkar · Duygu Ceylan · Balaji Krishnamurthy
HandNeRF: Neural Radiance Fields for Animatable Interacting Hands
Zhiyang Guo · Wengang Zhou · Min Wang · Li Li · Houqiang Li
PAniC-3D: Stylized Single-view 3D Reconstruction from Portraits of Anime Characters
Shuhong Chen · Kevin Zhang · Yichun Shi · Heng Wang · Yiheng Zhu · Guoxian Song · Sizhe An · Janus Kristjansson · Xiao Yang · Matthias Zwicker
PointAvatar: Deformable Point-based Head Avatars from Videos
Yufeng Zheng · Wang Yifan · Gordon Wetzstein · Michael Black · Otmar Hilliges
Ham2Pose: Animating Sign Language Notation into Pose Sequences
Rotem Shalev Arkushin · Amit Moryossef · Ohad Fried
Auto-CARD: Efficient and Robust Codec Avatar Driving for Real-time Mobile Telepresence
Yonggan Fu · Yuecheng Li · Chenghui Li · Jason Saragih · Peizhao Zhang · Xiaoliang Dai · Yingyan Lin
Learning Locally Editable Virtual Humans
Hsuan-I Ho · Lixin Xue · Jie Song · Otmar Hilliges
Zero-Shot Text-to-Parameter Translation for Game Character Auto-Creation
Rui Zhao · Wei Li · Zhipeng Hu · Lincheng Li · Zhengxia Zou · Zhenwei Shi · Changjie Fan
Learning Neural Parametric Head Models
Simon Giebenhain · Tobias Kirschstein · Markos Georgopoulos · Martin Rünz · Lourdes Agapito · Matthias Niessner
Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars
Jingxiang Sun · Xuan Wang · Lizhen Wang · Xiaoyu Li · Yong Zhang · Hongwen Zhang · Yebin Liu
Graphics Capsule: Learning Hierarchical 3D Face Representations from 2D Images
Chang Yu · Xiangyu Zhu · Xiaomei Zhang · Zhaoxiang Zhang · Zhen Lei
Parameter Efficient Local Implicit Image Function Network for Face Segmentation
Mausoom Sarkar · Nikitha S R · Mayur Hemani · Rishabh Jain · Balaji Krishnamurthy
StyleGene: Crossover and Mutation of Region-level Facial Genes for Kinship Face Synthesis
Hao Li · Xianxu Hou · Zepeng Huang · Linlin Shen
PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360

Sizhe An · Hongyi Xu · Yichun Shi · Guoxian Song · Umit Ogras · Linjie Luo
Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion
Yushi LAN · Xuyi Meng · Shuai Yang · CHEN CHANGE LOY · Bo Dai
3D Highlighter: Localizing Regions on 3D Shapes via Text Descriptions
Dale Decatur · Itai Lang · Rana Hanocka
Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models
Jiale Xu · Xintao Wang · Weihao Cheng · Yan-Pei Cao · Ying Shan · Xiaohu Qie · Shenghua Gao
Efficient View Synthesis and 3D-based Multi-Frame Denoising with Multiplane Feature Representations
Thomas Tanay · Ales Leonardis · Matteo Maggioni
Diffusion-Based Signed Distance Fields for 3D Shape Generation
Jaehyeok Shim · Changwoo Kang · Kyungdon Joo
Persistent Nature: A Generative Model of Unbounded 3D Worlds
Lucy Chai · Richard Tucker · Zhengqi Li · Phillip Isola · Noah Snavely
OReX: Object Reconstruction from Planar Cross-sections Using Neural Fields
Haim Sawdayee · Amir Vaxman · Amit Bermano
Sphere-Guided Training of Neural Implicit Surfaces
Andreea Dogaru · Andrei-Timotei Ardelean · Savva Ignatyev · Egor Zakharov · Evgeny Burnaev
NeuralUDF: Learning Unsigned Distance Fields for Multi-view Reconstruction of Surfaces with Arbitrary Topologies
Xiaoxiao Long · Cheng Lin · Lingjie Liu · Yuan Liu · Peng Wang · Christian Theobalt · Taku Komura · Wenping Wang
Looking Through the Glass: Neural Surface Reconstruction Against High Specular Reflections
Jiaxiong Qiu · Peng-Tao Jiang · Yifan Zhu · Ze-Xin Yin · Ming-Ming Cheng · Bo Ren
Teleidoscopic Imaging System for Microscale 3D Shape Reconstruction
Ryo Kawahara · Meng-Yu Kuo · Shohei Nobuhara
The Differentiable Lens: Compound Lens Search over Glass Surfaces and Materials for Object Detection
Geoffroi Côté · Fahim Mannan · Simon Thibault · Jean-Francois Lalonde · Felix Heide
SunStage: Portrait Reconstruction and Relighting using the Sun as a Light Stage
Yifan Wang · Aleksander Holynski · Xiuming Zhang · Cecilia Zhang
Nighttime smartphone reflective flare removal using optical center symmetry prior
Yuekun Dai · Yihang Luo · Shangchen Zhou · Chongyi Li · CHEN CHANGE LOY
ORCA: Glossy Objects as Radiance Field Cameras
Kushagra Tiwary · Akshat Dave · Nikhil Behari · Tzofi Klinghoffer · Ashok Veeraraghavan · Ramesh Raskar
ReLight My NeRF: A Dataset for Novel View Synthesis and Relighting of Real World Objects
Marco Toschi · Riccardo De Matteo · Riccardo Spezialetti · Daniele Gregorio · Luigi Di Stefano · Samuele Salti
Neural Scene Chronology
Haotong Lin · Qianqian Wang · Ruojin Cai · Sida Peng · Hadar Averbuch-Elor · Xiaowei Zhou · Noah Snavely
DyNCA: Real-time Dynamic Texture Synthesis Using Neural Cellular Automata
Ehsan Pajouheshgar · Yitao Xu · Tong Zhang · Sabine Süsstrunk
TriVol: Point Cloud Rendering via Triple Volumes
Tao Hu · Xiaogang Xu · Ruihang Chu · Jiaya Jia
Occlusion-Free Scene Recovery via Neural Radiance Fields
Chengxuan Zhu · Renjie Wan · Yunkai Tang · Boxin Shi
Transforming Radiance Field with Lipschitz Network for Photorealistic 3D Scene Stylization
Zicheng Zhang · Yinglu Liu · Congying Han · Yingwei Pan · Tiande Guo · Ting Yao
PaletteNeRF: Palette-based Appearance Editing of Neural Radiance Fields
Zhengfei Kuang · Fujun Luan · Sai Bi · Zhixin Shu · Gordon Wetzstein · Kalyan Sunkavalli
Masked Wavelet Representation for Compact Neural Radiance Fields
Daniel Rho · Byeonghyeon Lee · Seungtae Nam · Joo Chan Lee · Jong Hwan Ko · Eunbyung Park
SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting with Neural Radiance Fields
Ashkan Mirzaei · Tristan Aumentado-Armstrong · Konstantinos Derpanis · Jonathan Kelly · Marcus Brubaker · Igor Gilitschenski · Alex Levinshtein
MixNeRF: Modeling a Ray with Mixture Density for Novel View Synthesis from Sparse Inputs
Seunghyeon Seo · Donghoon Han · Yeonjin Chang · Nojun Kwak
GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from Multi-view Images
Jianchuan Chen · Wentao Yi · Liqian Ma · Xu Jia · Huchuan Lu
NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image Priors
Congyue Deng · Chiyu Jiang · Charles R. Qi · Xinchen Yan · Yin Zhou · Leonidas Guibas · Dragomir Anguelov
RobustNeRF: Ignoring Distractors with Robust Losses
Sara Sabour · Suhani Vora · Daniel Duckworth · Ivan Krasin · David Fleet · Andrea Tagliasacchi
High-fidelity Event-Radiance Recovery via Transient Event Frequency
Jin Han · Yuta Asano · Boxin Shi · Yinqiang Zheng · Zhihang Zhong
TruFor: Leveraging all-round clues for trustworthy image forgery detection and localization
Fabrizio Guillaro · Davide Cozzolino · Avneesh Sud · Nicholas Dufour · Luisa Verdoliva
CLIP2Protect: Protecting Facial Privacy using Text-Guided Makeup via Adversarial Latent Search
Fahad Shamshad · Muhammad Muzammal Naseer · Karthik Nandakumar
Discrete Point-wise Attack Is Not Enough: Generalized Manifold Adversarial Attack for Face Recognition
Qian Li · Yuxiao Hu · Ye Liu · Dongxiao Zhang · Xin Jin · Yuntian Chen
Generalist: Decoupling Natural and Robust Generalization
Hongjun Wang · Yisen Wang
AGAIN: Adversarial Training with Attribution Span Enlargement and Hybrid Feature Fusion
Shenglin Yin · kelu Yao · Sheng Shi · Yangzhou Du · Zhen Xiao
HGFormer: Hierarchical Grouping Transformer for Domain Generalized Semantic Segmentation
Jian Ding · Nan Xue · Gui-Song Xia · Bernt Schiele · Dengxin Dai
Pruning Parameterization with Bi-level Optimization for Efficient Semantic Segmentation on the Edge
Changdi Yang · Pu Zhao · Yanyu Li · Wei Niu · Jiexiong Guan · Hao Tang · Minghai Qin · Bin Ren · Xue Lin · Yanzhi Wang
Towards Open-World Segmentation of Parts
Tai-Yu Pan · Qing Liu · Wei-Lun Chao · Brian Price
SegLoc: Learning Segmentation-based Representations for Privacy-Preserving Visual Localization
Maxime Pietrantoni · Martin Humenberger · Torsten Sattler · Gabriela Csurka
GeoNet: Benchmarking Unsupervised Adaptation across Geographies
Tarun Kalluri · Wangdong Xu · Manmohan Chandraker
Modeling Entities as Semantic Points for Visual Information Extraction in the Wild
Zhibo Yang · Rujiao Long · Pengfei Wang · Sibo Song · Humen Zhong · Wenqing Cheng · Xiang Bai · Cong Yao
DPF: Learning Dense Prediction Fields with Weak Supervision
Xiaoxue Chen · Yuhang Zheng · Yupeng Zheng · Qiang Zhou · Hao Zhao · Guyue Zhou · Ya-Qin Zhang
Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning
Man Liu · Feng Li · Chunjie Zhang · Yunchao Wei · Huihui Bai · Yao Zhao
Universal Instance Perception as Object Discovery and Retrieval
Bin Yan · Yi Jiang · Jiannan Wu · Dong Wang · Ping Luo · Zehuan Yuan · Huchuan Lu
Learning Attention as Disentangler for Compositional Zero-shot Learning
Shaozhe Hao · Kai Han · Kwan-Yee K. Wong
CLIP is Also an Efficient Segmenter: A Text-Driven Approach for Weakly Supervised Semantic Segmentation
Yuqi Lin · Minghao Chen · Wenxiao Wang · Boxi Wu · Ke Li · Binbin Lin · Haifeng Liu · Xiaofei He
Self-supervised Implicit Glyph Attention for Text Recognition
Tongkun Guan · Chaochen Gu · Jingzheng Tu · Xue Yang · Qi Feng · yudi zhao · Wei Shen
Visual Recognition by Request
Chufeng Tang · Lingxi Xie · XIAOPENG ZHANG · Xiaolin Hu · Qi Tian
Aligning Bag of Regions for Open-Vocabulary Object Detection
Size Wu · Wenwei Zhang · Sheng Jin · Wentao Liu · CHEN CHANGE LOY
CLIP^2: Contrastive Language-Image-Point Pretraining from Real-World Point Cloud Data
Yihan Zeng · Chenhan Jiang · Jiageng Mao · Jianhua Han · Chaoqiang Ye · Qingqiu Huang · Dit-Yan Yeung · Zhen Yang · Xiaodan Liang · Hang Xu
CapDet: Unifying Dense Captioning and Open-World Detection Pretraining
Yanxin Long · Youpeng Wen · Jianhua Han · Hang Xu · Pengzhen Ren · Wei Zhang · Shen Zhao · Xiaodan Liang
Towards Unified Scene Text Spotting based on Sequence Generation
Taeho Kil · Seonghyeon Kim · Sukmin Seo · Yoonsik Kim · Daehee Kim
Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners
Renrui Zhang · Xiangfei Hu · Bohao Li · Siyuan Huang · Hanqiu Deng · Yu Qiao · Peng Gao · Hongsheng Li
Boundary-aware Backward-Compatible Representation via Adversarial Learning in Image Retrieval
Tan Pan · Furong Xu · Xudong Yang · Sifeng He · Chen Jiang · Qingpei Guo · Feng Qian · Xiaobo Zhang · Yuan Cheng · Lei Yang · Wei Chu
Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Zaid Khan · Vijay Kumar B G · Samuel Schulter · Xiang Yu · Yun Fu · Manmohan Chandraker
ConStruct-VL: Data-Free Continual Structured VL Concepts Learning
James Smith · Paola Cascante-Bonilla · Assaf Arbelle · Donghyun Kim · Rameswar Panda · David Cox · Diyi Yang · Zsolt Kira · Rogerio Feris · Leonid Karlinsky
À-la-carte Prompt Tuning (APT): Combining Distinct Data Via Composable Prompting
Benjamin Bowman · Alessandro Achille · Luca Zancato · Matthew Trager · Pramuditha Perera · Giovanni Paolini · Stefano Soatto
Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering
Zhenwei Shao · Zhou Yu · Meng Wang · Jun Yu
Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning
Zhuowan Li · Xingrui Wang · Elias Stengel-Eskin · Adam Kortzlewski · Wufei Ma · Benjamin Van Durme · Alan Yuille
Visual Programming: Compositional visual reasoning without training
Tanmay Gupta · Aniruddha Kembhavi
Multimodal Prompting with Missing Modalities for Visual Recognition
Yi-Lun Lee · Yi-Hsuan Tsai · Wei-Chen Chiu · Chen-Yu Lee
EXCALIBUR: Encouraging and Evaluating Embodied Exploration
Hao Zhu · Raghav Kapoor · So Yeon Min · Winson Han · Jiatai Li · Kaiwen Geng · Graham Neubig · Yonatan Bisk · Aniruddha Kembhavi · Luca Weihs
Iterative Vision-and-Language Navigation
Jacob Krantz · Shurjo Banerjee · Wang Zhu · Jason Corso · Peter Anderson · Stefan Lee · Jesse Thomason
Adaptive Zone-aware Hierarchical Planner for Vision-Language Navigation
Chen Gao · Xingyu Peng · Mi Yan · He Wang · Lirong Yang · Haibing Ren · Hongsheng Li · Si Liu
SkyEye: Self-Supervised Bird’s-Eye-View Semantic Mapping Using Monocular Frontal View Images
Nikhil Gosala · Kürsat Petek · Paulo Drews-Jr · Wolfram Burgard · Abhinav Valada
Natural Language-Assisted Sign Language Recognition
Ronglai Zuo · Fangyun Wei · Brian Mak
Learning to Predict Situation Hyper-Graphs for Video Question Answering
Aisha Urooj · Hilde Kuehne · Bo Wu · Kim Chheu · Walid Bousselham · Chuang Gan · Niels Lobo · Mubarak Shah
Align and Attend: Multimodal Summarization with Dual Contrastive Losses
Bo He · Jun Wang · Jielin Qiu · Trung Bui · Abhinav Shrivastava · Zhaowen Wang
Clover: Towards A Unified Video-Language Alignment and Fusion Model
Jingjia Huang · Yinan Li · Jiashi Feng · Xinglong Wu · Xiaoshuai Sun · Rongrong Ji
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval
Xudong Lin · Simran Tiwari · Shiyuan Huang · Manling Li · Mike Zheng Shou · Heng Ji · Shih-Fu Chang
PDPP:Projected Diffusion for Procedure Planning in Instructional Videos
Hanlin Wang · Yilu Wu · Sheng Guo · Limin Wang
Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations
Yiwu Zhong · Licheng Yu · Yang Bai · Shangwen Li · Xueting Yan · Yin Li
Text-Visual Prompting for Efficient 2D Temporal Video Grounding
Yimeng Zhang · Xin Chen · Jinghan Jia · Sijia Liu · Ke Ding
Language-Guided Music Recommendation for Video via Prompt Analogies
Daniel McKee · Justin Salamon · Josef Sivic · Bryan Russell
MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering
Difei Gao · Luowei Zhou · Lei Ji · Linchao Zhu · Yi Yang · Mike Zheng Shou
Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization
Chen Ju · Kunhao Zheng · Jinxiang Liu · Peisen Zhao · Ya Zhang · Jianlong Chang · Qi Tian · Yanfeng Wang
Cascade Evidential Learning for Open-world Weakly-supervised Temporal Action Localization
Mengyuan Chen · Junyu Gao · Changsheng Xu
STMixer: A One-Stage Sparse Action Detector
Tao Wu · Mengqi Cao · Ziteng Gao · Gangshan Wu · Limin Wang
The Wisdom of Crowds: Temporal Progressive Attention for Early Action Prediction
Alexandros Stergiou · Dima Damen
A Large-scale Robustness Analysis of Video Action Recognition Models
Madeline Chantry · Naman Biyani · Prudvi Kamtam · Shruti Vyas · Hamid Palangi · Vibhav Vineet · Yogesh Rawat
Learning to Dub Movies via Hierarchical Prosody Models
Gaoxiang Cong · Liang Li · Yuankai Qi · Zheng-Jun Zha · Qi Wu · Wenyu Wang · Bin. Jiang · Ming-Hsuan Yang · Qingming Huang
iQuery: Instruments as Queries for Audio-Visual Sound Separation
Jiaben Chen · Renrui Zhang · Dongze Lian · Jiaqi Yang · Ziyao Zeng · Jianbo Shi
Egocentric Auditory Attention Localization in Conversations
Fiona Ryan · Hao Jiang · Abhinav Shukla · James Rehg · Vamsi Krishna Ithapu
Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert
Jiadong Wang · Xinyuan Qian · Malu Zhang · Robby Tan · Haizhou Li
Source-Free Video Domain Adaptation with Spatial-Temporal-Historical Consistency Learning
Kai Li · Deep A Patel · Erik Kruus · Martin Min
Referring Multi-Object Tracking
Dongming Wu · Wencheng Han · Tiancai Wang · Xingping Dong · Xiangyu Zhang · Jianbing Shen
A Generalized Framework for Video Instance Segmentation
Miran Heo · Sukjun Hwang · Jeongseok Hyun · Hanjung Kim · Seoung Wug Oh · Joon-Young Lee · Seon Joo Kim
LSTFE-Net:Long Short-Term Feature Enhancement Network for Video Small Object Detection
Jinsheng Xiao · Yuanxu Wu · Yunhua Chen · Shurui Wang · Zhongyuan Wang · Jiayi Ma
Streaming Video Model
Yucheng Zhao · Chong Luo · Chuanxin Tang · Dongdong Chen · Noel Codella · Zheng-Jun Zha
Video Event Restoration Based on Keyframes for Video Anomaly Detection
Zhiwei Yang · Jing Liu · Zhaoyang Wu · Peng Wu · Xiaotao Liu
Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping
Long Lian · Zhirong Wu · Stella Yu
SeqTrack: Sequence to Sequence Learning for Visual Object Tracking
Xin Chen · Houwen Peng · Dong Wang · Huchuan Lu · Han Hu
VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
Limin Wang · Bingkun Huang · Zhiyu Zhao · Zhan Tong · Yinan He · Yi Wang · Yali Wang · Yu Qiao
Iterative Next Boundary Detection for Instance Segmentation of Tree Rings in Microscopy Images of Shrub Cross Sections
Alexander Gillert · Giulia Resente · Alba Anadon-Rosell · Martin Wilmking · Uwe Freiherr von Lukas
Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention
Mingyu Ding · Yikang Shen · Lijie Fan · Zhenfang Chen · Zitian Chen · Ping Luo · Joshua Tenenbaum · Chuang Gan
SimpSON: Simplifying Photo Cleanup with Single-Click Distracting Object Segmentation Network
Chuong Huynh · Yuqian Zhou · Zhe Lin · Connelly Barnes · Eli Shechtman · Sohrab Amirghodsi · Abhinav Shrivastava
Ada
MAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders
Wele Bandara Bandara · Naman Patel · Ali Gholami · Mehdi Nikkhah · Motilal Agrawal · Vishal Patel
FlexiViT: One Model for All Patch Sizes
Lucas Beyer · Pavel Izmailov · Alexander Kolesnikov · Mathilde Caron · Simon Kornblith · Xiaohua Zhai · Matthias Minderer · Michael Tschannen · Ibrahim Alabdulmohsin · Filip Pavetic
Improving Visual Representation Learning through Perceptual Understanding
Samyakh Tukra · Fred Hoffman · Ken Chatfield
Revealing the Dark Secrets of Masked Image Modeling
Zhenda Xie · Zigang Geng · Jingcheng Hu · Zheng Zhang · Han Hu · Yue Cao
Non-Contrastive Unsupervised Learning of Physiological Signals from Video
Jeremy Speth · Nathan Vance · Patrick Flynn · Adam Czajka
High-resolution image reconstruction with latent diffusion models from human brain activity
Yu Takagi · Shinji Nishimoto
RIFormer: Keep Your Vision Backbone Effective But Removing Token Mixer
Jiahao Wang · Songyang Zhang · Yong Liu · Taiqiang Wu · Yujiu Yang · Xihui Liu · Kai Chen · Ping Luo · Dahua Lin
Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference
Haoran You · Yunyang Xiong · Xiaoliang Dai · Peizhao Zhang · Bichen Wu · Haoqi Fan · Peter Vajda · Yingyan Lin
EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention
Xinyu Liu · Houwen Peng · Ningxin Zheng · Yuqing Yang · Han Hu · Yixuan Yuan
InternImage: Exploring Large-Scale Vision Fundamental Models with Deformable Convolutions
Wenhai Wang · Jifeng Dai · Zhe Chen · Zhenhang Huang · Zhiqi Li · Xizhou Zhu · Xiaowei Hu · Tong Lu · Lewei Lu · Hongsheng Li · Xiaogang Wang · Yu Qiao
Memory-friendly Scalable Super-resolution via Rewinding Lottery Ticket Hypothesis
林 锦 · Xiaotong Luo · ming Hong · Yanyun Qu · Yuan Xie · Zongze Wu
Learned Image Compression with Mixed Transformer-CNN Architectures
Jinming Liu · Heming Sun · Jiro Katto
NIRVANA: Neural Implicit Representations of Videos with Adaptive Networks and Autoregressive Patch-wise Modeling
Shishira Maiya · Sharath Girish · Max Ehrlich · Hanyu Wang · Kwot Sin Lee · Patrick Poirson · Pengxiang Wu · Chen Wang · Abhinav Shrivastava
Complexity-guided Slimmable Decoder for Efficient Deep Video Compression
Zhihao Hu · Dong Xu
Context-Based Trit-Plane Coding for Progressive Image Compression
Seungmin Jeon · KWANG PYO CHOI · YOUNGO PARK · Chang-Su Kim
End-to-end Video Matting with Trimap Propagation
Wei-Lun Huang · Ming-Sui Lee
Rethinking Image Super Resolution from Long-Tailed Distribution Learning Perspective
Yuanbiao Gou · Peng Hu · Jiancheng Lv · Hongyuan Zhu · Xi Peng
Shape-aware Text-driven Layered Video Editing
Yao-Chih Lee · Ji-Ze Jang · Yi-Ting Chen · Elizabeth Qiu · Jia-Bin Huang
Dimensionality-Varying Diffusion Process
Han Zhang · Ruili Feng · Zhantao Yang · Lianghua Huang · Yu Liu · Yifei Zhang · Yujun Shen · Deli Zhao · Jingren Zhou · Fan Cheng
On Distillation of Guided Diffusion Models
Chenlin Meng · Robin Rombach · Ruiqi Gao · Diederik Kingma · Stefano Ermon · Jonathan Ho · Tim Salimans
Towards Flexible Multi-modal Document Models
Naoto Inoue · Kotaro Kikuchi · Edgar Simo-Serra · Mayu Otani · Kota Yamaguchi
Toward verifiable and reproducible human evaluation for text-to-image generation
Mayu Otani · Riku Togashi · Yu Sawai · Ryosuke Ishigami · Yuta Nakashima · Esa Rahtu · Janne Heikkila · Shin’ichi Satoh
Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of Text-to-Image Diffusion Models to Learn Any Unseen Style
Haoming Lu · Hazarapet Tunanyan · Kai Wang · Shant Navasardyan · Zhangyang Wang · Humphrey Shi
Freestyle Layout-to-Image Synthesis
Han Xue · Zhiwu Huang · Qianru Sun · Li Song · Wenjun Zhang
ReCo: Region-Controlled Text-to-Image Generation
Zhengyuan Yang · Jianfeng Wang · Zhe Gan · Linjie Li · Kevin Lin · Chenfei Wu · Nan Duan · Zicheng Liu · Ce Liu · Michael Zeng · Lijuan Wang
Conditional Text Image Generation with Diffusion Models
Yuanzhi Zhu · Zhaohai Li · Tianwei Wang · Mengchao He · Cong Yao
Fix the Noise: Disentangling Source Feature for Controllable Domain Translation
Dongyeun Lee · Jae Young Lee · Doyeon Kim · Jaehyun Choi · Jaejun Yoo · Junmo Kim
GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
Ming Tao · Bing-Kun BAO · Hao Tang · Changsheng Xu
DATID-3D: Diversity-Preserved Domain Adaptation Using Text-to-Image Diffusion for 3D Generative Model
Gwanghyun Kim · Se Young Chun
NÜWA-LIP: Language-guided Image Inpainting with Defect-free VQGAN
Minheng Ni · Xiaoming Li · Wangmeng Zuo
Neural Preset for Color Style Transfer
Zhanghan Ke · Yuhao LIU · Lei Zhu · Nanxuan Zhao · Rynson Lau
Restoration of Hand-Drawn Architectural Drawings using Latent Space Mapping with Degradation Generator
Nakkwan Choi · Seungjae Lee · Yongsik Lee · Seungjoon Yang
Neural Fourier Filter Bank
Zhijie Wu · Yuhe Jin · Kwang Moo Yi
PyramidFlow: High-Resolution Defect Contrastive Localization using Pyramid Normalizing Flow
Jiarui Lei · Xiaobo Hu · Yue Wang · Dong Liu
PHA: Patch-wise High-frequency Augmentation for Transformer-based Person Re-identification
Guiwei Zhang · Yongfei Zhang · Tianyu Zhang · Bo Li · Shiliang Pu
Comprehensive and Delicate: An Efficient Transformer for Image Restoration
Haiyu Zhao · Yuanbiao Gou · Boyun Li · Dezhong Peng · Jiancheng Lv · Xi Peng
Ultrahigh Resolution Image/Video Matting with Spatio-Temporal Sparsity
Yanan SUN · Chi-Keung Tang · Yu-Wing Tai
Equivalent Transformation and Dual Stream Network Construction for Mobile Image Super-Resolution
Jiahao Chao · Zhou Zhou · Hongfan Gao · Jiali Gong · Zhengfeng Yang · Zhenbing Zeng · Lydia Dehbi
Real-time 6K Image Rescaling with Rate-distortion Optimization
Chenyang Qi · XIN YANG · Ka Leong Cheng · Ying-Cong Chen · Qifeng Chen
Human Guided Ground-truth Generation for Realistic Image Super-resolution
Du Chen · Jie Liang · Xindong Zhang · Ming Liu · Hui Zeng · Lei Zhang
Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective
Weixia Zhang · Guangtao Zhai · Ying Wei · Xiaokang Yang · Kede Ma
Visual Recognition-Driven Image Restoration for Multiple Degradation with Intrinsic Semantics Recovery
Zizheng Yang · Jie Huang · Jiahao Chang · man zhou · Hu Yu · Jinghao Zhang · Feng Zhao
ShadowDiffusion: When Degradation Prior Meets Diffusion Model for Shadow Removal
Lanqing Guo · Chong Wang · Wenhan Yang · Siyu Huang · Yufei Wang · Hanspeter Pfister · Bihan Wen
Probability-based Global Cross-modal Upsampling for Pan-sharpening
Zeyu Zhu · Xiangyong Cao · man zhou · Junhao Huang · Deyu Meng
Real-time Controllable Denoising for Image and Video
Zhaoyang Zhang · Yitong Jiang · Wenqi Shao · Xiaogang Wang · Ping Luo · Kaimo Lin · Jinwei Gu
Zero-Shot Noise2Noise: Efficient Image Denoising without any Data
Youssef Mansour · Reinhard Heckel
Rawgment: Noise-Accounted RAW Augmentation Enables Recognition in a Wide Variety of Environments
Masakazu Yoshimura · Junji Otsuka · Atsushi Irie · Takeshi Ohashi
Structure Aggregation for Cross-Spectral Stereo Image Guided Denoising
Zehua Sheng · Zhu Yu · Xiongwei Liu · Siyuan Cao · Yuqi Liu · Hui-liang Shen · Huaqi Zhang
Self-supervised Blind Motion Deblurring with Deep Expectation Maximization
Ji Li · Weixi Wang · YUESONG NAN · Hui Ji
Joint HDR Denoising and Fusion: A Real-World Mobile HDR Image Dataset
Shuaizheng Liu · Xindong Zhang · Lingchen Sun · Zhetong Liang · Hui Zeng · Lei Zhang
MetaFusion: Infrared and Visible Image Fusion via Meta-Feature Embedding from Object Detection
Wenda Zhao · Shigeng Xie · Fan Zhao · You He · Huchuan Lu
FeatER: An Efficient Network for Human Reconstruction via Feature Map-Based TransformER
Ce Zheng · Matias Mendieta · Taojiannan Yang · Guo-Jun Qi · Chen Chen
Joint Video Multi-Frame Interpolation and Deblurring under Unknown Exposure Time
Wei Shang · Dongwei Ren · yi yang · Hongzhi Zhang · Kede Ma · Wangmeng Zuo
Learning Event Guided High Dynamic Range Video Reconstruction
Yixin Yang · Jin Han · Jinxiu Liang · Zhihang Zhong · Boxin Shi
Multi Domain Learning for Motion Magnification
JASDEEP SINGH · Subrahmanyam Murala · G Sankara Kosuru
EvShutter: Transforming Events for Unconstrained Rolling Shutter Correction
Julius Erbach · Stepan Tulyakov · Patricia Vitoria · Alfredo Bochicchio · YUANYOU LI
Continuous Intermediate Token Learning with Implicit Motion Manifold for Keyframe Based Motion Interpolation
Clinton Mo · Kun Hu · Chengjiang Long · Zhiyong Wang
Recurrent Vision Transformers for Object Detection with Event Cameras
Mathias Gehrig · Davide Scaramuzza
MoDi: Unconditional Motion Synthesis from Diverse Data
Sigal Raab · Inbal Leibovitch · Peizhuo Li · Kfir Aberman · Olga Sorkine-Hornung · Daniel Cohen-Or
Skinned Motion Retargeting with Residual Perception of Motion Semantics & Geometry
Jiaxu Zhang · Junwu Weng · Di Kang · Fang Zhao · Shaoli Huang · Xuefei Zhe · Linchao Bao · Ying Shan · Jue Wang · Zhigang Tu
Real-time Multi-person Eyeblink Detection in the Wild for Untrimmed Video
Wenzheng Zeng · Yang Xiao · Sicheng Wei · Jinfang Gan · Xintao Zhang · Zhiguo Cao · Zhiwen Fang · Joey Zhou
SelfME: Self-Supervised Motion Learning for Micro-Expression Recognition
Xinqi Fan · Xueli CHEN · Mingjie Jiang · Ali Shahid · Hong Yan
An In-depth Exploration of Person Re-identification and Gait Recognition in Cloth-Changing Conditions
Weijia Li · Saihui Hou · Chunjie Zhang · Chunshui Cao · Xu Liu · Yongzhen Huang · Yao Zhao
Simple Cues Lead to a Strong Multi-Object Tracker
Jenny Seidenschwarz · Guillem Braso · Víctor Castro Serrano · Ismail Elezi · Laura Leal-Taixé
Tracking through Containers and Occluders in the Wild
Basile Van Hoorick · Pavel Tokmakov · Simon Stent · Jie Li · Carl Vondrick
Indiscernible Object Counting in Underwater Scenes
Guolei Sun · Zhaochong An · Yun Liu · Ce Liu · Christos Sakaridis · Deng-Ping Fan · Luc Van Gool
Affordances from Human Videos as a Versatile Representation for Robotics
Shikhar Bahl · Russell Mendonca · Lili Chen · Unnat Jain · Deepak Pathak
Galactic: Scaling End-to-End Reinforcement Learning for Rearrangement at 100k Steps-Per-Second
Vincent-Pierre Berges · Andrew Szot · Devendra Singh Chaplot · Aaron Gokaslan · Roozbeh Mottaghi · Dhruv Batra · Eric Undersander
Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion
Davis Rempe · Zhengyi Luo · Xue Bin Peng · Ye Yuan · Kris Kitani · Karsten Kreis · Sanja Fidler · Or Litany
FJMP: Factorized Joint Multi-Agent Motion Prediction over Learned Directed Acyclic Interaction Graphs
Luke Rowe · Martin Ethier · Eli-Henry Dykhne · Krzysztof Czarnecki
Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction
Shaofei Cai · Zihao Wang · Xiaojian Ma · Anji Liu · Yitao Liang
ReasonNet: End-to-End Driving with Temporal and Global Reasoning
Hao Shao · Letian Wang · Ruobing Chen · Steven Waslander · Hongsheng Li · Yu Liu
V2V4Real: A large-scale real-world dataset for Vehicle-to-Vehicle Cooperative Perception
Runsheng Xu · Xin Xia · JINLONG LI · Hanzhao Li · Shuo Zhang · Zhengzhong Tu · Zonglin Meng · Hao Xiang · Xiaoyu Dong · Rui Song · Hongkai Yu · Bolei Zhou · Jiaqi Ma
Bayesian posterior approximation with stochastic ensembles
Oleksandr Balabanov · Bernhard Mehlig · Hampus Linander
DistractFlow: Improving Optical Flow Estimation via Realistic Distractions and Pseudo-Labeling
Jisoo Jeong · Hong Cai · Risheek Garrepalli · Fatih Porikli
Sliced optimal partial transport
Yikun Bai · Bernhard Schmitzer · Matthew Thorpe · Soheil Kolouri
Unsupervised Deep Asymmetric Stereo Matching with Spatially-Adaptive Self-Similarity
Taeyong Song · Sunok Kim · Kwanghoon Sohn
Similarity Metric Learning For RGB-Infrared Group Re-Identification
Jianghao Xiong · Jianhuang Lai
Generalizable Local Feature Pre-training for Deformable Shape Analysis
SOUHAIB ATTAIKI · Lei Li · Maks Ovsjanikov
Quantum Multi-Model Fitting
Matteo Farina · Luca Magri · Willi Menapace · Elisa Ricci · Vladislav Golyanik · Federica Arrigoni
Bridging Search Region Interaction with Template for RGB-T Tracking
Tianrui Hui · Zizheng Xun · Fengguang Peng · Junshi Huang · Xiaoming Wei · Xiaolin Wei · Jiao Dai · Jizhong Han · Si Liu
Local Connectivity-Based Density Estimation for Face Clustering
Junho Shin · Hyo-Jun Lee · Hyunseop Kim · Jong-Hyeon Baek · Daehyun Kim · Yeong Jun Koh
Unsupervised Deep Probabilistic Approach for Partial Point Cloud Registration
Guofeng Mei · Hao Tang · Xiaoshui Huang · Weijie Wang · Juan Liu · Jian Zhang · Luc Van Gool · Qiang Wu
NerVE: Neural Volumetric Edges for Parametric Curve Extraction from Point Cloud
Xiangyu Zhu · Dong Du · Weikai Chen · Zhiyou Zhao · Yinyu Nie · Xiaoguang Han
SHS-Net: Learning Signed Hyper Surfaces for Oriented Normal Estimation of Point Clouds
Qing Li · Huifang Feng · Kanle Shi · Yue Gao · Yi Fang · Yushen Liu · Zhizhong Han
AnchorFormer: Point Cloud Completion from Discriminative Nodes
ZHIKAI CHEN · Fuchen Long · Zhaofan Qiu · Ting Yao · Wengang Zhou · Jiebo Luo · Tao Mei
GeoMAE: Masked Geometric Target Prediction for Self-supervised Point Cloud Pre-Training
Xiaoyu Tian · Haoxi Ran · Yue Wang · Hang Zhao
Symmetric Shape-Preserving Autoencoder for Unsupervised Real Scene Point Cloud Completion
Changfeng Ma · Yinuo Chen · Pengxiao Guo · Jie Guo · Chongjun Wang · Yanwen Guo
ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution
Tuan Ngo · Binh-Son Hua · Khoi Nguyen
itKD: Interchange Transfer-based Knowledge Distillation for 3D Object Detection
Hyeon Cho · Junyong Choi · Geonwoo Baek · Wonjun Hwang
DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets
Haiyang Wang · Chen Shi · Shaoshuai Shi · Meng Lei · Sen Wang · Di He · Bernt Schiele · Liwei Wang
WeatherStream: Light Transport Automation of Single Image Deweathering
Howard Zhang · Yunhao Ba · Ethan Yang · Varan Mehra · Blake Gella · Akira Suzuki · Arnold Pfahnl · Chethan Chinder Chandrappa · Alex Wong · Achuta Kadambi
LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs
Yukang Chen · Jianhui Liu · Xiangyu Zhang · XIAOJUAN QI · Jiaya Jia
PVT-SSD: Single-Stage 3D Object Detector with Point-Voxel Transformer
Honghui Yang · Wenxiao Wang · Minghao Chen · Binbin Lin · Tong He · Hua Chen · Xiaofei He · Wanli Ouyang
Unsupervised Intrinsic Image Decomposition with LiDAR Intensity
Shogo Sato · Yasuhiro Yao · Taiga Yoshida · Takuhiro Kaneko · Shingo Ando · Jun Shimamura
ALSO: Automotive Lidar Self-supervision by Occupancy estimation
Alexandre Boulch · Corentin Sautier · Björn Michele · Gilles Puy · Renaud Marlet
MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training
Runsen Xu · Tai Wang · Wenwei Zhang · Runjian Chen · Jinkun Cao · Jiangmiao Pang · Dahua Lin
Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images
bowei du · Yecheng Huang · JX Chen · Di Huang
Center Focusing Network for Real-Time LiDAR Panoptic Segmentation
Xiaoyan Li · Gang Zhang · Boyue Wang · Yongli Hu · Baocai Yin
Learning and Aggregating Lane Graphs for Urban Automated Driving
Martin Büchner · Jannik Zürn · Ion-George Todoran · Abhinav Valada · Wolfram Burgard
LiDAR-in-the-loop Hyperparameter Optimization
Félix Antoine Goudreault · Dominik Scheuble · Mario Bijelic · Nicolas Robidoux · Felix Heide
Bi-directional LiDAR-Radar Fusion for 3D Dynamic Object Detection
颖杰 王 · Jiajun Deng · Yao Li · Jinshui Hu · Cong Liu · Yu Zhang · Jianmin Ji · Wanli Ouyang · Yanyong Zhang
Toward RAW Object Detection: A New Benchmark and A New Model
Ruikang Xu · Chang Chen · Jingyang Peng · Cheng Li · Yibin Huang · Fenglong Song · Youliang Yan · Zhiwei Xiong
Resource-Efficient RGBD Aerial Tracking
Jinyu Yang · Shang Gao · Zhe Li · Feng Zheng · Ales Leonardis
Learned Two-Plane Perspective Prior based Image Resampling for Efficient Object Detection
Anurag Ghosh · Dinesh Reddy Narapureddy · Christoph Mertz · Srinivasa Narasimhan
Phase-Shifting Coder: Predicting Accurate Orientation in Oriented Object Detection
Yi Yu · Feipeng Da
PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers
Ryan Grainger · Thomas Paniagua · Xi Song · Naresh Cuntoor · MUN WAI LEE · Tianfu Wu
Global Vision Transformer Pruning with Hessian-Aware Saliency
Huanrui Yang · Hongxu Yin · Maying Shen · Pavlo Molchanov · Hai Li · Jan Kautz
Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation
Ning Zhang · Francesco Nex · George Vosselman · Norman Kerle
CompletionFormer: Depth Completion with Convolutions and Vision Transformers
Youmin Zhang · Xianda Guo · Matteo Poggi · Zheng Zhu · Guan Huang · Stefano Mattoccia
TINC: Tree-structured Implicit Neural Compression
Runzhao Yang
WIRE: Wavelet Implicit Neural Representations
Vishwanath Saragadam · Daniel LeJeune · Jasper Tan · Guha Balakrishnan · Ashok Veeraraghavan · Richard Baraniuk
Video Compression with Entropy-Constrained Neural Representations
Carlos Gomes · Roberto Azevedo · Christopher Schroers
MMVC: Learned Multi-Mode Video Compression with Block-based Prediction Mode Selection and Density-Adaptive Entropy Coding
Bowen Liu · Yu Chen · Rakesh Chowdary Machineni · Shiyu Liu · Hun-Seok Kim
EfficientSCI: Densely Connected Network with Space-time Factorization for Large-scale Video Snapshot Compressive Imaging
lishun wang · Miao Cao · Xin Yuan
Regularized Vector Quantization for Tokenized Image Synthesis
Jiahui Zhang · Fangneng Zhan · Christian Theobalt · Shijian Lu
Video Probabilistic Diffusion Models in Projected Latent Space
Sihyun Yu · Kihyuk Sohn · Subin Kim · Jinwoo Shin
Conditional Image-to-Video Generation with Latent Flow Diffusion Models
Haomiao Ni · Changhao Shi · Kai Li · Sharon Huang · Martin Min
Class-Balancing Diffusion Models
Yiming QIN · Huangjie Zheng · Jiangchao Yao · Mingyuan Zhou · Ya Zhang
HOLODIFFUSION: Training a 3D Diffusion Model using 2D Images
Animesh Karnewar · Andrea Vedaldi · David Novotny · Niloy Mitra
Self-Guided Diffusion Models
Tao Hu · David Zhang · Yuki Asano · Gertjan Burghouts · Cees Snoek
LayoutFormer++: Conditional Graphic Layout Generation via Constraint Serialization and Decoding Space Restriction
Zhaoyun Jiang · Jiaqi Guo · Shizhao Sun · Huayu Deng · Zhongkai Wu · Vuksan Mijovic · Zijiang Yang · Jian-Guang Lou · Dongmei Zhang
InstructPix2Pix: Learning to Follow Image Editing Instructions
Tim Brooks · Aleksander Holynski · Alexei A. Efros
Paint by Example: Exemplar-based Image Editing with Diffusion Models
Binxin Yang · Shuyang Gu · Bo Zhang · Ting Zhang · Xuejin Chen · Xiaoyan Sun · Dong Chen · Fang Wen
SpaText: Spatio-Textual Representation for Controllable Image Generation
Omri Avrahami · Thomas Hayes · Oran Gafni · Sonal Gupta · Yaniv Taigman · Devi Parikh · Dani Lischinski · Ohad Fried · Xi Yin
Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting
Su Wang · Chitwan Saharia · Ceslee Montgomery · Jordi Pont-Tuset · Shai Noy · Stefano Pellegrini · Yasumasa Onoe · Sarah Laszlo · David Fleet · Radu Soricut · Jason Baldridge · Mohammad Norouzi · Peter Anderson · William Chan
LayoutDM: Transformer-based Diffusion Model for Layout Generation
Shang Chai · Liansheng Zhuang · Fengying Yan
CLIP-Sculptor: Zero-Shot Generation of High-Fidelity and Diverse Shapes from Natural Language
Aditya Sanghi · Rao Fu · Vivian Liu · Karl Willis · Hooman Shayani · Amir Khasahmadi · Srinath Sridhar · Daniel Ritchie
Master: Meta Style Transformer for Controllable Zero-Shot and Few-Shot Artistic Style Transfer
Hao Tang · Songhua Liu · Tianwei Lin · Shaoli Huang · Fu Li · Dongliang He · Xinchao Wang
DeepVecFont-v2: Exploiting Transformers to Synthesize Vector Fonts with Higher Quality
Yuqing Wang · Yizhi Wang · Longhui Yu · Yuesheng Zhu · Zhouhui Lian
ObjectStitch: Object Compositing with Diffusion Model
Yizhi Song · Zhifei Zhang · Zhe Lin · Scott Cohen · Brian Price · Jianming Zhang · Soo Ye Kim · Daniel Aliaga
CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer
Linfeng Wen · Chengying Gao · Changqing Zou
LEMaRT: Label-Efficient Masked Region Transform for Image Harmonization
Sheng Liu · Cong Phuoc Huynh · Cong Chen · Maxim Arap · Raffay Hamid
Efficient and Explicit Modelling of Image Hierarchies for Image Restoration
Yawei Li · Yuchen Fan · Xiaoyu Xiang · Denis Demandolx · Rakesh Ranjan · Radu Timofte · Luc Van Gool
GamutMLP: A Lightweight MLP for Color Loss Recovery
Hoang Le · Brian Price · Scott Cohen · Michael Brown
Cascaded Local Implicit Transformer for Arbitrary-Scale Super-Resolution
Hao-Wei Chen · Yu-Syuan Xu · Min-Fong Hong · Yi-Min Tsai · Hsien-Kai Kuo · Chun-Yi Lee
Super-Resolution Neural Operator
Min Wei · Xuesong Zhang
Guided Depth Super-Resolution by Deep Anisotropic Diffusion
Nando Metzger · Rodrigo Daudt · Konrad Schindler
AutoFocusFormer: Image Segmentation off the Grid
Ziwen Chen · Kaushik Patnaik · Shuangfei Zhai · Alvin Wan · Zhile Ren · Alexander Schwing · R Colburn · Li Fuxin
AccelIR: Task-aware Image Compression for Accelerating Neural Restoration
Juncheol Ye · Hyunho Yeo · Jinwoo Park · Dongsu Han
Raw Image Reconstruction with Learned Compact Metadata
Yufei Wang · Yi Yu · Wenhan Yang · Lanqing Guo · Lap-Pui Chau · Alex Kot · Bihan Wen
Context-aware Pretraining for Efficient Blind Image Decomposition
Chao Wang · Zhedong Zheng · Ruijie Quan · Yifan Sun · Yi Yang
Deep Random Projector: Accelerated Deep Image Prior
Taihui Li · Hengkang Wang · Zhong Zhuang · Ju Sun
Spectral Bayesian Uncertainty for Image Super-resolution
Tao Liu · Jun Cheng · Shan Tan
Contrastive Semi-supervised Learning for Underwater Image Restoration via Reliable Bank
Shirui Huang · Keyan Wang · Huan Liu · Jun Chen · Yunsong Li
You Do Not Need Additional Priors or Regularizers in Retinex-based Low-light Image Enhancement
Huiyuan Fu · Wenkai Zheng · Xiangyu Meng · Xin Wang · Chuanming Wang · Huadong Ma
Decoupling-and-Aggregating for Image Exposure Correction
Yang Wang · Long Peng · Liang Li · Yang Cao · Zheng-Jun Zha
Self-supervised Non-uniform Kernel Estimation with Flow-based Motion Prior for Blind Image Deblurring
Zhenxuan Fang · Fangfang Wu · Weisheng Dong · Xin Li · Jinjian Wu · Guangming Shi
Neural Texture Synthesis with Guided Correspondence
Yang Zhou · Kaijian Chen · rongjun xiao · Hui Huang
GradICON: Approximate Diffeomorphisms via Gradient Inverse Consistency
Lin Tian · Thomas Greer · François-Xavier Vialard · Roland Kwitt · Raul San Jose Estepar · Richard Rushmore · Nikolaos Makris · Sylvain Bouix · Marc Niethammer
TransFlow: Transformer as Flow Learner
Yawen Lu · Qifan Wang · Siqi Ma · Tong Geng · Yingjie Victor Chen · Huaijin Chen · Dongfang Liu
Video Dehazing via a Multi-Range Temporal Alignment Network with Physical Prior
Jiaqi Xu · Xiaowei Hu · Lei Zhu · DOU QI · Jifeng Dai · Yu Qiao · Pheng-Ann Heng
Event-Based Frame Interpolation with Ad-hoc Deblurring
Lei Sun · Christos Sakaridis · Jingyun Liang · Peng Sun · Jiezhang Cao · Kai Zhang · Qi Jiang · Kaiwei Wang · Luc Van Gool
Event-based Video Frame Interpolation with Cross-Modal Asymmetric Bidirectional Motion Fields
Taewoo Kim · Yujeong Chae · Hyun-Kurl Jang · Kuk-Jin YOON
"Seeing’’ Electric Network Frequency from Events
Lexuan Xu · Guang Hua · Haijian Zhang · Lei Yu · Ning Qiao
Executing your Commands via Motion Diffusion in Latent Space
Xin Chen · Biao Jiang · Wen Liu · Zilong Huang · BIN FU · Tao Chen · Gang Yu
Event-guided Person Re-Identification via Sparse-Dense Complementary Learning
Chengzhi Cao · Xueyang Fu · Hongjian Liu · Yukun Huang · Kunyu Wang · Jiebo Luo · Zheng-Jun Zha
Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis
Duomin Wang · Yu Deng · Zixin Yin · Heung-Yeung Shum · Baoyuan Wang
One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural Radiance Field
Weichuang Li · Longhao Zhang · Dong Wang · Bin Zhao · Zhigang Wang · Mulin Chen · Bang Zhang · Zhongjian Wang · Liefeng Bo · Xuelong Li
Rethinking the Learning Paradigm for Dynamic Facial Expression Recognition
Hanyang Wang · Bo Li · Shuang Wu · Siyuan Shen · Feng Liu · Shouhong Ding · Aimin Zhou
Multi-modal Gait Recognition via Effective Spatial-Temporal Feature Fusion
Yufeng Cui · Yimei Kang
MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking
Zheng Qin · Sanping Zhou · Le Wang · Jinghai Duan · Gang Hua · Wei Tang
Standing Between Past and Future: Spatio-Temporal Modeling for Multi-Camera 3D Multi-Object Tracking
Ziqi Pang · Jie Li · Pavel Tokmakov · Dian Chen · Sergey Zagoruyko · Yu-Xiong Wang
Camouflaged Instance Segmentation via Explicit De-camouflaging
Naisong Luo · Yuwen Pan · Rui Sun · Tianzhu Zhang · Zhiwei Xiong · Feng Wu
NeRF in the Palm of Your Hand: Corrective Robot Augmentation via Novel-View Synthesis
Allan Zhou · Moo J Kim · Lirui Wang · Pete Florence · Chelsea Finn
PIRLNav: Pretraining with Imitation and RL Finetuning for ObjectNav
Ram Ramrakhya · Dhruv Batra · Erik Wijmans · Abhishek Das
AdamsFormer for Spatial Action Localization in the Future
Hyung-gun Chi · Kwonjoon Lee · Nakul Agarwal · Yi Xu · Karthik Ramani · Chiho Choi
Unsupervised Sampling Promoting for Stochastic Human Trajectory Prediction
Guangyi Chen · Zhenhao Chen · Shunxing Fan · Kun Zhang
Query-Centric Trajectory Prediction
Zikang Zhou · Jianping Wang · Yung-Hui Li · Yu-Kai Huang
Planning-oriented Autonomous Driving
yihan hu · Jiazhi Yang · Li Chen · Keyu Li · Chonghao Sima · Xizhou Zhu · Siqi Chai · Senyao Du · Tianwei Lin · Wenhai Wang · Lewei Lu · Xiaosong Jia · Qiang Liu · Jifeng Dai · Yu Qiao · Hongyang Li
UniHCP: A Unified Model for Human-Centric Perceptions
Yuanzheng Ci · Yizhou Wang · Meilin Chen · SHIXIANG TANG · LEI BAI · Feng Zhu · Rui Zhao · Fengwei Yu · Donglian Qi · Wanli Ouyang
You Only Segment Once: Towards Real-Time Panoptic Segmentation
Jie Hu · Linyan Huang · Tianhe Ren · shengchuan zhang · Rongrong Ji · Liujuan Cao
On the Convergence of IRLS and Its Variants in Outlier-Robust Estimation
Liangzu Peng · Christian Kümmerle · Rene Vidal
Learning Adaptive Dense Event Stereo from the Image Domain
Hoonhee Cho · Jegyeong Cho · Kuk-Jin YOON
Correspondence Transformers with Asymmetric Feature Learning and Matching Flow Super-Resolution
Yixuan Sun · Dongyang Zhao · Zhangyue Yin · Yiwen Huang · Tao Gui · Wenqiang Zhang · Weifeng Ge
DKM: Dense Kernelized Feature Matching for Geometry Estimation
Johan Edstedt · Ioannis Athanasiadis · Mårten Wadenbäck · Michael Felsberg
3D Registration with Maximal Cliques
Xiyu Zhang · Jiaqi Yang · Shikun Zhang · Yanning Zhang
Self-Supervised Learning for Multimodal Non-Rigid 3D Shape Matching
Dongliang Cao · Florian Bernard
Towards Better Gradient Consistency for Neural Signed Distance Functions via Level Set Alignment
Baorui Ma · Junsheng Zhou · Yushen Liu · Zhizhong Han
Unsupervised Inference of Signed Distance Functions from Single Sparse Point Clouds without Learning Priors
Chao Chen · Yushen Liu · Zhizhong Han
PEAL: Prior-embedded Explicit Attention Learning for low-overlap Point Cloud Registration
Junle Yu · Luwei Ren · Yu Zhang · Wenhui Zhou · Lili Lin · Guojun Dai
PointListNet: Deep Learning on 3D Point Lists
Hehe Fan · Linchao Zhu · Yi Yang · Mohan Kankanhalli
Meta Architecture for Point Cloud Analysis
Haojia Lin · Xiawu Zheng · lijiang Li · Fei Chao · Shanshan Wang · Yan Wang · Yonghong Tian · Rongrong Ji
Learnable Skeleton-Aware 3D Point Cloud Sampling
Cheng Wen · Baosheng Yu · Dacheng Tao
Complete-to-Partial 4D Distillation for Self-Supervised Point Cloud Sequence Representation Learning
Zhuoyang Zhang · Yuhao Dong · Yunze Liu · Li Yi
ViewNet: A Novel Projection-Based Backbone with View Pooling for Few-shot Point Cloud Classification
Jiajing Chen · Minmin Yang · Senem Velipasalar
SCPNet: Semantic Scene Completion on Point Cloud
Zhaoyang Xia · Youquan Liu · Xin Li · Xinge ZHU · Yuexin Ma · Yikang LI · Yuenan Hou · Yu Qiao
SCoDA: Domain Adaptive Shape Completion for Real Scans
Yushuang Wu · Zizheng Yan · Ce Chen · Lai Wei · Xiao Li · Guanbin Li · Yihao Li · Shuguang Cui · Xiaoguang Han
GrowSP: Unsupervised Semantic Segmentation of 3D Point Clouds
zihui zhang · Bo Yang · Bing WANG · Bo Li
MethaneMapper: Spectral Absorption aware Hyperspectral Transformer for Methane Detection
Satish Kumar · Ivan Arevalo · A S M Iftekhar · B.S. Manjunath
Weakly Supervised Class-agnostic Motion Prediction for Autonomous Driving
Ruibo Li · Hanyu Shi · Ziang Fu · Zhe Wang · Guosheng Lin
Single Domain Generalization for LiDAR Semantic Segmentation
Hyeonseong Kim · Yoonsu Kang · Changgyoon Oh · Kuk-Jin YOON
PeakConv: Learning Peak Receptive Field for Radar Semantic Segmentation
Liwen Zhang · Xinyan Zhang · Youcheng Zhang · Yufei Guo · Yuanpei Chen · Xuhui Huang · Zhe Ma
PillarNeXt: Rethinking Network Designs for 3D Object Detection in LiDAR Point Clouds
Jinyu Li · Chenxu Luo · Xiaodong Yang
Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection
Qianjiang Hu · Daizong Liu · Wei Hu
Spherical Transformer for LiDAR-based 3D Recognition
Xin Lai · Yukang Chen · Fanbin Lu · Jianhui Liu · Jiaya Jia
Neural Map Prior for Autonomous Driving
Xuan Xiong · Yicheng Liu · Tianyuan Yuan · Yue Wang · Yilun Wang · Hang Zhao
LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion
Xin Li · Tao MA · Yuenan Hou · Botian Shi · Yuchen Yang · Youquan Liu · Xingjiao Wu · Qin Chen · Yikang LI · Yu Qiao · Liang He
Pix2map: Cross-modal Retrieval for Inferring Street Maps From Images
Xindi Wu · Kwun Fung Lau · Francesco Ferroni · Aljosa Osep · Deva Ramanan
Azimuth Super-Resolution for FMCW Radar in Autonomous Driving
Yu-Jhe Li · Shawn Hunt · Jinhyung Park · Matthew O’Toole · Kris Kitani
MonoATT: Online Monocular 3D Object Detection with Adaptive Token Transformer
Yunsong Zhou · Hongzi Zhu · Quan Liu · Shan Chang · Minyi Guo
Weakly Supervised Monocular 3D Object Detection using Multi-View Projection and Direction Consistency
Runzhou Tao · Wencheng Han · Zhongying Qiu · Cheng-zhong Xu · Jianbing Shen
Semi-Supervised Stereo-based 3D Object Detection via Cross-View Consensus
Wenhao Wu · Hau-San Wong · Si Wu
BEV-SAN: Accurate BEV 3D Object Detection via Slice Attention Networks
Xiaowei Chi · Jiaming Liu · Ming Lu · Rongyu Zhang · Zhaoqing Wang · Yandong Guo · Shanghang Zhang
Anchor3DLane: Learning to Regress 3D Anchors for Monocular 3D Lane Detection
Shaofei Huang · Zhenwei Shen · Zehao Huang · Zi-han Ding · Jiao Dai · Jizhong Han · Naiyan Wang · Si Liu
Learning Transformations To Reduce the Geometric Shift in Object Detection
Vidit Vidit · Martin Engilberge · Mathieu Salzmann
Look, Radiate, and Learn: Self-Supervised Localisation via Radio-Visual Correspondence
Mo Alloulah · Maximilian Arnold
Non-line-of-sight Imaging with Signal Superresolution Network
Jianyu Wang · Xintong Liu · Leping Xiao · Zuoqiang Shi · Lingyun Qiu · Xing Fu
ESLAM: Efficient Dense SLAM System Based on Hybrid Representation of Signed Distance Fields
Seyed Mohammad Mahdi Johari · Camilla Carta · François Fleuret
OmniCity: Omnipotent City Understanding with Multi-level and Multi-view Images
Weijia Li · Yawen Lai · Linning Xu · Yuanbo Xiangli · Yu Jinhua · Conghui He · Gui-Song Xia · Dahua Lin
Semantic Ray: Learning a Generalizable Semantic Field with Cross-Reprojection Attention
Fangfu Liu · Chubin Zhang · Yu Zheng · Yueqi Duan
Multi-View Stereo Representation Revist: Region-Aware MVSNet
Yisu Zhang · Jianke Zhu · Lixiang Lin
All-in-focus Imaging from Event Focal Stack
Hanyue Lou · Minggui Teng · Yixin Yang · Boxin Shi
Wide-angle Rectification via Content-aware Conformal Mapping
Qi Zhang · Hongdong Li · Qing Wang
Single Image Depth Prediction Made Better: A Multivariate Gaussian Take
Ce Liu · Suryansh Kumar · Shuhang Gu · Radu Timofte · Luc Van Gool
DeepLSD: Line Segment Detection and Refinement with Deep Image Gradients
Rémi Pautrat · Daniel Barath · Viktor Larsson · Martin Oswald · Marc Pollefeys
VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos
Huiyu Gao · Wei Mao · miaomiao Liu
Perspective Fields for Single Image Camera Calibration
Linyi Jin · Jianming Zhang · Yannick Hold-Geoffroy · Oliver Wang · Kevin Blackburn-Matzen · Matthew Sticha · David Fouhey
RUST: Latent Neural Scene Representations from Unposed Imagery
Mehdi S. M. Sajjadi · Aravindh Mahendran · Thomas Kipf · Etienne Pot · Daniel Duckworth · Mario Lucic · Klaus Greff
Learning Accurate 3D Shape Based on Stereo Polarimetric Imaging
Tianyu Huang · Haoang Li · Kejing He · Congying SUI · Bin Li · Yun-Hui Liu
The ObjectFolder Benchmark: Multisensory Learning with Neural and Real Objects
Ruohan Gao · Yiming Dou · Hao Li · Tanmay Agarwal · Jeannette Bohg · Yunzhu Li · Li Fei-Fei · Jiajun Wu
Paired-Point Lifting for Enhanced Privacy-Preserving Visual Localization
Chunghwan Lee · Jaihoon Kim · Chanhyuk Yun · Je Hyeong Hong
Learning to Predict Scene-Level Implicit 3D from Posed RGBD Data
Nilesh Kulkarni · Linyi Jin · Justin Johnson · David Fouhey
Long-term Visual Localization with Mobile Sensors
Shen Yan · Yu Liu · Long Wang · Zehong Shen · Zhen Peng · Haomin Liu · Maojun Zhang · Guofeng Zhang · Xiaowei Zhou
Learning the Distribution of Errors in Stereo Matching for Joint Disparity and Uncertainty Estimation
Liyan Chen · Weihan Wang · Philippos Mordohai
Revisiting Rotation Averaging: Uncertainties and Robust Losses
Ganlin Zhang · Viktor Larsson · Daniel Barath
Level-S
2
fM: Structure from Motion on Neural Level Set of Implicit Surfaces
Yuxi Xiao · Nan Xue · Tianfu Wu · Gui-Song Xia
Linking Garment with Person via Semantically Associated Landmarks for Virtual Try-On
Keyu Yan · Tingwei Gao · Hui Zhang · Chengjun Xie
Cross-domain 3D Hand Pose Estimation with Dual Modalities
Qiuxia Lin · Linlin Yang · Angela Yao
ScarceNet: Animal Pose Estimation with Scarce Annotations
Chen Li · Gim Lee
HS-Pose: Hybrid Scope Feature Extraction for Category-level Object Pose Estimation
Linfang Zheng · Chen Wang · Yinghan Sun · Esha Dasgupta · Hua Chen · Ales Leonardis · Wei Zhang · Hyung Jin Chang
ViPLO: Vision Transformer based Pose-Conditioned Self-Loop Graph for Human-Object Interaction Detection
Jeeseung Park · Jin-Woo Park · Jong-Seok Lee
Ego-Body Pose Estimation via Ego-Head Pose Estimation
Jiaman Li · Karen Liu · Jiajun Wu
Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video
Runyang Feng · Yixing Gao · Xueqing Ma · Tze Ho Elden Tse · Hyung Jin Chang
Trajectory-Aware Body Interaction Transformer for Multi-Person Pose Forecasting
Xiaogang Peng · Siyuan Mao · Zizhao Wu
What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging
Zitian Tang · Wenjie Ye · Wei-Chiu Ma · Hang Zhao
Detecting Human-Object Contact in Images
Yixin Chen · Sai Kumar Dwivedi · Michael Black · Dimitrios Tzionas
In-Hand 3D Object Scanning from an RGB Sequence
Shreyas Hampali · Tomas Hodan · LUAN TRAN · Lingni Ma · Cem Keskin · Vincent Lepetit
Autonomous Manipulation Learning for Similar Deformable Objects via Only One Demonstration
Yu Ren · Ronghan Chen · Yang Cong
What You Can Reconstruct from a Shadow
Ruoshi Liu · Sachit Menon · Chengzhi Mao · Dennis Park · Simon Stent · Carl Vondrick
H2ONet: Hand-Occlusion-and-Orientation-aware Network for Real-time 3D Hand Mesh Reconstruction
Hao Xu · Tianyu Wang · Xiao Tang · Chi-Wing Fu
Learning Human Mesh Recovery in 3D Scenes
Zehong Shen · Zhi Cen · Sida Peng · Qing Shuai · Hujun Bao · Xiaowei Zhou
Bringing Inputs to Shared Domains for 3D Interacting Hands Recovery in the Wild
Gyeongsik Moon
Hi4D: 4D Instance Segmentation of Close Human Interaction
Yifei Yin · Chen Guo · Manuel Kaufmann · Juan Zarate · Jie Song · Otmar Hilliges
Deformable Mesh Transformer for 3D Human Mesh Recovery
Yusuke Yoshiyasu
Reconstructing Animatable 3D Categories from Videos
Gengshan Yang · Chaoyang Wang · Dinesh Reddy Narapureddy · Deva Ramanan
Learning Semantic-Aware Disentangled Representation for 3D Human Body Editing
Xiaokun Sun · Qiao Feng · Xiongzheng Li · Jinsong Zhang · Yu-Kun Lai · Jingyu Yang · Kun Li
Physically Realizable Natural-Looking Clothing Textures Evade Person Detectors via 3D Modeling
Zhanhao Hu · Wenda Chu · Xiaopei Zhu · Hui Zhang · Bo Zhang · Xiaolin Hu
Towards Domain Generalization for Multi-view 3D Object Detection in Bird-Eye-View
Shuo Wang · Xinhai Zhao · Haiming Xu · Zehui Chen · Dameng Yu · Jiahao Chang · Zhen Yang · Feng Zhao
Listening Human Behavior: 3D Human Pose Estimation with Acoustic Signals
Yuto Shibata · Yutaka Kawashima · Mariko Isogawa · Go Irie · Akisato Kimura · Yoshimitsu Aoki
NLOST: Non-Line-of-Sight Imaging with Transformer
Yue Li · Jiayong Peng · Juntian Ye · Yueyi Zhang · Feihu Xu · Zhiwei Xiong
Few-shot Non-line-of-sight Imaging with Signal-surface Collaborative Regularization
Xintong Liu · Jianyu Wang · Leping Xiao · Xing Fu · Lingyun Qiu · Zuoqiang Shi
Co-SLAM: Joint Coordinate and Sparse Parametric Encodings for Neural Real-Time SLAM
Hengyi Wang · Jingwen Wang · Lourdes Agapito
OSRT: Omnidirectional Image Super-Resolution with Distortion-aware Transformer
Fanghua Yu · Xintao Wang · Mingdeng Cao · Gen Li · Ying Shan · Chao Dong
HRDFuse: Monocular 360

Depth Estimation by Collaboratively Learning Holistic-with-Regional Depth Distributions
Hao Ai · Zidong Cao · Yan-Pei Cao · Ying Shan · Lin Wang
K3DN: Disparity-aware Kernel Estimation for Dual-Pixel Defocus Deblurring
Yan Yang · Liyuan Pan · Liu Liu · miaomiao Liu
Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized Photography
Ilya Chugunov · Yuxuan Zhang · Felix Heide
DynamicStereo: Consistent Dynamic Depth from Stereo Videos
Nikita Karaev · Ignacio Rocco · Benjamin Graham · Natalia Neverova · Andrea Vedaldi · Christian Rupprecht
End-to-End Vectorized HD-map Construction with Piecewise Bezier Curve
Limeng Qiao · Wenjie Ding · Xi Qiu · Chi Zhang
Enhanced Stable View Synthesis
Nishant Jain · Suryansh Kumar · Luc Van Gool
Scalable, Detailed and Mask-Free Universal Photometric Stereo
Satoshi Ikehata
PRISE: Demystifying Deep Lucas-Kanade with Strongly Star-Convex Constraints for Multimodel Image Alignment
Yiqing Zhang · Xinming Huang · Ziming Zhang
Visual Localization using Imperfect 3D Models from the Internet
Vojtech Panek · Zuzana Kukelova · Torsten Sattler
HelixSurf: A Robust and Efficient Neural Implicit Surface Learning of Indoor Scenes with Iterative Intertwined Regularization
Zhihao Liang · Zhangjin Huang · Changxing Ding · Kui Jia
Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild
Garrick Brazil · Abhinav Kumar · Julian Straub · Nikhila Ravi · Justin Johnson · Georgia Gkioxari
Objaverse: A Universe of Annotated 3D Objects
Matt Deitke · Dustin Schwenk · Jordi Salvador Marcos · Luca Weihs · Oscar Michel · Eli VanderBilt · Ludwig Schmidt · Kiana Ehsani · Aniruddha Kembhavi · Ali Farhadi
Privacy-Preserving Representations are not Enough: Recovering Scene Content from Camera Poses
Kunal Chelani · Torsten Sattler · Fredrik Kahl · Zuzana Kukelova
Learning a Depth Covariance Function
Eric Dexheimer · Andrew Davison
Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning
Ajinkya Tejankar · Maziar Sanjabi · Qifan Wang · Sinong Wang · Hamed Firooz · Hamed Pirsiavash · Liang Tan
Backdoor Defense via Deconfounded Representation Learning
Zaixi Zhang · Qi Liu · Zhicai Wang · Zepu Lu · Qingyong Hu
Backdoor Cleansing with Unlabeled Data
Lu Pang · Tao Sun · Haibin Ling · Chao Chen
Breaching FedMD: Image Recovery via Paired-Logits Inversion Attack
Hideaki Takahashi · Jingjing Liu · Yang Liu
ELASTIC AGGREGATION FOR FEDERATED OPTIMIZATION
Chen Dengsheng · Jie Hu · Vince Tan · Xiaoming Wei · Enhua Wu
DynaFed: Tackling Client Data Heterogeneity with Global Dynamics
Renjie PI · WEIZHONG ZHANG · Yueqi Xie · Jiahui Gao · Xiaoyu Wang · Sunghun Kim · Qifeng Chen
How to Prevent the Poor Performance Clients for Personalized Federated Learning?
Zhe Qu · Xingyu Li · Xiao Han · Rui Duan · Chengchao Shen · Lixing Chen
Cloud-Device Collaborative Adaptation to Continual Changing Environments in the Real-world
Yulu Gan · Mingjie Pan · Rongyu Zhang · Zijian Ling · Lingran Zhao · Jiaming Liu · Shanghang Zhang
Diversity-Measurable Anomaly Detection
Wenrui Liu · Hong Chang · Bingpeng Ma · Shiguang Shan · Xilin CHEN
Look Around for Anomalies: Weakly-supervised Anomaly Detection via Context-Motion Relational Learning
MyeongAh Cho · Minjung Kim · Sangwon Hwang · Chaewon Park · Kyungjae Lee · Sangyoun Lee
Semi-supervised Hand Appearance Recovery via Structure Disentanglement and Dual Adversarial Discrimination
Zimeng Zhao · Binghui Zuo · Zhiyu Long · Yangang Wang
Adversarial Normalization: I Can visualize Everything (ICE)
Hoyoung Choi · Seungwan Jin · Kyungsik Han
Learning on Gradients: Generalized Artifacts Representation for GAN-Generated Images Detection
Chuangchuang Tan · Yao Zhao · Shikui Wei · Guanghua Gu · Yunchao Wei
GLeaD: Improving GANs with A Generator-Leading Task
Qingyan Bai · Ceyuan Yang · Yinghao Xu · Xihui Liu · Yujiu Yang · Yujun Shen
Data-Free Sketch-Based Image Retrieval
Abhra Chaudhuri · Ayan Kumar Bhunia · Yi-Zhe Song · Anjan Dutta
OpenMix: Exploring Outlier Samples for Misclassification Detection
Fei Zhu · Zhen Cheng · Xu-yao Zhang · Cheng-lin Liu
Genie: Show Me the Data for Quantization
Yongkweon Jeon · Chungman Lee · Ho-young Kim
How to Prevent the Continuous Damage of Noises to Model training?
Xiaotian Yu · Yang Jiang · Tianqi Shi · Zunlei Feng · Yuexuan Wang · Mingli Song · Li Sun
Gradient-based Uncertainty Attribution for Explainable Bayesian Deep Learning
Hanjing Wang · Dhiraj Joshi · Shiqiang Wang · Qiang Ji
FIANCEE: Faster Inference of Adversarial Networks via Conditional Early Exits
Polina Karpikova · Ekaterina Radionova · Anastasia Yaschenko · Andrei Spiridonov · Leonid Kostyushko · Riccardo Fabbricatore · Aleksei Ivakhnenko
Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks
Jierun Chen · Shiu-hong Kao · Hao He · Weipeng Zhuo · Song Wen · Chul-Ho Lee · S.-H. Chan
FFCV: Accelerating Training by Removing Data Bottlenecks
Guillaume Leclerc · Andrew Ilyas · Logan Engstrom · Sung Min Park · Hadi Salman · Aleksander Madry
Disentangled Representation Learning for Unsupervised Neural Quantization
Haechan Noh · Sangeek Hyun · Woojin Jeong · Hanshin Lim · Jae-Pil Heo
HOTNAS: Hierarchical Optimal Transport for Neural Architecture Search
Jiechao Yang · Yong Liu · Hongteng Xu
Solving relaxations of MAP-MRF problems: Combinatorial in-face Frank-Wolfe directions
Vladimir Kolmogorov
Transformer-Based Learned Optimization
Erik Gärtner · Luke Metz · Misha Andriluka · C. Freeman · Cristian Sminchisescu
Multi-Agent Automated Machine Learning
Zhaozhi Wang · Kefan Su · Jian Zhang · Huizhu Jia · Qixiang Ye · Xiaodong Xie · Zongqing Lu
Accelerating Dataset Distillation via Model Augmentation
Lei Zhang · Jie Zhang · Bowen Lei · Subhabrata Mukherjee · Xiang Pan · Bo Zhao · Caiwen Ding · Yao Li · Dongkuan Xu
PA&DA: Jointly Sampling Path and Data for Consistent NAS
Shun Lu · Yu Hu · Longxing Yang · Zihao Sun · Jilin Mei · Jianchao Tan · Chengru Song
Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning
Sanghwan Kim · Lorenzo Noci · Antonio Orvieto · Thomas Hofmann
EcoTTA: Memory-Efficient Continual Test-time Adaptation via Self-distilled Regularization
Junha Song · Jungsoo Lee · In So Kweon · Sungha Choi
CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning
James Smith · Leonid Karlinsky · Vyshnavi Gutta · Paola Cascante-Bonilla · Donghyun Kim · Assaf Arbelle · Rameswar Panda · Rogerio Feris · Zsolt Kira
DisWOT: Student Architecture Search for Distillation WithOut Training
Peijie Dong · Lujun Li · Zimian Wei
Real-Time Evaluation in Online Continual Learning: A New Hope
Yasir Ghunaim · Adel Bibi · Kumail Alhamoud · Motasem Alfarra · Hasan Hammoud Hammoud · Ameya Prabhu · Philip Torr · Bernard Ghanem
Dealing with Cross-Task Class Discrimination in Online Continual Learning
Yiduo Guo · Bing Liu · Dongyan Zhao
Class Attention Transfer Based Knowledge Distillation
Ziyao Guo · Haonan Yan · HUI LI · Xiaodong Lin
Dense Network Expansion for Class Incremental Learning
Zhiyuan Hu · Yunsheng Li · Jiancheng Lyu · Dashan Gao · Nuno Vasconcelos
Multi-Mode Online Knowledge Distillation for Self-Supervised Visual Representation Learning
Kaiyou Song · Jin Xie · Shan Zhang · Zimeng Luo
Few-Shot Class-Incremental Learning via Class-Aware Bilateral Distillation
Linglan Zhao · Jing Lu · Yunlu Xu · Zhanzhan Cheng · Dashan Guo · Yi Niu · Xiangzhong Fang
Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners
Zitian Chen · Yikang Shen · Mingyu Ding · Zhenfang Chen · Hengshuang Zhao · Erik Learned-Miller · Chuang Gan
Train-Once-for-All Personalization
Hong-You Chen · YANDONG LI · Yin Cui · Mingda Zhang · Wei-Lun Chao · Li Zhang
Generalizable Implicit Neural Representations with Instance Pattern Composers
Chiheon Kim · Doyup Lee · Saehoon Kim · Minsu Cho · Wook-Shin Han
Deep Frequency Filtering for Domain Generalization
Shiqi Lin · Zhizheng Zhang · Zhipeng Huang · Yan Lu · Cuiling Lan · Peng Chu · Quanzeng You · Jiang Wang · Zicheng Liu · Viraj Navkal · Amey Parulkar · Zhibo Chen
Back to the Source: Diffusion-Driven Adaptation to Test-Time Corruption
Jin Gao · Jialing Zhang · Xihui Liu · Trevor Darrell · Evan Shelhamer · Dequan Wang
Decompose, Adjust, Compose: Effective Normalization by Playing with Frequency for Domain Generalization
Sangrok Lee · Jongseong Bae · Ha Kim Kim
Enhanced Multimodal Representation Learning with Cross-modal KD
mengxi Chen · Linyu XING · Yu Wang · Ya Zhang
Equiangular Basis Vectors
Yang Shen · Xu-Hao Sun · Xiu-Shen Wei
DARE-GRAM : Unsupervised Domain Adaptation Regression by Aligning Inverse Gram Matrices
Ismail Nejjar · Qin Wang · Olga Fink
Towards Better Stability and Adaptability: Improve Online Self-Training for Model Adaptation in Semantic Segmentation
Dong Zhao · Shuang Wang · Qi Zang · Dou Quan · XIUTIAO YE · Licheng Jiao
MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation
Lukas Hoyer · Dengxin Dai · Haoran Wang · Luc Van Gool
Neural Dependencies Emerging from Learning Massive Categories
Ruili Feng · Kecheng Zheng · Kai Zhu · Yujun Shen · Jian Zhao · Yukun Huang · Deli Zhao · Jingren Zhou · Michael Jordan · Zheng-Jun Zha
Co-training
2
L
submodels for image recognition
Hugo Touvron · Matthieu CORD · Maxime Oquab · Piotr Bojanowski · Jakob Verbeek · Herve Jegou
On-the-fly Category Discovery
Ruoyi Du · Dongliang Chang · Kongming Liang · Timothy Hospedales · Yi-Zhe Song · Zhanyu Ma
Generative Bias for Robust Visual Question Answering
Jae Won Cho · Dong-Jin Kim · Hyeonggon Ryu · In So Kweon
RMLVQA: A Margin Loss Approach For Visual Question Answering with Language Biases
Abhipsa Basu · Sravanti Addepalli · Venkatesh Babu Radhakrishnan
Twin Contrastive Learning with Noisy Labels
Zhizhong Huang · Junping Zhang · Hongming Shan
Fine-Grained Classification with Noisy Labels
Qi Wei · Lei Feng · Haoliang Sun · Ren Wang · Chenhui Guo · Yilong Yin
ProtoCon: Pseudo-label Refinement via Online Clustering and Prototypical Consistency for Efficient Semi-supervised Learning
Islam Nassar · Munawar Hayat · Ehsan Abbasnejad · Hamid Rezatofighi · Gholamreza Haffari
Zero-shot Model Diagnosis
Jinqi Luo · Zhaoning Wang · Chen Henry Wu · Dong Huang · Fernando de la Torre
Mind the Label Shift of Augmentation-based Graph OOD Generalization
Junchi Yu · Jian Liang · Ran He
RONO: Robust Discriminative Learning with Noisy Labels for 2D-3D Cross-Modal Retrieval
Yanglin Feng · Hongyuan Zhu · Dezhong Peng · Xi Peng · Peng Hu
Deep Incomplete Multi-view Clustering with Cross-view Partial Sample and Prototype Alignment
Jiaqi Jin · Siwei Wang · Zhibin Dong · Xinwang Liu · En Zhu
MetaViewer: Towards A Unified Multi-View Representation
Ren Wang · Haoliang Sun · Yuling Ma · Xiaoming Xi · Yilong Yin
Rethinking Out-of-Distribution Detection: Masked Image Modeling is All You Need
Jingyao Li · Pengguang Chen · Zexin He · Shaozuo Yu · Shu Liu · Jiaya Jia
Towards Trustable Skin Cancer Diagnosis via Rewriting Model’s Decision
Siyuan Yan · zhen yu · Xuelin Zhang · Dwarikanath Mahapatra · Shekhar Chandra · Monika Janda · H. Peter Soyer · Zongyuan Ge
METransformer: Radiology Report Generation by Transformer with Multiple Learnable Expert Tokens
Zhanyu Wang · Lingqiao Liu · Lei Wang · Luping Zhou
Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images
Ramin Nakhli · Puria Azadi Moghadam · Haoyang Mi · Hossein Farahani · Alexander Baras · Blake Gilks · Ali Bashashati
Ambiguous Medical Image Segmentation using Diffusion Models
AIMON RAHMAN · Jeya Maria Jose Valanarasu · Ilker Hacihaliloglu · Vishal Patel
Directional Connectivity-based Segmentation of Medical Images
Ziyun Yang · Sina Farsiu
Bidirectional Copy-Paste for Semi-Supervised Medical Image Segmentation
Yunhao Bai · Duowen Chen · Qingli Li · Wei Shen · Yan Wang
AutoLabel: CLIP-based framework for Open-set Video Domain Adaptation
Giacomo Zara · Subhankar Roy · Paolo Rota · Elisa Ricci
Zero-shot Generative Model Adaptation via Image-specific Prompt Learning
Jiayi Guo · Chaofei Wang · You Wu · Eric Zhang · Kai Wang · Xingqian Xu · Shiji Song · Humphrey Shi · Gao Huang
2PCNet: Two-Phase Consistency Training for Day-to-Night Unsupervised Domain Adaptive Object Detection
Mikhail Kennerley · Jian-Gang Wang · Bharadwaj Veeravalli · Robby Tan
Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection
Muhammad Akhtar Munir · Muhammad Khan Khan · Salman Khan · Fahad Khan
Learning Transformation-Predictive Representations for Detection and Description of Local Features
Zihao Wang · Chunxu Wu · Yifei Yang · Zhen Li
Annealing-based Label-Transfer Learning for Open World Object Detection
Yuqing Ma · Hainan Li · Zhange Zhang · Jinyang Guo · Shanghang Zhang · Ruihao Gong · Xianglong Liu
PROB: Probabilistic Objectness for Open World Object Detection
Orr Zohar · Kuan-Chieh Wang · Serena Yeung
Detecting Everything in the Open World: Towards Universal Object Detection
Zhenyu Wang · Ya-Li Li · Xi Chen · Ser-Nam Lim · Antonio Torralba · Hengshuang Zhao · Shengjin Wang
DETR with Additional Global Aggregation for Cross-domain Weakly Supervised Object Detection
Zongheng Tang · Yifan Sun · Si Liu · Yi Yang
Self-supervised AutoFlow
Hsin-Ping Huang · Charles Herrmann · Junhwa Hur · Erika Lu · Kyle Sargent · Austin Stone · Ming-Hsuan Yang · Deqing Sun
Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
Lingchen Meng · Xiyang Dai · Yinpeng Chen · Pengchuan Zhang · Dongdong Chen · Mengchen Liu · Jianfeng Wang · Zuxuan Wu · Lu Yuan · Yu-Gang Jiang
Learning Common Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems
Yangyang Shu · Anton Hengel · Lingqiao Liu
Full or weak annotations? An adaptive strategy for budget-constrained annotation campaigns
Javier Gamazo Tejero · Martin Zinkernagel · Sebastian Wolf · Raphael Sznitman · Pablo Márquez Neila
Class-Incremental Exemplar Compression for Class-Incremental Learning
Zilin Luo · Yaoyao Liu · Bernt Schiele · Qianru Sun
The Devil is in the Points: Weakly Semi-Supervised Instance Segmentation via Point-Guided Mask Representation
Beomyoung Kim · Joonhyun Jeong · Dongyoon Han · Sung Ju Hwang
Augmentation Matters: A Simple-yet-Effective Approach to Semi-supervised Semantic Segmentation
Zhen Zhao · Lihe Yang · Sifan Long · Jimin Pi · Luping Zhou · Jingdong Wang
Weakly Supervised Semantic Segmentation via Adversarial Learning of Classifier and Reconstructor
Hyeokjun Kweon · Sung-Hoon Yoon · Kuk-Jin YOON
Learning Orthogonal Prototypes for Generalized Few-shot Semantic Segmentation
Sun-Ao Liu · Yiheng Zhang · Zhaofan Qiu · Hongtao Xie · Yongdong Zhang · Ting Yao
Beyond mAP: Towards better evaluation of instance segmentation
Rohit Kumar Jena · Lukas Zhornyak · Nehal Doiphode · Pratik Chaudhari · Vivek Buch · James Gee · Jianbo Shi
Dynamic Focus-aware Positional Queries for Semantic Segmentation
Haoyu He · Jianfei Cai · Zizheng Pan · Jing Liu · Jing Zhang · Dacheng Tao · Bohan Zhuang
Focus On Details: Online Multi-object Tracking with Diverse Fine-grained Representation
Hao Ren · Shoudong Han · Huilin Ding · Ziwen Zhang · Hongwei Wang · Faquan Wang
DynaMask: Dynamic Mask Selection for Instance Segmentation
Ruihuang Li · Chenhang HE · Shuai Li · Yabin Zhang · Lei Zhang
A Strong Baseline for Generalized Few-Shot Semantic Segmentation
Seyed Mohammadsina Hajimiri · Malik Boudiaf · Ismail Ayed · Jose Dolz
Compositor: Bottom-up Clustering and Compositing for Robust Part and Object Segmentation
Ju He · Jieneng Chen · Ming-Xian Lin · Qihang Yu · Alan Yuille
Inferring and Leveraging Parts from Object Shape for Improving Semantic Image Synthesis
Yuxiang Wei · Zhilong Ji · Xiaohe Wu · Jinfeng Bai · Lei Zhang · Wangmeng Zuo
Primitive Generation and Semantic-related Alignment for Universal Zero-Shot Segmentation
SHUTING HE · Henghui Ding · Wei Jiang
UniDAformer: Unified Domain Adaptive Panoptic Segmentation Transformer via Hierarchical Mask Calibration
Jingyi Zhang · Jiaxing Huang · Xiaoqin Zhang · Shijian Lu
StructVPR: Distill Structural Knowledge with Weighting Samples for Visual Place Recognition
Yanqing Shen · Sanping Zhou · Jingwen Fu · Ruotong Wang · Shitao Chen · Nanning Zheng
CLIP-S
4
: Language-Guided Self-Supervised Semantic Segmentation
Wenbin He · Suphanut Jamonnak · Liang Gou · Liu Ren
Learning Conditional Attributes for Compositional Zero-Shot Learning
Qingsheng Wang · Lingqiao Liu · Chenchen Jing · Hao Chen · Guoqiang Liang · PENG WANG · Chunhua Shen
Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection
Luting Wang · Yi Liu · Penghui Du · Zihan Ding · Yue Liao · Qiaosong Qi · Biaolong Chen · Si Liu
ZegCLIP: Towards Adapting CLIP for Zero-shot Semantic Segmentation
Ziqin Zhou · Yinjie Lei · Bowen Zhang · Lingqiao Liu · Yifan Liu
Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs
Junbum Cha · Jonghwan Mun · Byungseok Roh
Mobile User Interface Element Detection Via Adaptively Prompt Tuning
Weiqiang Wang · Zhuoer Xu · Haoxing Chen · jun lan · Changhua Meng · Weiqiang Wang
Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Dahun Kim · Anelia Angelova · Weicheng Kuo
Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling
yongshuai huang · Ning Lu · Dapeng Chen · Yibo Li · Zecheng Xie · Shenggao Zhu · Liangcai Gao · Wei Peng
End-to-End 3D Dense Captioning with Vote2Cap-DETR
Sijin Chen · Hongyuan Zhu · Xin Chen · Yinjie Lei · Gang Yu · Tao Chen
Visual DNA: Representing and Comparing Images using Distributions of Neuron Activations
Benjamin Ramtoula · Matthew Gadd · Paul Newman · Daniele De Martini
Hint-Aug: Drawing Hints from Foundation Vision Transformers towards Boosted Few-shot Parameter-Efficient Tuning
Zhongzhi Yu · Shang Wu · Shunyao Zhang · Yonggan Fu · Yingyan Lin
Improving Zero-shot Generalization and Robustness of Multi-modal Models
Yunhao Ge · Jie Ren · Andrew Gallagher · Yuxiao Wang · Ming-Hsuan Yang · Hartwig Adam · Laurent Itti · Balaji Lakshminarayanan · Jiaping Zhao
Asymmetric Feature Fusion for Image Retrieval
Hui Wu · Min Wang · Wengang Zhou · Zhenbo Lu · Houqiang Li
Cross-Image-Attention for Conditional Embeddings in Deep Metric Learning
Dmytro Kotovenko · Pingchuan Ma · Timo Milbich · Björn Ommer
Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-commerce
Yang Jin · Yongzhi Li · Zehuan Yuan · Yadong MU
Learning Attribute and Class Specific Representation Duet for Fine-grained Fashion Analysis
Yang Jiao · Yan Gao · Jingjing Meng · Jin Shang · Yi Sun
HAAV: Hierarchical Aggregation of Augmented Views for Image Captioning
Chia-Wen Kuo · Zsolt Kira
Non-Contrastive Learning Meets Language-Image Pre-Training
Jinghao Zhou · Li Dong · Zhe Gan · Lijuan Wang · Furu Wei
ViLEM: Visual-Language Error Modeling for Image-Text Retrieval
Yuxin Chen · Zongyang Ma · ziqi zhang · Zhongang Qi · Chunfeng Yuan · Ying Shan · Bing Li · Weiming Hu · Xiaohu Qie · Jianping WU
CLIPPO: Image-and-Language Understanding from Pixels Only
Michael Tschannen · Basil Mustafa · Neil Houlsby
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining
Xiaoyi Dong · Jianmin Bao · Yinglin Zheng · Ting Zhang · Dongdong Chen · Hao Yang · Ming Zeng · Weiming Zhang · Lu Yuan · Dong Chen · Fang Wen · Nenghai Yu
Context-aware Alignment and Mutual Masking for 3D-Language Pre-training
Zhao Jin · Munawar Hayat · Yuwei Yang · Yulan Guo · Yinjie Lei
SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text
Pinaki Nath Chowdhury · Ayan Kumar Bhunia · Aneeshan Sain · Subhadeep Koley · Tao Xiang · Yi-Zhe Song
Learning Bottleneck Concepts in Image Classification
Bowen Wang · Liangzhi Li · Yuta Nakashima · Hajime Nagahara
GIVL: Improving Geographical Inclusivity of Vision-and-Language Models with Pre-Training Methods
Da Yin · Feng Gao · Govind Thattai · Michael Johnston · Kai-Wei Chang
Grounding Counterfactual Explanation of Image Classifiers to Textual Concept Space
Siwon Kim · Jinoh Oh · SUNGJIN LEE · Seunghak Yu · Jaeyoung Do · Tara Taghavi
Overlooked factors in concept-based explanations: Dataset choice, concept learnability, and human capability
Vikram V. Ramaswamy · Sunnie S. Y. Kim · Ruth Fong · Olga Russakovsky
LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding
Gen Li · Varun Jampani · Deqing Sun · Laura Sevilla-Lara
Task Residual for Tuning Vision-Language Models
Tao Yu · Zhihe Lu · Xin Jin · Zhibo Chen · Xinchao Wang
Hierarchical Prompt Learning for Multi-Task Learning
Yajing Liu · Yuning Lu · Hao Liu · Yaozu An · Zhuoran Xu · Yao Zhuokun · Zhang Baofeng · Zhiwei Xiong · Chenguang Gui
Diversity-Aware Meta Visual Prompting
Qidong Huang · Xiaoyi Dong · Dongdong Chen · Weiming Zhang · Feifei Wang · Gang Hua · Nenghai Yu
From Images to Textual Prompts: Zero-shot Visual Question Answering with Frozen Large Language Models
Jiaxian Guo · Junnan Li · Dongxu Li · Anthony Tiong · Boyang Li · Dacheng Tao · Steven Hoi
Language Adaptive Weight Generation for Multi-task Visual Grounding
Wei Su · Peihan Miao · Huanzhang Dou · Gaoang Wang · Liang Qiao · Zheyang Li · Xi Li
Fusing Pre-trained Language Models with Multimodal Prompts through Reinforcement Learning
Youngjae Yu · Jiwan Chung · Heeseung Yun · Jack Hessel · Jae Sung Park · Ximing Lu · Rowan Zellers · Prithviraj Ammanabrolu · Ronan Le Bras · Gunhee Kim · Yejin Choi
Are Deep Neural Networks SMARTer than Second Graders?
Anoop Cherian · Kuan-Chuan Peng · Suhas Lohit · Kevin Smith · Joshua Tenenbaum
A-CAP: Anticipation Captioning with Commonsense Knowledge
MINH DUC VO · An Luong · Akihiro Sugimoto · Hideki Nakayama
A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
Aishwarya Kamath · Peter Anderson · Su Wang · Jing Yu Koh · Alexander Ku · Austin Waters · Yinfei Yang · Jason Baldridge · Zarana Parekh
Improving Vision-and-Language Navigation by Generating Future-View Image Semantics
Jialu Li · Mohit Bansal
Layout-based Causal Inference for Object Navigation
Sixian Zhang · Xinhang Song · Weijie Li · Yubing Bai · Xinyao Yu · Shuqiang Jiang
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model
Shengkun Tang · Yaqing Wang · Zhenglun Kong · Tianchi Zhang · Yao Li · Caiwen Ding · Yanzhi Wang · Yi Liang · Dongkuan Xu
Distilling Cross-Temporal Contexts for Continuous Sign Language Recognition
Leming Guo · Wanli Xue · Qing Guo · Bo Liu · Kaihua Zhang · Tiantian Yuan · Shengyong Chen
Multivariate, Multi-frequency and Multimodal: Rethinking Graph Neural Networks for Emotion Recognition in Conversation
Feiyu Chen · Jie Shao · Shuyuan Zhu · Heng Tao Shen
Modular Memorability: Tiered Representations for Video Memorability Prediction
Théo Dumont · Juan Hevia · Camilo Fosco
VindLU: A Recipe for Effective Video-and-Language Pretraining
Feng Cheng · Xizi Wang · Jie Lei · David Crandall · Mohit Bansal · Gediminas Bertasius
Procedure-Aware Pretraining for Instructional Video Understanding
Honglu Zhou · Roberto Martín-Martín · Mubbasir Kapadia · Silvio Savarese · Juan Carlos Niebles
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Antoine Yang · Arsha Nagrani · Paul Hongsuck Seo · Antoine Miech · Jordi Pont-Tuset · Ivan Laptev · Josef Sivic · Cordelia Schmid
Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Wenhao Wu · Haipeng Luo · Bo Fang · Jingdong Wang · Wanli Ouyang
Leveraging Temporal Context in Low Representational Power Regimes
Camilo Fosco · SouYoung Jin · Emilie Josephs · Aude Oliva
Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
Tsu-Jui Fu · Licheng Yu · Ning Zhang · Cheng-Yang Fu · Jong-Chyi Su · William Yang Wang · Sean Bell
NewsNet: A Novel Dataset for Hierarchical Temporal Segmentation
Haoqian Wu · Keyu Chen · Haozhe Liu · Mingchen Zhuge · Bing Li · Ruizhi Qiao · Xiujun Shu · Bei Gan · Liangsheng Xu · Bo Ren · Mengmeng Xu · Wentian Zhang · Raghavendra Ramachandra · Chia-Wen Lin · Bernard Ghanem
Perception and Semantic Aware Regularization for Sequential Confidence Calibration
Zhenghua Peng · Yu Luo · Tianshui Chen · Keke Xu · Shuangping Huang
Boosting Weakly-Supervised Temporal Action Localization with Text Information
Guozhang Li · De Cheng · Xinpeng Ding · Nannan Wang · Xiaoyu Wang · Xinbo Gao
Re2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization
Chen Zhao · Shuming Liu · Karttikeya Mangalam · Bernard Ghanem
Search-Map-Search: A Frame Selection Paradigm for Action Recognition
Mingjun Zhao · Yakun Yu · Xiaoli Wang · Lei Yang · Di Niu
Therbligs In Action: Video Understanding through Motion Primitives
Eadom Dessalene · Michael Maynord · Cornelia Fermuller · Yiannis Aloimonos
Learning Discriminative Representations for Skeleton Based Action Recognition
Huanyu Zhou · Qingjie Liu · Yunhong Wang
MOSO: Decomposing MOtion, Scene and Object for Video Prediction
Mingzhen Sun · Weining Wang · Xinxin Zhu · Jing Liu
EVAL: Explainable Video Anomaly Localization
Ashish Singh · Michael Jones · Erik Learned-Miller
Unified Mask Embedding and Correspondence Learning for Self-Supervised Video Segmentation
Liulei Li · Wenguan Wang · Tianfei Zhou · Jianwu Li · Yi Yang
Representation Learning for Visual Object Tracking by Masked Appearance Transfer
Haojie Zhao · Dong Wang · Huchuan Lu
Generalized Relation Modeling for Transformer Tracking
Shenyuan Gao · Chunluan Zhou · Jun Zhang
Panoptic Video Scene Graph Generation
Jingkang Yang · Wenxuan Peng · Xiangtai Li · ZUJIN GUO · Liangyu Chen · Bo Li · Zheng Ma · Wayne Zhang · Kaiyang Zhou · CHEN CHANGE LOY · Ziwei Liu
Devil’s on the Edges: Selective Quad Attention for Scene Graph Generation
Deunsol Jung · Sanghyun Kim · Won Hwa Kim · Minsu Cho
Focused and Collaborative Feedback Integration for Interactive Image Segmentation
Qiaoqiao Wei · Hui Zhang · Jun-Hai Yong
Knowledge Distillation for 6D Pose Estimation by Aligning Distributions of Local Predictions
Shuxuan Guo · Yinlin Hu · Jose Alvarez · Mathieu Salzmann
PartMix: Regularization Strategy to Learn Part Discovery for Visible-Infrared Person Re-identification
Minsu Kim · Seungryong Kim · Jungin Park · Seongheon Park · Kwanghoon Sohn
Integrally Pre-Trained Transformer Pyramid Networks
Yunjie Tian · Lingxi Xie · Zhaozhi Wang · Longhui Wei · XIAOPENG ZHANG · Jianbin Jiao · Yaowei Wang · Qi Tian · Qixiang Ye
Explaining Image Classifiers with Multiscale Directional Image Representation
Stefan Kolek · Robert Windesheim · Hector Andrade Loarca · Gitta Kutyniok · Ron Levie
Neuron Structure Modeling for Generalized Remote Physiological Measurement
Hao LU · Zitong Yu · Xuesong Niu · Ying-Cong Chen
Visual Atoms: Pre-training Vision Transformers with Sinusoidal Waves
Sora Takashima · Ryo Hayamizu · Nakamasa Inoue · Hirokatsu Kataoka · Rio Yokota
Model-Agnostic Gender Debiased Image Captioning
Yusuke Hirota · Yuta Nakashima · Noa Garcia
ImageBind: One Embedding Space To Bind Them All
Rohit Girdhar · Alaaeldin El-Nouby · Zhuang Liu · Mannat Singh · Kalyan Vasudev Alwala · Armand Joulin · Ishan Misra
I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification
Muhammad Naeem Naeem · Gul Zain Khan · Yongqin Xian · Muhammad Zeshan Afzal · Didier Stricker · Luc Van Gool · Federico Tombari
Learning Semantic Relationship among Instances for Image-Text Matching
Zheren Fu · Zhendong Mao · Yan Song · Yongdong Zhang
Learning Customized Visual Models with Retrieval-Augmented Knowledge
Haotian Liu · Kilho Son · Jianwei Yang · Ce Liu · Jianfeng Gao · Yong Jae Lee · Chunyuan Li
M
6
Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for \ Modern Document Layout Analysis
Hiuyi Cheng · Peirong Zhang · Sihang Wu · Jiaxin Zhang · Qiyuan · Zecheng Xie · Jing Li · Kai Ding · Lianwen Jin
Towards Modality-Agnostic Person Re-identification with Descriptive Query
Cuiqun Chen · Mang Ye · Ding Jiang
Generalized Decoding for Pixel, Image, and Language
Xueyan Zou · Zi-Yi Dou · Jianwei Yang · Zhe Gan · Linjie Li · Chunyuan Li · Xiyang Dai · Harkirat Behl · Jianfeng Wang · Lu Yuan · Nanyun Peng · Lijuan Wang · Yong Jae Lee · Jianfeng Gao
Correlational Image Modeling for Self-Supervised Visual Pre-Training
Wei Li · Jiahao Xie · CHEN CHANGE LOY
Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token embeddings to Finite Discrete Tokens
Yuxiao Chen · Jianbo Yuan · Yu Tian · Shijie Geng · Xinyu Li · Ding Zhou · Dimitris Metaxas · Hongxia Yang
What Can Human Sketches Do for Object Detection?
Pinaki Nath Chowdhury · Ayan Kumar Bhunia · Aneeshan Sain · Subhadeep Koley · Tao Xiang · Yi-Zhe Song
Local-guided Global: Paired Similarity Representation for Visual Reinforcement Learning
Hyesong Choi · Hunsang Lee · Wonil Song · Sangryul Jeon · Kwanghoon Sohn · Dongbo Min
OCTET: Object-Aware Counterfactual Explanations
Mehdi Zemni · Mickael Chen · Eloi Zablocki · Hedi Ben younes · Patrick Perez · Matthieu CORD
Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks
Weihua Chen · Xianzhe Xu · Jian Jia · Hao Luo · Yaohua Wang · Fan Wang · Rong Jin · Xiuyu Sun
Advancing Visual Grounding with Scene Knowledge: Benchmark and Method
Zhihong Chen · Ruifei Zhang · Yibing Song · Xiang Wan · Guanbin Li
FashionSAP: Symbols and Attributes Prompt for Fine-grained Fashion Vision-Language Pre-training
Yunpeng Han · Lisai Zhang · Qingcai Chen · chen zhijian · Zhonghua Li · Jianxin Yang · Zhao Cao
Learning to Exploit Temporal Structure for Biomedical Vision–Language Processing
Shruthi Bannur · Stephanie Hyland · Qianchu Liu · Fernando Pérez-García · Maximilian Ilse · Daniel Castro · Benedikt Boecking · Harshita Sharma · Kenza Bouzid · Anja Thieme · Anton Schwaighofer · Maria Teodora Wetscherek · Matthew Lungren · Aditya Nori · Javier Alvarez Valle · Ozan Oktay
Neural Koopman Pooling: Control-Inspired Temporal Dynamics Encoding for Skeleton-Based Action Recognition
Xinghan Wang · Xin Xu · Yadong MU
Fine-grained Audible Video Description
Xuyang Shen · Dong Li · Jinxing Zhou · Zhen Qin · Bowen He · Xiaodong Han · Aixuan Li · Yuchao Dai · Lingpeng Kong · Meng Wang · Yu Qiao · Yiran Zhong
Language-Guided Audio-Visual Source Separation via Trimodal Consistency
Reuben Tan · Arijit Ray · Andrea Burns · Bryan Plummer · Justin Salamon · Oriol Nieto · Bryan Russell · Kate Saenko
Audio-Visual Grouping Network for Sound Localization from Mixtures
Shentong Mo · Yapeng Tian
Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Sagnik Majumder · Hao Jiang · Pierre Moulon · Ethan Henderson · Paul Calamia · Kristen Grauman · Vamsi Krishna Ithapu
Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
Lingting Zhu · Xian Liu · Xuanyu Liu · Rui Qian · Ziwei Liu · Lequan Yu
Spatio-Temporal Pixel-Level Contrastive Learning-based Source-Free Domain Adaptation for Video Semantic Segmentation
Shao-Yuan Lo · Poojan Oza · Sumanth Chennupati · Patricio Galindo · Vishal Patel
MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos
Minghan Li · Shuai Li · Wangmeng Xiang · Lei Zhang
System-status-aware Adaptive Network for Online Streaming Video Understanding
Lin Geng Foo · GONG JIA · Zhipeng Fan · Jun Liu
Frame Flexible Network
Yitian Zhang · Yue Bai · Chang Liu · Huan Wang · Sheng Li · Yun Fu
Self-Supervised Video Forensics by Audio-Visual Anomaly Detection
Chao Feng · Ziyang Chen · Andrew Owens
MobileVOS: Real-Time Video Object Segmentation Contrastive Learning meets Knowledge Distillation
ROY MILES · Mehmet Kerim Yucel · Bruno Manganelli · Albert Saa-Garriga
Improving Robustness of Semantic Segmentation to Motion-Blur using Class-Centric Augmentation
Aakanksha Aakanksha · Rajagopalan Ambasamduram
MAGVIT: Masked Generative Video Transformer
Lijun Yu · Yong Cheng · Kihyuk Sohn · Jose Lezama · Han Zhang · Huiwen Chang · Alexander Hauptmann · Ming-Hsuan Yang · Yuan Hao · Irfan Essa · Lu Jiang
SCOTCH and SODA: A Transformer Video Shadow Detection Framework
Lihao Liu · Jean Prost · Lei Zhu · Nicolas Papadakis · Pietro Lio · Carola-Bibiane Schönlieb · Angelica Aviles-Rivero
Blind Video Deflickering by Neural Filtering with a Flawed Atlas
Chenyang Lei · Xuanchi Ren · Zhaoxiang Zhang · Qifeng Chen
Probabilistic Debiasing of Scene Graphs
Bashirul Biswas Biswas · Qiang Ji
ViTs for SITS: Vision Transformers for Satellite Image Time Series
Michail Tarasiou · Erik Chavez · Stefanos Zafeiriou
OmniMAE: Single Model Masked Pretraining on Images and Videos
Rohit Girdhar · Alaaeldin El-Nouby · Mannat Singh · Kalyan Vasudev Alwala · Armand Joulin · Ishan Misra
BASiS: Batch Aligned Spectral Embedding Space
Or Streicher · Ido Cohen · Guy Gilboa
Evolved Part Masking for Self-Supervised Learning
Zhanzhou FENG · Shiliang Shiliang
Hard Patches Mining for Masked Image Modeling
Haochen Wang · Kaiyou Song · Junsong Fan · Yuxi Wang · Jin Xie · Zhaoxiang Zhang
Pose-disentangled Contrastive Learning for Self-supervised Facial Representation
Yuanyuan Liu · Wenbin Wang · Yibing Zhan · Shaoze Feng · Kejun Liu · Zhe Chen
OpenGait: Revisiting Gait Recognition Towards Better Practicality
Chao Fan · Junhao Liang · Chuanfu Shen · Saihui Hou · Yongzhen Huang · Shiqi Yu
Autoregressive Visual Tracking
Xing Wei · Yifan Bai · Yongchao Zheng · Dahu Shi · Yihong Gong
Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking
Jinkun Cao · Jiangmiao Pang · Xinshuo Weng · Rawal Khirodkar · Kris Kitani
GazeNeRF: 3D-Aware Gaze Redirection with Neural Radiance Fields
Alessandro Ruzzi · Xiangwei Shi · Xi Wang · Gengyan Li · Shalini De Mello · Hyung Jin Chang · Xucong Zhang · Otmar Hilliges
Phone2Proc: Bringing Robust Robots Into Our Chaotic World
Matt Deitke · Rose Hendrix · Ali Farhadi · Kiana Ehsani · Aniruddha Kembhavi
Learning Human-to-Robot Handovers from Point Clouds
Sammy Christen · Wei Yang · Claudia Pérez-D’Arpino · Otmar Hilliges · Dieter Fox · Yu-Wei Chao
MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion
Chiyu Jiang · Andre Cornman · Cheolho Park · Benjamin Sapp · Yin Zhou · Dragomir Anguelov
Uncovering the Missing Pattern: Unified Framework Towards Trajectory Imputation and Prediction
Yi Xu · Armin Bazarjani · Hyung-gun Chi · Chiho Choi · Yun Fu
MixSim: A Hierarchical Framework for Mixed Reality Traffic Simulation
Simon Suo · Kelvin Wong · Justin Xu · James Tu · Alexander Cui · Sergio Casas · Raquel Urtasun
Visual Exemplar Driven Task-Prompting for Unified Perception in Autonomous Driving
Xiwen Liang · Minzhe Niu · Jianhua Han · Hang Xu · Chunjing Xu · Xiaodan Liang
Are We Ready for Vision-Centric Driving Streaming Perception? The ASAP Benchmark
Xiaofeng Wang · Zheng Zhu · Yunpeng Zhang · Guan Huang · Yun Ye · Wenbo Xu · Ziwei Chen · Xingang Wang
BAEFormer: Bi-directional and Early Interaction Transformers for Bird’s Eye View Semantic Segmentation
Cong Pan · Yonghao He · Junran Peng · Qian Zhang · Wei Sui · Zhaoxiang Zhang
PVO: Panoptic Visual Odometry
Weicai Ye · Xinyue Lan · SHUO CHEN · Yuhang Ming · Xingyuan Yu · Hujun Bao · Zhaopeng Cui · Guofeng Zhang
Unsupervised Cumulative Domain Adaptation for Foggy Scene Optical Flow
Zhou Hanyu · Yi Chang · YAN WENDING · Luxin Yan
Domain Generalized Stereo Matching via Hierarchical Visual Transformation
Tianyu Chang · Xun Yang · Tianzhu Zhang · Meng Wang
Unsupervised Visible-Infrared Person Re-Identification via Progressive Graph Matching and Alternate Learning
Wu Zesen · Mang Ye
Geometric Visual Similarity Learning in 3D Medical Image Self-Supervised Pre-training
Yuting He · Guanyu Yang · Rongjun Ge · Yang Chen · Jean-louis Coatrieux · Boyu Wang · Shuo Li
Progressive Neighbor Consistency Mining for Correspondence Pruning
Xin Liu · Jufeng Yang
Visual Prompt Multi-Modal Tracking
Jiawen Zhu · Simiao Lai · Xin Chen · Dong Wang · Huchuan Lu
Robust Multiview Point Cloud Registration with Reliable Pose Graph Initialization and History Reweighting
Haiping Wang · Yuan Liu · Zhen Dong · Yulan Guo · Yushen Liu · Wenping Wang · Bisheng Yang
PointCert: Point Cloud Classification with Deterministic Certified Robustness Guarantees
Jinghuai Zhang · Jinyuan Jia · Hongbin Liu · Neil Gong
Rethinking the Approximation Error in 3D Surface Fitting for Point Cloud Normal Estimation
Hang Du · Xuejun Yan · Jingjing Wang · Di Xie · Shiliang Pu
FAC: 3D Representation Learning via Foreground Aware Feature Contrast
Kangcheng Liu · Aoran Xiao · Xiaoqin Zhang · Shijian Lu · Ling Shao
ProxyFormer: Proxy Alignment Assisted Point Cloud Completion with Missing Part Sensitive Transformer
Shanshan Li · Pan Gao · Xiaoyang Tan · Mingqiang Wei
PointVector: A Vector Representation In Point Cloud Analysis
Xin Deng · wenyu Zhang · Qing Ding · Xinming Zhang
Fast Point Cloud Generation with Straight Flows
Lemeng Wu · Dilin Wang · Chengyue Gong · Xingchao Liu · Yunyang Xiong · Rakesh Ranjan · Raghuraman Krishnamoorthi · Vikas Chandra · qiang liu
ACL-SPC: Adaptive Closed-Loop system for Self-Supervised Point Cloud Completion
Sangmin Hong · Mohsen Yavartanoo · Reyhaneh Neshatavar Haghighi Shiraz · Kyoung Mu Lee
Open-set Semantic Segmentation for Point Clouds via Adversarial Prototype Framework
Jianan Li · Qiulei Dong
GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds
Honghui Yang · Tong He · Jiaheng Liu · Hua Chen · Boxi Wu · Binbin Lin · Xiaofei He · Wanli Ouyang
Novel Class Discovery for 3D Point Cloud Semantic Segmentation
Luigi Riz · Cristiano Saltori · Elisa Ricci · Fabio Poiesi
3D Semantic Segmentation in the Wild: Learning Generalized Models for Adverse-Condition Point Clouds
Aoran Xiao · Jiaxing Huang · Weihao Xuan · Ruijie Ren · Kangcheng Liu · Dayan Guan · Abdulmotaleb El Saddik · Shijian Lu · Eric Xing
Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation
Li Li · Hubert P. H. Shum · Toby Breckon
Instant Domain Augmentation for LiDAR Semantic Segmentation
Kwonyoung Ryu · Soonmin Hwang · Jaesik Park
Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision
Fangqiang Ding · Andras Palffy · Dariu Gavrila · Xiaoxuan Lu
MoDAR: Using Motion Forecasting for 3D Object Detection in Point Cloud Sequences
Yingwei Li · Charles R. Qi · Yin Zhou · Chenxi Liu · Dragomir Anguelov
Towards Unsupervised Object Detection from LiDAR Point Clouds
Lunjun Zhang · Anqi Joyce Yang · Yuwen Xiong · Sergio Casas · Bin Yang · Mengye Ren · Raquel Urtasun
DeepMapping2: Self-supervised Large-scale LiDAR Map Optimization
Chao Chen · Xinhao Liu · Yiming Li · Li Ding · Chen Feng
ConQueR: Query Contrast Voxel-DETR for 3D Object Detection
Benjin ZHU · Zhe Wang · Shaoshuai Shi · Hang Xu · Lanqing HONG · Hongsheng Li
SGLoc: Scene Geometry Encoding for Outdoor LiDAR Localization
Wen Li · Shangshu Yu · Cheng Wang · Guosheng Hu · Siqi Shen · Chenglu Wen
Depth Estimation from Camera Image and mmWave Radar Point Cloud
Akash Deep Singh · Yunhao Ba · Ankur Sarker · Howard Zhang · Achuta Kadambi · Stefano Soatto · Mani Srivastava · Alex Wong
Towards Building Self-Aware Object Detectors via Reliable Uncertainty Quantification and Calibration
Kemal Oksuz · Tom Joy · Puneet Dokania
Uni3D: A Unified Baseline for Multi-dataset 3D Object Detection
Bo Zhang · Jiakang Yuan · Botian Shi · Tao Chen · Yikang LI · Yu Qiao
Collaboration Helps Camera Overtake LiDAR in 3D Detection
Yue Hu · Yifan Lu · Runsheng Xu · Weidi Xie · Siheng Chen · Yanfeng Wang
BEV@DC: Bird’s-Eye View Assisted Training for Depth Completion
Wending Zhou · Xu Yan · Yinghong Liao · Yuankai Lin · Jin Huang · Gangming Zhao · Shuguang Cui · Zhen Li
Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction
Yuanhui Huang · Wenzhao Zheng · Yunpeng Zhang · Jie Zhou · Jiwen Lu
Viewpoint Equivariance for Multi-View 3D Object Detection
Dian Chen · Jie Li · Vitor Guizilini · Rareș Ambruș · Adrien Gaidon
3D Concept Learning and Reasoning from Multi-View Images
Yining Hong · Chunru Lin · Yilun Du · Zhenfang Chen · Joshua Tenenbaum · Chuang Gan
Role of Transients in Two-Bounce Non-Line-of-Sight Imaging
Siddharth Somasundaram · Akshat Dave · Connor Henley · Ashok Veeraraghavan · Ramesh Raskar
3D Spatial Multimodal Knowledge Accumulation for Scene Graph Prediction in Point Cloud
Mingtao Feng · Haoran Hou · Liang Zhang · Zijie Wu · Yulan Guo · Ajmal Mian
Revisiting the Stack-Based Inverse Tone Mapping
Ning Zhang · Yuyao Ye · Yang Zhao · Ronggang Wang
MVImgNet: A Large-scale Dataset of Multi-view Images
Xianggang Yu · Mutian Xu · Yidan Zhang · Haolin Liu · Chongjie Ye · Yushuang Wu · Zizheng Yan · Chenming Zhu · Zhangyang Xiong · Tianyou Liang · Guanying Chen · Shuguang Cui · Xiaoguang Han
Fully Self-Supervised Depth Estimation from Defocus Clue
Haozhe Si · Bin Zhao · Dong Wang · Yunpeng Gao · Mulin Chen · Zhigang Wang · Xuelong Li
Zero-Shot Dual-Lens Super-Resolution
Ruikang Xu · Mingde Yao · Zhiwei Xiong
Temporally Consistent Online Depth Estimation Using Point-Based Fusion
Numair Khan · Eric Penner · Douglas Lanman · Lei Xiao
Learning to Detect Mirrors from Videos via Dual Correspondences
Jiaying Lin · Xin Tan · Rynson Lau
Renderable Neural Radiance Map for Visual Navigation
obin kwon · Jeongho Park · Songhwai Oh
VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion
Yiming Li · Zhiding Yu · Chris Choy · Chaowei Xiao · Jose Alvarez · Sanja Fidler · Chen Feng · Anima Anandkumar
Behind the Scenes: Density Fields for Single View Reconstruction
Felix Wimbauer · Nan Yang · Christian Rupprecht · Daniel Cremers
Multiview Compressive Coding for 3D Reconstruction
Chao-Yuan Wu · Justin Johnson · Jitendra Malik · Christoph Feichtenhofer · Georgia Gkioxari
Virtual Occlusions Through Implicit Depth
Jamie Watson · Mohamed Sayed · Zawar Imam Qureshi · Gabriel Brostow · Sara Vicente · Oisin Aodha · Michael Firman
Panoptic Lifting for 3D Scene Understanding with Neural Fields
Yawar Siddiqui · Lorenzo Porzi · Samuel Rota Bulò · Norman Müller · Matthias Niessner · Angela Dai · Peter Kontschieder
Neural Part Priors: Learning to Optimize Part-Based Object Completion in RGB-D Scans
Alexey Bokhovkin · Angela Dai
BAAM: Monocular 3D pose and shape reconstruction with bi-contextual attention module and attention-guided modeling
Hyo-Jun Lee · Hanul Kim · Su-Min Choi · Seong-Gyun Jeong · Yeong Jun Koh
BKinD-3D: Self-Supervised 3D Keypoint Discovery from Multi-View Videos
Jennifer J. Sun · Lili Karashchuk · Amil Dravid · Serim Ryou · Sonia Fereidooni · John Tuthill · Aggelos Katsaggelos · Bingni Brunton · Georgia Gkioxari · Ann Kennedy · Yisong Yue · Pietro Perona
Four-view geometry with unknown radial distortion
Petr Hrubý · Viktor Korotynskiy · Timothy Duff · Luke Oeding · Marc Pollefeys · Tomas Pajdla · Viktor Larsson
Two-view Geometry Scoring Without Correspondences
Axel Barroso-Laguna · Eric Brachmann · Victor Prisacariu · Gabriel Brostow · Daniyar Turmukhambetov
Neural Voting Field for Camera-Space 3D Hand Pose Estimation
Lin Huang · Chung-Ching Lin · Kevin Lin · Lin Liang · Lijuan Wang · Junsong Yuan · Zicheng Liu
expOSE: Accurate Initialization-Free Projective Factorization using Exponential Regularization
José Iglesias Iglesias · Amanda Nilsson · Carl Olsson
Object Pose Estimation with Statistical Guarantees: Conformal Keypoint Detection and Geometric Uncertainty Propagation
Heng Yang · Marco Pavone
Crowd3D: Towards Hundreds of People Reconstruction from a Single Image
Hao Wen · Jing Huang · Huili Cui · Haozhe Lin · Yu-Kun Lai · LU FANG · Kun Li
Rigidity-Aware Detection for 6D Object Pose Estimation
Hai Yang · Rui Song · Jiaojiao Li · Mathieu Salzmann · Yinlin Hu
Robot Structure Prior Guided Temporal Attention for Camera-to-Robot Pose Estimation from Image Sequence
Yang Tian · Jiyao Zhang · Zekai Yin · Hao Dong
GFIE: A Dataset and Baseline for Gaze-Following from 2D to 3D in Indoor Environments
Zhengxi Hu · Yuxue Yang · Xiaolin Zhai · Dingye Yang · Bohan Zhou · Jingtai Liu
TokenHPE: Learning Orientation Tokens for Efficient Head Pose Estimation via Transformers
Cheng Zhang · Hai Liu · Yongjian Deng · Bochen Xie · Youfu Li
Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation
Xiaolong Shen · Zongxin Yang · Xiaohan Wang · Jianxin Ma · Chang Zhou · Yi Yang
PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D Human Pose Estimation
Qitao Zhao · Ce Zheng · Mengyuan Liu · Pichao WANG · Chen Chen
BITE: Beyond Priors for Improved Three-D Dog Pose Estimation
Nadine Rueegg · Shashank Tripathi · Konrad Schindler · Michael Black · Silvia Zuffi
TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D Environments
Yu Sun · Qian Bao · Wu Liu · Tao Mei · Michael Black
NeuralDome: A Neural Modeling Pipeline on Multi-View Human-Object Interactions
Juze Zhang · Haimin Luo · Hongdi Yang · Xinru Xu · Qianyang Wu · Ye Shi · Jingyi Yu · Lan Xu · Jingya Wang
Target-referenced Reactive Grasping for Dynamic Objects
Jirong Liu · Ruo Zhang · Hao-Shu Fang · Minghao Gou · Hongjie Fang · Chenxi Wang · Sheng Xu · Hengxu Yan · Cewu Lu
Command-driven Articulated Object Understanding and Manipulation
Ruihang Chu · Zhengzhe Liu · Xiaoqing Ye · Xiao Tan · XIAOJUAN QI · Chi-Wing Fu · Jiaya Jia
Visual-Tactile Sensing for In-Hand Object Reconstruction
Wenqiang Xu · Zhenjun Yu · Han Xue · Ruolin Ye · Siqiong Yao · Cewu Lu
MagicPony: Learning Articulated 3D Animals in the Wild
Shangzhe Wu · Ruining Li · Tomas Jakab · Christian Rupprecht · Andrea Vedaldi
Learning Analytical Posterior Probability for Human Mesh Recovery
Qi Fang · Kang Chen · Yinghui Fan · Qing Shuai · Jiefeng Li · Weidong Zhang
Marching-Primitives: Shape Abstraction from Signed Distance Function
Weixiao Liu · Yuwei Wu · Sipu Ruan · Gregory Chirikjian
Learning Neural Volumetric Representations of Dynamic Humans in Minutes
Chen Geng · Sida Peng · Zhen Xu · Hujun Bao · Xiaowei Zhou
Complete 3D Human Reconstruction from a Single Incomplete Image
Junying Wang · Jae Shin Yoon · Tuanfeng Wang · Krishna Kumar Singh · Ulrich Neumann
DIFu: Depth-guided Implicit Function for Clothed Human Reconstruction
Dae-Young Song · HeeKyung Lee · Jeongil Seo · Donghyeon Cho
BEDLAM: A Synthetic Dataset of Bodies Exhibiting Detailed Lifelike Animated Motion
Michael Black · Priyanka Patel · Joachim Tesch · Jinlong Yang
Invertible Neural Skinning
Yash Kant · Aliaksandr Siarohin · Riza Alp Guler · Menglei Chai · Jian Ren · Sergey Tulyakov · Igor Gilitschenski
Zero-shot Pose Transfer for Unrigged Stylized 3D Characters
Jiashun Wang · Xueting Li · Sifei Liu · Shalini De Mello · Orazio Gallo · Xiaolong Wang · Jan Kautz
Biomechanics-guided Facial Action Unit Detection through Force Modeling
Zijun Cui · Chenyi Kuang · Tian Gao · Kartik Talamadupula · Qiang Ji
Hand Avatar: Free-Pose Hand Animation and Rendering from Monocular Video
Xingyu Chen · Baoyuan Wang · Heung-Yeung Shum
High-fidelity Clothed Avatar Reconstruction from a Single Image
Tingting Liao · Xiaomei Zhang · Yuliang Xiu · Hongwei Yi · Xudong Liu · Guo-Jun Qi · Yong Zhang · Xuan Wang · Xiangyu Zhu · Zhen Lei
NeuWigs: A Neural Dynamic model for Volumetric Hair Capture and Animation
Ziyan Wang · Giljoo Nam · Tuur Stuyck · Stephen Lombardi · Chen Cao · Jason Saragih · Michael Zollhöfer · Jessica Hodgins · Christoph Lassner
FitMe: Deep Photorealistic 3D Morphable Model Avatars
Alexandros Lattas · Stylianos Moschoglou · Stylianos Ploumpis · Baris Gecer · Jiankang Deng · Stefanos Zafeiriou
FaceLit: Neural 3D Relightable Faces
Anurag Ranjan · Kwang Moo Yi · Jen-Hao Chang · Oncel Tuzel
Learning a Morphable Face Reflectance Model from Low-cost Data
Yuxuan Han · Zhibo Wang · Feng Xu
Fine-Grained Face Swapping via Regional GAN Inversion
Zhian Liu · Maomao Li · Yong Zhang · Cairong Wang · Qi Zhang · Jue Wang · Yongwei Nie
DiffSwap: High-Fidelity and Controllable Face Swapping via 3D-Aware Masked Diffusion
Wenliang Zhao · Yongming Rao · Weikang Shi · Zuyan Liu · Jie Zhou · Jiwen Lu
Unsupervised 3D Shape Reconstruction by Part Retrieval and Assembly
Xianghao Xu · Paul Guerrero · Matthew Fisher · Siddhartha Chaudhuri · Daniel Ritchie
PREIM3D: 3D Consistent Precise Image Attribute Editing from a Single Image
Jianhui Li · Jianmin Li · Haoji Zhang · Shilong Liu · Zhengyi Wang · Zihao Xiao · Kaiwen Zheng · Jun Zhu
NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-shot Real Image Animation
Yu Yin · Kamran Ghasedi · HsiangTao Wu · Jiaolong Yang · Xin Tong · Yun Fu
Quantitative Manipulation of Custom Attributes on 3D-Aware Image Synthesis
Hoseok Do · EunKyung Yoo · Taehyeong Kim · Chul Lee · Jin Choi
SinGRAF: Learning a 3D Generative Radiance Field for a Single Scene
Minjung Son · Jeong Joon Park · Leonidas Guibas · Gordon Wetzstein
NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models
Seung Wook Kim · Bradley Brown · Kangxue Yin · Karsten Kreis · Katja Schwarz · Daiqing Li · Robin Rombach · Antonio Torralba · Sanja Fidler
NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-view Images
Yunfan Ye · Renjiao Yi · Zhirui Gao · Chenyang Zhu · Zhiping Cai · Kai Xu
NeuDA: Neural Deformable Anchor for High-Fidelity Implicit Surface Reconstruction
Bowen Cai · Jinchi Huang · Rongfei Jia · chengfei lv · Huan Fu
PermutoSDF: Fast Multi-View Reconstruction with Implicit Surfaces using Permutohedral Lattices
Radu Alexandru Rosu · Sven Behnke
Neuralangelo: High-Fidelity Neural Surface Reconstruction
Zhaoshuo Li · Thomas Müller · Alex Evans · Russ Taylor · Mathias Unberath · Ming-Yu Liu · Chen-Hsuan Lin
RealFusion: 360

Reconstruction of Any Object from a Single Image
Luke Melas-Kyriazi · Iro Laina · Christian Rupprecht · Andrea Vedaldi
Neural Lens Modeling
Wenqi Xian · Aljaz Bozic · Noah Snavely · Christoph Lassner
RGBD2: Generative Scene Synthesis via Incremental View Inpainting using RGBD Diffusion Models
Jiabao Lei · Jiapeng Tang · Kui Jia
Controllable Light Diffusion for Portraits
David Futschik · Kelvin Ritland · James Vecore · Sean Fanello · Sergio Orts-Escolano · Brian Curless · Daniel Sýkora · Rohit Pandey
Weakly-supervised Single-view Image Relighting
Renjiao Yi · Chenyang Zhu · Kai Xu
MAIR: Multi-view Attention Inverse Rendering with 3D Spatially-Varying Lighting Estimation
JunYong Choi · SeokYeong Lee · Haesol Park · Seung-Won Jung · Ig-Jae Kim · Junghyun Cho
DANI-Net: Uncalibrated Photometric Stereo by Differentiable Shadow Handling, Anisotropic Reflectance Modeling, and Neural Inverse Rendering
Zongrui Li · Qian Zheng · Boxin Shi · Gang Pan · Xudong Jiang
Neural Fields meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes
Zian Wang · Tianchang Shen · Jun Gao · SHENGYU HUANG · Jacob Munkberg · Jon Hasselgren · Zan Gojcic · Wenzheng Chen · Sanja Fidler
Pointersect: Neural Rendering with Cloud-Ray Intersection
Jen-Hao Chang · Wei-Yu Chen · Anurag Ranjan · Kwang Moo Yi · Oncel Tuzel
Point2Pix: Photo-Realistic Point Cloud Rendering via Neural Radiance Fields
Tao Hu · Xiaogang Xu · Shu Liu · Jiaya Jia
StyleRF: Zero-shot 3D Style Transfer of Neural Radiance Fields
Kunhao Liu · Fangneng Zhan · Yiwen Chen · Jiahui Zhang · Yingchen Yu · Abdulmotaleb El Saddik · Shijian Lu · Eric Xing
EditableNeRF: Editing Topologically Varying Neural Radiance Fields by Key Points
Chengwei Zheng · Wenbin Lin · Feng Xu
Learning Neural Duplex Radiance Fields for Real-Time View Synthesis
Ziyu Wan · Christian Richardt · Aljaz Bozic · Chao Li · Vijay Rengarajan · Seonghyeon Nam · Xiaoyu Xiang · Tuotuo Li · Bo Zhu · Rakesh Ranjan · Jing Liao
Grid-guided Neural Radiance Fields for Large Urban Scenes
Linning Xu · Yuanbo Xiangli · Sida Peng · Xingang Pan · Nanxuan Zhao · Christian Theobalt · Bo Dai · Dahua Lin
NeRF-DS: Neural Radiance Fields for Dynamic Specular Objects
Zhiwen Yan · Chen Li · Gim Lee
Nerflets: Local Radiance Fields for Efficient Structure-Aware 3D Scene Representation from 2D Supervision
Xiaoshuai Zhang · Abhijit Kundu · Thomas Funkhouser · Leonidas Guibas · Hao Su · Kyle Genova
Local-to-Global Registration for Bundle-Adjusting Neural Radiance Fields
Yue Chen · Xingyu Chen · Xuan Wang · Qi Zhang · Yu Guo · Ying Shan · Fei Wang
FreeNeRF: Improving Few-shot Neural Rendering with Free Frequency Regularization
Jiawei Yang · Marco Pavone · Yue Wang
RefSR-NeRF: Towards High Fidelity and Super Resolution View Synthesis
Xudong Huang · Wei Li · Jie Hu · Hanting Chen · Yunhe Wang
Swept-Angle Synthetic Wavelength Interferometry
Alankar Kotwal · Anat Levin · Ioannis Gkioulekas
Edge-aware Regional Message Passing Controller for Image Forgery Localization
Dong Li · Jiaying Zhu · Menglu Wang · Jiawei Liu · Xueyang Fu · Zheng-Jun Zha
Revisiting Residual Networks for Adversarial Robustness
Shihua Huang · Zhichao Lu · Kalyanmoy Deb · Vishnu Naresh Boddeti
CFA: Class-wise Calibrated Fair Adversarial Training
Zeming Wei · Yifei Wang · Yiwen Guo · Yisen Wang
Feature Separation and Recalibration for Adversarial Robustness
Woo Jae Kim · Yoonki Cho · Junsik Jung · Sung-eui Yoon
Improving the Transferability of Adversarial Samples by Path-Augmented Method
Jianping Zhang · Jen-tse Huang · Wenxuan Wang · Yichen LI · Weibin Wu · Xiaosen Wang · Yuxin Su · Michael Lyu
StyLess: Boosting the Transferability of Adversarial Examples
Kaisheng Liang · Bin Xiao
Minimizing Maximum Model Discrepancy for Transferable Black-box Targeted Attacks
Anqi Zhao · Tong Chu · Yahao Liu · Wen Li · Jingjing Li · Lixin Duan
Adversarially Robust Neural Architecture Search for Graph Neural Networks
Beini Xie · Heng Chang · Ziwei Zhang · Xin Wang · Daixin Wang · Zhiqiang Zhang · Rex Ying · Wenwu Zhu
Color Backdoor: A Robust Poisoning Attack in Color Space
Wenbo Jiang · Hongwei Li · Guowen Xu · Tianwei Zhang
Effective Ambiguity Attack Against Passport-based DNN Intellectual Property Protection Schemes through Fully Connected Layer Substitution
Yiming Chen · Jinyu Tian · Xiangyu Chen · Jiantao Zhou
Single Image Backdoor Inversion via Robust Smoothed Classifiers
Mingjie Sun · J Kolter
Multi-view Adversarial Discriminator: Mine the Non-causal Factors for Object Detection in Unseen Domains
Mingjun Xu · Lingyun Qin · Weijie Chen · Shiliang Pu · Lei Zhang
RiDDLE: Reversible and Diversified De-identification with Latent Encryptor
Dongze Li · Wei Wang · Kang Zhao · Jing Dong · Tieniu Tan
CaPriDe Learning: Confidential and Private Decentralized Learning based on Encryption-friendly Distillation Loss
Nurbek Tastan · Karthik Nandakumar
Federated Learning with Data-Agnostic Distribution Fusion
Jian-hui Duan · Wenzhong Li · Derun Zou · Ruichen Li · Sanglu Lu
Learning Federated Visual Prompt in Null Space for MRI Reconstruction
Chun-Mei Feng · Bangjun Li · Xinxing Xu · Yong Liu · Huazhu Fu · Wangmeng Zuo
Decentralized Learning with Multi-Headed Distillation
Andrey Zhmoginov · Mark Sandler · Nolan Miller · Gus Kristiansen · Max Vladymyrov
Efficient Second-Order Plane Adjustment
Lipu Zhou
Learning Correspondence Uncertainty via Differentiable Nonlinear Least Squares
Dominik Muhle · Lukas Koestler · Krishna Murthy Jatavallabhula · Daniel Cremers
Learning Articulated Shape with Keypoint Pseudo-labels from Web Images
Anastasis Stathopoulos · Georgios Pavlakos · Ligong Han · Dimitris Metaxas
ObjectMatch: Robust Registration using Canonical Object Correspondences
Can Gümeli · Angela Dai · Matthias Niessner
Pose Synchronization under Multiple Pair-wise Relative Poses
Yifan Sun · Qixing Huang
MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding
Jun Chen · Ming Hu · Darren Coker · Michael L. Berumen · Blair Costelloe · Sara Beery · Anna Rohrbach · Mohamed Elhoseiny
DiffPose: Toward More Reliable 3D Pose Estimation
GONG JIA · Lin Geng Foo · Zhipeng Fan · Qiuhong Ke · Hossein Rahmani · Jun Liu
Scene-aware Egocentric 3D Human Pose Estimation
Jian Wang · Diogo Luvizon · Weipeng Xu · Lingjie Liu · Kripasindhu Sarkar · Christian Theobalt
Unified Pose Sequence Modeling
Lin Geng Foo · Tianjiao Li · Hossein Rahmani · Qiuhong Ke · Jun Liu
A Characteristic Function-based Method for Bottom-up Human Pose Estimation
Haoxuan Qu · Yujun Cai · Lin Geng Foo · Ajay Kumar · Jun Liu
AssemblyHands: Towards Egocentric Activity Understanding via 3D Hand Pose Estimation
Takehiko Ohkawa · Kun He · Fadime Sener · Tomas Hodan · LUAN TRAN · Cem Keskin
Harmonious Feature Learning for Interactive Hand-Object Pose Estimation
Zhifeng Lin · Changxing Ding · Huan Yao · Zengsheng Kuang · Shaoli Huang
CIMI4D: A Large Multimodal Climbing Motion Dataset under Human-scene Interactions
Ming Yan · Xin Wang · Yudi Dai · Siqi Shen · Chenglu Wen · Lan Xu · Yuexin Ma · Cheng Wang
MIME: Human-Aware 3D Scene Generation
Hongwei Yi · Chun-Hao Huang · Shashank Tripathi · Lea Hering · Justus Thies · Michael Black
ACR: Attention Collaboration-based Regressor for Arbitrary Two-Hand Reconstruction
Zhengdi Yu · Shaoli Huang · Chen Fang · Toby Breckon · Jue Wang
ARCTIC: A Dataset for Dexterous Bimanual Hand-Object Manipulation
Zicong Fan · Omid Taheri · Dimitrios Tzionas · Muhammed Kocabas · Manuel Kaufmann · Michael Black · Otmar Hilliges
NIKI: Neural Inverse Kinematics with Invertible Neural Networks for 3D Human Pose and Shape Estimation
Jiefeng Li · Siyuan Bian · Qi Liu · Jiasheng Tang · Fan Wang · Cewu Lu
P
C
2
: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction
Luke Melas-Kyriazi · Christian Rupprecht · Andrea Vedaldi
ShapeClipper: Scalable 3D Shape Learning from Single-View Images via Geometric and CLIP-based Consistency
Zixuan Huang · Varun Jampani · Ngoc Anh Thai · Yuanzhen Li · Stefan Stojanov · James Rehg
Human Body Shape Completion with Implicit Shape and Flow Learning
Boyao Zhou · Di Meng · Jean-Sébastien Franco · Edmond Boyer
gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction
Zerui Chen · Shizhe Chen · Cordelia Schmid · Ivan Laptev
Sampling is Matter: Point-guided 3D Human Mesh Reconstruction
Jeong Hwan Kim · Mi-Gyeong Gwon · Hyunwoo Park · Hyukmin Kwon · Gi-Mun Um · Wonjun Kim
High-fidelity 3D Human Digitization from Single 2K Resolution Images
Sang-Hun Han · Min-Gyu Park · Ju Yoon · Ju-Mi Kang · YOUNG-JAE PARK · Hae-Gon Jeon
Vid2Avatar: 3D Avatar Reconstruction from Videos in the Wild via Self-supervised Scene Decomposition
Chen Guo · Tianjian Jiang · Xu Chen · Jie Song · Otmar Hilliges
CLOTH4D: A Dataset for Clothed Human Reconstruction
XINGXING ZOU · Xintong Han · Waikeung Wong
RaBit: Parametric Modeling of 3D Biped Cartoon Characters with a Topological-consistent Dataset
Zhongjin Luo · Shengcai Cai · Jinguo Dong · Ruibo Ming · Liangdong Qiu · Xiaohang Zhan · Xiaoguang Han
OmniAvatar: Geometry-Guided Controllable 3D Head Synthesis
Hongyi Xu · Guoxian Song · Zihang Jiang · Jianfeng Zhang · Yichun Shi · Jing Liu · Wanchun Ma · Jiashi Feng · Linjie Luo
HARP: Personalized Hand Reconstruction from Monocular RGB Videos
Korrawe Karunratanakul · Sergey Prokudin · Otmar Hilliges · Siyu Tang
Reconstructing Signing Avatars From Video Using Linguistic Priors
Maria-Paola Forte · Peter Kulits · Chun-Hao Huang · Vasileios Choutas · Dimitrios Tzionas · Katherine J. Kuchenbecker · Michael Black
CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior
Jinbo Xing · Menghan Xia · Yuechen ZHANG · Xiaodong Cun · Jue Wang · Tien-Tsin Wong
MEGANE: Morphable Eyeglass and Avatar Network
Junxuan Li · Shunsuke Saito · Tomas Simon · Stephen Lombardi · Hongdong Li · Jason Saragih
Parametric Implicit Face Representation for Audio-Driven Facial Reenactment
Ricong Huang · Peiwen Lai · Yipeng Qin · Guanbin Li
3D-aware Facial Landmark Detection via Multi-view Consistent Training on Synthetic Data
Libing Zeng · Lele Chen · Wentao Bao · Zhong Li · Yi Xu · Junsong Yuan · Nima Kalantari
DiffusionRig: Learning Personalized Priors for Facial Appearance Editing
Zheng Ding · Cecilia Zhang · Zhihao Xia · Lars Jebe · Zhuowen Tu · Xiuming Zhang
HairStep: Transfer Synthetic to Real Using Strand and Depth Maps for Single-View 3D Hair Modeling
Yujian Zheng · Zi-Rong Jin · Moran Li · Haibin Huang · Chongyang Ma · Shuguang Cui · Xiaoguang Han
DCFace: Synthetic Face Generation with Dual Condition Diffusion Model
Minchul Kim · Feng Liu · Anil Jain · Xiaoming Liu
3D-Aware Face Swapping
Yixuan Li · Chao Ma · Yichao Yan · Wenhan Zhu · Xiaokang Yang
CoralStyleCLIP: Co-optimized Region and Layer Selection for Image Editing
Ambareesh Revanur · Debraj Basu · Shradha Agrawal · Dhwanit Agarwal · Deepak Pai
Local 3D Editing via 3D Distillation of CLIP Knowledge
Junha Hyung · Sungwon Hwang · Daejin Kim · Hyunji Lee · Jaegul Choo
Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
Gal Metzer · Elad Richardson · Or Patashnik · Raja Giryes · Daniel Cohen-Or
3D-aware multi-class image-to-image translation with NeRFs
Senmao Li · Joost van de Weijer · Yaxing Wang · Fahad Khan · Meiqin Liu · jian Yang
Diffusion-SDF: Text-to-Shape via Voxelized Diffusion
Muheng Li · Yueqi Duan · Jie Zhou · Jiwen Lu
Infinite Photorealistic Worlds using Procedural Generation
Alexander Raistrick · Lahav Lipson · Zeyu Ma · Lingjie Mei · Mingzhe Wang · Yiming Zuo · Karhan Kayan · Hongyu Wen · Beining Han · Yihan Wang · Alejandro Newell · Hei Law · Ankit Goyal · Kaiyu Yang · Jia Deng
Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation
Haochen Wang · Xiaodan Du · Jiahao Li · Raymond A. Yeh · Greg Shakhnarovich
RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation
Titas Anciukevicius · Zexiang Xu · Matthew Fisher · Paul Henderson · Hakan Bilen · Niloy Mitra · Paul Guerrero
PET-NeuS: Positional Encoding Tri-planes for Neural Surfaces
Yiqun Wang · Ivan Skorokhodov · Peter Wonka
SparseFusion: Distilling View-conditioned Diffusion for 3D Reconstruction
Zhizhuo Zhou · Shubham Tulsiani
Dionysus: Recovering Scene Structures by Dividing into Semantic Pieces
Likang Wang · Lei Chen
3D shape reconstruction of semi-transparent worms
Thomas Ilett · Omer Yuval · Thomas Ranner · Netta Cohen · David Hogg
Seeing Through the Glass: Neural 3D Reconstruction of Object Inside a Transparent Container
Jinguang Tong · Sundaram Muthu · Fahira Afzal Maken · Chuong Nguyen · Hongdong Li
HumanGen: Generating Human Radiance Fields with Explicit Priors
Suyi Jiang · Haoran Jiang · Ziyu Wang · Haimin Luo · Wenzheng Chen · Lan Xu
Humans as Light Bulbs: 3D Human Reconstruction from Thermal Reflection
Ruoshi Liu · Carl Vondrick
Accidental Light Probes
Hong-Xing Yu · Samir Agarwala · Charles Herrmann · Richard Szeliski · Noah Snavely · Jiajun Wu · Deqing Sun
Inverse Rendering of Translucent Objects using Physical and Neural Renderers
Chenhao Li · Trung Ngo · Hajime Nagahara
Multi-view Inverse Rendering for Large-scale Real-world Indoor Scenes
Zhen Li · Lingli Wang · Mofang Cheng · Cihui Pan · Jiaqi Yang
K-Planes: Explicit Radiance Fields in Space, Time, and Appearance
Sara Fridovich-Keil · Giacomo Meanti · Frederik Warburg · Benjamin Recht · Angjoo Kanazawa
Efficient Map Sparsification Based on 2D and 3D Discretized Grids
Xiaoyu Zhang · Yun-Hui Liu
Modernizing Old Photos Using Multiple References via Photorealistic Style Transfer
Agus Gunawan · Soo Ye Kim · Hyeonjun Sim · Jae-Ho Lee · Munchurl Kim
DINER: Depth-aware Image-based NEural Radiance fields
Malte Prinzler · Otmar Hilliges · Justus Thies
Cross-Guided Optimization of Radiance Fields with Multi-View Image Super-Resolution for High-Resolution Novel View Synthesis
Youngho Yoon · Kuk-Jin YOON
NeRFLight: Fast and Light Neural Radiance Fields using a Shared Feature Grid
Fernando Rivas-Manzaneque · Jorge Sierra-Acosta · Adrian Penate-Sanchez · Francesc Moreno-Noguer · Angela Ribeiro
Multi-Space Neural Radiance Fields
Ze-Xin Yin · Jiaxiong Qiu · Ming-Ming Cheng · Bo Ren
DyLiN: Making Light Field Networks Dynamic
Heng Yu · Joel Julin · Zoltan Milacski · Koichiro Niinuma · Laszlo Jeni
DP-NeRF: Deblurred Neural Radiance Field with Physical Scene Priors
Do-Gyoon Lee · Minhyeok Lee · Chajin Shin · Sangyoun Lee
SUDS: Scalable Urban Dynamic Scenes
Haithem Turki · Jason Zhang · Francesco Ferroni · Deva Ramanan
NeRFLix: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer
Kun Zhou · Wenbo Li · Yi Wang · Tao Hu · Nianjuan Jiang · Xiaoguang Han · Jiangbo Lu
Polarimetric iToF: Measuring High-Fidelity Depth through Scattering Media
Daniel Jeon · Andreas Meuleman · Seung-Hwan Baek · Min Kim Kim
MaLP: Manipulation Localization Using a Proactive Scheme
Vishal Asnani · Xi Yin · Tal Hassner · Xiaoming Liu
Physically Adversarial Infrared Patches with Learnable Shapes and Locations
Xingxing Wei · Jie Yu · Yao Huang
Towards Benchmarking and Assessing Visual Naturalness of PhysicalWorld Adversarial Attacks
Simin Li · Shuning Zhang · Gujun Chen · dong wang · Pu Feng · Jiakai Wang · Aishan Liu · Xin Yi · Xianglong Liu
Seasoning Model Soups for Robustness to Adversarial and Natural Distribution Shifts
Francesco Croce · Sylvestre-Alvise Rebuffi · Evan Shelhamer · Sven Gowal
Demystifying Causal Features on Adversarial Examples and Causal Inoculation for Robust Network by Adversarial Instrumental Variable Regression
Junho Kim · Byung-Kwan Lee · Yong Man Ro
Black-Box Sparse Adversarial Attack via Multi-Objective Optimisation
Phoenix Williams · Ke Li
Enhancing the Self-Universality for Transferable Targeted Attacks
Zhipeng Wei · Jingjing Chen · Zuxuan Wu · Yu-Gang Jiang
Evading DeepFake Detectors via Adversarial Statistical Consistency
Hou Yang · Qing Guo · Yihao Huang · Xiaofei Xie · Lei Ma · Jianjun Zhao
CAP: Robust Point Cloud Classification via Semantic and Structural Modeling
Daizong Ding · Erling Jiang · Yuanmin Huang · Mi Zhang · Wenxuan Li · Min Yang
Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger
Yi Yu · Yufei Wang · Wenhan Yang · Shijian Lu · Yap-peng Tan · Alex Kot
FedSeg: Class-Heterogeneous Federated Learning for Semantic Segmentation
Jiaxu Miao · Zongxin Yang · Leilei Fan · Yi Yang
Multimodal Industrial Anomaly Detection via Hybrid Fusion
Yue Wang · Jinlong Peng · Jiangning Zhang · Ran Yi · Yabiao Wang · Chengjie Wang
Unbiased Multiple Instance Learning for Weakly Supervised Video Anomaly Detection
HUI LYU · Zhongqi Yue · Qianru Sun · Bin Luo · Zhen Cui · Hanwang Zhang
Attribute-preserving Face Dataset Anonymization via Latent Code Optimization
Simone Barattin · Christos Tzelepis · Ioannis Patras · Nicu Sebe
HandsOff: Labeled Dataset Generation with No Additional Human Annotations
Austin Xu · Mariya Vasileva · Achal Dave · Arjun Seshadri
Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences between Pretrained Generative Models
Matthew Olson · Shusen Liu · Rushil Anirudh · Jayaraman J. Thiagarajan · Peer-timo Bremer · Weng-Keen Wong
Learning to Generate Image Embeddings with User-level Differential Privacy
Zheng Xu · Maxwell Collins · Yuxiao Wang · Liviu Panait · Sewoong Oh · Sean Augenstein · Ting Liu · Florian Schroff · Hugh McMahan
Adaptive Data-Free Quantization
Biao Qian · Yang Wang · Richang Hong · Meng Wang
Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective
Yuexiao Ma · Huixia Li · Xiawu Zheng · Xuefeng Xiao · Rui Wang · Shilei Wen · Xin Pan · Fei Chao · Rongrong Ji
One-Shot Model for Mixed-Precision Quantization
Ivan Koryakovskiy · Alexandra Yakovleva · Valentin Buchnev · Temur Isaev · Gleb Odinokikh
Training debiased subnetworks with contrastive weight pruning
Geon Yeong Park · Sangmin Lee · Sang Wan Lee · Jong Ye
Understanding Masked Autoencoders via Hierarchical Latent Variable Models
Lingjing Kong · Martin Q. Ma · Guangyi Chen · Eric Xing · Yuejie Chi · Louis-Philippe Morency · Kun Zhang
MobileOne: An Improved One Millisecond Mobile Backbone
Pavan Kumar Anasosalu Vasu · James Gabriel · Jeff Zhu · Oncel Tuzel · Anurag Ranjan
Rate Gradient Approximation Attack Threats Deep Spiking Neural Networks
Tong Bu · Jianhao Ding · Zecheng Hao · Zhaofei Yu
Constructing Deep Spiking Neural Networks from Artificial Neural Networks with Knowledge Distillation
Qi Xu · Yaxin Li · Jiangrong Shen · Jian Liu · Huajin Tang · Gang Pan
From Node Interaction to Hop Interaction: New Effective and Scalable Graph Learning Paradigm
Jie Chen · Zilong Li · Zhu Yin · Junping Zhang · Jian Pu
A General Regret Bound of Preconditioned Gradient Method for DNN Training
Hongwei Yong · Ying Sun · Lei Zhang
Improved Distribution Matching for Dataset Condensation
Ganlong Zhao · Guanbin Li · Yipeng Qin · Yizhou Yu
Imitation Learning as State Matching via Differentiable Physics
Siwei Chen · Xiao Ma · Zhongwen Xu
Trainable Projected Gradient Method for Robust Fine-tuning
Junjiao Tian · Xiaoliang Dai · Chih-Yao Ma · Zecheng He · Yen-Cheng Liu · Zsolt Kira
Improving Generalization of Meta Learning with Inverted Regularization at Inner-level
Lianzhe Wang · Shiji Zhou · Shanghang Zhang · Xu Chu · Heng Chang · Wenwu Zhu
SIM: Semantic-aware Instance Mask Generation for Box-Supervised Instance Segmentation
Ruihuang Li · Chenhang HE · Yabin Zhang · Shuai Li · Liyi Chen · Lei Zhang
Rethinking the Correlation in Few-Shot Segmentation: A Buoys View
Yuan Wang · Rui Sun · Tianzhu Zhang
Reliability in Semantic Segmentation: Are We on the Right Track?
Pau de Jorge Aranda · Riccardo Volpi · Philip Torr · Grégory Rogez
ACSeg: Adaptive Conceptualization for Unsupervised Semantic Segmentation
Kehan Li · Zhennan Wang · Zesen Cheng · Runyi Yu · Yian Zhao · Guoli Song · Chang Liu · Li Yuan · Jie Chen
PartDistillation: Learning Parts from Instance Segmentation
Jang Hyun Cho · Philipp Kraehenbuehl · Vignesh Ramanathan
PACO: Parts and Attributes of Common Objects
Vignesh Ramanathan · Anmol Kalia · Vladan Petrovic · Yi Wen · Baixue Zheng · Baishan Guo · Rui Wang · Aaron Marquez · Rama Kovvuri · Abhishek Kadian · Amir Mousavi · Yiwen Song · Abhimanyu Dubey · Dhruv Mahajan
MIANet: Aggregating Unbiased Instance and General Information for Few-Shot Semantic Segmentation
Yong Yang · Qiong Chen · Yuan Feng · Tianlin Huang
Generative Semantic Segmentation
Jiaqi Chen · Jiachen Lu · Xiatian Zhu · Li Zhang
GeoLayoutLM: Geometric Pre-training for Visual Information Extraction
Chuwei Luo · Changxu Cheng · Qi Zheng · Cong Yao
GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts
Haoran Geng · Helin Xu · Chengyang Zhao · Chao Xu · Li Yi · Siyuan Huang · He Wang
A Simple Framework for Text-Supervised Semantic Segmentation
Muyang Yi · Quan Cui · Hao Wu · Cheng Yang · Osamu Yoshie · Hongtao Lu
Learning to Detect and Segment for Open Vocabulary Object Detection
tao wang
Open-vocabulary Attribute Detection
Maria Bravo · Sudhanshu Mittal · Simon Ging · Thomas Brox
CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching
Xiaoshi Wu · Feng Zhu · Rui Zhao · Hongsheng Li
CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIP
Runnan Chen · Youquan Liu · Lingdong Kong · Xinge ZHU · Yuexin Ma · Yikang LI · Yuenan Hou · Yu Qiao · Wenping Wang
PLA: Language-Driven Open-Vocabulary 3D Scene Understanding
Runyu Ding · Jihan Yang · Chuhui Xue · Wenqing Zhang · Song Bai · XIAOJUAN QI
CrOC: Cross-View Online Clustering for Dense Visual Representation Learning
Thomas Stegmüller · Tim Lebailly · Behzad Bozorgtabar · Tinne Tuytelaars · Jean-Philippe Thiran
ScanDMM: A Deep Markov Model of Scanpath Prediction for 360° Images
Xiangjie Sui · Yuming Fang · Hanwei Zhu · Shiqi Wang · Zhou Wang
Turning a CLIP Model into a Scene Text Detector
Wenwen Yu · Yuliang Liu · Wei Hua · Deqiang Jiang · Bo Ren · Xiang Bai
Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
Filip Radenovic · Abhimanyu Dubey · Abhishek Kadian · Todor Mihaylov · Simon Vandenhende · Yash Patel · Yi Wen · Vignesh Ramanathan · Dhruv Mahajan
Uncurated Image-Text Datasets: Shedding Light on Demographic Bias
Noa Garcia · Yusuke Hirota · YANKUN WU · Yuta Nakashima
EXIF as Language: Learning Cross-Modal Associations Between Images and Camera Metadata
Chenhao Zheng · Ayush Shrivastava · Andrew Owens
Cross-Domain Image Captioning with Discriminative Finetuning
Roberto Dessi · Michele Bevilacqua · Eleonora Gualdoni · Nathanaël Rakotonirina · Francesca Franzon · Marco Baroni
Similarity Maps for Self-Training Weakly-Supervised Phrase Grounding
Tal Shaharabany · Lior Wolf
Positive-Augmented Constrastive Learning for Image and Video Captioning Evaluation
Sara Sarto · Manuele Barraco · Marcella Cornia · Lorenzo Baraldi · Rita Cucchiara
Detecting and Grounding Multi-Modal Media Manipulation
Rui Shao · Tianxing Wu · Ziwei Liu
DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation
Yueming Lyu · Tianwei Lin · Fu Li · Dongliang He · Jing Dong · Tieniu Tan
Parts2Words: Learning Joint Embedding of Point Clouds and Texts by Bidirectional Matching between Parts and Words
Chuan Tang · Xi Yang · Bojian Wu · Zhizhong Han · Yi Chang
Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR
Aneeshan Sain · Ayan Kumar Bhunia · Subhadeep Koley · Pinaki Nath Chowdhury · Soumitri Chattopadhyay · Tao Xiang · Yi-Zhe Song
GeneCIS: A Benchmark for General Conditional Image Similarity
Sagar Vaze · Nicolas Carion · Ishan Misra
Picture that Sketch: Photorealistic Image Generation from Abstract Sketches
Subhadeep Koley · Ayan Kumar Bhunia · Aneeshan Sain · Pinaki Nath Chowdhury · Tao Xiang · Yi-Zhe Song
Hyperbolic Contrastive Learning for Visual Representations beyond Objects
Songwei Ge · Shlok Mishra · Simon Kornblith · Chun-Liang Li · David Jacobs
Images Speak in Images: A Generalist Painter for In-Context Visual Learning
Xinlong Wang · Wen Wang · Yue Cao · Chunhua Shen · Tiejun Huang
DeAR: Debiasing Vision-Language Models with Additive Residuals
Ashish Seth · Mayur Hemani · Chirag Agarwal
Leverage Interactive Affinity for Affordance Learning
Hongchen Luo · Wei Zhai · Jing Zhang · Yang Cao · Dacheng Tao
Affordance Grounding from Demonstration Video to Target Image
Joya Chen · Difei Gao · Kevin Qinghong Lin · Mike Zheng Shou
Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning
Yatai Ji · Rong-Cheng Tu · jie jiang · Weijie Kong · Chengfei Cai · Wenzhe Zhao · WANG HongFa · Yujiu Yang · Wei Liu
Is BERT Blind? Exploring the Effect of Vision-and-Language Pretraining on Visual Language Understanding
Morris Alper · Michael Fiman · Hadar Averbuch-Elor
Probabilistic Prompt Learning for Dense Prediction
Hyeongjun Kwon · Taeyong Song · Somi Jeong · Jin Kim · Jinhyun Jang · Kwanghoon Sohn
Visual-Language Prompt Tuning with Knowledge-guided Context Optimization
Hantao Yao · Rui Zhang · Changsheng Xu
The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training
Gi-Cheon Kang · Sungdong Kim · Jinhwa Kim · Donghyun Kwak · Byoung-Tak Zhang
Divide and Conquer: Answering Questions with Object Factorization and Compositional Reasoning
Shi Chen · Qi Zhao
Logical Implications for Visual Question Answering Consistency
Sergio Tascon Morales · Pablo Márquez Neila · Raphael Sznitman
Abstract Visual Reasoning: An Algebraic Approach for Solving Raven’s Progressive Matrices
Jingyi Xu · Tushar Vaidya · Yufei Wu · Saket Chandra · Zhangsheng Lai · Kai Fong Ernest Chong
NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory
Santhosh Kumar Ramakrishnan · Ziad Al-Halah · Kristen Grauman
Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding
Minyoung Hwang · Jaeyeon Jeong · Minsoo Kim · Yoonseon Oh · Songhwai Oh
3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification
Jiazhao Zhang · Liu Dai · Fanpeng Meng · Qingnan Fan · Xuelin Chen · Kai Xu · He Wang
VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision
Mengyin Liu · jie jiang · Chao Zhu · Xu-Cheng Yin
An Actor-Centric Causality Graph for Asynchronous Temporal Inference in Group Activity
Zhao Xie · Tian Gao · Kewei Wu · Jiao Chang
Affection: Learning Affective Explanations for Real-World Visual Data
Panos Achlioptas · Maks Ovsjanikov · Leonidas Guibas · Sergey Tulyakov
Decoupled Multimodal Distilling for Emotion Recognition
Yong Li · Yuanzhi Wang · Zhen Cui
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
Wenhao Wu · Xiaohan Wang · Haipeng Luo · Jingdong Wang · Yi Yang · Wanli Ouyang
Learning Video Representations from Large Language Models
Yue Zhao · Ishan Misra · Philipp Kraehenbuehl · Rohit Girdhar
ProTéGé: Untrimmed Pretraining for Video Temporal Grounding by Video Temporal Grounding
Lan Wang · Gaurav Mittal · Sandra Sajeev · Ye Yu · Matthew Hall · Vishnu Naresh Boddeti · Mei Chen
Fine-tuned CLIP Models are Efficient Video Learners
Hanoona Bangalath · Muhammad Uzair Khattak · Muhammad Maaz · Salman Khan · Fahad Khan
Movies2Scenes: Using Movie Metadata to Learn Scene Representation
Shixing Chen · Chun-Hao Liu · Xiang Hao · Xiaohan Nie · Maxim Arap · Raffay Hamid
Soft-Landing Strategy for Alleviating the Task Discrepancy Problem in Temporal Action Localization Tasks
Hyolim Kang · Hanjung Kim · Joungbin An · Minsu Cho · Seon Joo Kim
Reducing the Label Bias for Timestamp Supervised Temporal Action Segmentation
Kaiyuan Liu · Yunheng Li · Shenglan Liu · Tan · Zihang Shao
Active Exploration of Multimodal Complementarity for Few-Shot Action Recognition
Yuyang Wanyan · Xiaoshan Yang · Chaofan Chen · Changsheng Xu
MMG-Ego4D: Multimodal Generalization in Egocentric Action Recognition
Xinyu Gong · Sreyas Mohan · Naina Dhingra · Jean-Charles Bazin · YILEI LI · Zhangyang Wang · Rakesh Ranjan
Prompt-Guided Zero-Shot Anomaly Action Recognition using Pretrained Deep Skeleton Features
Fumiaki Sato · Ryo Hachiuma · Taiki Sekii
TempSAL - Uncovering Temporal Information for Deep Saliency Prediction
Bahar Aydemir · Ludo Hoffstetter · Tong Zhang · Mathieu Salzmann · Sabine Süsstrunk
Decompose More and Aggregate Better: Two Closer Looks at Frequency Representation Learning for Human Motion Prediction
Xuehao Gao · Shaoyi Du · Yang Wu · Yang Yang
CASP-Net: Rethinking Video Saliency Prediction from an Audio-Visual Consistency Perceptual Perspective
Junwen Xiong · Ganglai Wang · Peng Zhang · Wei Huang · Yufei Zha · Guangtao Zhai
Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment
Sungbin Kim · Arda Senocak · Hyunwoo Ha · Andrew Owens · Tae-Hyun Oh
Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning
Weixuan Sun · Jiayi Zhang · Jianyuan Wang · Zheyuan Liu · Yiran Zhong · Tianpeng Feng · Yandong Guo · Yanhao Zhang · Nick Barnes
Novel-view Acoustic Synthesis
Changan Chen · Alexander Richard · Roman Shapovalov · Vamsi Krishna Ithapu · Natalia Neverova · Kristen Grauman · Andrea Vedaldi
Relational Space-Time Query in Long-Form Videos
Xitong Yang · FU-JEN CHU · Raghav Goyal · Matt Feiszli · Lorenzo Torresani · Du Tran
Selec