Skip to content

Latest commit

 

History

History
603 lines (345 loc) · 21.2 KB

File metadata and controls

603 lines (345 loc) · 21.2 KB

ECCV2020 - European Conference on Computer Vision

Contents


ECCV2020 6D姿态估计

ECCV2020 (ORAL) Self6D: Self-Supervised Monocular 6D Object Pose Estimation

  • 标题:Self6D:自监督单目6D物体姿态估计
  • 作者团队:清华大学 & 慕尼黑工业大学 & 谷歌

6D目标姿态估计是计算机视觉中的一个基本问题。卷积神经网络(CNN)最近被证明可以从单目获取的图像中预测物体可靠的6D姿态。尽管如此,CNN仍然被认为是高度依赖数据驱动的 (data-driven),但是,获取足够的标注通常非常耗时且劳动强度大。为克服此缺点,我们提出了通过自我监督学习进行单目6D姿态估计的想法,该想法消除了对带有标注的真实数据需求。在用合成RGB数据完全监督了我们提议的网络之后,我们利用神经渲染的最新进展对未标注的真实RGB-D数据进行进一步的自我监督,以寻求视觉和几何上的最佳对齐方式。广泛的评估表明,我们提出的自我监督能够显着提高模型的原始性能,胜过所有其他依赖于合成数据或采用领域自适应技术的方法。

[arXiv]


ECCV2020 Neural Object Learning for 6D Pose Estimation Using a Few Cluttered Images

  • 标题:使用少量杂波图像进行6D姿势估计的神经目标学习
  • 作者团队:维也纳工业大学 (TU Wien)

用于目标6D姿态估计的最新方法,比如假定纹理3D模型或覆盖目标姿态整个范围的真实图像。 但是,在真实场景中很难获得带纹理的3D模型和带有标注对象姿态的数据。 本文提出了一种方法,即神经目标学习(NOL),该方法可以通过仅合并来自混合图像的一些观察结果,以任意姿态生成目标的合成图像。 提出了一种新颖的细化步骤,以对齐源图像中对象的不准确姿态,从而获得质量更好的图像。 对两个公共数据集的评估表明,与使用真实图像数量10倍的方法相比,由NOL创建的渲染图像达到SOTA的性能。 对我们新数据集的评估表明,可以使用一系列固定场景同时训练和识别多个目标。

[arXiv] [YouTube]


ECCV2020 Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation

  • 标题:先验形状变形用于分类6D对象姿态和大小估计
  • 作者团队:新加坡国立大学

我们提出了一种新颖的学习方法,可以从RGB-D图像中恢复未见对象实例的6D姿态和大小。 为了处理类内形状变化,我们提出了一个深层网络,可通过对来自预先学习的分类形状的变形进行显式建模来重建3D对象模型。 此外,我们的网络可以推断对象实例的深度,用于观察与重建3D模型之间的紧密对应关系,以达到共同估算6D对象姿态和大小的目的。 我们设计了一个自动编码器,该编码器可以在一组对象模型上进行训练,并计算每个类别的平均潜在嵌入量,以学习分类先验形状。 在合成数据集和真实数据集上进行的大量实验表明,我们的方法大大优于现有技术。

[arXiv] [GitHub]


ECCV2020 目标检测

Dense RepPoints: Representing Visual Objects with Dense Point Sets

Corner Proposal Network for Anchor-free, Two-stage Object Detection

BorderDet: Border Feature for Dense Object Detection

Multi-Scale Positive Sample Refinement for Few-Shot Object Detection

PIoU Loss: Towards Accurate Oriented Object Detection in Complex Environments

Boosting Weakly Supervised Object Detection with Progressive Knowledge Transfer

Probabilistic Anchor Assignment with IoU Prediction for Object Detection

HoughNet: Integrating near and long-range evidence for bottom-up object detection

OS2D: One-Stage One-Shot Object Detection by Matching Anchor Features

End-to-End Object Detection with Transformers

Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training

遥感旋转目标检测

Arbitrary-Oriented Object Detection with Circular Smooth Label

3D目标检测

Pillar-based Object Detection for Autonomous Driving

EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection

视频目标检测

Learning Where to Focus for Efficient Video Object Detection


图像分割

Tensor Low-Rank Reconstruction for Semantic Segmentation

Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation

GMNet: Graph Matching Network for Large Scale Part Semantic Segmentation in the Wild

SegFix: Model-Agnostic Boundary Refinement for Segmentation

Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation(Oral)

Improving Semantic Segmentation via Decoupled Body and Edge Supervision

实例分割

SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation

Commonality-Parsing Network across Shape and Appearance for Partially Supervised Instance Segmentation

Boundary-preserving Mask R-CNN

Conditional Convolutions for Instance Segmentation (Oral)

SOLO: Segmenting Objects by Locations

视频目标分割

Collaborative Video Object Segmentation by Foreground-Background Integration

Video Object Segmentation with Episodic Graph Memory Networks


人脸(检测/识别/解析等)

3D人脸重建

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency

人脸活体检测

CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich Annotations

人脸解析

Edge-aware Graph Representation Learning and Reasoning for Face Parsing


目标跟踪

单目标跟踪

Ocean: Object-aware Anchor-Free Tracking

多目标跟踪

Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking

Ocean: Object-aware Anchor-Free Tracking

TAO: A Large-Scale Benchmark for Tracking Any Object

Segment as Points for Efficient Online Multi-Object Tracking and Segmentation(Oral)


3D点云(分类/分割/配准/补全等)

AdvPC: Transferable Adversarial Perturbations on 3D Point Clouds

A Closer Look at Local Aggregation Operators in Point Cloud Analysis

3D点云补全

Multimodal Shape Completion via Conditional Generative Adversarial Networks

GRNet: Gridding Residual Network for Dense Point Cloud Completion

3D点云生成

Progressive Point Cloud Deconvolution Generation Network


图像分类

Learning to Learn Parameterized Classification Networks for Scalable Input Images

Learning To Classify Images Without Labels


视频理解/行为识别/行为检测

Context-Aware RCNN: A Baseline for Action Detection in Videos

Actions as Moving Points

SF-Net: Single-Frame Supervision for Temporal Action Localization

Asynchronous Interaction Aggregation for Action Detection


GAN

Rewriting a Deep Generative Model

Contrastive Learning for Unpaired Image-to-Image Translation

XingGAN for Person Image Generation


Re-ID

Temporal Complementary Learning for Video Person Re-Identification

论文:https://arxiv.org/abs/2007.09357 代码:https://github.com/blue-blue272/VideoReID-TCLNet

Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification

Robust Re-Identification by Multiple Views Knowledge Distillation

Multiple Expert Brainstorming for Domain Adaptive Person Re-identification


ECCV2020 Oral

Quaternion Equivariant Capsule Networks for 3D Point Clouds Oral

DeepFit: 3D Surface Fitting by Neural Network Weighted Least Squares Oral

MoSaNAS: Multi-Objective Surrogate-Assisted Neural Architecture Search Oral

Describing Textures using Natural Language Oral

Empowering Relational Network by Self-Attention Augmented Conditional Random Fields for Group Activity Recognition Oral

AiR: Attention with Reasoning Capability Oral

Self6D: Self-Supervised Monocular 6D Object Pose Estimation Oral

Invertible Image Rescaling Oral

Synthesize then Compare: Detecting Failures and Anomalies for Semantic Segmentation Oral

House-GAN: Relational Generative Adversarial Networks for Graph-constrained House Layout Generation Oral

Crowdsampling the Plenoptic Function Oral

End-to-End Estimation of Multi-Person 3D Poses from Multiple Cameras Oral

End-to-End Object Detection with Transformers Oral

DeepSFM: Structure From Motion Via Deep Bundle Adjustment Oral

Ladybird: Deep Implicit Field Based 3D Reconstruction with Sampling and Symmetry Oral

Segment as Points for Efficient Online Multi-Object Tracking and Segmentation Oral

Conditional Convolutions for Instance Segmentation Oral

MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution Oral

Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset Oral

Privacy Preserving Structure-from-Motion Oral

Rewriting a Deep Generative Model Oral

Compare and Reweight: Distinctive Image Captioning Using Similar Images Sets Oral

Long-term Human Motion Prediction with Scene Context Oral

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis Oral

ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes Oral

MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images Oral

Learning and aggregating deep local descriptors for instance-level recognition Oral

A Consistently Fast and Globally Optimal Solution to the Perspective-n-Point Problem Oral

Learn to Recover Visible Color for Video Surveillance in a Day Oral

Deep Fashion3D: A Dataset and Benchmark for 3D Garment Reconstruction from Single-view Images Oral

Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation Oral

BorderDet: Border Feature for Dense Object Detection Oral

Regularization with Latent Space Virtual Adversarial Training Oral

Du$^2$Net: Learning Depth Estimation from Dual-Cameras and Dual-Pixels Oral

Model-Agnostic Boundary-Adversarial Sampling for Test-Time Generalization in Few-Shot learning Oral

Targeted Attack for Deep Hashing based Retrieval Oral

Gradient Centralization: A New Optimization Technique for Deep Neural Networks Oral

Content-Aware Unsupervised Deep Homography Estimation Oral

Multi-View Optimization of Local Feature Geometry Oral

Efficient Model Fitting by Combining Lifted Optimization with Phong Surface Models Oral

Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video Oral

Learning Stereo from Single Images Oral

Prototype Rectification for Few-Shot Learning Oral

Learning Feature Descriptors using Camera Pose Supervision Oral

Semantic Flow for Fast and Accurate Scene Parsing Oral

Appearance Consensus Driven Self-Supervised Human Mesh Recovery Oral

Diffraction Line Imaging Oral

Aligning and Projecting Images to Class-conditional Generative Networks Oral

Suppress and Balance: A Simple Gated Network for Salient Object Detection Oral

Visual Memorability for Robotic Interestingness Prediction via Unsupervised Online Learning Oral

Post-Training Piecewise Linear Quantization for Deep Neural Networks Oral

Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification Oral

In-Home Daily-Life Captioning Using Radio Signals Oral

Self-Challenging Improves Cross-Domain Generalization Oral

A Competence-aware Curriculum for Visual Concepts Learning via Question Answering Oral

Multi-task Learning Increases Adversarial Robustness Oral

S2DNAS: Transforming Static CNN Model for Dynamic Inference via Neural Architecture Search Oral

Improving Deep Video Compression by Resolution-adaptive Flow Coding Oral

Motion Capture from Internet Videos Oral

Appearance-Preserving 3D Convolution for Video-based Person Re-identification Oral

Solving the Blind Perspective-n-Point Problem End-To-End With Robust Differentiable Geometric Optimization Oral

Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation Oral

Deep Spatial-angular Regularization for Compressive Light Field Reconstruction over Coded Apertures Oral

Video-based Remote Physiological Measurement via Cross-verified Feature Disentangling Oral

Combining Implicit Function Learning and Parametric Models for 3D Human Reconstruction Oral

Orientation-aware Vehicle Re-identification with Semantics-guided Part Attention Network Oral

Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation Oral

Coherent full scene 3D reconstruction from a single RGB image Oral

Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs Oral

RAFT: Recurrent All-Pairs Field Transforms for Optical Flow Oral

Domain-invariant Stereo Matching Networks Oral

DeepHandMesh: Weakly-supervised Deep Encoder-Decoder Framework for High-fidelity Hand Mesh Modeling from a Single RGB Image Oral

Content Adaptive and Error Propagation Aware Deep Video Compression Oral

Towards Streaming Image Understanding Oral

Towards Automated Testing and Robustification by Semantic Adversarial Data Generation Oral

Adversarial Generative Grammars for Human Activity Prediction Oral

Greedy Sampler and Dumb Learner: A Surprisingly Effective Approach for Continual Learning Oral

Learning Lane Graph Representations for Motion Forecasting Oral

What Matters in Unsupervised Optical Flow Oral

Synthesis and Completion of Facades from Satellite Imagery Oral

Mapillary Planet-Scale Depth Dataset Oral

V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction Oral

Training Interpretable Convolutional Neural Networks by Differentiating Class-specific Filters Oral

EagleEye: Fast Sub-net Evaluation for Efficient Neural Network Pruning Oral

Intrinsic Point Cloud Interpolation via Dual Latent Space Navigation Oral

Cross-Domain Cascaded Deep Translation Oral

"Look Ma, no landmarks!" - Unsupervised, model-based dense face alignment Oral

Online Invariance Selection for Local Feature Descriptors Oral

Rethinking image inpainting via a mutual encoder-decoder with feature equalization Oral

TextCaps: a Dataset for Image Captioning with Reading Comprehension Oral

It is not the Journey but the Destination: Endpoint Conditioned Trajectory Prediction Oral

Learning What to Learn for Video Object Segmentation Oral

SIZER: A Dataset and Model for Parsing 3D Clothing and Learning Size Sensitive 3D Clothing Oral

LIMP: Learning Latent Shape Representations with Metric Preservation Priors Oral

Unsupervised Sketch-to-Photo Synthesis Oral

A simple way to make neural networks robust against diverse image corruptions Oral

SoftpoolNet: Shape Descriptor for Point Cloud Completion and Classification Oral

Hierarchical Face Aging through Disentangled Latent Characteristics Oral

Hybrid Models for Open Set Recognition Oral

TopoGAN: A Topology-Aware Generative Adversarial Network Oral

Learning to Localize Actions from Moments Oral

ForkGAN: Seeing into the Rainy Night Oral

TCGM: An Information-Theoretic Framework for Semi-Supervised Multi-Modality Learning Oral

ExchNet: A Unified Hashing Network for Large-Scale Fine-Grained Image Retrieval Oral