Semantic-aware scene recognition

López Cifuentes, Alejandro; Escudero Viñolo, Marcos; Bescos Cano, Jesús; García Martín, Álvaro

UAM_Biblioteca

Autor (es)

López Cifuentes, Alejandro

; Escudero Viñolo, Marcos

; Bescos Cano, Jesús

; García Martín, Álvaro

Entidad

UAM. Departamento de Tecnología Electrónica y de las Comunicaciones

Editor

Elsevier

Fecha de edición

2020-06-01

Cita

Semantic-aware scene recognition 152 (2020): 107256

ISSN

0031-3203

DOI

10.1016/j.patcog.2020.107256

Financiado por

This study has been partially supported by the Spanish Government through its TEC2017-88169-R MobiNetVideo project

Proyecto

Gobierno de España. TEC2017-88169-R

Versión del editor

https://doi.org/10.1016/j.patcog.2020.107256

Materias

Convolutional neural networks; Deep learning; Scene recognition; Semantic segmentation; Telecomunicaciones

URI

http://hdl.handle.net/10486/706166

Derechos

Esta obra está bajo una licencia de Creative Commons Reconocimiento-NoComercial-SinObraDerivada 4.0 Internacional.

Resumen

Scene recognition is currently one of the top-challenging research fields in computer vision. This may be due to the ambiguity between classes: images of several scene classes may share similar objects, which causes confusion among them. The problem is aggravated when images of a particular scene class are notably different. Convolutional Neural Networks (CNNs) have significantly boosted performance in scene recognition, albeit it is still far below from other recognition tasks (e.g., object or image recognition). In this paper, we describe a novel approach for scene recognition based on an end-to-end multi-modal CNN that combines image and context information by means of an attention module. Context information, in the shape of a semantic segmentation, is used to gate features extracted from the RGB image by leveraging on information encoded in the semantic representation: the set of scene objects and stuff, and their relative locations. This gating process reinforces the learning of indicative scene content and enhances scene disambiguation by refocusing the receptive fields of the CNN towards them. Experimental results on three publicly available datasets show that the proposed approach outperforms every other state-of-the-art method while significantly reducing the number of network parameters. All the code and data used along this paper is available at: https://github.com/vpulab/Semantic-Aware-Scene-Recognition

Mostrar el registro completo del ítem

Lista de ficheros

Nombre

semantic_lopez-cifuentes_PR_2020_pre.pdf

Tamaño

6.969Mb

Formato

PDF

Google™ Scholar:López Cifuentes, Alejandro - Escudero Viñolo, Marcos - Bescos Cano, Jesús - García Martín, Álvaro

Lista de colecciones del ítem

Producción científica en acceso abierto de la UAM [20370]

UAM_Biblioteca