Show simple item record

dc.contributor.authorLópez Cifuentes, Alejandro es_ES
dc.contributor.authorEscudero Viñolo, Marcos es_ES
dc.contributor.authorBescos Cano, Jesús es_ES
dc.contributor.authorGarcía Martín, Álvaro es_ES
dc.contributor.otherUAM. Departamento de Tecnología Electrónica y de las Comunicacionesen_US
dc.date.accessioned2023-02-02T17:44:08Zen_US
dc.date.available2023-02-02T17:44:08Zen_US
dc.date.issued2020-06-01en_US
dc.identifier.citationLópez-Cifuentes, A.; Escudero-Viñolo, M.; Bescós, J.; García-Martín, Á. (2020) Semantic-aware scene recognition, 152, 107256.en_US
dc.identifier.issn0031-3203en_US
dc.identifier.urihttp://hdl.handle.net/10486/706166en_US
dc.description.abstractScene recognition is currently one of the top-challenging research fields in computer vision. This may be due to the ambiguity between classes: images of several scene classes may share similar objects, which causes confusion among them. The problem is aggravated when images of a particular scene class are notably different. Convolutional Neural Networks (CNNs) have significantly boosted performance in scene recognition, albeit it is still far below from other recognition tasks (e.g., object or image recognition). In this paper, we describe a novel approach for scene recognition based on an end-to-end multi-modal CNN that combines image and context information by means of an attention module. Context information, in the shape of a semantic segmentation, is used to gate features extracted from the RGB image by leveraging on information encoded in the semantic representation: the set of scene objects and stuff, and their relative locations. This gating process reinforces the learning of indicative scene content and enhances scene disambiguation by refocusing the receptive fields of the CNN towards them. Experimental results on three publicly available datasets show that the proposed approach outperforms every other state-of-the-art method while significantly reducing the number of network parameters. All the code and data used along this paper is available at: https://github.com/vpulab/Semantic-Aware-Scene-Recognitionen_US
dc.description.sponsorshipThis study has been partially supported by the Spanish Government through its TEC2017-88169-R MobiNetVideo projecten_US
dc.format.extent30 pag.en_US
dc.format.mimetypeapplication/pdfen_US
dc.language.isoengen_US
dc.publisherElsevieren_US
dc.relation.ispartofPattern Recognitionen_US
dc.rights© 2020 Elsevier Ltd.en_US
dc.subject.otherConvolutional neural networkses_ES
dc.subject.otherDeep learninges_ES
dc.subject.otherScene recognitiones_ES
dc.subject.otherSemantic segmentationes_ES
dc.titleSemantic-aware scene recognitiones_ES
dc.typearticlees_ES
dc.subject.ecienciaTelecomunicacioneses_ES
dc.relation.publisherversionhttps://doi.org/10.1016/j.patcog.2020.107256en_US
dc.identifier.doi10.1016/j.patcog.2020.107256en_US
dc.identifier.publicationfirstpage107256.1en_US
dc.identifier.publicationlastpage107256.15en_US
dc.identifier.publicationvolume102en_US
dc.relation.projectIDGobierno de España. TEC2017-88169-Res_ES
dc.type.versioninfo:eu-repo/semantics/submittedVersiones_ES
dc.contributor.groupVideo Processing and Understanding Labgrupo de Tratamiento e Interpretación de Vídeoen_US
dc.rights.ccReconocimiento – NoComercial – SinObraDerivadaen_US
dc.rights.accessRightsopenAccessen_US
dc.facultadUAMEscuela Politécnica Superiores_ES


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record