Molecular Identification from AFM Images Using the IUPAC Nomenclature and Attribute Multimodal Recurrent Neural Networks
Entity
UAM. Departamento de Física Teórica de la Materia CondensadaPublisher
American Chemical SocietyDate
2023-05-01Citation
ACS Applied Materials & Interfaces 15.18 (2023): 22692–22704ISSN
1944-8244 (print); 1944-8252 (online)Funded by
We would like to acknowledge support from the Comunidad de Madrid Industrial Doctorate programme 2017 under reference number IND2017/IND7793 and from Quasar Science Resources S.L. P.P. and R.P. acknowledge support from the Spanish Ministry of Science and Innovation, through project PID2020-115864RB-I00 and the “María de Maeztu” Programme for Units of Excellence in R&D (CEX2018-000805-M). C.R.-M. acknowledges financial support by the Ramón y Cajal program of the Spanish Ministry of Science and Innovation (ref. RYC2021-031176-I). Computer time provided by the Red Española de Supercomputación (RES) at the Finisterrae II Supercomputer is also acknowledgedEditor's Version
https://doi.org/10.1021/acsami.3c01550Subjects
Atomic force microscopy; Molecular identification; Deep learning; Neural network; Image captioning; Density functional theory; FísicaRights
© 2023 The AuthorsAbstract
Spectroscopic methods like nuclear magnetic
resonance, mass spectrometry, X-ray diffraction, and UV/visible
spectroscopies applied to molecular ensembles have so far been
the workhorse for molecular identification. Here, we propose a
radically different chemical characterization approach, based on the
ability of noncontact atomic force microscopy with metal tips
functionalized with a CO molecule at the tip apex (referred as HRAFM) to resolve the internal structure of individual molecules. Our
work demonstrates that a stack of constant-height HR-AFM
images carries enough chemical information for a complete
identification (structure and composition) of quasiplanar organic
molecules, and that this information can be retrieved using
machine learning techniques that are able to disentangle the contribution of chemical composition, bond topology, and internal
torsion of the molecule to the HR-AFM contrast. In particular, we exploit multimodal recurrent neural networks (M-RNN) that
combine convolutional neural networks for image analysis and recurrent neural networks to deal with language processing, to
formulate the molecular identification as an imaging captioning problem. The algorithm is trained using a data set which contains
almost 700,000 molecules and 165 million theoretical AFM images to produce as final output the IUPAC name of the imaged
molecule. Our extensive test with theoretical images and a few experimental ones shows the potential of deep learning algorithms in
the automatic identification of molecular compounds by AFM. This achievement supports the development of on-surface synthesis
and overcomes some limitations of spectroscopic methods in traditional solution-based synthesis
Files in this item
Google Scholar:Carracedo Cosme, Jaime
-
Romero Muñiz, Carlos
-
Pou Bell, Pablo
-
Pérez Pérez, Rubén
This item appears in the following Collection(s)
Related items
Showing items related by title, author, creator and subject.