1College of Electrical and Automation Engineering, Nanjing Normal University,Nanjing,China
American Journal of Electrical and Electronic Engineering.
2022,
Vol. 10 No. 1, 6-23
DOI: 10.12691/ajeee-10-1-2
Copyright © 2022 Science and Education PublishingCite this paper: Kexin Ye, Lei Zhang, Tianran Li. Design of a Picture-seeing and Talking System Based on Attention Mechanism.
American Journal of Electrical and Electronic Engineering. 2022; 10(1):6-23. doi: 10.12691/ajeee-10-1-2.
Correspondence to: Kexin Ye, College of Electrical and Automation Engineering, Nanjing Normal University,Nanjing,China. Email:
947985637@qq.comAbstract
In order to solve this problem, this paper proposes an image title generation model based on deep loop architecture. This model combines some new achievements in computer vision and machine translation, and can generate natural sentences that accurately describe the image for an uncomplicated physical image. The model is trained to maximize the accuracy of the target description sentence in a given training image. The training data set was mainly completed by the MSCOCO data set, and in the later adjustment stage, some feature pictures I specifically looked for on the Internet were included for improvement. Test experiments on several data sets verify that the model has the ability of basic accurate image description. This model is usually more accurate in the case of uncomplicated physical pictures, which I have verified both qualitatively and quantitatively. The final result can input a qualified image and output a natural language sentence to describe the main content of the image.
Keywords