PERCEPTION
Explorations into Deep Learning Text Architectures for Dense Image Captioning
Martina Toshevska, Frosina Stojanovska, Eftim Zdravevski, Petre Lameski, Sonja Gievska
- Year
- 2020
- Citations
- 6
- Access
- Open access
Abstract
Image captioning is the process of generating a textual description that best fits the image scene. It is one of the most important tasks in computer vision and natural language processing and has the potential to improve many applications in robotics, assistive technologies, storytelling, medical imaging and more. This paper aims to analyse different encoder-decoder architectures for dense image caption generation while focusing on the text generation component.
Keywords
Closed captioningComputer scienceArtificial intelligenceNatural language generationWord (group theory)Feature (linguistics)Natural languageNatural language processingDeep learningSentence
Related papers
OTHER
📊 26,957 cites
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
PERCEPTION
📊 22,245 cites
Artificial intelligence: a modern approach
1995
OTHER
📊 18,993 cites
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
SWARM
📊 14,853 cites
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002