About This Document
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Drew A. Hudson
- sl:arxiv_num : 1907.03950
- sl:arxiv_published : 2019-07-09T03:08:41Z
- sl:arxiv_summary : We introduce the Neural State Machine, seeking to bridge the gap between the
neural and symbolic views of AI and integrate their complementary strengths for
the task of visual reasoning. Given an image, we first predict a probabilistic
graph that represents its underlying semantics and serves as a structured world
model. Then, we perform sequential reasoning over the graph, iteratively
traversing its nodes to answer a given question or draw a new inference. In
contrast to most neural architectures that are designed to closely interact
with the raw sensory data, our model operates instead in an abstract latent
space, by transforming both the visual and linguistic modalities into semantic
concept-based representations, thereby achieving enhanced transparency and
modularity. We evaluate our model on VQA-CP and GQA, two recent VQA datasets
that involve compositionality, multi-step inference and diverse reasoning
skills, achieving state-of-the-art results in both cases. We provide further
experiments that illustrate the model's strong generalization capacity across
multiple dimensions, including novel compositions of concepts, changes in the
answer distribution, and unseen linguistic structures, demonstrating the
qualities and efficacy of our approach.@en
- sl:arxiv_title : Learning by Abstraction: The Neural State Machine@en
- sl:arxiv_updated : 2019-11-25T10:02:05Z
- sl:bookmarkOf : https://arxiv.org/abs/1907.03950
- sl:creationDate : 2019-07-10
- sl:creationTime : 2019-07-10T22:05:52Z
- sl:relatedDoc : https://arxiv.org/abs/1709.08568
Documents with similar tags (experimental)