We present the MAC network, a novel fully differentiable neural networkarchitecture, designed to facilitate explicit and expressive reasoning. MACmoves away from monolithic black-box neural architectures towards a design thatencourages both transparency and versatility. The model approaches problems bydecomposing them into a series of attention-based reasoning steps, eachperformed by a novel recurrent Memory, Attention, and Composition (MAC) cellthat maintains a separation between control and memory. By stringing the cellstogether and imposing structural constraints that regulate their interaction,MAC effectively learns to perform iterative reasoning processes that aredirectly inferred from the data in an end-to-end approach. We demonstrate themodel's strength, robustness and interpretability on the challenging CLEVRdataset for visual reasoning, achieving a new state-of-the-art 98.9% accuracy,halving the error rate of the previous best model. More importantly, we showthat the model is computationally-efficient and data-efficient, in particularrequiring 5x less data than existing models to achieve strong results.
translated by 谷歌翻译