Feature-wise transformations. A simple and surprisingly effective family of conditioning mechanisms. (2018)(About) > Many real-world problems require integrating multiple sources of information...When approaching such problems, it often makes sense to process one source of information in the context of another. In machine learning, we often refer to this context-based processing as conditioning: the computation carried out by a model is **conditioned** or **modulated** by information extracted from an auxiliary input. Eg.: **extract meaning from the image in the context of the question**.
Related to this talk at Paris NLP meetup: ["Language and Perception in Deep Learning"](/doc/2019/10/language_and_perception_in_deep)
Graph Attention Networks (2018)(About) A novel approach to processing graph-structured data by neural networks, leveraging **masked self-attentional layers over a node's neighborhood**. (-> different weights to different nodes in a neighborhood, without requiring any kind of computationally intensive matrix operation or depending on knowing the graph structure upfront).
[1605.07427] Hierarchical Memory Networks - Chandar et. al (2016)(About) > hybrid between hard and soft attention memory networks. The memory is organized in a hierarchical structure such that reading from it is done with less computation than soft attention over a flat memory, while also being easier to train than hard attention over a flat memory
Towards bridging the gap between deep learning and brains(About) > Underlying Assumption: There are principles giving rise to intelligence (machine, human
or animal) via learning, simple enough that they can be
described compactly, similarly to the laws of physics, i.e., our
intelligence is not just the result of a huge bag of tricks and
pieces of knowledge, but of general mechanisms to acquire
How do RBMs work? - Quora(About) > You can think of it a little bit like you think about Principal Components Analysis, in that it is trained by unsupervised learning so as to capture the leading variations in the data, and it yields a new representation of the data
How does one apply deep learning to time series forecasting? - Quora(About) > I would use the state-of-the-art [recurrent nets](/tag/recurrent_neural_network.html) (using gated units and multiple layers) to make predictions at each time step for some future horizon of interest. The RNN is then updated with the next observation to be ready for making the next prediction
The Consciousness Prior (2017)(About) "consciousness seen as the formation of a low-dimensional combination of a few concepts constituting a conscious thought, i.e., consciousness as awareness at a particular time instant": the projection of a big vector (all the things conscious and unconscious in brain). Attention: additional mechanism describing what mind chooses to focus on.