Reasoning With Neural Tensor Networks for Knowledge Base Completion (2013)(About) **Predicting the likely truth of additional facts based on existing facts in the knowledge base.**
> we introduce an expressive neural
tensor network suitable for reasoning over relationships between two entities.
Most similar work: [Bordes et al.](http://127.0.0.1:8080/semanlink/doc/2019/08/learning_structured_embeddings_) (2011)
1. new neural tensor
network (NTN) suitable for reasoning over relationships between two entities.. Generalizes several previous neural network models and provides a more
powerful way to model relational information than a standard neural network layer.
2. a new way to represent entities in knowledge bases, as the
average of their constituting word vectorss, allowing the sharing of statistical strength between the words describing
each entity (e.g., Bank of China and China).
3. incorporation of word vectors which are trained on large unlabeled text
> We learn to modify word representations
via grounding in world knowledge. This essentially allows us to analyze word embeddings and
query them for specific relations. Furthermore, the resulting vectors could be used in other tasks
such as named entity recognition or relation classification in natural language
[1907.03950] Learning by Abstraction: The Neural State Machine (2019)(About) > Given an image, we first predict a probabilistic graph
that represents its underlying semantics and serves as a structured world model.
Then, we perform sequential reasoning over the graph, iteratively traversing its
nodes to answer a given question or draw a new inference. In contrast to most
neural architectures that are designed to closely interact with the raw sensory
data, our model operates instead in an abstract latent space, by transforming both
the visual and linguistic modalities into semantic concept-based representations,
thereby achieving enhanced transparency and modularity.
> Drawing inspiration from [Bengio’s consciousness prior](/doc/?uri=https%3A%2F%2Farxiv.org%2Fabs%2F1709.08568)...
Effect of Non-linear Deep Architecture in Sequence Labeling(About) > we show the close connection between CRF and “sequence model” neural nets, and present an empirical investigation to compare their performance on two sequence labeling tasks – Named Entity Recognition and Syntactic Chunking. Our results suggest that **non-linear models are highly effective in low-dimensional distributional spaces. Somewhat surprisingly, we find that a non-linear architecture offers no benefits in a high-dimensional discrete feature space**.
Representations for Language: From Word Embeddings to Sentence Meanings (2017) - YouTube(About) [Slides](/doc/?uri=https%3A%2F%2Fnlp.stanford.edu%2Fmanning%2Ftalks%2FSimons-Institute-Manning-2017.pdf)
**What's special about human language? the only hope for explainable intelligence**.
Symbols are not just an invention of logic / classical AI.
Meaning: a solution via distributional similarity based representations. One of the most successfull ideas of modern NLP.
> You shall know a word by the company it keeps (JR Firth 1957)
The BiLSTM hegemony
Neural Bag of words
> "Surprisingly effective for many tasks :-(" [cf "DAN", Deep Averaging Network, Iyyver et al.](/doc/?uri=http%3A%2F%2Fwww.cs.cornell.edu%2Fcourses%2Fcs5740%2F2016sp%2Fresources%2Fdans.pdf)
Christopher Manning - "Building Neural Network Models That Can Reason" (TCSDLS 2017-2018) - YouTube(About) Goal: to enhance DL systems with reasoning capabilities from the ground-up
- allowing them to perform transparent multi-step reasoning processes
- while retaining end-to-end differentiability and scalability to real-world problems
> I get the feeling that if we're going to make further progress in AI, we actually have to get back to some of these problems of knowledge representation reasoning
- From ML to machine reasoning
- the CLEVR task
- Memory-Attention-Composition Networks
What is reasoning? (Bottou 2011)
- manipulating previously acquired knowledge in order to answer a question
- not necessarily achieved by making logical inference (eg: algebraic manipulations of matrices)
- composition rules -> combination of operations to address new tasks
Latent semantic indexing ("Introduction to Information Retrieval" Manning 2008)(About) VSM : problem with synonymy and polysemy (eg. synonyms are accorded separate dimensions)
Could we use the co-occurrences of terms to capture the latent semantic associations of terms and alleviate these problems?
- computational cost of the SVD is significant
- biggest obstacle to the widespread adoption to LSI.
- One approach to this obstacle: build the LSI representation on a randomly sampled subset of the documents, following which the remaining documents are ``folded in'' (cf Gensim tutorial "[Random Projection (used as an option to speed up LSI)](https://radimrehurek.com/gensim/models/rpmodel.html)")
- As we reduce k, recall tends to increase, as expected.
- **Most surprisingly**, a value of k in the low hundreds can actually increase precision. **This appears to suggest that for a suitable value of *k*, LSI addresses some of the challenges of synonymy**.
- LSI works best in applications where there is little overlap between queries and documents. (--??)
The experiments also documented some modes where LSI failed to match the effectiveness of more traditional indexes and score computations.
LSI shares two basic drawbacks of vector space retrieval:
- no good way of expressing negations
- no way of enforcing Boolean conditions.
LSI can be viewed as soft clustering by interpreting each dimension of the reduced space as a cluster and the value that a document has on that dimension as its fractional membership in that cluster.