Semanlink - [2110.10778] Contrastive Document Representation Learning with Graph Attention Networks

[2110.10778] Contrastive Document Representation Learning with Graph Attention Networks

Tags:

About This Document

sl:arxiv_author :
sl:arxiv_firstAuthor : Peng Xu
sl:arxiv_num : 2110.10778
sl:arxiv_published : 2021-10-20T21:05:02Z
sl:arxiv_summary : Recent progress in pretrained Transformer-based language models has shown great success in learning contextual representation of text. However, due to the quadratic self-attention complexity, most of the pretrained Transformers models can only handle relatively short text. It is still a challenge when it comes to modeling very long documents. In this work, we propose to use a graph attention network on top of the available pretrained Transformers model to learn document embeddings. This graph attention network allows us to leverage the high-level semantic structure of the document. In addition, based on our graph document model, we design a simple contrastive learning strategy to pretrain our models on a large amount of unlabeled corpus. Empirically, we demonstrate the effectiveness of our approaches in document classification and document retrieval tasks.@en
sl:arxiv_title : Contrastive Document Representation Learning with Graph Attention Networks@en
sl:arxiv_updated : 2021-10-20T21:05:02Z
sl:bookmarkOf : https://arxiv.org/abs/2110.10778
sl:creationDate : 2022-03-10
sl:creationTime : 2022-03-10T13:54:40Z

File info

Bookmark of: https://arxiv.org/abs/2110.10778

Documents with similar tags (experimental)

[2401.18059] RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

Tags:

2024-02-03 About

[2309.15427] Graph Neural Prompting with Large Language Models

Tags:

> Can we learn beneficial knowledge from KGs
and integrate them into pre-trained LLMs?

> we propose to
leverage the factual knowledge from KGs to enhance LLMs,
while still benefiting from circumventing the burdensome
training expenses by using pre-trained LLMs

> Graph Neural Prompting
(GNP), a plug-and-play method to assist pre-trained
LLMs in learning beneficial knowledge from KGs
>
> GNP
encodes the pertinent grounded knowledge and complex
structural information to derive Graph Neural Prompt, an
embedding vector that can be sent into LLMs to provide
guidance and instructions

> - GNP first utilizes
a GNN to capture and encode the
intricate graph knowledge into **entity/node embeddings**. 
> - Then,
a cross-modality pooling module is present to determine
the **most relevant node embeddings in relation to the text
input**, and consolidate these node embeddings into **a holistic
graph-level embedding**.
> - After that, GNP encompasses a
**domain projector** to bridge the inherent disparities between
the graph and text domains.
> - Finally, a **self-supervised link
prediction objective** is introduced to enhance the model
comprehension of relationships between entities and capture
graph knowledge in a self-supervised manner.

2023-09-28 About

[2305.07185] MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers

Tags: