About This Document
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Denghui Zhang
- sl:arxiv_num : 2009.02835
- sl:arxiv_published : 2020-09-07T00:15:36Z
- sl:arxiv_summary : Pre-trained language models such as BERT have achieved great success in a
broad range of natural language processing tasks. However, BERT cannot well
support E-commerce related tasks due to the lack of two levels of domain
knowledge, i.e., phrase-level and product-level. On one hand, many E-commerce
tasks require an accurate understanding of domain phrases, whereas such
fine-grained phrase-level knowledge is not explicitly modeled by BERT's
training objective. On the other hand, product-level knowledge like product
associations can enhance the language modeling of E-commerce, but they are not
factual knowledge thus using them indiscriminately may introduce noise. To
tackle the problem, we propose a unified pre-training framework, namely,
E-BERT. Specifically, to preserve phrase-level knowledge, we introduce Adaptive
Hybrid Masking, which allows the model to adaptively switch from learning
preliminary word knowledge to learning complex phrases, based on the fitting
progress of two modes. To utilize product-level knowledge, we introduce
Neighbor Product Reconstruction, which trains E-BERT to predict a product's
associated neighbors with a denoising cross attention layer. Our investigation
reveals promising results in four downstream tasks, i.e., review-based question
answering, aspect extraction, aspect sentiment classification, and product
classification.@en
- sl:arxiv_title : E-BERT: A Phrase and Product Knowledge Enhanced Language Model for E-commerce@en
- sl:arxiv_updated : 2020-09-10T23:00:16Z
- sl:bookmarkOf : https://arxiv.org/abs/2009.02835
- sl:creationDate : 2020-12-14
- sl:creationTime : 2020-12-14T11:10:29Z
- sl:relatedDoc :
Documents with similar tags (experimental)