Semanlink - [2302.04870] Offsite-Tuning: Transfer Learning without Full Model

[2302.04870] Offsite-Tuning: Transfer Learning without Full Model

Tags:

About This Document

sl:arxiv_author :
sl:arxiv_firstAuthor : Guangxuan Xiao
sl:arxiv_num : 2302.04870
sl:arxiv_published : 2023-02-09T18:59:55Z
sl:arxiv_summary : Transfer learning is important for foundation models to adapt to downstream tasks. However, many foundation models are proprietary, so users must share their data with model owners to fine-tune the models, which is costly and raise privacy concerns. Moreover, fine-tuning large foundation models is computation-intensive and impractical for most downstream users. In this paper, we propose Offsite-Tuning, a privacy-preserving and efficient transfer learning framework that can adapt billion-parameter foundation models to downstream data without access to the full model. In offsite-tuning, the model owner sends a light-weight adapter and a lossy compressed emulator to the data owner, who then fine-tunes the adapter on the downstream data with the emulator's assistance. The fine-tuned adapter is then returned to the model owner, who plugs it into the full model to create an adapted foundation model. Offsite-tuning preserves both parties' privacy and is computationally more efficient than the existing fine-tuning methods that require access to the full model weights. We demonstrate the effectiveness of offsite-tuning on various large language and vision foundation models. Offsite-tuning can achieve comparable accuracy as full model fine-tuning while being privacy-preserving and efficient, achieving 6.5x speedup and 5.6x memory reduction. Code is available at https://github.com/mit-han-lab/offsite-tuning.@en
sl:arxiv_title : Offsite-Tuning: Transfer Learning without Full Model@en
sl:arxiv_updated : 2023-02-09T18:59:55Z
sl:bookmarkOf : https://arxiv.org/abs/2302.04870
sl:creationDate : 2023-02-11
sl:creationTime : 2023-02-11T18:33:24Z