张浩彬
(2024-10-30 10:19):
#paper
AdapterFusion: Non-Destructive Task Composition for Transfer Learning
https://doi.org/10.48550/arXiv.2005.00247
adapter的改进版本,AdapterFusion。简单来说就是多个任务分别构建adapter,之后通过组合adapters的方式实现更好知识融合。
摘要简述:序列微调和多任务学习是旨在融合多个任务知识的方法;然而,它们存在灾难性遗忘和数据集平衡困难的问题。为了解决这些缺点,我们提出了AdapterFusion,这是一种新的两阶段学习算法,可以利用多个任务的知识。首先,在知识提取阶段,我们学习称为adapters的特定任务参数,这些参数封装了特定任务的信息。然后,我们在单独的知识组合步骤中组合adapters。我们表明,通过分离这两个阶段,即知识提取和知识组合,分类器可以以非破坏性的方式有效地利用从多个任务中学习的表示。我们在16个不同的NLU任务上对AdapterFusion进行了实证评估,发现它可以有效地在模型的不同层结合各种类型的知识。我们表明,我们的方法优于传统策略,如完全微调以及多任务学习。我们的代码和adapters可在AdapterHub.ml上获得。
arXiv,
2020-05-01T07:03:42Z.
DOI: 10.48550/arXiv.2005.00247
AdapterFusion: Non-Destructive Task Composition for Transfer Learning
翻译
Abstract:
Sequential fine-tuning and multi-task learning are methods aiming toincorporate knowledge from multiple tasks; however, they suffer fromcatastrophic forgetting and difficulties in dataset balancing. To address theseshortcomings, we propose AdapterFusion, a new two stage learning algorithm thatleverages knowledge from multiple tasks. First, in the knowledge extractionstage we learn task specific parameters called adapters, that encapsulate thetask-specific information. We then combine the adapters in a separate knowledgecomposition step. We show that by separating the two stages, i.e., knowledgeextraction and knowledge composition, the classifier can effectively exploitthe representations learned from multiple tasks in a non-destructive manner. Weempirically evaluate AdapterFusion on 16 diverse NLU tasks, and find that iteffectively combines various types of knowledge at different layers of themodel. We show that our approach outperforms traditional strategies such asfull fine-tuning as well as multi-task learning. Our code and adapters areavailable at AdapterHub.ml.
翻译
Related Links: