AI/Google&DeepMind

Learning to Modulate pre-trained Models in RL

유로파물고기 2023. 6. 27. 10:47
반응형

https://arxiv.org/abs/2306.14884

 

Learning to Modulate pre-trained Models in RL

Reinforcement Learning (RL) has been successful in various domains like robotics, game playing, and simulation. While RL agents have shown impressive capabilities in their specific tasks, they insufficiently adapt to new tasks. In supervised learning, this

arxiv.org

1. 강화 학습(RL)은 로봇 공학, 게임 플레이, 시뮬레이션 등 다양한 분야에서 성공적으로 활용되었지만, RL 에이전트는 특정 작업에서 뛰어난 능력을 보이지만 새로운 작업에는 충분히 적응하지 못합니다. 이러한 적응 문제는 대규모 사전 학습과 다운스트림 작업에 대한 미세 조정을 통해 해결됩니다.

2. 최근에는 RL에서 다중 작업에 대한 사전 학습이 주목받고 있지만, 사전 학습된 모델의 미세 조정은 종종 치명적인 망각, 즉 새로운 작업에 미세 조정할 때 사전 학습 작업의 성능이 저하되는 현상을 겪습니다. 이러한 치명적인 망각 현상을 조사하기 위해 우리는 Meta-World와 DMControl이라는 두 벤치마크 세트의 데이터셋에서 모델을 공동으로 사전 학습하였습니다.

3. 우리는 학습된 기술의 저하를 피하기 위해 냉동된 사전 학습 모델의 정보 흐름을 학습 가능한 모듈레이션 풀을 통해 조절하는 새로운 방법인 Learning-to-Modulate (L2M)를 제안합니다. 우리의 방법은 Continual-World 벤치마크에서 최고의 성능을 달성하면서 사전 학습 작업에 대한 성능을 유지합니다. 마지막으로, 이 분야의 미래 연구를 지원하기 위해 Meta-World 50개와 DMControl 16개 작업을 포함하는 데이터셋을 공개합니다.