AI/etc

명령 예측을 통한 장기적 모방 개선

유로파물고기 2023. 6. 23. 12:16
반응형

https://arxiv.org/abs/2306.12554

 

Improving Long-Horizon Imitation Through Instruction Prediction

Complex, long-horizon planning and its combinatorial nature pose steep challenges for learning-based agents. Difficulties in such settings are exacerbated in low data regimes where over-fitting stifles generalization and compounding errors hurt accuracy. I

arxiv.org

https://github.com/jhejna/instruction-prediction

 

GitHub - jhejna/instruction-prediction

Contribute to jhejna/instruction-prediction development by creating an account on GitHub.

github.com

 

1. 복잡하고 장기적인 계획 및 그 조합적인 성격은 학습 기반 에이전트에게 가파른 도전을 제기합니다. 이러한 상황에서 어려움은 데이터가 적은 상황에서 과적합이 일반화를 억제하고 누적 오류가 정확도를 저해하는 경우 더욱 심해집니다.

2. 이 연구에서는 종종 사용되지 않는 보조 감독의 출처인 언어의 사용을 탐색합니다. 트랜스포머 기반 모델의 최근 진보에 영감을 받아, 고차원적인 추상화 수준에서 작동하는 시간적으로 확장된 표현을 학습하도록 격려하는 지시 예측 손실을 이용해 에이전트를 훈련시킵니다.

3. 구체적으로, BabyAI와 Crafter 벤치마크에서 제한된 수의 시연을 통해 훈련할 때 지시 모델링이 계획 환경에서의 성능을 크게 향상시킴을 보여줍니다. 더 자세한 분석에서 복잡한 추론을 필요로 하는 작업에 대해 지시 모델링이 가장 중요하며, 당연히 단순한 계획을 필요로 하는 환경에서는 더 작은 이득을 제공한다는 것을 발견했습니다.