AI/etc

생각 복제: 인간의 생각을 모방하여 행동하면서 생각하는 법 배우기

유로파물고기 2023. 6. 3. 00:12
반응형

https://arxiv.org/abs/2306.00323

 

Thought Cloning: Learning to Think while Acting by Imitating Human Thinking

Language is often considered a key aspect of human thinking, providing us with exceptional abilities to generalize, explore, plan, replan, and adapt to new situations. However, Reinforcement Learning (RL) agents are far from human-level performance in any

arxiv.org

https://twitter.com/jeffclune/status/1664618665160085505

 

트위터에서 즐기는 Jeff Clune

“Introducing Thought Cloning: AI agents learn to *think* & act like humans by imitating the thoughts & actions of humans thinking out loud while acting, enhancing performance, efficiency, generalization, AI Safety & Interpretability. Led by @shengranhu h

twitter.com

1. 인간의 사고력의 핵심적인 측면인 언어는 우리에게 일반화, 탐색, 계획, 재계획, 그리고 새로운 상황에 적응하는 능력을 부여합니다. 하지만 강화 학습(RL) 에이전트는 이러한 능력 중 어느 것에서도 인간 수준의 성능에 미치지 못합니다. 우리는 이러한 인지적 결함의 한 이유는 그들이 언어로 사고하는 이점이 부족하기 때문이며, 그들을 인간처럼 사고하는 방식으로 훈련함으로써 AI 에이전트를 향상시킬 수 있다고 가설을 세웠습니다.
2. 우리는 인간의 행동뿐만 아니라 인간이 이러한 행동을 수행하면서 가지는 생각을 복제하는 것이 아니라, 새로운 모방 학습 프레임워크인 Thought Cloning을 소개합니다. 인터넷 규모의 데이터셋에서 특히 빛을 발하리라고 기대하며, 이는 행동하는 동안 대면적으로 사고하는 사람들의 데이터셋(예를 들어, 텍스트로 번역된 온라인 비디오)에서 특히 그렇습니다.
3. Thought Cloning은 Behavioral Cloning보다 훨씬 빠르게 학습하며, 분포 외 테스트 작업이 얼마나 멀리 떨어져 있는지에 따라 성능 우위가 커짐으로써, 새로운 상황을 더 잘 다룰 수 있는 능력을 강조합니다. Thought Cloning은 AI 안전성과 해석 가능성에 중요한 이점을 제공하며, AI를 디버그하고 개선하는 것을 쉽게 만듭니다. 에이전트의 생각을 관찰할 수 있기 때문에, 우리는 왜 문제가 발생하는지를 더 쉽게 진단하고, 문제를 해결하는 것을 우리에게 쉽게 만들며, 에이전트의 생각을 교정함으로써 에이전트를 조종하거나, 에이전트가 계획한 불안전한 행동을 방지할 수 있습니다. 전반적으로, Thought Cloning은 에이전트가 어떻게 사고하고 행동하는지를 교육함으로써 더 안전하고 강력한 에이전트를 만듭니다.