AI/etc

Adam보다 2배 빠른 새로운 옵티마이저 소피아

유로파물고기 2023. 5. 29. 00:44
반응형

https://arxiv.org/abs/2305.14342

 

Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training

Given the massive cost of language model pre-training, a non-trivial improvement of the optimization algorithm would lead to a material reduction on the time and cost of training. Adam and its variants have been state-of-the-art for years, and more sophist

arxiv.org

https://twitter.com/tengyuma/status/1661412995430219786?s=20

 

트위터에서 즐기는 Tengyu Ma

“Adam, a 9-yr old optimizer, is the go-to for training LLMs (eg, GPT-3, OPT, LLAMA). Introducing Sophia, a new optimizer that is 2x faster than Adam on LLMs. Just a few more lines of code could cut your costs from $2M to $1M (if scaling laws hold). https

twitter.com

9년 된 Adam 옵티마이저보다 2배 빠른 새로운 옵티마이저 Sophia가 발표되었다.