AI/etc 48

언어 모델을 사용한 언어 모델의 사회적 추론 이해

https://arxiv.org/abs/2306.15448 Understanding Social Reasoning in Language Models with Language Models As Large Language Models (LLMs) become increasingly integrated into our everyday lives, understanding their ability to comprehend human mental states becomes critical for ensuring effective interactions. However, despite the recent attempts to assess the T arxiv.org 1. 대형 언어 모델(LLM)이 우리 일상에 점점..

AI/etc 2023.06.28

산술 트랜스포머의 길이 일반화

https://arxiv.org/abs/2306.15400 Length Generalization in Arithmetic Transformers We examine how transformers cope with two challenges: learning basic integer arithmetic, and generalizing to longer sequences than seen during training. We find that relative position embeddings enable length generalization for simple tasks, such as additi arxiv.org 1. 우리는 트랜스포머가 기본 정수 산술을 학습하고, 훈련 중에 본 것보다 더 긴 시퀀스..

AI/etc 2023.06.28

Aidan Gomez 인터뷰

https://www.ft.com/content/732fc372-67ea-4684-9ab7-6b6f3cdfd736 Aidan Gomez: AI threat to human existence is ‘absurd’ distraction from real risks Co-founder of Cohere says we should be more worried about use of artificial intelligence in social media and medicine www.ft.com 코헤어의 공동 창업자이자 CEO인 에이든 고메즈는 인공지능(AI)에 대한 과도한 두려움이 실제 기술 문제로부터 우리를 방해하고 있다고 주장합니다. 그는 슈퍼인공지능(AGI)에 의한 인류 멸망에 대한 논의는 우리 시간과 공..

AI/etc 2023.06.23

ALP: 인식을 위한 행동 인식 구현 학습

https://arxiv.org/abs/2306.10190 ALP: Action-Aware Embodied Learning for Perception Current methods in training and benchmarking vision models exhibit an over-reliance on passive, curated datasets. Although models trained on these datasets have shown strong performance in a wide variety of tasks such as classification, detection, and segm arxiv.org 1. 현재의 시각 모델 학습 및 벤치마킹 방법은 수동적이고 선별된 데이터셋에 과도하게..

AI/etc 2023.06.21

역 스케일링: 클수록 좋지 않은 경우

https://arxiv.org/abs/2306.09479 Inverse Scaling: When Bigger Isn't Better Work on scaling laws has found that large language models (LMs) show predictable improvements to overall loss with increased scale (model size, training data, and compute). Here, we present evidence for the claim that LMs may show inverse scaling, or worse arxiv.org 1. 대형 언어 모델(LM)에 대한 연구에서는 모델 크기, 훈련 데이터, 계산량 등이 증가함에 따라 ..

AI/etc 2023.06.19

언어 모델이 약한 에이전트를 가르칠 수 있습니까? 마음의 이론을 통해 학생들을 향상시키는 교사 설명

https://arxiv.org/abs/2306.09299 Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind Large Language Models (LLMs) perform complex reasoning by generating explanations for their predictions. However, a complementary goal of explanations is to also communicate useful knowledge that improves weaker agents. Hence, we investigate whether LLMs a arxiv.org ..

AI/etc 2023.06.18