AI/Google&DeepMind 29

Three Towers: 사전 훈련된 이미지 모델을 사용한 유연한 대조 학습

https://arxiv.org/abs/2305.16999 Three Towers: Flexible Contrastive Learning with Pretrained Image Models We introduce Three Towers (3T), a flexible method to improve the contrastive learning of vision-language models by incorporating pretrained image classifiers. While contrastive models are usually trained from scratch, LiT (Zhai et al., 2022) has recently s arxiv.org 1. 본 논문에서는 사전에 학습된 이미지 분류..

AI/Google&DeepMind 2023.05.30

대규모 언어모델을 사용한 전문가 수준의 의료 질문 답변을 향하여

https://arxiv.org/abs/2305.09617 Towards Expert-Level Medical Question Answering with Large Language Models Recent artificial intelligence (AI) systems have reached milestones in "grand challenges" ranging from Go to protein-folding. The capability to retrieve medical knowledge, reason over it, and answer medical questions comparably to physicians has long been arxiv.org 설명: https://twitter.com/..

AI/Google&DeepMind 2023.05.29

무작위 위치 인코딩으로 트랜스포머의 길이 일반화 향상

https://huggingface.co/papers/2305.16843 Paper page - Randomized Positional Encodings Boost Length Generalization of Transformers Anian Ruoss , Grégoire Delétang , Tim Genewein , Jordi Grau-Moya , Róbert Csordás , Mehdi Bennani , Shane Legg , Joel Veness ·published on May 26 huggingface.co 1. 트랜스포머는 고정된 컨텍스트 길이를 가진 작업에 대해 인상적인 일반화 능력을 보여줍니다. 그러나 문자열을 복제하는 것과 같은 비교적 간단한 작업에 대해서도 임의의 길이의 시퀀스로 일반화하..

AI/Google&DeepMind 2023.05.29

생각의 나무(ToT): 대규모 언어 모델을 사용한 고의적 문제 해결

https://github.com/ysymyth/tree-of-thought-llm https://huggingface.co/papers/2305.10601 Paper page - Tree of Thoughts: Deliberate Problem Solving with Large Language Models huggingface.co 설명 https://twitter.com/ShunyuYao12/status/1659357547474681857?s=20 트위터에서 즐기는 Shunyu Yao “Still use ⛓️Chain-of-Thought (CoT) for all your prompting? May be underutilizing LLM capabilities🤠 Introducing 🌲Tree-of-T..

AI/Google&DeepMind 2023.05.29

시뮬레이션된 인간 사회에서 사회적으로 정렬된 언어 모델 교육

https://arxiv.org/abs/2305.16960 Training Socially Aligned Language Models in Simulated Human Society Social alignment in AI systems aims to ensure that these models behave according to established societal values. However, unlike humans, who derive consensus on value judgments through social interaction, current language models (LMs) are trained to rigidl arxiv.org 1. 인공지능 시스템에서의 사회적 조정은 이러한 모델..

AI/Google&DeepMind 2023.05.29

도구 제작자로서의 대규모 언어 모델

설명 https://twitter.com/tianle_cai/status/1662988435114852352?s=20 트위터에서 즐기는 Tianle Cai “LLMs can make their own tools just like humans🤖! Thrilled to share my intern work @Google. We introduced a closed-loop framework to let LLMs make and utilize reusable new tools🛠️ (implemented as programs). Paper: https://t.co/cOk3VZ47ka More det twitter.com https://arxiv.org/abs/2305.17126 Large Language Mode..

AI/Google&DeepMind 2023.05.29

대규모 언어모델을 사용한 역할극

https://arxiv.org/abs/2305.16367 Role-Play with Large Language Models As dialogue agents become increasingly human-like in their performance, it is imperative that we develop effective ways to describe their behaviour in high-level terms without falling into the trap of anthropomorphism. In this paper, we foreground the conc arxiv.org 대화형 에이전트가 인간과 같은 행동을 점점 더 보여주면서, 인간의 특성을 과잉적으로 부여하는 인간화(anthr..

AI/Google&DeepMind 2023.05.29

Getting ViT in Shape 컴퓨팅 최적화 모델 설계를 위한 확장 법칙

https://arxiv.org/abs/2305.13035 Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design Scaling laws have been recently employed to derive compute-optimal model size (number of parameters) for a given compute duration. We advance and refine such methods to infer compute-optimal model shapes, such as width and depth, and successfully implement arxiv.org 1. 컴퓨트 최적화 모델 형태 유추: 최근에는 주어진 ..

AI/Google&DeepMind 2023.05.28

Flan-MoE: 전문가가 거의 혼합되지 않은 확장명령 미세조정 언어모델

https://arxiv.org/abs/2305.14705 Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts The explosive growth of language models and their applications have led to an increased demand for efficient and scalable methods. In this paper, we introduce Flan-MoE, a set of Instruction-Finetuned Sparse Mixture-of-Expert (MoE) models. We show that naiv arxiv.org 1. 이 논문에서는 ..

AI/Google&DeepMind 2023.05.28