bequiet-log
๐Ÿท๏ธ Tags
๐Ÿ’ป Profile
profile_image
Yongjun Cho
Machine Learning Researcher
Let's make Synergy together
๐Ÿ”Ž Search
๐Ÿ“‚ All Posts
๐Ÿค– Embodied AI
(3) [Paper Review] Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning: Embodied AI

(3) [Paper Review] Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning: Embodied AI

May 21, 2024

Today, I will review a new paper that was released yesterday. This research comes from Sergey Levineโ€™s team, a prominent figure in the AI and RL domains. They propose fine-tuning Vision-Language Models (VLM) with Reinforcement Learning (RL) to enhance performance in optimal decision-making tasks within multi-step interactive environments. The paper presents a simple approach that outperforms both GPT-4 and Gemini. This research is similar to my own ideas for solving challenges in embodied AI. Therefore, I will review this paper and organize its key concepts.

ENG
Blog
Vison Language Model
๐Ÿ“ฆ 3D Generation
(3) Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models : 3D Generation

(3) Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models : 3D Generation

May 14, 2024

์ด๋ฒˆ์—๋Š” 4D generation (3D generation + motion)์— ๋Œ€ํ•ด ๋ฆฌ๋ทฐํ•ด๋ณด๋„๋ก ํ•˜๊ฒ ๋‹ค. ์ด ์—ฐ๊ตฌ๋Š” Nvidia์—์„œ ๋ฐœํ‘œํ•œ ๋…ผ๋ฌธ์œผ๋กœ ์—ฌ๋Ÿฌ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ 4D generation์„ ์ง„ํ–‰ํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœํ•˜์˜€๋‹ค. ์•„์ง์€ ๋ฐœ์ „ํ•  ๊ฒƒ์ด ๋งŽ์•„๋ณด์ด์ง€๋งŒ ๊ทธ๋ž˜๋„ ์ƒˆ๋กœ์šด ์—ฐ๊ตฌ ๋ฐฉํ–ฅ์œผ๋กœ์จ 4D๊ฐ€ ์ฃผ๋ชฉ๋ฐ›๊ณ  ์žˆ๊ณ  ์—ฐ๊ตฌ๋ฅผ ํ•˜๊ธฐ์—๋Š” ์ตœ์ ์˜ ์ฃผ์ œ๋ผ๊ณ  ์ƒ๊ฐํ•œ๋‹ค.

3D Generation
KOR
Diffusion
Blog
๐Ÿค– Embodied AI
(2) [๋…ผ๋ฌธ๋ฆฌ๋ทฐ] RT-2, Vision-Language-Action Models Transfer Web Knowlege to Robotic Control: Embodied AI

(2) [๋…ผ๋ฌธ๋ฆฌ๋ทฐ] RT-2, Vision-Language-Action Models Transfer Web Knowlege to Robotic Control: Embodied AI

May 10, 2024

์ด๋ฒˆ์—๋Š” Q-transformer์— ์ด์–ด ๋”ฅ๋งˆ์ธ๋“œ์—์„œ ๊ณต๊ฐœํ•œ Embodied AI์ธ RT-2์— ๋Œ€ํ•œ ๋ฆฌ๋ทฐ๋ฅผ ์ง„ํ–‰ํ•ด๋ณด๋„๋ก ํ•˜๊ฒ ๋‹ค. ์ด์ „์— RT-1๊ณผ Q-transformer๊ฐ€ ๋กœ๋ด‡ ๋ฐ์ดํ„ฐ๋งŒ์„ ๊ฐ€์ง€๊ณ  Transformer๋ฅผ ํ•™์Šต์‹œ์ผœ Imitation learning์„ ์ง„ํ–‰ํ–ˆ๋‹ค๋ฉด ์ด๋ฒˆ์—๋Š” Internet Scale์—์„œ ํ•™์Šต๋œ Vision Language ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋กœ๋ด‡ ๋ฐ์ดํ„ฐ๋ฅผ ์ถ”๊ฐ€ํ•ด ๋”์šฑ Generalization ์„ฑ๋Šฅ์ด ๋›ฐ์–ด๋‚œ ๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜๋Š” ์—ฐ๊ตฌ์ด๋‹ค. GPT๋ฅผ ์จ๋ณด์•˜๋‹ค๋ฉด ์ด๋ฏธ ์ด๋ฏธ์ง€๋ฅผ ํ†ตํ•œ Reasoning์˜ ์ˆ˜์ค€์ด ๋†€๋ผ์šด ์ˆ˜์ค€์ด๊ณ , ์ด๋ฅผ ํ™œ์šฉํ•˜๋ฉด ์‹ค์ œ ๋กœ๋ด‡์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค๋Š” ์ƒ์ƒ์„ ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค. ์ด ๋…ผ๋ฌธ์€ ๊ทธ ์ƒ์ƒ์„ ์ง์ ‘ ์‹คํ—˜์œผ๋กœ ์ฆ๋ช…ํ•˜๊ณ  ๊ฒ€์ฆํ•œ ๋…ผ๋ฌธ์ด๋‹ค. ์ด ๋…ผ๋ฌธ์„ ํ†ตํ•ด ๋” ์ข‹์€ ์„ฑ๋Šฅ์˜ Embodied AI๊ฐ€ ๊ฐœ๋ฐœ๋  ๊ฒƒ์ด๋ผ๋Š” ํ™•์‹ ์„ ๊ฐ€์ง€๊ฒŒ ๋˜์—ˆ๋‹ค.

KOR
Robotics
Vison Language Model
Blog
๐Ÿ“ฆ 3D Generation
(2) [๋…ผ๋ฌธ๋ฆฌ๋ทฐ] Zero123: 3D Generation

(2) [๋…ผ๋ฌธ๋ฆฌ๋ทฐ] Zero123: 3D Generation

May 9, 2024

์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” 3D generation ๋ชจ๋ธ ์ค‘์—์„œ Zero123๋ฅผ ๋ฆฌ๋ทฐํ•ด ๋ณผ ๊ฒƒ์ด๋‹ค. Zero123๋Š” diffusion model์„ ์นด๋ฉ”๋ผ ๊ฐ๋„์— ๋”ฐ๋ฅธ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๋„๋ก finetuningํ•˜์—ฌ 3D generation์„ ์ง„ํ–‰ํ•œ๋‹ค๋Š” ๋งค์šฐ ๊ฐ„๋‹จํ•œ ์•„์ด๋””์–ด์—์„œ ์ถœ๋ฐœํ•œ ๋…ผ๋ฌธ์ด๋‹ค.

3D Generation
Blog
KOR
Diffusion
๐Ÿ“ฆ 3D Generation
(1) [๋…ผ๋ฌธ๋ฆฌ๋ทฐ] DreamFusion: Text-To-3D Using 2D Diffusion - 3D generation

(1) [๋…ผ๋ฌธ๋ฆฌ๋ทฐ] DreamFusion: Text-To-3D Using 2D Diffusion - 3D generation

Apr 29, 2024

์ด๋ฒˆ ํฌ์ŠคํŠธ์—์„œ๋Š” ํ˜„์žฌ ๋‹ค์–‘ํ•œ 3D generation ๋ชจ๋ธ์˜ ๊ธฐ์ดˆ๊ฐ€ ๋œ๋‹ค๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋Š” DreamFusion์— ๋Œ€ํ•ด์„œ ๋ฆฌ๋ทฐํ•ด๋ณผ ๊ฒƒ์ด๋‹ค. ์ด ๋…ผ๋ฌธ์€ ๊ตฌ๊ธ€ ๋ฆฌ์„œ์น˜์™€ ๋ฒ„ํด๋ฆฌ์—์„œ ์ง„ํ–‰ํ•œ ์—ฐ๊ตฌ์ด๊ณ , 2D diffusion ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ NeRF๋ฅผ ํ•™์Šต์‹œ์ผœ 3D generation ๋ชจ๋ธ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค๋Š” ์‚ฌ์‹ค๋กœ ์ฃผ๋ชฉ๋ฐ›์•˜๋‹ค. ํ˜„์žฌ๋Š” video prior, 2D์™€ 3D๋ฅผ ๊ฒฐํ•ฉํ•œ prior ๋“ฑ์„ ์ด์šฉํ•˜์—ฌ ๋‹ค์–‘ํ•œ ์—ฐ๊ตฌ๊ฐ€ ๋‚˜์˜ค๊ณ  ์žˆ๋‹ค. ์•ž์œผ๋กœ 3D generation ๋ชจ๋ธ๋“ค์— ๋Œ€ํ•ด ๋ฆฌ๋ทฐ๋ฅผ ์ง„ํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ์•Œ์•„๋‘์–ด์•ผํ•˜๋Š” ๋…ผ๋ฌธ์ด๊ธฐ ๋•Œ๋ฌธ์— ์ž์„ธํ•œ ๋ฆฌ๋ทฐ๋ฅผ ์ง„ํ–‰ํ•ด๋ณด๋„๋ก ํ•  ๊ฒƒ์ด๋‹ค.

Diffusion
3D Generation
Blog
KOR
๐Ÿค– Embodied AI
(1) [๋…ผ๋ฌธ๋ฆฌ๋ทฐ] Q-transformer : Embodied AI

(1) [๋…ผ๋ฌธ๋ฆฌ๋ทฐ] Q-transformer : Embodied AI

Mar 4, 2024

์ด๋ฒˆ ํฌ์ŠคํŠธ์—์„œ๋Š” ์ง€๋‚œ๋ฒˆ ํฌ์ŠคํŠธ์—์„œ ์งง๊ฒŒ ์„ค๋ช…ํ–ˆ๋˜ Q-Transformer๋ผ๋Š” ๋…ผ๋ฌธ์— ๋Œ€ํ•ด ๋” ์ž์„ธํžˆ ์•Œ์•„๋ณผ ๊ฒƒ์ด๋‹ค. ์ด ํฌ์ŠคํŠธ๋ฅผ ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ง€๋‚œ ๊ธ€ ์ค‘ offline RL ๋ถ€๋ถ„์„ ๋ฐ˜๋“œ์‹œ ์ฝ์–ด๋ณด๋Š” ๊ฒƒ์ด ์ข‹๋‹ค.

Reinforcement Learning
Blog
KOR
Robotics
๐Ÿ›ก๏ธReinforcement Learning
(2) Text-to-Image Diffusion Model, Alignment in Deep Learning : Comprehensive summary

(2) Text-to-Image Diffusion Model, Alignment in Deep Learning : Comprehensive summary

Feb 27, 2024

์ง€๋‚œ ํฌ์ŠคํŒ…์— ์ด์–ด ์ด๋ฒˆ์—๋Š” Image generation model์—์„œ์˜ alignment๋ฅผ ์‚ดํŽด๋ณด๋ ค๊ณ ํ•œ๋‹ค. ์ด ๋ถ„์•ผ๋Š” ํ˜„์žฌ ์น˜์—ดํ•œ ๊ฒฝ์Ÿ์ด ์ง„ํ–‰๋˜๊ณ  ์žˆ์–ด ๋งŽ์€ ๋…ผ๋ฌธ์ด ๋ฐœํ‘œ๋˜๊ณ  ์žˆ๋‹ค. ์ด ๊ธ€์—์„œ๋Š” ์ฒซ ์‹œ๋„์ธ Aligning Text-to-Image ๋…ผ๋ฌธ๋ถ€ํ„ฐ DPOK, Diffusion DPO๊นŒ์ง€ ์ž์„ธํ•˜๊ฒŒ ๋ฆฌ๋ทฐํ•ด๋ณด๊ณ ์ž ํ•œ๋‹ค. ๋‚˜๋จธ์ง€ ๋‹ค์–‘ํ•œ ์—ฐ๊ตฌ๋“ค์€ ์งง๊ฒŒ ์š”์•ฝํ•ด์„œ ์„ค๋ช…ํ•  ๊ฒƒ์ด๋‹ค.

Blog
KOR
Reinforcement Learning
Diffusion
๐Ÿ›ก๏ธReinforcement Learning
(1) RLHF LLM, Alignment in Deep Learning: Comprehensive Summary

(1) RLHF LLM, Alignment in Deep Learning: Comprehensive Summary

Feb 16, 2024

LLM๊ณผ Image generation ๋ชจ๋ธ์„ ํ†ตํ•ด ์ƒ์„ฑํ˜• ์ธ๊ณต์ง€๋Šฅ ๋ชจ๋ธ์— ๋Œ€ํ•œ ๊ด€์‹ฌ์€ ํญ๋ฐœ์ ์œผ๋กœ ์ฆ๊ฐ€ํ–ˆ๋‹ค. ์ด๋ฏธ ChatGPT์™€ ๋ฏธ๋“œ์ €๋‹ˆ์™€ ๊ฐ™์€ ์ธ๊ณต์ง€๋Šฅ ๋ชจ๋ธ์˜ ์˜ํ–ฅ๋ ฅ์€ ๊ฒฝ์ œ ์‚ฌํšŒ ์ „๋ฐ˜์˜ ๋ณ€ํ™”๋ฅผ ์ผ์œผํ‚ค๊ณ  ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ์ƒ์„ฑ ๋ชจ๋ธ์„ ์„œ๋น„์Šค์— ์‚ฌ์šฉํ•˜๊ธฐ๊นŒ์ง€๋Š” ์—ฌ๋Ÿฌ ๋ฒˆ์˜ ํ•™์Šต ๊ณผ์ •์„ ๊ฑฐ์น˜๊ฒŒ ๋˜๋Š”๋ฐ, ์ด๋Š” ๋ฐ”๋กœ ์ƒ์„ฑํ˜• ๋ชจ๋ธ์ด ๊ฐ€์ง„ ํŠน์ง• ๋•Œ๋ฌธ์ด๋‹ค. ์ด ํ•™์Šต ๊ณผ์ •์—์„œ ํ•„์ˆ˜์ ์ธ Alignment์— ๋Œ€ํ•ด์„œ ์ •๋ฆฌํ•ด๋ณด์•˜๋‹ค.

Blog
KOR
Reinforcement Learning
Large Language Model
๐Ÿ˜Ž Daily
๋ฐ์ผ ์นด๋„ค๊ธฐ์˜ ์ธ๊ฐ„๊ด€๊ณ„๋ก ์„ ์ฝ๊ณ 

๋ฐ์ผ ์นด๋„ค๊ธฐ์˜ ์ธ๊ฐ„๊ด€๊ณ„๋ก ์„ ์ฝ๊ณ 

Jan 19, 2024

๋ฐ์ผ ์นด๋„ค๊ธฐ์˜ ์ธ๊ฐ„๊ด€๊ณ„๋ก ์„ ์ฝ๊ณ  ๋Š๋‚€์ ์„ ์ •๋ฆฌํ•ด๋ณด์•˜๋‹ค.

Blog
KOR
Book
๐Ÿ›ก๏ธReinforcement Learning
(4) ๊ฐ•ํ™”ํ•™์Šต ๊ฒŒ์ž„ ์ƒ์šฉํ™”  - [๋ชจ๋‘์˜ ์—ฐ๊ตฌ์†Œ] ์ค‘์š”ํ•œ ๊ฒƒ์€ ๊บพ์ด์ง€ ์•Š๋Š” RL ํ›„๊ธฐ

(4) ๊ฐ•ํ™”ํ•™์Šต ๊ฒŒ์ž„ ์ƒ์šฉํ™” - [๋ชจ๋‘์˜ ์—ฐ๊ตฌ์†Œ] ์ค‘์š”ํ•œ ๊ฒƒ์€ ๊บพ์ด์ง€ ์•Š๋Š” RL ํ›„๊ธฐ

Dec 19, 2023

์ด๋ฒˆ ๋ธ”๋กœ๊ทธ ๊ธ€์€ ๋ฒŒ์จ ๋งˆ์ง€๋ง‰ ์ฃผ์ œ์ด๋‹ค. ๋‘ ๊ฐœ์˜ ๋ฐœํ‘œ๋งŒ์ด ๋‚จ์•˜๋Š”๋ฐ ๋ชจ๋‘ ๊ฐ•ํ™”ํ•™์Šต์„ ๊ฒŒ์ž„์— ์ ์šฉํ•˜์—ฌ ์ƒ์šฉํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ๋ฐœํ‘œ์˜€๋‹ค.

Seminar
Blog
KOR
Reinforcement Learning
๐Ÿ˜Ž Daily
Striving for Visibility: My Journey to Get My Blog Indexed on Google

Striving for Visibility: My Journey to Get My Blog Indexed on Google

Dec 10, 2023

How to get indexing from google search console efficiently

Blog
ENG
๐Ÿ›ก๏ธReinforcement Learning
(3) Pretraining for inteligent robot - [๋ชจ๋‘์˜ ์—ฐ๊ตฌ์†Œ] ์ค‘์š”ํ•œ ๊ฒƒ์€ ๊บพ์ด์ง€ ์•Š๋Š” RL ํ›„๊ธฐ

(3) Pretraining for inteligent robot - [๋ชจ๋‘์˜ ์—ฐ๊ตฌ์†Œ] ์ค‘์š”ํ•œ ๊ฒƒ์€ ๊บพ์ด์ง€ ์•Š๋Š” RL ํ›„๊ธฐ

Nov 30, 2023

์ด๋ฒˆ ๋ฐœํ‘œ๋Š” reinforcement learning์—์„œ pretrain๊ณผ ๊ด€๋ จ๋œ ์ „์ฒด์ ์ธ ๋‚ด์šฉ์„ ์„ค๋ช…ํ•ด์ฃผ๋Š” ๊ฐ•์˜์˜€๋‹ค. ์‰ฌ์šด ์„ค๋ช…๊ณผ ํ•จ๊ป˜ ๋Œ€ํ‘œ์ ์ธ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์„ค๋ช…ํ•ด์ฃผ์–ด ๋งค์šฐ ๋„์›€์ด ๋งŽ์ด ๋˜์—ˆ๋˜ ๊ฒƒ ๊ฐ™๋‹ค.

Seminar
Blog
KOR
Reinforcement Learning
Robotics
๐Ÿ›ก๏ธReinforcement Learning
(2) Causal RL, Multi environment RL - [๋ชจ๋‘์˜ ์—ฐ๊ตฌ์†Œ] ์ค‘์š”ํ•œ ๊ฒƒ์€ ๊บพ์ด์ง€ ์•Š๋Š” RL ํ›„๊ธฐ

(2) Causal RL, Multi environment RL - [๋ชจ๋‘์˜ ์—ฐ๊ตฌ์†Œ] ์ค‘์š”ํ•œ ๊ฒƒ์€ ๊บพ์ด์ง€ ์•Š๋Š” RL ํ›„๊ธฐ

Nov 29, 2023

Causal RL๊ณผ multi environment RL์— ๋Œ€ํ•œ ๋ฐœํ‘œ๋ฅผ ์ •๋ฆฌํ•ด๋ณด์•˜๋‹ค.

Seminar
Blog
KOR
Reinforcement Learning
๐Ÿ›ก๏ธReinforcement Learning
(1) 2023 Reinforcement Learning Trend - [๋ชจ๋‘์˜ ์—ฐ๊ตฌ์†Œ] ์ค‘์š”ํ•œ ๊ฒƒ์€ ๊บพ์ด์ง€ ์•Š๋Š” RL ํ›„๊ธฐ

(1) 2023 Reinforcement Learning Trend - [๋ชจ๋‘์˜ ์—ฐ๊ตฌ์†Œ] ์ค‘์š”ํ•œ ๊ฒƒ์€ ๊บพ์ด์ง€ ์•Š๋Š” RL ํ›„๊ธฐ

Nov 25, 2023

๋ชจ๋‘์˜ ์—ฐ๊ตฌ์†Œ ๊ฐ•ํ™”ํ•™์Šต ์„ธ๋ฏธ๋‚˜์— ์ฐธ์„๋ณด์•˜๋‹ค.

Seminar
Blog
KOR
Reinforcement Learning
๐Ÿ’ป Profile
Yongjun Cho
Machine Learning Researcher
Let's make Synergy together
๐Ÿ’ฌ Contact
github
email
linkedin