![[Blog] Itโs Been a While โ Hereโs What Iโve Been Up To](/_next/image?url=https%3A%2F%2Fwww.notion.so%2Fimage%2Fattachment%253A3da53626-5887-40f0-bef6-99f807cfdb82%253AScreenshot_2025-03-27_at_10.39.55_AM.png%3Ftable%3Dblock%26id%3D1c39b104-9db2-8025-afe4-d782541781dc%26cache%3Dv2&w=3840&q=75)
![[Blog] Itโs Been a While โ Hereโs What Iโve Been Up To](/_next/image?url=https%3A%2F%2Fwww.notion.so%2Fimage%2Fattachment%253A3da53626-5887-40f0-bef6-99f807cfdb82%253AScreenshot_2025-03-27_at_10.39.55_AM.png%3Ftable%3Dblock%26id%3D1c39b104-9db2-8025-afe4-d782541781dc%26cache%3Dv2&w=3840&q=75)
![[Paper Review] OpenVLA: An Open-Source Vision-Language-Action Model](/_next/image?url=https%3A%2F%2Fwww.notion.so%2Fimage%2Fhttps%253A%252F%252Fprod-files-secure.s3.us-west-2.amazonaws.com%252Fde7e8c2a-d8d4-488e-8f26-549d4037a363%252Fe03024b2-c957-4aed-a090-396a1d381282%252FUntitled_(5).png%3Ftable%3Dblock%26id%3D1289b104-9db2-80eb-8479-f53d78d356e6%26cache%3Dv2&w=3840&q=75)
[Paper Review] OpenVLA: An Open-Source Vision-Language-Action Model
1. Yet, widespread adoption of VLAs for robotics has been challenging as 1) existing VLAs are largely closed and inaccessible to the public, and 2) prior work fails to explore methods for efficiently fine-tuning VLAs for new tasks, a key component for adoption. 2. Addressing these challenges, we introduce OpenVLA, a 7B-parameter open-source VLA trained on a diverse collection of 970k real-world robot demonstrations. OpenVLA builds on a Llama 2 language model combined with a visual encoder that fuses pretrained features from DINOv2 and SigLIP. 3. We further show that we can effectively fine-tune OpenVLA for new settings, with especially strong generalization results in multi-task environments involving multiple objects and strong language grounding abilities, and outperform expressive from-scratch imitation learning methods such as Diffusion Policy by 20.4%
![[Paper Review] Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs](/_next/image?url=https%3A%2F%2Fwww.notion.so%2Fimage%2Fhttps%253A%252F%252Fprod-files-secure.s3.us-west-2.amazonaws.com%252Fde7e8c2a-d8d4-488e-8f26-549d4037a363%252Fd57228e6-db19-4563-9709-c2323fb1716f%252FUntitled_(3).png%3Ftable%3Dblock%26id%3D1289b104-9db2-8051-85a4-ea6e301e66d3%26cache%3Dv2&w=3840&q=75)
[Paper Review] Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs
โข An elusive goal in navigation research is to build an intelligent agent that can understand multimodal instructions including natural language and image, and perform useful navigation โข To achieve this, we study a widely useful category of navigation tasks we call Multimodal Instruction Navigation with demonstration Tours (MINT), in which the environment prior is provided through a previously recorded demonstration video. โข We evaluated Mobility VLA in a 836$m^2$ real world environment and show that Mobility VLA has a high end-to-end success rates on previously unsolved multimodal instructions such as โWhere should I return this?โ while holding a plastic bin.
.png%3Ftable%3Dblock%26id%3D1289b104-9db2-8004-9ae2-d8b020323818%26cache%3Dv2&w=3840&q=75)
(2) An Introduction to Vision-Language Modeling: A Guide to VLM Training
The document provides an overview of vision-language modeling (VLM) training strategies, discussing when to use contrastive models like CLIP, masking techniques, generative models, and pretrained backbones. It emphasizes the importance of grounding and alignment in VLMs, detailing methods such as instruction tuning and reinforcement learning from human feedback (RLHF). Additionally, it highlights advancements in models like LLaVA and its variants, which incorporate multimodal instruction tuning and improve performance on various benchmarks. Finally, it addresses parameter-efficient fine-tuning methods to adapt large-scale models for specific tasks while managing computational costs.

(1) An Introduction to Vision-Language Modeling: The Families of VLMs
The document discusses Vision-Language Models (VLMs), highlighting their role in solving rate-distortion problems by optimizing predictive information and constraining conditional densities. It covers various approaches, including generative-based VLMs that generate text and images, and examples like CoCa and CM3Leon which utilize multimodal generative techniques. The document also explores the use of pretrained backbones in VLMs, emphasizing models like MiniGPT and BLIP2 that efficiently integrate visual and textual data for various tasks, showcasing advancements in multimodal understanding and generation capabilities.
![[๋
ผ๋ฌธ๋ฆฌ๋ทฐ] Driving Everywhere with Large Language Model Policy Adaptation](/_next/image?url=https%3A%2F%2Fwww.notion.so%2Fimage%2Fhttps%253A%252F%252Fprod-files-secure.s3.us-west-2.amazonaws.com%252Fde7e8c2a-d8d4-488e-8f26-549d4037a363%252Ff97413d5-0526-48ec-b314-cba401f2af4c%252FUntitled_(2).png%3Ftable%3Dblock%26id%3D1289b104-9db2-8012-81e0-ec7999fd858a%26cache%3Dv2&w=3840&q=75)
[๋ ผ๋ฌธ๋ฆฌ๋ทฐ] Driving Everywhere with Large Language Model Policy Adaptation
โข Adapting driving behavior to new environments, customs, and laws is a long-standing problem in autonomous driving, precluding the widespread deployment of autonomous vehicles (AVs) โข LLaDA achieves this by leveraging the impressive zero-shot generalizability of large language models (LLMs) in interpreting the traffic rules in the local driver handbook. โข We also demonstrate LLaDAโs ability to adapt AV motion planning policies in real-world datasets; LLaDA outperforms baseline planning approaches on all our metrics.
![(3) [Paper Review] Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning: Embodied AI](/_next/image?url=https%3A%2F%2Fwww.notion.so%2Fimage%2Fhttps%253A%252F%252Fprod-files-secure.s3.us-west-2.amazonaws.com%252Fde7e8c2a-d8d4-488e-8f26-549d4037a363%252F8ceeba0d-8b6c-45f9-a296-4aeadeae0566%252FUntitled.png%3Ftable%3Dblock%26id%3D9b9b3122-31fc-405d-860d-5fea0f2c5dbd%26cache%3Dv2&w=3840&q=75)
(3) [Paper Review] Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning: Embodied AI
Today, I will review a new paper that was released yesterday. This research comes from Sergey Levineโs team, a prominent figure in the AI and RL domains. They propose fine-tuning Vision-Language Models (VLM) with Reinforcement Learning (RL) to enhance performance in optimal decision-making tasks within multi-step interactive environments. The paper presents a simple approach that outperforms both GPT-4 and Gemini. This research is similar to my own ideas for solving challenges in embodied AI. Therefore, I will review this paper and organize its key concepts.
![[๋
ผ๋ฌธ๋ฆฌ๋ทฐ] Scaling Instructable Agents Across Many Simulated Worlds](/_next/image?url=https%3A%2F%2Fwww.notion.so%2Fimage%2Fhttps%253A%252F%252Fprod-files-secure.s3.us-west-2.amazonaws.com%252Fde7e8c2a-d8d4-488e-8f26-549d4037a363%252Fdd673178-754b-477e-85c3-510c9e0304f4%252FUntitled_(4).png%3Ftable%3Dblock%26id%3D1289b104-9db2-805b-8450-f609dec0de7a%26cache%3Dv2&w=3840&q=75)
[๋ ผ๋ฌธ๋ฆฌ๋ทฐ] Scaling Instructable Agents Across Many Simulated Worlds
1. ์์์ ์ธ์ด ์ง์๋ฅผ ๋ฐ๋ฅผ ์ ์๋๋ก ๊ตฌ์ฒด์ ํ๋์ ๋ง์ถ์ด ๋ณต์กํ ์์ ์ ์ํํ๋๋ก ํ๋ค. 2. ์๋ฎฌ๋ ์ด์ ๋ 3D ํ๊ฒฝ์์ ์ธ๊ฐ์ด ํ ์ ์๋ ๋ชจ๋ ๊ฒ์ ์ํํ ์ ์๋ Scalable, Instructable, Multiworld agent๋ฅผ ํ์ตํ ๊ฒ์ด๋ค. language + observation โ keyboard-and-mouse 3. ๋๊ธฐ์ ๋ชฉํ, ์ด๊ธฐ ์งํ์ํฉ, ์ฌ๋ฌ ์ฐ๊ตฌํ๊ฒฝ๊ณผ ์์ ์ฉ ๋น๋์ค ๊ฒ์์์์ ์๋น์ ์ธ ๊ฒฐ๊ณผ๋ฅผ ์ค๋ช ํ๋ค.

(3) Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models : 3D Generation
์ด๋ฒ์๋ 4D generation (3D generation + motion)์ ๋ํด ๋ฆฌ๋ทฐํด๋ณด๋๋ก ํ๊ฒ ๋ค. ์ด ์ฐ๊ตฌ๋ Nvidia์์ ๋ฐํํ ๋ ผ๋ฌธ์ผ๋ก ์ฌ๋ฌ ๋ชจ๋ธ์ ์ฌ์ฉํ์ฌ 4D generation์ ์งํํ๋ ๊ฒ์ ๋ชฉํ๋กํ์๋ค. ์์ง์ ๋ฐ์ ํ ๊ฒ์ด ๋ง์๋ณด์ด์ง๋ง ๊ทธ๋๋ ์๋ก์ด ์ฐ๊ตฌ ๋ฐฉํฅ์ผ๋ก์จ 4D๊ฐ ์ฃผ๋ชฉ๋ฐ๊ณ ์๊ณ ์ฐ๊ตฌ๋ฅผ ํ๊ธฐ์๋ ์ต์ ์ ์ฃผ์ ๋ผ๊ณ ์๊ฐํ๋ค.
![(2) [๋
ผ๋ฌธ๋ฆฌ๋ทฐ] RT-2, Vision-Language-Action Models Transfer Web Knowlege to Robotic Control: Embodied AI](/_next/image?url=https%3A%2F%2Fwww.notion.so%2Fimage%2Fhttps%253A%252F%252Fprod-files-secure.s3.us-west-2.amazonaws.com%252Fde7e8c2a-d8d4-488e-8f26-549d4037a363%252F02cff65f-d206-4eba-8d76-c314778405e8%252FUntitled.png%3Ftable%3Dblock%26id%3D5049813d-bd9f-486b-9774-c8634f779a2b%26cache%3Dv2&w=3840&q=75)
(2) [๋ ผ๋ฌธ๋ฆฌ๋ทฐ] RT-2, Vision-Language-Action Models Transfer Web Knowlege to Robotic Control: Embodied AI
์ด๋ฒ์๋ Q-transformer์ ์ด์ด ๋ฅ๋ง์ธ๋์์ ๊ณต๊ฐํ Embodied AI์ธ RT-2์ ๋ํ ๋ฆฌ๋ทฐ๋ฅผ ์งํํด๋ณด๋๋ก ํ๊ฒ ๋ค. ์ด์ ์ RT-1๊ณผ Q-transformer๊ฐ ๋ก๋ด ๋ฐ์ดํฐ๋ง์ ๊ฐ์ง๊ณ Transformer๋ฅผ ํ์ต์์ผ Imitation learning์ ์งํํ๋ค๋ฉด ์ด๋ฒ์๋ Internet Scale์์ ํ์ต๋ Vision Language ๋ชจ๋ธ์ ์ฌ์ฉํ์ฌ ๋ก๋ด ๋ฐ์ดํฐ๋ฅผ ์ถ๊ฐํด ๋์ฑ Generalization ์ฑ๋ฅ์ด ๋ฐ์ด๋ ๋ชจ๋ธ์ ๊ฐ๋ฐํ๋ ์ฐ๊ตฌ์ด๋ค. GPT๋ฅผ ์จ๋ณด์๋ค๋ฉด ์ด๋ฏธ ์ด๋ฏธ์ง๋ฅผ ํตํ Reasoning์ ์์ค์ด ๋๋ผ์ด ์์ค์ด๊ณ , ์ด๋ฅผ ํ์ฉํ๋ฉด ์ค์ ๋ก๋ด์ ๋ง๋ค ์ ์๋ค๋ ์์์ ํ ์ ์์ ๊ฒ์ด๋ค. ์ด ๋ ผ๋ฌธ์ ๊ทธ ์์์ ์ง์ ์คํ์ผ๋ก ์ฆ๋ช ํ๊ณ ๊ฒ์ฆํ ๋ ผ๋ฌธ์ด๋ค. ์ด ๋ ผ๋ฌธ์ ํตํด ๋ ์ข์ ์ฑ๋ฅ์ Embodied AI๊ฐ ๊ฐ๋ฐ๋ ๊ฒ์ด๋ผ๋ ํ์ ์ ๊ฐ์ง๊ฒ ๋์๋ค.
![(2) [๋
ผ๋ฌธ๋ฆฌ๋ทฐ] Zero123: 3D Generation](/_next/image?url=https%3A%2F%2Fwww.notion.so%2Fimage%2Fhttps%253A%252F%252Fprod-files-secure.s3.us-west-2.amazonaws.com%252Fde7e8c2a-d8d4-488e-8f26-549d4037a363%252F7868d637-7741-4f1e-b7c1-9860b6bf19a9%252FUntitled.png%3Ftable%3Dblock%26id%3Db14c1168-59a5-4fd8-b5fe-7be4785b502d%26cache%3Dv2&w=3840&q=75)
(2) [๋ ผ๋ฌธ๋ฆฌ๋ทฐ] Zero123: 3D Generation
์ด๋ฒ ํฌ์คํ ์์๋ 3D generation ๋ชจ๋ธ ์ค์์ Zero123๋ฅผ ๋ฆฌ๋ทฐํด ๋ณผ ๊ฒ์ด๋ค. Zero123๋ diffusion model์ ์นด๋ฉ๋ผ ๊ฐ๋์ ๋ฐ๋ฅธ ์ด๋ฏธ์ง๋ฅผ ์์ฑํ๋๋ก finetuningํ์ฌ 3D generation์ ์งํํ๋ค๋ ๋งค์ฐ ๊ฐ๋จํ ์์ด๋์ด์์ ์ถ๋ฐํ ๋ ผ๋ฌธ์ด๋ค.
![(1) [๋
ผ๋ฌธ๋ฆฌ๋ทฐ] DreamFusion: Text-To-3D Using 2D Diffusion - 3D generation](/_next/image?url=https%3A%2F%2Fwww.notion.so%2Fimage%2Fhttps%253A%252F%252Fprod-files-secure.s3.us-west-2.amazonaws.com%252Fde7e8c2a-d8d4-488e-8f26-549d4037a363%252Fa2bb5c49-3998-475b-ba1b-c1e035eb4a4e%252Fthumbnail.png%3Ftable%3Dblock%26id%3Dc1685489-7a32-45bf-863c-c9e4905b0e9d%26cache%3Dv2&w=3840&q=75)
(1) [๋ ผ๋ฌธ๋ฆฌ๋ทฐ] DreamFusion: Text-To-3D Using 2D Diffusion - 3D generation
์ด๋ฒ ํฌ์คํธ์์๋ ํ์ฌ ๋ค์ํ 3D generation ๋ชจ๋ธ์ ๊ธฐ์ด๊ฐ ๋๋ค๊ณ ๋ณผ ์ ์๋ DreamFusion์ ๋ํด์ ๋ฆฌ๋ทฐํด๋ณผ ๊ฒ์ด๋ค. ์ด ๋ ผ๋ฌธ์ ๊ตฌ๊ธ ๋ฆฌ์์น์ ๋ฒํด๋ฆฌ์์ ์งํํ ์ฐ๊ตฌ์ด๊ณ , 2D diffusion ๋ชจ๋ธ์ ์ฌ์ฉํ์ฌ NeRF๋ฅผ ํ์ต์์ผ 3D generation ๋ชจ๋ธ์ ๋ง๋ค ์ ์๋ค๋ ์ฌ์ค๋ก ์ฃผ๋ชฉ๋ฐ์๋ค. ํ์ฌ๋ video prior, 2D์ 3D๋ฅผ ๊ฒฐํฉํ prior ๋ฑ์ ์ด์ฉํ์ฌ ๋ค์ํ ์ฐ๊ตฌ๊ฐ ๋์ค๊ณ ์๋ค. ์์ผ๋ก 3D generation ๋ชจ๋ธ๋ค์ ๋ํด ๋ฆฌ๋ทฐ๋ฅผ ์งํํ๊ธฐ ์ํด ์์๋์ด์ผํ๋ ๋ ผ๋ฌธ์ด๊ธฐ ๋๋ฌธ์ ์์ธํ ๋ฆฌ๋ทฐ๋ฅผ ์งํํด๋ณด๋๋ก ํ ๊ฒ์ด๋ค.
![(1) [๋
ผ๋ฌธ๋ฆฌ๋ทฐ] Q-transformer : Embodied AI](/_next/image?url=https%3A%2F%2Fwww.notion.so%2Fimage%2Fhttps%253A%252F%252Fprod-files-secure.s3.us-west-2.amazonaws.com%252Fde7e8c2a-d8d4-488e-8f26-549d4037a363%252Fc1c7cd24-b59b-4673-9dab-d7555dc32598%252FUntitled.png%3Ftable%3Dblock%26id%3D5414683b-2ac8-4e64-9861-0a325b133567%26cache%3Dv2&w=3840&q=75)
(1) [๋ ผ๋ฌธ๋ฆฌ๋ทฐ] Q-transformer : Embodied AI
์ด๋ฒ ํฌ์คํธ์์๋ ์ง๋๋ฒ ํฌ์คํธ์์ ์งง๊ฒ ์ค๋ช ํ๋ Q-Transformer๋ผ๋ ๋ ผ๋ฌธ์ ๋ํด ๋ ์์ธํ ์์๋ณผ ๊ฒ์ด๋ค. ์ด ํฌ์คํธ๋ฅผ ์ดํดํ๊ธฐ ์ํด์๋ ์ง๋ ๊ธ ์ค offline RL ๋ถ๋ถ์ ๋ฐ๋์ ์ฝ์ด๋ณด๋ ๊ฒ์ด ์ข๋ค.

(2) Text-to-Image Diffusion Model, Alignment in Deep Learning : Comprehensive summary
์ง๋ ํฌ์คํ ์ ์ด์ด ์ด๋ฒ์๋ Image generation model์์์ alignment๋ฅผ ์ดํด๋ณด๋ ค๊ณ ํ๋ค. ์ด ๋ถ์ผ๋ ํ์ฌ ์น์ดํ ๊ฒฝ์์ด ์งํ๋๊ณ ์์ด ๋ง์ ๋ ผ๋ฌธ์ด ๋ฐํ๋๊ณ ์๋ค. ์ด ๊ธ์์๋ ์ฒซ ์๋์ธ Aligning Text-to-Image ๋ ผ๋ฌธ๋ถํฐ DPOK, Diffusion DPO๊น์ง ์์ธํ๊ฒ ๋ฆฌ๋ทฐํด๋ณด๊ณ ์ ํ๋ค. ๋๋จธ์ง ๋ค์ํ ์ฐ๊ตฌ๋ค์ ์งง๊ฒ ์์ฝํด์ ์ค๋ช ํ ๊ฒ์ด๋ค.

(1) RLHF LLM, Alignment in Deep Learning: Comprehensive Summary
LLM๊ณผ Image generation ๋ชจ๋ธ์ ํตํด ์์ฑํ ์ธ๊ณต์ง๋ฅ ๋ชจ๋ธ์ ๋ํ ๊ด์ฌ์ ํญ๋ฐ์ ์ผ๋ก ์ฆ๊ฐํ๋ค. ์ด๋ฏธ ChatGPT์ ๋ฏธ๋์ ๋์ ๊ฐ์ ์ธ๊ณต์ง๋ฅ ๋ชจ๋ธ์ ์ํฅ๋ ฅ์ ๊ฒฝ์ ์ฌํ ์ ๋ฐ์ ๋ณํ๋ฅผ ์ผ์ผํค๊ณ ์๋ค. ํ์ง๋ง ์์ฑ ๋ชจ๋ธ์ ์๋น์ค์ ์ฌ์ฉํ๊ธฐ๊น์ง๋ ์ฌ๋ฌ ๋ฒ์ ํ์ต ๊ณผ์ ์ ๊ฑฐ์น๊ฒ ๋๋๋ฐ, ์ด๋ ๋ฐ๋ก ์์ฑํ ๋ชจ๋ธ์ด ๊ฐ์ง ํน์ง ๋๋ฌธ์ด๋ค. ์ด ํ์ต ๊ณผ์ ์์ ํ์์ ์ธ Alignment์ ๋ํด์ ์ ๋ฆฌํด๋ณด์๋ค.

๋ฐ์ผ ์นด๋ค๊ธฐ์ ์ธ๊ฐ๊ด๊ณ๋ก ์ ์ฝ๊ณ
๋ฐ์ผ ์นด๋ค๊ธฐ์ ์ธ๊ฐ๊ด๊ณ๋ก ์ ์ฝ๊ณ ๋๋์ ์ ์ ๋ฆฌํด๋ณด์๋ค.
![(4) ๊ฐํํ์ต ๊ฒ์ ์์ฉํ - [๋ชจ๋์ ์ฐ๊ตฌ์] ์ค์ํ ๊ฒ์ ๊บพ์ด์ง ์๋ RL ํ๊ธฐ](/_next/image?url=https%3A%2F%2Fwww.notion.so%2Fimage%2Fhttps%253A%252F%252Fprod-files-secure.s3.us-west-2.amazonaws.com%252Fde7e8c2a-d8d4-488e-8f26-549d4037a363%252F7d6a375d-7108-4fb3-87bf-1daa46cdbc81%252F%2525EC%25258A%2525A4%2525ED%252581%2525AC%2525EB%2525A6%2525B0%2525EC%252583%2525B7_2024-02-27_231050.png%3Ftable%3Dblock%26id%3D0e2d5da6-cec0-4c04-a33a-4a9d3402a0cd%26cache%3Dv2&w=3840&q=75)
(4) ๊ฐํํ์ต ๊ฒ์ ์์ฉํ - [๋ชจ๋์ ์ฐ๊ตฌ์] ์ค์ํ ๊ฒ์ ๊บพ์ด์ง ์๋ RL ํ๊ธฐ
์ด๋ฒ ๋ธ๋ก๊ทธ ๊ธ์ ๋ฒ์จ ๋ง์ง๋ง ์ฃผ์ ์ด๋ค. ๋ ๊ฐ์ ๋ฐํ๋ง์ด ๋จ์๋๋ฐ ๋ชจ๋ ๊ฐํํ์ต์ ๊ฒ์์ ์ ์ฉํ์ฌ ์์ฉํํ๋ ๋ฐฉ๋ฒ์ ๋ํ ๋ฐํ์๋ค.
.png%3Ftable%3Dblock%26id%3D296914a6-c2aa-49a7-93b2-87a195f322ff%26cache%3Dv2&w=3840&q=75)
Striving for Visibility: My Journey to Get My Blog Indexed on Google
How to get indexing from google search console efficiently
![(3) Pretraining for inteligent robot - [๋ชจ๋์ ์ฐ๊ตฌ์] ์ค์ํ ๊ฒ์ ๊บพ์ด์ง ์๋ RL ํ๊ธฐ](/_next/image?url=https%3A%2F%2Fwww.notion.so%2Fimage%2Fhttps%253A%252F%252Fprod-files-secure.s3.us-west-2.amazonaws.com%252Fde7e8c2a-d8d4-488e-8f26-549d4037a363%252F190508d1-3bfb-4770-a76b-bf01b3856f2e%252Funnamed.webp%3Ftable%3Dblock%26id%3Da2076157-d74f-4073-afc7-896382563e7e%26cache%3Dv2&w=3840&q=75)
(3) Pretraining for inteligent robot - [๋ชจ๋์ ์ฐ๊ตฌ์] ์ค์ํ ๊ฒ์ ๊บพ์ด์ง ์๋ RL ํ๊ธฐ
์ด๋ฒ ๋ฐํ๋ reinforcement learning์์ pretrain๊ณผ ๊ด๋ จ๋ ์ ์ฒด์ ์ธ ๋ด์ฉ์ ์ค๋ช ํด์ฃผ๋ ๊ฐ์์๋ค. ์ฌ์ด ์ค๋ช ๊ณผ ํจ๊ป ๋ํ์ ์ธ ์๊ณ ๋ฆฌ์ฆ์ ์ค๋ช ํด์ฃผ์ด ๋งค์ฐ ๋์์ด ๋ง์ด ๋์๋ ๊ฒ ๊ฐ๋ค.
![(2) Causal RL, Multi environment RL - [๋ชจ๋์ ์ฐ๊ตฌ์] ์ค์ํ ๊ฒ์ ๊บพ์ด์ง ์๋ RL ํ๊ธฐ](/_next/image?url=https%3A%2F%2Fwww.notion.so%2Fimage%2Fhttps%253A%252F%252Fprod-files-secure.s3.us-west-2.amazonaws.com%252Fde7e8c2a-d8d4-488e-8f26-549d4037a363%252Fa75a3cf5-d236-4e7f-99f7-f5cc3deb1abc%252Fff13-19.png%3Ftable%3Dblock%26id%3D246388c9-0d45-4f00-9f08-6e80b2ce0ceb%26cache%3Dv2&w=3840&q=75)
(2) Causal RL, Multi environment RL - [๋ชจ๋์ ์ฐ๊ตฌ์] ์ค์ํ ๊ฒ์ ๊บพ์ด์ง ์๋ RL ํ๊ธฐ
Causal RL๊ณผ multi environment RL์ ๋ํ ๋ฐํ๋ฅผ ์ ๋ฆฌํด๋ณด์๋ค.
![(1) 2023 Reinforcement Learning Trend - [๋ชจ๋์ ์ฐ๊ตฌ์] ์ค์ํ ๊ฒ์ ๊บพ์ด์ง ์๋ RL ํ๊ธฐ](/_next/image?url=https%3A%2F%2Fwww.notion.so%2Fimage%2Fhttps%253A%252F%252Fprod-files-secure.s3.us-west-2.amazonaws.com%252Fde7e8c2a-d8d4-488e-8f26-549d4037a363%252F3420c0a5-6e65-443f-8961-5a6b2d8adf24%252F400331474_6956017711125153_5509337466998315324_n.jpg%3Ftable%3Dblock%26id%3D86e8a2c5-444c-4954-a161-2ba08bb1cc1b%26cache%3Dv2&w=3840&q=75)
(1) 2023 Reinforcement Learning Trend - [๋ชจ๋์ ์ฐ๊ตฌ์] ์ค์ํ ๊ฒ์ ๊บพ์ด์ง ์๋ RL ํ๊ธฐ
๋ชจ๋์ ์ฐ๊ตฌ์ ๊ฐํํ์ต ์ธ๋ฏธ๋์ ์ฐธ์๋ณด์๋ค.