🤖🧠 Deep mind AI blog series

❯

Vision Action Model VAL

Oct 06, 20251 min read

Language-vision-action models that add action to the equation.

Trained by fine-tuning vision-language models with both Internet-scale visual-language tasks and robotic trajectory data. By expressing the robot actions as text tokens and incorporating them into the training set together with natural language tokens, these models can learn to output robot actions like LLMs output text

Graph View

Backlinks

No backlinks found

GitHub
Discord Community

🤖🧠 Deep mind AI blog series

Explorer

Vision Action Model VAL

Graph View

Backlinks