Tags¶
按标签查找论文。每篇笔记会同时出现在它涉及的多个 tag 下 — 一篇既是 world model 又是 egocentric 的论文,可以从两个方向之中任一个找到。
action-expert¶
action-tokenization¶
actor-critic¶
attention¶
autonomous-driving¶
- Gigapixel (2606.19641)
- LA-Pose (2604.27448)
- Roach (2108.08265, ICCV 2021)
- Spiced Self-Play (2606.19370)
- TerraTransfer (2606.17386)
- World Engine (2606.19836)
autoregressive¶
camera-control¶
co-training¶
cross-view¶
debugging¶
deployment¶
depth-estimation¶
dexterous¶
diffusion¶
- ABC (abc.bot)
- ABot-M0 (2602.11236)
- ABot-PhysWorld (2603.23376)
- DeFI (2604.16391)
- FRS (2606.13675)
- Fast-WAM (2603.16666)
- Fast-WAM — 讨论
- GenPO (2505.18763)
- LagerNVS (2603.20176)
- Octo (2405.12213)
- Pelican-Unified (2605.15153)
- QGF (2606.11087)
- Qwen-VLA (2605.30280)
- UCPE (2512.07237)
- VideoWorld 2 (2602.10102)
- WALL-WM (X Square Robot)
- Wall-OSS-0.5 (X Square Robot)
- World Engine (2606.19836)
- ZPRL (2605.19919)
- μ₀ (2606.13769)
- π₀ (2410.24164)
discussion¶
driveos¶
egocentric¶
engineering-notes¶
evaluation-benchmark¶
- ABC (abc.bot)
- ABot-M0 (2602.11236)
- ABot-PhysWorld (2603.23376)
- ActionCodec (2602.15397)
- FASTer (2512.04952)
- Gigapixel (2606.19641)
- LeRobot (2602.22818)
- OASIS (2605.25829)
- OpenVLA-OFT (2502.19645)
- Roach (2108.08265, ICCV 2021)
- Spiced Self-Play (2606.19370)
- TerraTransfer (2606.17386)
- VibeThinker-3B (2606.16140)
- VideoWorld (2501.09781)
- VideoWorld 2 (2602.10102)
- Wall-OSS-0.5 (X Square Robot)
fine-tuning¶
- Agentic-VLA (2605.22896)
- LiteFrame (2605.17260)
- OpenVLA (2406.09246)
- OpenVLA-OFT (2502.19645)
- VibeThinker-3B (2606.16140)
- World Engine (2606.19836)
- ZPRL (2605.19919)
- π*₀.₆ · RECAP (2511.14759)
- π₀.₇ (2604.15483)
flatformer¶
flow-matching¶
gaussian-splatting¶
human-pose¶
imitation-learning¶
- ABC (abc.bot)
- ABot-M0 (2602.11236)
- Agentic-VLA (2605.22896)
- FAST · π₀-FAST (2501.09747)
- FRS (2606.13675)
- Fast-WAM (2603.16666)
- Gigapixel (2606.19641)
- LeRobot (2602.22818)
- MEM · π₀.₆-MEM (2603.03596)
- OASIS (2605.25829)
- Octo (2405.12213)
- OpenVLA-OFT (2502.19645)
- QGF (2606.11087)
- Roach (2108.08265, ICCV 2021)
- Spiced Self-Play (2606.19370)
- VideoWorld (2501.09781)
- ZPRL (2605.19919)
- π*₀.₆ · RECAP (2511.14759)
- π₀ (2410.24164)
- π₀.₅ (2504.16054)
- π₀.₆ (Model Card)
- π₀.₇ (2604.15483)
inference¶
knowledge-insulation¶
kv-cache¶
latent-action¶
- DeFI (2604.16391)
- FRS (2606.13675)
- LA-Pose (2604.27448)
- VideoWorld (2501.09781)
- VideoWorld 2 (2602.10102)
lidar¶
liteframe¶
locomotion¶
lora¶
manipulation¶
- ABC (abc.bot)
- ABot-M0 (2602.11236)
- ABot-PhysWorld (2603.23376)
- ActionCodec (2602.15397)
- Agentic-VLA (2605.22896)
- DeFI (2604.16391)
- FAST · π₀-FAST (2501.09747)
- FASTer (2512.04952)
- FRS (2606.13675)
- Fast-WAM (2603.16666)
- GenPO (2505.18763)
- IMPACT (2605.09127)
- KI · Knowledge Insulation (2505.23705)
- LeRobot (2602.22818)
- MEM · π₀.₆-MEM (2603.03596)
- OASIS (2605.25829)
- Octo (2405.12213)
- OpenVLA (2406.09246)
- OpenVLA-OFT (2502.19645)
- Pelican-Unified (2605.15153)
- QGF (2606.11087)
- Qwen-VLA (2605.30280)
- RLT · RL Token (2604.23073)
- SimDist (2603.15759)
- Utonia (2603.03283)
- VideoWorld (2501.09781)
- VideoWorld 2 (2602.10102)
- WALL-WM (X Square Robot)
- Wall-OSS-0.5 (X Square Robot)
- ZPRL (2605.19919)
- μ₀ (2606.13769)
- π*₀.₆ · RECAP (2511.14759)
- π₀ (2410.24164)
- π₀.₅ (2504.16054)
- π₀.₆ (Model Card)
- π₀.₇ (2604.15483)
mha¶
model-card¶
mpc-planning¶
muon¶
muonclip¶
navigation¶
novel-view-synthesis¶
object-pose¶
online-rl¶
open-source¶
paligemma¶
pi-zero¶
- FAST · π₀-FAST (2501.09747)
- KI · Knowledge Insulation (2505.23705)
- MEM · π₀.₆-MEM (2603.03596)
- RLT · RL Token (2604.23073)
- π*₀.₆ · RECAP (2511.14759)
- π₀ (2410.24164)
- π₀.₅ (2504.16054)
- π₀.₆ (Model Card)
- π₀.₇ (2604.15483)
planning¶
point-cloud¶
pose-estimation¶
pretraining¶
- ABot-M0 (2602.11236)
- DeFI (2604.16391)
- FAST · π₀-FAST (2501.09747)
- LA-Pose (2604.27448)
- LagerNVS (2603.20176)
- LiteFrame (2605.17260)
- MEM · π₀.₆-MEM (2603.03596)
- Octo (2405.12213)
- OpenVLA (2406.09246)
- Qwen-VLA (2605.30280)
- Utonia (2603.03283)
- VibeThinker-3B (2606.16140)
- VideoWorld 2 (2602.10102)
- WALL-WM (X Square Robot)
- Wall-OSS-0.5 (X Square Robot)
- μ₀ (2606.13769)
- π₀ (2410.24164)
- π₀.₅ (2504.16054)
- π₀.₇ (2604.15483)
qk-clip¶
qk-norm¶
qkv-bias¶
quantization¶
real-time-slam¶
representation-learning¶
rl¶
- Agentic-VLA (2605.22896)
- FRS (2606.13675)
- GenPO (2505.18763)
- Gigapixel (2606.19641)
- LeRobot (2602.22818)
- QGF (2606.11087)
- Qwen-VLA (2605.30280)
- RLT · RL Token (2604.23073)
- Roach (2108.08265, ICCV 2021)
- SimDist (2603.15759)
- Spiced Self-Play (2606.19370)
- TerraTransfer (2606.17386)
- VibeThinker-3B (2606.16140)
- World Engine (2606.19836)
- ZPRL (2605.19919)
- π*₀.₆ · RECAP (2511.14759)
sample-efficient¶
self-supervised¶
sim2real¶
tensorrt¶
training-stability¶
trajectory-optimization¶
transformer¶
- Fast-WAM (2603.16666)
- KV-Tracker (2512.22581)
- LagerNVS (2603.20176)
- LiteFrame (2605.17260)
- MEM · π₀.₆-MEM (2603.03596)
- Octo (2405.12213)
- UCPE (2512.07237)
- Utonia (2603.03283)
video-generation¶
- ABot-PhysWorld (2603.23376)
- EgoExo-WM (2605.15477)
- Pelican-Unified (2605.15153)
- UCPE (2512.07237)
- VideoWorld (2501.09781)
- VideoWorld 2 (2602.10102)
- WALL-WM (X Square Robot)
vla¶
- ABC (abc.bot)
- ABot-M0 (2602.11236)
- ABot-PhysWorld (2603.23376)
- ActionCodec (2602.15397)
- Agentic-VLA (2605.22896)
- DeFI (2604.16391)
- FAST · π₀-FAST (2501.09747)
- FASTer (2512.04952)
- FRS (2606.13675)
- Fast-WAM (2603.16666)
- KI · Knowledge Insulation (2505.23705)
- LeRobot (2602.22818)
- MEM · π₀.₆-MEM (2603.03596)
- OASIS (2605.25829)
- Octo (2405.12213)
- OpenVLA (2406.09246)
- OpenVLA-OFT (2502.19645)
- Pelican-Unified (2605.15153)
- Qwen-VLA (2605.30280)
- RLT · RL Token (2604.23073)
- WALL-WM (X Square Robot)
- Wall-OSS-0.5 (X Square Robot)
- μ₀ (2606.13769)
- π*₀.₆ · RECAP (2511.14759)
- π₀ (2410.24164)
- π₀.₅ (2504.16054)
- π₀.₆ (Model Card)
- π₀.₇ (2604.15483)