Scaling Test-time Physical Memory for Robot Manipulation

[Submitted on 23 Feb 2026 (v1), last revised 28 Mar 2026 (this version, v4)]

View a PDF of the paper titled PhysMem: Scaling Test-time Physical Memory for Robot Manipulation, by Haoyang Li and 3 other authors

View PDF
HTML (experimental)

Abstract:Reliable object manipulation requires understanding physical properties that vary across objects and environments. Vision-language model (VLM) planners can reason about friction and stability in general terms; however, they often cannot predict how a specific ball will roll on a particular surface or which stone will provide a stable foundation without direct experience. We present PhysMem, a memory framework that enables VLM robot planners to learn physical principles from interaction at test time, without updating model parameters. The system records experiences, generates candidate hypotheses, and verifies them through targeted interaction before promoting validated knowledge to guide future decisions. A central design choice is verification before application: the system tests hypotheses against new observations rather than applying retrieved experience directly, reducing rigid reliance on prior experience when physical conditions change. We evaluate PhysMem on three real-world manipulation tasks and simulation benchmarks across four VLM backbones. On a controlled brick insertion task, principled abstraction achieves 76% success compared to 23% for direct experience retrieval, and real-world experiments show consistent improvement over 30-minute deployment sessions.

Submission history

From: Haoyang Li [view email]
[v1]
Mon, 23 Feb 2026 20:18:35 UTC (18,772 KB)
[v2]
Wed, 4 Mar 2026 04:33:20 UTC (18,767 KB)
[v3]
Mon, 23 Mar 2026 00:23:00 UTC (18,767 KB)
[v4]
Sat, 28 Mar 2026 02:43:53 UTC (18,768 KB)

What's Hot

At Least 32 People Dead After a Mine Bridge Collapsed Due to Overcrowding

Here’s how I turned a Raspberry Pi into an in-car media server

Beloved SF cat’s death fuels Waymo criticism

Scaling Test-time Physical Memory for Robot Manipulation

[2512.05658] Multilingual Medical Reasoning for Question Answering with Large Language Models

How Much Can RAG Systems Gain from Evaluation Secrets?

[2511.10983] Binary Verification for Zero-Shot Vision

Explainable AI in Production: A Neuro-Symbolic Model for Real-Time Fraud Detection

Enhance the Few-shot on Tensorial Radiance Fields

How to Lie with Statistics with your Robot Best Friend

At Least 32 People Dead After a Mine Bridge Collapsed Due to Overcrowding

Here’s how I turned a Raspberry Pi into an in-car media server

Beloved SF cat’s death fuels Waymo criticism

How to Gain Superpowers With AI : Social Media Examiner

How to Build Pages That Rank

Scaling Test-time Physical Memory for Robot Manipulation

New Google TurboQuant algorithm improves vector search speed

Inoreader Q1 highlights: Upgrades to Teams and automated insights

Reddit Pro opens to all publishers, adds new features in public beta

Most Popular

13 Trending Songs on TikTok in Nov 2025 (+ How to Use Them)

How to watch the 2026 GRAMMY Awards online from anywhere

Corporate Reputation Management Strategies | Sprout Social

Our Picks

At Least 32 People Dead After a Mine Bridge Collapsed Due to Overcrowding

Here’s how I turned a Raspberry Pi into an in-car media server

Beloved SF cat’s death fuels Waymo criticism

Subscribe to Updates

What's Hot

Scaling Test-time Physical Memory for Robot Manipulation

Submission history

Related Posts

Subscribe to Updates