Jiaqi Liao 廖佳琪

VLM & WAM Post-Training

I am a research intern at Ant Group working on post-training for VLM and WAM (World Action Model).

I am especially interested in video generation for egocentric embodied scenarios and long-video understanding.

I am also deeply interested in WAM and RL post-training.

Jiaqi Liao

Research

🏋️
Gym-V: A Unified Vision Environment System for Agentic Vision Research

Fanqing Meng*, …, Lingxiao Du, Jiawei Gu, Jiaqi Liao* (Co-first & Core Contributor), …, Linjie Li, Jiawei Gu, Ziqi Zhao, Mengkang Hu, Yue Zhang, Zichen Liu, Michael Qizhe Shieh

Preprint

Gym-V provides a unified environment for agentic vision research, enabling systematic evaluation of vision-language models.

Projects

ClawMark Bench

Released

A benchmark for multi-day, multimodal coworker agents with living-world tasks and rule-based scoring.

Benchmark Multimodal Agents

Experience

Ant Group

Research Intern — VLM & WAM Post-training
Present
Video generation, long-video understanding, and RL post-training

About

I am currently focused on post-training for VLM and WAM at Ant Group. My interests center on video generation for egocentric embodied scenes, long-video understanding, and RL post-training. I received the SenseTime Scholarship and twice received the National Scholarship.