4Research·5h ago

Rank-Then-Act: Reward-Free Control from Frame-Order Progress

Researchers have introduced Rank-Then-Act, a new method for training AI control policies using only expert video demonstrations rather than environment rewards. By utilizing a vision-language model to score the chronological progress of these videos, the system learns to perform tasks through observation alone. This approach potentially simplifies the development of robotic agents in scenarios where defining precise numerical reward functions is difficult or impossible.

Covered by 1 source

AarXiv CS.AI↗Yuriy Maksyuta, George Bredis, Ruslan Rakhimov, Daniil Gavrilov5h ago

Rank-Then-Act: Reward-Free Control from Frame-Order Progress

Covered by 1 source

Related stories