4Research·5h ago
Rank-Then-Act: Reward-Free Control from Frame-Order Progress
Researchers have introduced Rank-Then-Act, a new method for training AI control policies using only expert video demonstrations rather than environment rewards. By utilizing a vision-language model to score the chronological progress of these videos, the system learns to perform tasks through observation alone. This approach potentially simplifies the development of robotic agents in scenarios where defining precise numerical reward functions is difficult or impossible.
Covered by 1 source
- AarXiv CS.AI↗Yuriy Maksyuta, George Bredis, Ruslan Rakhimov, Daniil Gavrilov5h ago