arXiv:2606.20772v1 Announce Type: new Abstract: Camera-first autonomous-driving models predict future ego waypoints from images, ego-state features, and route commands, but waypoint supervision alone does not explicitly supervise actor-level representations of nearby road users. We study this as su