Show HN: PhAIL – Real-robot benchmark for AI models
11 hours ago
- #AI evaluation
- #commercial applications
- #robotics
- A leaderboard is being loaded to evaluate physical AI models.
- Five leading models will be tested on a commercial task using production metrics.
- The dataset is loading, which may take a few minutes.