I think the proper way of estimating the cost is the cost of entire run of a test. Like in aider's leaderboard.