Why are the outputs measured in hours? Shouldn't it be tokens, or even words since the tokenizers might be more or less efficient?
And since TPS on 5.6 might be much faster.