| ▲ | Nevermark 5 hours ago | |
Here is one: An adjustment to weight updates, that makes it more likely for weights to stay uniformly distributed. ~257.5 teraflops for normal distribution, versus ~268 teraflops uniform, reported on the first graph. I would have liked to see a straight graph of performance vs. clock speed, for each type of data. Pick your data statistics, then pick the peak performance clock speed accordingly. And for actual runs, from a pre-run sampled curve. | ||