| ▲ | gdiamos 6 hours ago | |
One of my lessons in using different accelerators, whether they be different NVIDIA versions, or GPU->TPU, etc is that someone needs to do this work of indexing, partitioning, mapping, scheduling, and benchmarking. That work is labor intensive. In this case, google has already done it, and that will be true for high resourced accelerator companies like Google working with the most popular operations like attention. As long as you use those operations, you are okay. But if you do something different, you need to be prepared to do all of this yourself. | ||