How feasible is this for a cloud operation(s)? I imagine this work requires close collaboration with the architects and proprietary knowledge about the design.

▲

ants_everywhere 4 days ago | parent [-]

it seems feasible, it's more a matter of how much of a priority it is.

I follow Google most closely. They design and manufacture their own accelerators. AWS I know manufactures its own CPUs, but I don't know if they're working on or already have an AI accelerator.

Several of the big players are working on OpenXLA, which is designed to abstract and commoditize the GPU layer: https://openxla.org/xla

OpenXLA mentions:

> Alibaba, Amazon Web Services, AMD, Apple, Arm, Google, Intel, Meta, and NVIDIA

▲

mdaniel 4 days ago | parent [-]

> AWS I know manufactures its own CPUs, but I don't know if they're working on or already have an AI accelerator

I believe those are the Inferentia: https://aws.amazon.com/ai/machine-learning/inferentia/

> AWS Inferentia chips are designed by AWS to deliver high performance at the lowest cost in Amazon EC2 for your deep learning (DL) and generative AI inference applications

but I don't know this second if they're supported by the major frameworks, or what

I also didn't recall about https://aws.amazon.com/ai/machine-learning/trainium/ until I was looking up that page, so it seems they're trying to have a competitor to the TPUs just naming them dumb, because AWS

> AWS Trainium chips are a family of AI chips purpose built by AWS for AI training and inference to deliver high performance while reducing costs.

	▲	ants_everywhere 4 days ago \| parent [-]
		thanks this is useful! > have a competitor to the TPUs just naming them dumb, because AWS I kind of like "trainium" although "inferentia" I could take or leave. At least it's nice that the names tell you the intended use case.