Remix.run Logo
PeterisP 3 days ago

The general principle is that "pipelines" impose a restriction where the errors of the first step get baked-in and can't effectively use the knowledge of the following step(s) to fix them.

So if the first step isn't near-perfect (which ASR isn't) and if there is some information or "world knowledge" in the later step(s) which is helpful in deciding that (which is true with respect to knowledge about named entities and ASR) then you can get better accuracy by having an end-to-end system where you don't attempt to pick just one best option at the system boundary. Also, joint training can be helpful, but that IMHO is less important.