Remix.run Logo
zozbot234 3 days ago

> I know the point of attention is to select the correct data and process that

In a way, the point of attention mechanisms is to bias the model towards long-range dependencies (as seen e.g. in language sentence structure) and away from the very short-term ones that a plain RNN would tend to focus on. LSTM is sort-of in the middle; it manages to persist information beyond the very short run (so "paying attention" to it in a very fuzzy sense) but not as long-term as attention does.