| ▲ | gzer0 2 days ago | |
Congratulations on the release to the DeepSeek team. An interesting note on the use of CSA and HCA: CSA provides higher-resolution, query-selected memory over 4-token compressed blocks, while HCA provides very low-resolution dense global memory over 128-token blocks. That could be a plausible reason to interleave them: CSA alone risks missing information if the indexer fails, while HCA alone is too lossy for precise retrieval. Still reading through the release, as usual, always appreciate the attention to detail in the technical papers. | ||