▲ | fwip a day ago | |
There's a few options out there that have noticeably better compression, with the downside of being less widely-compatible with tools. zstd also has the benefit of being very fast (depending on your settings, of course). CRAM compresses unmapped fastq pretty well, and can do even better with reference-based compression. If your institution is okay with it, you can see additional savings by quantizing quality scores (modern Illumina sequencers already do this for you). If you're aligning your data anyways, probably retaining just the compressed CRAM file with unmapped reads included is your best bet. There are also other fasta/fastq specific tools like fqzcomp or MZPAQ. Last I checked, both of these could about halve the size of our fastq.gz files. |