Remix.run Logo
optionalsquid 10 hours ago

The original FASTA/Pearson format and fasta/tfasta tools have supported 'N' for ambiguous nucleotides since at least 1996 [1], and the FASTQ format has to my knowledge always supported 'N' bases (i.e. since around 2000). IUPAC codes themselves date back to 1970 [2]. You can probably get away with not supporting the full range of IUPAC nucleotide codes, but not supporting 'N' makes your tool unusable to represent what is probably the majority of available FASTA/FASTQ data

[1] See 'release.v16' in the fasta2 release at https://fasta.bioch.virginia.edu/wrpearson/fasta/fasta_versi...

[2] https://iupac.qmul.ac.uk/misc/naabb.html