| ▲ | layer8 an hour ago | ||||||||||||||||
The page is declared as ISO 8859-1, but the actual bytes of the text appear to be UTF-8. In UTF-8, characters from U+0080 to U+00AF happen to be encoded as C2 <codepoint value>. For example, U+0092 is encoded as C2 92. C2 in ISO 8859-1 is ””. U+0092 is the control code Private Use 2 in Unicode, and 92 is the same in ISO 8859-1. However, the standard Western Windows code page 1252 extends ISO 8859-1 by assigning “’” (right single quotation mark) to 92. HTML5/WHATWG requires an ISO 8859-1 charset declaration to be interpreted as Windows-1252 (https://blog.whatwg.org/the-road-to-html-5-character-encodin...), hence the displayed result is “Â’”. The original Windows-1252 content must have previously been converted to UTF-8 under the assumption that the source is ISO 8859-1, i.e. mapping 92 to U+0092 (Private Use 2) instead of to U+2019 (Right Single Quotation Mark). The resulting UTF-8 encoding was placed in the web page, which however is declared as ISO 8859-1. | |||||||||||||||||
| ▲ | wvbdmp an hour ago | parent | next [-] | ||||||||||||||||
Delicious, thank you! | |||||||||||||||||
| |||||||||||||||||
| ▲ | root-parent 32 minutes ago | parent | prev [-] | ||||||||||||||||
this one does g11n.... | |||||||||||||||||