| ▲ | wvbdmp 2 hours ago | |||||||||||||||||||||||||||||||
Yikes, mojibake in 2021. edit: actually, how did that happen? The apostrophes show up correctly, they’re just all preceded by a  that doesn’t seem to represent anything? | ||||||||||||||||||||||||||||||||
| ▲ | layer8 an hour ago | parent | next [-] | |||||||||||||||||||||||||||||||
The page is declared as ISO 8859-1, but the actual bytes of the text appear to be UTF-8. In UTF-8, characters from U+0080 to U+00AF happen to be encoded as C2 <codepoint value>. For example, U+0092 is encoded as C2 92. C2 in ISO 8859-1 is ””. U+0092 is the control code Private Use 2 in Unicode, and 92 is the same in ISO 8859-1. However, the standard Western Windows code page 1252 extends ISO 8859-1 by assigning “’” (right single quotation mark) to 92. HTML5/WHATWG requires an ISO 8859-1 charset declaration to be interpreted as Windows-1252 (https://blog.whatwg.org/the-road-to-html-5-character-encodin...), hence the displayed result is “Â’”. The original Windows-1252 content must have previously been converted to UTF-8 under the assumption that the source is ISO 8859-1, i.e. mapping 92 to U+0092 (Private Use 2) instead of to U+2019 (Right Single Quotation Mark). The resulting UTF-8 encoding was placed in the web page, which however is declared as ISO 8859-1. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | 31 minutes ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
| [deleted] | ||||||||||||||||||||||||||||||||
| ▲ | netsharc an hour ago | parent | prev [-] | |||||||||||||||||||||||||||||||
They're probably Microsoft's "Smart Quotes", which are Unicode. They were presumably stored in UTF-8 but retrieved as ASCII (or ISO-8859-1). | ||||||||||||||||||||||||||||||||