> The specification must contain a non-ambiguous formal grammar that can be parsed easily. A page can then be tested against the standard and reject or accept as compliant. Pages that don't conform with the specification won't be rendered. It is explicitly forbidden for clients to accept any page that doesn't conform with the specification.

This is what XHTML was, and it was a complete disaster. There's a reason almost nobody serves XHTML with the application/xhtml+xml MIME type, and that reason is that getting a “parser error” (this is what browsers still do! try it!) is always worse than getting a page that 99% works.[0] I strongly believe that rejecting the robustness principle is a fatal mistake for a web-replacement project. The fact that horribly broken old sites can stay online and stay readable is a huge part of the web's value. Without that, it's not really “the web”, spiritually or otherwise.

[0] It's particularly “cool” how they simply do not work in the Internet Archive's Wayback machine. The page can be retrieved, but nobody can read it.

▲

fooqux 2 hours ago | parent | next [-]

Agreed. There may be some situations where I may want to ensure 100% correctness. I'm thinking life or death scenarios, (which if so, maybe should use a different protocol). However, checking the sports score or looking at cat memes isn't that.

▲

singpolyma3 2 hours ago | parent | prev | next [-]

To be fair, HTML5 also has a defined parsing algorithm. It just happens to always work on any input to produce a webpage

	▲	jerf an hour ago \| parent \| next [-]
		Yes, this is what you'd want. It doesn't have to be a complicated as the HTML5 algorithm either. That's complicated because it was a harmonization of at least 3 browser's multi-decade heuristics and untold terabytes of existing HTML practice. An algorithm unconcerned with backwards compatibility could much simpler, but still clearly define error behavior much easier to use than "scream and die". And it's still unambiguous. You can cringe at what some people do, but it would be strictly a taste issue rather than a technical one, as the parse would still be unambiguous. And if you think you can fix taste issues with technical specification, well, you've already lost anyhow.
	▲	stavros an hour ago \| parent \| prev [-]
		I think the GP has an issue not with the specification part, but with the part where it's forbidden for clients to render a noncompliant page.

▲

rodarima 37 minutes ago | parent | prev | next [-]

Author here. I agree that you cannot go from HTML to XHTML because users and UA devs will always go towards "it mostly works".

However, I don't see it that clearly that this cannot be done since the start so that the expectations are right since the beginning. For example, I don't see the same problem in other formats like JPEG or PNG where you expect the image to work perfectly or fail with a decoding error.

Other than implementing it and see how it goes, can you propose a feasible experiment to see how an new strict spec will measurably fail?

	▲	htmlenjoyye 18 minutes ago \| parent [-]
		browsers will display invalid/corrupt images (best effort) tried it right now - took a PNG and a JPEG, opened them in a text editor, literally deleted the second half of the file, saved, and dragged them into both Firefox and Chrome - they are displayed instead of erroring out. there is a classic article why a minimal version of the web with features removed will fail - you removed 80% of the features that YOU think are not important. thats a classic fatal mistake search the web for different proposals for a minimal web and you will understand - they will have removed some feature they think is bloat but which you kept in your proposal because you consider it critical. which is why you created a new proposal - their minimal proposal is not the right one for you https://www.joelonsoftware.com/2001/03/23/strategy-letter-iv...

▲

maxerickson 2 hours ago | parent | prev | next [-]

No scripting is a tell, it's about wanting other people to accommodate their concerns about running a complex browser, not about solving a real problem.

If it did somehow happen that a good deal of interesting content was published using the standard, the most popular client would probably be nonconforming, ignoring the rule to not render ambiguous content.

	▲	krapp an hour ago \| parent [-]
		Every modern alternative web protocol is about accommodating the author's concerns and pet peeves about the modern web (and usually gatekeeping it from capitalists and normies.) Protocols used to be limited by technology, now they're defined by ideology.

▲

TFNA 2 hours ago | parent | prev [-]

XHTML failed in an era when writers (even normies) were writing some HTML of their own and they could't be trusted to close their tags properly. XHTML also assumed writers would be personally invested in semantic markup like distinguishing e.g. the italics of book titles from the italics of emphasis.

Today, when writers are using visual editors (or Markdown), few are writing their own HTML any more. A web standard requiring compliance would work differently today.

▲

PaulHoule 2 hours ago | parent | next [-]

Markdown sux and so do visual editors. I think visual editors were just invented to make it so cut-and-paste never quite works right. There's been some conceptual problem with the whole idea ever since MS Word and the industry has never dealt with it.

▲

intrasight 2 hours ago | parent | prev [-]

> XHTML failed in an era when writers (even normies) were writing some HTML of their own

I'd say it was a minority of writers that were handcrafting XHTML. And it was the case that everyone or their handcrafting or using tools could validate their compliance using a browser which made it very easy to adjust your tools or your handcrafted code. We are now in a situation where there is no schema for HTML.

I, for one, am very much in favor of forking the web with a document format with a schema. It really seems like a small and simple change to me.

▲

TFNA 2 hours ago | parent | next [-]

Note that when I say "writing their own HTML", I don't mean handcrafting a whole webpage. I mean that people were writing i or b tags in their Wordpress editors or in online comment boxes, because back then such text fields did not have visual editors and would accept raw tags. Under XHTML, if the writer did not close tags properly, such input would have broken the whole page, so obviously back then such a standard was DOA.

	▲	singpolyma3 2 hours ago \| parent [-]
		Those cases were easy to fix by using eg htmltidy on the UGC. Honestly I don't think it was killed by one thing, or by anything. Just no platform really cared and it wasn't a win for anyone and occasionally a loss.

▲

an hour ago | parent | prev [-]

[deleted]