Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I still maintain that the "living standard" is an oxymoron. It's a collaborative browser dev document. Don't get me wrong, that's great. However for everyone else an unversioned document, any part of which can change at any moment, is not what's usually thought of as a standard.


This obsession with "living" and constant change seems to be mostly confined to the web --- instead of settling on a spec and then leaving it alone and "doing what you can with what you have", those working on this stuff seem more inclined with continuing to make browsers change.

I suspect at least part of the reason is to build a high barrier to entry and preserve the monopoly, keeping out competitors, given who the people in these groups work for.

My personal opinion on this is to stop feeding the monopoly and refuse to use anything other than basic HTML for static content sites.


> My personal opinion on this is to stop feeding the monopoly and refuse to use anything other than basic HTML for static content sites.

That's not going to work for one simple reason: despite HN's obsession with plain HTML/CSS, these new standards _are_ actually useful. They're being created with the express purpose of solving practical problems that developers, site operators, and users are experiencing in the real world. Those stakeholders aren't going to just ignore a practical solution to their problem over some esoteric concerns about "feeding the monopoly".

I share your concerns about the proliferation of web standards making it difficult for browser vendors to compete, but if the only alternative you can offer is stagnation I think it's completely unsurprising that your concerns go unheeded.


This times a million.

And for anyone who needs some bona fide examples of new standards and updates that are useful from the WHATWG specifications, well how about:

New semantic elements like header, footer, section, article, nav, aside, main, etc. These are far better for making logically structured pages than a ton of divs with class names would be.

The various new input types and attributes. Now you can have input fields which validate email addresses and phone numbers, present the right keyboard for the type of input you want, do various other validation checks or even provide things like a nifty date picker without JavaScript.

Picture and srcset too. No more having to load giant images on mobile devices or have blurry ones on devices with retina support, you can choose which image displays on which type of device.

The preload attribute and what not. Being able to load content the user will likely need in the background is helpful.

The built in video and audio elements, obviously. Again, object was a terrible solution for this, which was unsemantic, was awkward to use and struggled in certain browsers. These elements don't.

The canvas tag and things you can do with it.

And this will be a controversial one, but... most of the old elements that were made official by 'paving the cowpaths' were nice to see added too. The whole fiasco involving embed/object in the old days and how supporting multiple browsers meant invalid HTML was a bit of a farce really, especially as neither were particularly 'semantic' to begin with. Seeing them both made official and better alternatives provided just puts a lifelong issue at rest.


if only all of those things were fully flushed out in the various browsers... for instance, it disheartening to realize that it's been 15 years already and useful form elements like date(-time), phone, and email still don't have reliable and complete cross-browser behaviors, validation (i know it's hard, but still), and styling/event hooks.


True, support still isn't perfect. The calendar/date picker stuff is especially annoying here, since it really should be easier to customise and more similar cross browser than it currently is.

But most of the things mentioned do work in more browsers than not, and you can use them without worrying the site will break in almost all cases. The new semantic elements will never have issues (unless you're trying to target IE8 or below), the validation works fine in pretty much all modern browsers (as do simple styles for it) and support for the picture element and srcset are pretty good too.

The likelihood of anyone using the Blackberry browser or IE mobile is small enough to ignore.


> these new standards _are_ actually useful

I don't necessarily disagree, but it would help your case if you includes some examples in your comment (if only to build a discussion upon).


The Web Authentication standard [0] seems super useful and something we really need in the web.

[0] https://developer.mozilla.org/en-US/docs/Web/API/Web_Authent...


Yeah, this is pretty great. It's also comes from the W3C, not the WHATWG.

(it might be a bit lost in the hierarchy of the thread by now, but the original comment was about the WHATWG taking over and monopolising the normal considered and democratic standardisation process of the W3C with their HTML5 "living" spec.)


W3C was a good fit for WebAuthn because the W3C is a body for corporations and by its nature WebAuthn is built and primarily implemented by corporations.

Not a criticism, by the way, sometimes that's just the right fit.


Huh? The WHATWG steering committee is Apple, Mozilla, Google, and Microsoft. WHATWG's legitimacy (such as it is) doesn't come from being non-corporate, it comes from being entirely controlled by the top corporate browser vendors and thus reflecting the de facto state of affairs regarding how and when and why features are defined and implemented. If the WHATWG disappeared nothing would change because the top vendors are going to pick & choose W3C standards, anyhow.


Sorry, I was thinking out loud, I was contrasting the other obvious place to standardise this - the distinctly non-corporate IETF. WHATWG makes no sense for something like WebAuthn


You have to take the mindset that "giant bundles of JavaScript" aren't how we want to build the web but to begin to ship more and more of the web baked into browsers instead. And that even "static HTML" requires accessibility improvements and adaptations to new device types. For example, split screen phones, where you've a mobile device that starts at one size, but then can open up to a larger size. Tiny watch apps, new camera APIs, Responsive @media queries, better support for print media, less onscroll jank through the use of IntersectionObserver, new authentication APIs to support Windows Hello, advancements to link tags for prefetching resources and DNS, CORS security enhancements, discussion about proprietary browser implementations of features so they're not proprietary, using origin trials (Chrome) or the Develop menu (Safari) or preferences (Firefox) to encourage web developers to try new features on their sites and report back on how they work (origin trials make this trivial), non-vendors building new features for the web such as Intel and Facebook, and ... well, it's hard to summarize how many advancements there have been in web standards since HTML5 became a thing. I'd also point to a useful parallel in how ECMAScript versions their spec, it's in a constantly evolving state also as new proposals move between stages. To me, this reflects how standards bodies now can use git much more easily and effectively as a versioning mechanism and how they've learned to try and use the best of open source instead of locking up versions behind paywalls and private interest groups. Literally anyone can contribute to these specifications, see https://www.youtube.com/watch?v=y1TEPQz02iU


> You have to take the mindset that "giant bundles of JavaScript" aren't how we want to build the web but to begin to ship more and more of the web baked into browsers instead.

"giant bundles of JavaScript" weren't how we shipped the web before the HTML5 process began. The solution to "we're loading too much overengineered bloated redundant logic over HTTP" isn't "lets bundle all of that logic", it's "stop overengineering simple webpages".

> even "static HTML" requires accessibility improvements and adaptations to new device types

Accessibility standards predate and are still external to HTML5, and have actual fixed specs. They're relatively simple to implement from a parser perspective, and individual tools implementing functionality on top of that are ancillary to mainstream browsers. The only exception here is support for things like MSAA, etc. and that's also completely separate to HTML5 et al.

You go on to list a bunch of CSS stuff (again, some pre-dating HTML5, none coming from WHATWG), proprietary browser settings (not a part of any spec, living or static).

> better support for print media

Now this would be great to see. However this is something we definitely have not gotten since the advent of HTML5. What are you referring to here?

CORS pre-dates HTML5 (was later subsumed into HTML5), but the newer auth APIs you mention are nice; I'll give you that.

Things like IntersectionObserver however are one perfect example of extra feature-creep in HTML5, building hacks on top of an ecosystem overrun with overengineering. You should not need IntersectionObserver to get basic, usable performant scrolling. If you feel you do, you've overengineered your webapp: try actually addressing your performance bottlenecks.


I was speaking more generally about live standards, but if you want to focus specifically on the HTML spec and notable changes for newer devices and accessibility:

https://github.com/whatwg/html/commits

Let’s see, right off the top we’ve inputmode attr which helps support different keyboards on plain input text controls, we’ve enterkeyhint which lets you pick from a list of options for the enter key on virtual keyboards - both of these changes improve the usability and accessibility of the keyboard on newer touch screen devices. Form-associated custom elements let you create your own HTML elements that can participate in forms (the goal of custom elements is to help you build your own custom LEGO pieces instead of relying on the basic blocks the HTML spec includes — one could think of this as an eventual replacement of JSX components with HTML-based ones), autocomplete=one-time-code (thanks, iOS!), more granular control over file downloads, updating the spec to match reality (what browsers actually implemented for compatibility vs what was written in advance), srcset for retina (which reminds me, lazy loading images as a simple img tag attribute), WeakMap/WeakSet to help reduce memory leaks from stale DOM node references, requestAnimationFrame and other enhancements to the page lifecycle, CSP headers and related HTML attribute changes, and well, the list doesn’t end. ;-) For accessibility there’s the inert attribute that prevents focus to child content which is great if you need more flexibility than the dialog element provides by default but don’t want to get into setting tabindex manually or controlling all focus with JS, etc.


I really really don't mean to be dismissive here, but... you know the WHATWG largely emerged as an backlash against the W3C's efforts to create an extensible standard for allowing authors to create custom elements (what's more, they would be namespaced with actual schema, a la react prop-types, or—later—Typescript interfaces), with extensible forms (xforms). This is a process started in the late 90s! Two decades later, and there's people on the internet claiming the WHATWG are doing something novel reinventing this concept now.

WeakMap is ECMA. CSP was W3C and implemented by Firefox 4 years before HTML5 was released as a spec.

There's good stuff in the HTML5 (there would want to be in a spec. that size!), but it's remarkable how many of the cited examples are nothing to do with it.


It's important to note, nobody's being forced to use custom components or new syntax by specification, though as deprecations occur it's possible based on usage statistics that services like Google or new browser security or performance improvements could affect your site. And speaking to XML for a sec, while XHTML's backwards compatibility problems could have been avoided by specifying some kind of graceful fallback instead of Firefox's red text on yellow (yikes!), the real point is that when specs are created now, there's more of an attempt to "see what sticks" than there used to be. It's funny because E4X was also a failure, but JSX is so popular today, go figure.

Re. specific examples mentioned:

CSP is still evolving today and HTML has to keep up -- https://github.com/whatwg/html/search?o=desc&q=CSP&s=author-...

With WeakRef, things still need to stay up-to-date: https://github.com/whatwg/html/pull/4571

I would say it's remarkable how many standards on the web apparently don't have anything to do with HTML.


Tiny off-topic side-not on E4X: it was extremely popular in it's time... for extension author, GM scripters, XULers, anyone who had the freedom to use it without the concern of cross-browser compat. I think its failure was either one of standardisation bureaucracy, or of odd cross-client resistance to implementation, rather than it not being tech people wanted.

> I would say it's remarkable how many standards on the web apparently don't have anything to do with HTML.

I took the thrust of the original comment I responded to above to be giving WHATWG and general living/rolling-spec. process credit for increasing the pace of useful/practical/needed web innovations. I was pointing out that many of the cited examples of useful innovations were created either before WHATWG existed, or at least outside of WHATWG process, and that the majority (admittedly not all) of what the WHATWG has actually contributed has been superfluous cruft. That's the intent of my separating "this is part of HTML5, this isn't". Obviously many things are/can be subsumed into HTML5 as that's where they belong taxonomically, but I'm focusing on inception and what benefit living-standard process brings.


More probably, it's just that there is a lot of money to make in taking total control of the user experience. Up till now, we had to deal with this pesky OS, those package dependencies, long cycles, and complicated installation process. But with the Web, you can inject your code almost in real time on the user machine, they don't have to understand anything, they don't have to wait, and you don't have to care about what's under the hood. In fact, the user doesn't even know or accept it's running a software.

Now the browser is pretty limited, sandbox and all that, so companies making bank with it want to widen what they can do with it, again, and again, until the browser becomes an OS under their control, and the user machine just a terminal to access it.


> I suspect at least part of the reason is to build a high barrier to entry and preserve the monopoly, keeping out competitors, given who the people in these groups work for.

A lot of what happens in someone like Chrome implements something not yet standardized, then it gets standardized in a different way, or Mozilla implements something and Chrome implements it differently from the standard, perhaps to try and reach the same goal without a care for the yet to be finalized standard.

Of course Embrade, Extend, Extinguish is something I think of mostly in regards to Google, but I rather not think they're completely and utterly sinister in their goals. Some parts of Google are bad, just like some parts of Microsoft (sadly) are bad.

Go is a good example of the good parts of Google. Sure the core team works for Google, but they make all the decisions, not Google.

VS Code is another good example in regards to Microsoft, and soon enough GitHub (which I suspect is implementing a lot of ideas they couldn't afford to work on prior to the acquisition).


> Sure the core team works for Google, but they make all the decisions, not Google.

If the core team works for Google then Google is making all the decisions.

A company is the people working there making the decisions. A company can't make decisions without the people there doing so.


Maybe they are saying that the senior leadership at Google is evil, but the developers in the trenches are more benign in their intentions.


AMP, WebUSB, WebBluetooth all indicate otherwise.


What are they doing to harm WebUSB and WebBluetooth?


They're not doing anything to "harm" them, they created both of them, and both specs basically include a caveat that says "this could be mis-used, but YOLO".


So you're against the standards entirely? What is the alternative that you favor?

If you are making a web app (like my company does) that needs to print to a label printer how should we do that besides switching to a native app that all of our clients need to install and maintain? WebUSB gives us an option of doing it in a seamless and maintainable way.


> If you are making a web app (like my company does) that needs to print to a label printer how should we do that besides switching to a native app that all of our clients need to install and maintain?

So rather than just doing what works and shipping a native app, you instead rely on a feature that's currently in draft status, and works in one browser.. unless Google disable it again, like they did last year?

Sounds like a winning strategy my man.

Edit: to answer your original question > So you're against the standards entirely?

Yes. Literally the only reason either exists, is because without them Google's theory of "everything can be in a browser" falls flat on it's face, and when things are in a browser Google has a good chance of controlling the conversation (the same way Microsoft controlled the conversation in the 90s and early 2000's with Windows).

Both specs have glaring security issues that they themselves point out, but then provide no actual solution or work around for. They may as well start each one with "this could lead to compromise of your device.. life's a lottery, be lucky!"


We already have a native app but we are transitioning the majority of our core app to the web. Printing labels is one of the tricky parts that will be hard to transition. We are currently using an electron shell which makes maintenance easier but it still requires that end users install something which they might not have administrative rights to do. With WebUSB we expect that our entire app will be on the web and fully usable both on and offline without much maintenance fuss for the thousands of clients we have.


[flagged]


We are not a label printing business. I know I haven't really told you what we do but it's just a tiny, minor thing that goes along with the huge suite of software we make. Building on the web platform has huge advantages over native apps and we do know what we are doing.

WebUSB is just one piece of tech that helps wrap up some of the edge cases.


Yes, it's about the people! The point is that Google isn't a monolith, decision-making happens at all levels, and people working there have different, often conflicting opinions and are not necessarily thinking strategically like a CEO. I mean, sometimes they are, but without reading internal design docs that we don't have access to, you can't get good insight into their motivations.

This means that decision-making is less consistent or deterministic than many people pretend. Things happen because someone advocated for it and other people gave in. The outcomes of internal political decision-making are not necessarily predictable. Speaking of "Google" making a decision is usually misleading.


> A lot of what happens in someone like Chrome implements something not yet standardized, then it gets standardized in a different way

This. The Chrome team is hopelessly optimistic when making assumptions about how standards evolve. See Web Bluetooth which was heavily advertised to developers before other browser eventually chimed in and said, "yeah, we're not doing that."

Still turned on by default in modern Chrome without a flag, probably because the Chrome team is still assuming that eventually the API will be standardized to Chrome's specification.


All the browser engines and most browsers are open source, so making them harder to implement does not "feed the monopoly".


The fact that the source is open doesn't say anything about difficulty of implementation; in fact, being open-source probably increases the monopolising effect, since now people will be more inclined to think "that implementation is open-source, I'll just use/contribute to it instead of starting another independent one."


It might create a monopoly of browser engines, but not a monopoly of browser vendors, which is the harmful part (of course we might be heading to a browser vendor monopoly anyway, but that's due to Google abusing their market position, not the complexity of HTML).


That's a bit of an oversimplification.

The Electronic Frontier Foundation resigned from the W3C for a reason.


What reason?


Probably the introduction of DRM so Hollywood would start using video tags instead of flash/silverlight/other plugins. It’s a pragmatic approach, but right now it’s incredibly limited for anyone who wants to make their own browser that can load Netflix.com, for example. The only folks DRM punishes are those trying to do the right thing, everybody else can find workarounds if they need to.


>I suspect at least part of the reason is to build a high barrier to entry and preserve the monopoly

That isn't very logical. New specs like Grid are introduced because they're insanely useful; not because browser vendors just want to make the spec more complex and hard to implement.


> I suspect at least part of the reason is to build a high barrier to entry and preserve the monopoly

That is far too grand a motive. The reality is that the average web developer has a shelf life of about 5 years or is primarily focused elsewhere, such as Java or C#. That said consider the people who do this work with 100% focus. These people are typically not the same developers who are performing graduate level statistical analysis, building artificial intelligence, or resolving quantum computing. The hiring expectations for front-end developers in the corporate world tend to skew much lower, and the historical average salaries for these positions reflect that. Taking this into account I suspect the real reason for the constant churn is that many of these developers want things to be easier everything else be damned.


You are already downvoted into oblivion and rightfully so, but I just wanted to add my perspective as a backend developer (and the one who did post-graduate level mathematical stuff at that) who now manages a team of back and front end developers and plays with front end development for toy projects. The sheer amount of complexity and required knowledge for front end development is simply baffling to me. These folks have to know such disparate technologies as CSS, HTML, Javascript as a bare minimum and it never stops there. Typescript, templating engines, CSS preprocessors, huge and continuously evolving javascript ecosystem, evented concurrency (please take me back to my threads and semaphores), asynchronous state management in complex UIs with non-trivial interdependencies... Compared to this stochastic calculus is a breeze.


Totally agree. Also: I'm strongly against calling people "Front-End Developer" or anything like that at all. Where does front-end start? Where does it end? What is full-stack, does it imply you can do embedded? I've seen many listings that count PHP as one of the front-end requirements. This is just a term that should make the HR's job easier, and it does that badly.


We web devs actually have pretty solid definitions for "front-end," "back-end," and "full-stack." The back end deals with work on the server done after an incoming HTTP request comes in and involves preparing the HTTP response, including preparing the outgoing data for the templating layer if applicable (eg, when doing web page responses rather than JSON, XML, or binary responses). The front end deals with building templates such that that data becomes a standard web page, using CSS to style that page, and using JavaScript to implement client-side interactivity. A full-stack developer will be constantly switching between work from both sides of this divide rather than primarily focusing on one or the other.

If a job listing for a front-end developer is expecting applicants to know PHP beyond a "writing templates" level, it's either a poorly-written job listing or its creator has unrealistic expectations of its applicants - sadly, neither case is very uncommon in this industry.

At any rate, someone calling themselves a full-stack developer isn't implying they can do embedded, as that sort of stuff is pretty far afield of web development.


>The sheer amount of complexity and required knowledge for front end development is simply baffling to me.

And none of that complexity is essential to the task, which by itself is a testament of the quality of the “web platform”.


>The sheer amount of complexity and required knowledge for front end development is simply baffling to me

99% of it is self-inflicted though. You didn't need react/redux/sagas/uber popular framework 7.5 to do your job, but you and your coworkers thought "This is the cool new toy that facebook made!!!! Let's use it!!!" When 90% of web sites are reinventing a wheel made in 1994, there's no reason to actually do most of this junk


Bullshit. More and more front-end developers are building applications. These projects aren't in anyway equivalent to a 1994 website.


I agree that you don’t need any of that madness. It is self imposed pain by developers wanting easy over simple and work arounds over original code.

See the framework fallacy https://news.ycombinator.com/item?id=20014888


[flagged]


Having a low opinion of yourself doesn't make you right.


Why are people so hyper sensitive about this? What mortal wound does this subject expose?


This is the most arrogant and condescending comment I think I've ever read on HN, you spend so many words dancing around it, just call web developers stupid and lazy instead of wasting so much space.

It's not good enough to for me to be an arrogant techno-mage looking down on those stupid commoners from my technical tower, I must now also shit on all the other lower developers I have judged to be not as smart as me, some other class of developer.

Fuck off.


Also, I will say, some of the worst engineering I have ever seen has been on these allegedly superior engineers that work in AI and statistical analysis.

Maybe self reflection or hubris is the issue?


And yet, I can build world scale cloud systems but can’t build a web front end anymore to save my life.

I’m blown away by the sheer complexity that has evolved in frontend code. Developers there have such a broad toolset and arcane knowledge about how you really need to combine two CSS traits on surrounding containers to get the positioning you want.

Unfortunately, in our industry, most engineers think the layer above theirs is a trivial composition of parts


And Rube Goldberg machines are engineering marvels, but we shouldn't laud the engineer who builds one to transport a part from here to there when a "simple" conveyor belt does the job just fine.

How much of that complexity was necessary? How much of it was chosen to boost someone's CV?


I mean a lot of the complexity in the front end has arisen from the need of the business to provide more feature full experiences, not because some developer thought he could pad his CV. Developers aren't developing complicated front ends to pad CVs, they are developing complicated front ends because the business needs it to solve certain problems.


I agree that the frontend landscape is too complex, but much of that is self imposed. Most developers on the client side refuse to string two lines of code together without a framework or large abstraction library. Offers of reducing complexity such as removing unnecessary decoration in the code or returning to a standards driven approach is often met with hostility. Much of this complexity is the result of striving for comfort and easiness. Simplicity isn’t easy.

As an example consider this recent comment that was upvoted 11 times and some of the responses to it: https://news.ycombinator.com/item?id=20021708


While jaded, I don't think your reasoning is valid. If the question is, "Why suspect an organization to coordinate with another to establish a dominant position in manufacturing or standardization in order to create a barrier to entry/monopoly for potential competitors?" I'm pretty sure the answer has nothing to do with web developers, even if that's the subject at hand in the context of this question.


My reasoning is primarily a consideration of where pressure and redirection of web standards originates. For example consider how classes landed in ES6. The maintainers of the standard didn’t want classes, but hands down they were the single most requested feature of that specification.


"Standard" = consensus definition of desired behaviour. "Living" = open to change.

I don't see what is so oxymoronic about this? Just because a consensus has been reached doesn't mean that consensus must be immutable.

Just because the process of standardisation in some technical domains has tended to use versioned documents doesn't mean that such an approach is fundamental to standardisation.


The point is that HTML is almost 30 years old, and based on SGML which is much older (even though ISO 8879 is officially "only" from 1986), where SGML is just a formalization of typesetting practices established in the 1960's and 1970's. Given the depth of usage of HTML in everyday life (laws, contracts in ecommerce, medical records, personal communication, education, etc., etc.), I think HTML deserves better than being tinkered with all the time for no good reasons other than job security and/or achieving Webkit dominance, or other nebulous reasons at this point. At the very least, a standard should serve the purpose of defining a wellformedness criterion for a deliverable that you outsource to some HTML author, and that doesn't change all the time.

These are the first sentences of Yuri Rubinski's foreword to The SGML Handbook outlining the purpose of markup languages (from 1990):

> The next five years will see a revolution in computing. Users will no longer have to work at every computer task as if they had no need to share data with all their other computer tasks, they will not need to act as if the computer is simply a replacement for paper, nor will they have to appease computers or software programs that seem to be at war with one another.

Held up to that goal, HTML has utterly failed. And there's just no justification for attempting to complicate HTML, leading only further away from that goal, and by intent never coming to an end for a task that isn't terribly complicated, and increasing the scope without any sense of mental discipline, and without considering the scarce resources available, thereby creating a browser monoculture.


The markup parts of HTML (e.g. the parsing) are pretty frozen and have been for a while. This is the part that can sort of argue it used to be based on SGML (though now it's not, for various reasons).

The "HTML spec" includes a lot of APIs and processing model details that need tweaking as new constraints come up. A good example is that a lot of APIs that involve cross-window access need changes to their specifications in a process-per-origin world; the old spec text assumes everything can always be done synchronously, but that's not actually possible in that world as far as I can tell.

This, and fixing security bugs in the spec that get found from time to time are far from "no good reasons"!

None of the recent HTML spec changes would have affected "wellformedness" of an existing document in the markup sense. They're mostly about fixing spec bugs (in that the spec doesn't match what web sites expect!) in complicated areas like security, navigation, etc.


> it used to be based on SGML (though now it's not, for various reasons)

HTML doesn't cease to be based on SGML by mere declaration, or even by brittling it to the point it can't be parsed by any known formal standard. That's more a political stance, as in American isn't English or some such.

WHATWG's formulation of HTML has deliberately distanced itself from SGML out of ignorance and a desire to not being accountable or formally verifiable (aka move fast and break things) against established, rich theoretical foundations of markup languages. And it shows: already in a paper I published two years ago [1], I show flaws in HTML as described by WHATWG, some of which have since been fixed ([3]). Not only is the concept of "transparent markup" flawed and underspecified, it also has since be used in the definition of the content model for the dl element, and as an unintended consequence also the div element (cf [2]), flaws that could be easily avoided by just using SGML for the grammar WHATWG is attempting to describe when SGML has been around for ages.

It's also not entirely true that WHATWG HTML can parse all legacy docs. For example, the keygen element has been removed, and while not a terrible loss as such ;), since keygen is a void element (an element with declared content EMPTY in SGML parlance), its presence in a legacy document (eg without an end-element tag) will make HTML5 parsers fail hard ([4]). It's also completely unclear which version of HTML is being validated by eg. W3C's nu or another validator. Heck, even the spec text for WHATWG HTML itself reads

> This file is NOT HTML, it's a proprietary language that is then post-processed into HTML

when a large portion of the spec text portrays HTML in the role of an authoring language.

So tell me why, as an author, I should follow WHATWG's vision for HTML? As you say yourself, WHATWG hasn't advanced HTML the markup language at all, and has rather prevented the evolution of declarative UI features, to the effect of making JavaScript essential for all but the most basic documents.

[1]: http://archive.xmlprague.cz/2017/files/xmlprague-2017-procee...

[2]: https://github.com/w3c/html/issues/1116

[3]: https://github.com/whatwg/html/commit/6e305c457e42276bf275b8...

[4]: Edit: this affects differences introduced in HTML 5.2 vs HTML 5.1, not some distant archaic HTML 4 version


> WHATWG's formulation of HTML has deliberately distanced itself from SGML out of ignorance

I don't think that's a fair characterization at all. Ian Hickson was pretty intimately familiar with the SGML formulation of HTML.

What led to that being dropped was that no browsers implemented it in practice and that trying to implement it as written in the HTML 4 standard actively broke web pages (which had been written against browsers, not against the spec). Firefox tried going down that route for a while; I should know, because I implemented some parts of that and reviewed the code for the implementations of other parts. We eventually had to remove them, because of compat problems with websites. Comment parsing was a perennial favorite there.

> For example, the keygen element has been removed

That is a good point, yes. Arguably it should have been left in as a void element with no behavior to avoid parsing issues of the sort you describe...

> So tell me why, as an author, I should follow WHATWG's vision for HTML?

Honestly, because that's the thing browsers will implement. That's the only reason.


The narrative that "HTML5 looked at what browsers do, rather than following ivory tower SGML" is simply a myth and not backed up by facts. Ian Hickson introduced sectioning elements, the whole flawed outlining algorithm idea, and the aside element (presumably to make it easier to tell ads from content for Google's crawler), with lots of controversities at the time. The HTML spec is also chock full of inconsistencies of that time and mindset (for example, the allowed characters in ID attributes or general lexical rules for elements not matching CSS selectors' idea of an ID or element name) not actually supported by any browser.

Please don't tell me SGML comments were the problem - SGML commenting syntax is straightforward, eg. anything in double-hyphens within a markup declaration is treated as a comment, and there can be multiple comments in a markup declaration (unlike XML). The only problem I see is that there is an interaction with an ancient form of JavaScript comment taking the form of double-hyphens, presumably to make JavaScript commenting uniform with HTML syntax. Now the rules for terminating the content of a script element are dangerously bogus in ancient HTML, but HTML5 has done nothing to fix the situation.

In any case, WHATWG has driven almost all web browsers out of existence already; appealing to what "browser vendors" (Google) actually do is not the solution, but part of the problem, obviously.


I'm not saying HTML5 didn't have its share of architecture astronauting, attempts to add features that didn't pan out, etc. Trust me, I know it did.

I'm not saying it didn't spec various things that didn't match browsers (some just because, some because it tried to reverse-engineer browsers and failed). Well do I know it did; we're still sorting some of those things out. The only defense there is that unlike previous web specs it actually tried to specify this stuff (like navigation!) instead of just saying "yeah, do whatever".

The element name thing was actually needed for compat with how browsers parsed HTML in practice. The ID thing largely affects well-formedness, but was also informed by common practices, and the mismatch with CSS in large part was probably somewhat unavoidable due to differences in reserved chars. For example, there's really no reason, within the context of HTML, to not allow "foo.bar" as an ID, and people were definitely using IDs of this form all over, whereas in CSS you'd need to jump through the "#foo\.bar" hoop to use it in a selector.

SGML comments were definitely _a_ big problem (I'm not sure why you decided to describe them as "the" problem). This sort of markup:

  <!-- This is comment -- This is just in a markup decl -->
    This is still comment, because the '>' is inside the comment the third double-dashes started, yes?
was fairly common: people like to use "--" as a replacement for em-dash and it often ends up in the middle of comments. Browsers that attempted to implement SGML comment parsing would end up with the "This is still comment text commented out; other browsers did not.

> presumably to make JavaScript commenting uniform with HTML syntax

No, that was there to enable hiding of <script> tag contents from browsers that didn't know about <script> at all. So you would write:

  <script>
  <!--
    // Your script here
   -->
  </script>
and in a non-script-aware browser you wouldn't have a blob of script text showing up... It's actually a pretty sane approach for the problem of initially introducing the <script> element in a world where it didn't use to exist.

> In any case, WHATWG has driven almost all web browsers out of existence already

I'm not sure the problem here is "WHATWG" per se. I'm pretty sure that if WHATWG had never existed the results would have been pretty similar...


The HTML parsing algorithm looked at what browsers do. WHATWG HTML also included other innovations, some of which didn’t entirely work out. Nowadays, new additions are not added so loosely and there is a better defined working mode and governance policy.


"Government policy" yeah right. Chrome implements stuff, and Moz has to follow suit; then it gets prescribed in WHATWG's spec. OTOH, stuff introduced by Moz that Chrome doesn't implement gets removed from the spec. Such as much needed new elements for basic declarative UIs (menu, menuitem) to fight over-reliance on JavaScript and CSS hacks, introduced by FF but boycotted by Chrome. As was part of the WHATWG snapshot on which W3C HTML 5.1 was based, and removed in W3C HTML 5.2. There's no evidence Hixie analyzed "what browsers were doing". There is, however, evidence that Hixie just made up new elements as he saw fit [1].

[1]: https://www.webdesignerdepot.com/2013/the-harsh-truth-about-...


> That is a good point, yes. Arguably it should have been left in as a void element with no behavior to avoid parsing issues of the sort you describe...

That's what we did, actually :). Parsing behavior was unchanged. Ctrl+F "keygen" in https://html.spec.whatwg.org/multipage/parsing.html.


You do realize that spontaneously editing the spec to drop, then re-introduce an element buried in a git commit is exactly the reason why WHATWG is an unreliable source for a definite HTML reference, don't you? Especially when it goes unnoticed even by experts such as GP. At the same time, you want to claim authoritative control over HTML, yet show no sign of respecting other established standards and standard bodies such as ISO/IEC, IETF (eg avk's URL "standard"), and W3C?


I'm not sure what you mean by drop and then re-introduce, buried in a Git commit. We made a single commit to remove keygen, after displaying a public deprecation-will-be-removed notice in the spec for some years. Dropping keygen was part of a highly-public pull request, which gathered discussion from all browser vendors, as well as interested community members. The pull request was only merged once all four browser vendors supported removal (2 had already removed by that point).


I should have checked carefully, sorry!


I agree 100%, its sad to see XHTML go from declarative and well thought out components to a web ui markup + a pile of js.


People often forget two things:

- Virtually none of these changes are breaking, by design. If you prefer the web of 1998, then as a web developer by and large you can pretend that's still the world we live in.

- HTML itself has actually been a very small fraction of the "HTML5" (a silly marketing term) rapid iteration over the past decade. CSS has grown dramatically in power, and JS is hardly even the same language (which is good, because it was barely a real programming language at the beginning). But HTML itself is not dramatically different; most of what it's gotten are a handful of native replacements for things people had been implementing in JavaScript on a regular basis.


"not breaking" is an aspirational goal, not remotely a fact. When browser vendors decide to ship a breaking change (and make no mistake, they do this multiple times a year - probably dozens to hundreds) they have to run live experiments to gather data on how many commonly-visited sites use a feature they're going to change or a quirk they're going to remove. Typically if the value is like 1% or above the change is killed unless it can be made compatible, but they will go ahead even if it's over 0%. 1% may sound small to you, but web browsers are used by like... a billion people? More? And 1% of a billion is a pretty sizable number of people.

Pretend if you want, your stuff will probably break eventually. If it's simple enough it won't break and then you can go on with your life - that's certainly the goal of browser developers.

At the moment you start using the DOM or other JS APIs exposed by browsers the odds of your stuff breaking in the next decade go up, especially if you're using things that aren't like 5-year-old parts of the spec. The shelf life of most JS-heavy webapps is like 2 years in my experience.


I'm really not sure what kind of changes you're talking about. They certainly don't do what you describe at the API level. Maybe you're just talking about bugs? I have encountered a few browser regressions over the years but they're always for really exotic use-cases of relatively new APIs, and they always get fixed in the next release. I've been working on a suite of highly complex JS-heavy tools for the past 2.5 years, and the problems that have stemmed specifically from browser bugs in that time are a vanishingly small number. Probably less than five.


1% is _way_ higher than the acceptable breakage thresholds I've seen browsers use. Usually it's more like 0.003% or less.

This can still be a significant number, of course...


> When browser vendors decide to ship a breaking change (and make no mistake, they do this multiple times a year - probably dozens to hundreds)

can you share an example of a breaking change to HTML that a browser has intentionally shipped in the last year?


This is not remotely true anymore. I can remember times when browser updates broke site functionality, but that was a decade ago (longer?) when we were still coding for specific bugs in Internet Explorer 6.


> I think HTML deserves better than being tinkered with all the time for no good reasons other than job security and/or achieving Webkit dominance, or other nebulous reasons at this point.

That's good, because these aren't the reasons the HTML standard is changed, and to claim they are is absurd.

HTML may have had its origins in SGML, but it has long, long since grown past beyond those origins to become the web platform. Like any other non-dead software platform, it is undergoes a process of refinement and enhancement, not for reasons of "job security" or "Webkit dominance", but to provide additional functionality to allow more and better software to be build with of it, and to remain competitive with other software platforms.

You, and a large constituency of Hacker News commenters along with you, may utterly loathe the web platform. You may wish the web had evolved along entirely different lines, remaining a simple system for server hypertext documents. But the fact is, it didn't, and to act as if that the global development ecosystem that relies on the web platform doesn't exist is ridiculous.


There is a gulf of nuance between these two extremes you describe.

The web has "evolved" from a medium for simple self-publishing into a medium of mass surveillance and manipulation, big media, privacy-invading ads, uncalled-for browser monopoly, information oligopoly, and arbitrary crap code being sent to you in ridiculous quantities, not only draining your batteries and showing no respect for planet earth wrt energy efficiency, but also actively putting you in danger through fishing, xss and whatnot, and making your future ability to even read your legal, personal, study, business, or banking documents dependent on a needlessly over-complicated technology stack that no-one has the ability to influence in meaningful ways except Google, an ad company.

What it has not evolved into is a medium for long-term preservation of digital information, information autonomy, for simple ecommerce transactions and payments for everyone (as a merchant), for letting content producers thrive with quality content, or one that fosters free speech and diversity.

It has "evolved" by being captured to serve the interests of very few players, and fails the criteria of not having to appease computers or software programs that seem to be at war with one another.


> "Standard" = consensus definition of desired behaviour. "Living" = open to change. > I don't see what is so oxymoronic about this?

A standard is set in stone as a fixed target for implementers.

No fixed target? No standard.

And no, an unversioned document that changes arbitrarily is not a standard.

To put it differently, a standard is a goal. If the goalpost is arbitrarily kept on the move then there is no goal.


> A standard is set in stone as a fixed target for implementers.

Is that a descriptive or normative statement?


The thing that makes the web different and why the idea of a living standard makes sense for the web is because (by and large) web changes can't break backwards compatibility. Browser vendors are unwilling to make changes that will break existing websites because it could result in losing market share as users switch to browsers that still work for those websites. So anything currently in the standard today is expected to continue working tomorrow. That means even though the standard may change regularly, you can depend on anything that it currently says to keep working.

The idea of versioned standards is only really important if you have to worry about things changing out from under you that could break your existing work.


I don't mind a changing standard as long as two people can agree on exactly which version of the standard they're talking about and actually use that version.

I should be able to build my web app against an LTS version of the standard and expect it to behave identically in all future versions of all major browsers, until the EOL date of said version, no matter what changes they make in future versions. I want this for the same reason I want either CentOS or Ubuntu LTS, not Arch, on my production boxes.

Unfortunately, the whole HTML 5.1/5.2/5.3 business never quite caught on, and browsers don't support anything but the latest snapshot of whatever it is that they call a standard.


> Unfortunately, the whole HTML 5.1/5.2/5.3 business never quite caught on...

Because that was an artifact created by W3C by taking snapshots of the WHATWG HTML5 standard and making arbitrary changes to portions they didn't agree with. Since the WHATWG standard was already considered normative by most browser makers, the influence of the W3C's "standards" work here was essentially nil.


>I should be able to build my web app against an LTS version of the standard and expect it to behave identically in all future versions of all major browsers, until the EOL date of said version, no matter what changes they make in future versions.

That's already true of HTML5. Breaking changes in standardized features almost never happen. HTML5 actually goes further than what you suggest: they aim to never have breaking changes.


Attempt of versioning the HTML was failed miserably in the past. I prefer living document than outdated standard.


Hopefully those aren't in contrast to each other. You can have a living document that gets consistent updates AND still has meaningful versions, much like a lot of well developed software.

The goal is for the time between HTML versions to not be a decade, but instead for consistent, incremental improvements without browsers trailing behind for years. At the same time, these should (hopefully) not be breaking changes.


Sure. IMHO one solution might be to have a HTML standard specification that is stable, which can then be built on. Extensions can be proposed so long as they don't break the standard. This would allow new elements and attributes to be introduced before they are themselves fully stable, as they are now. Breaking changes would require bumping the spec to a new major version and should happen only rarely.


We have HTML standards that ar stable. 4.01 e.g. The problem is that no browser ever implemented all of it, and even implemented parts were done differently.


The difference is this time lessons have been learned. Browser vendors are now actively working together. The HTML5 project has done a great job in collecting and codifying real world behaviour.

No browser will ever implement the current spec because it's shifting sand. A smaller but stabilised spec that can be extended is much better for everybody who isn't a major browser vendor.


This new model announced in the OP will do that. Snapshots every 6 months that are taken through the W3C REC track.


It isn't great at all, because even that living standard is never being implemented up to spec, often by the same people who wrote it, and the current browser support status usually ends up being "what's written in bugzilla"


There are very comprehensive browser support matrices at https://kangax.github.io/compat-table/es6/ (JavaScript features) and https://caniuse.com/ (DOM features).

It's only for very recently implemented features that you should need to look in bugzilla.


A better resources is https://wpt.fyi/, which contains comprehensive test results in all browser engines. These days, at least for WHATWG specs, tests are required before any changes land in the specification, so wpt.fyi will necessarily contain all browser support for all features landed in the specs.


"should", sure, but vendors (I mostly run into this w/Chrome but other vendors do it too) happily just ignore the spec because it's inconvenient and sometimes explicitly have no plans to ever align with it. I recently ran into a case where Chrome was intentionally moving away from the spec without making any effort to update the spec, because being correct was... annoying. Not impossible or bad for users, just annoying.

At the time the behavior worked right in Firefox and Edge but now that Chrome is going to ignore the spec I suspect the other vendors might too. (For reference, it was an issue related to the lifetime of javascript objects for frames that have been unloaded or navigated)

This sort of thing will never appear in a compatibility matrix. You find out when it breaks your code.


JavaScript (EcmaScript) follows a more traditional standard process, though at a very high pace.

caniuse.com is amazing but let‘s not kid ourselves what it covers. It‘s about feature availability. Some feature might have 1000s of normative behaviors specifics but their being condensed down to there/not there.


HTML has always been poorly defined and has mostly been a description of de facto standards since forever. It takes a lot of leverage to make the browser vendors follow any kind of specification.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: