Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Learn WebGPU (eliemichel.github.io)
243 points by ibobev on April 27, 2023 | hide | past | favorite | 79 comments


As someone who's very recently been looking into using WebGPU on a project, why must it be so complicated to execute some simple instructions on a GPU?

All of the code I was looking at seems to use (and require from the API) a lot of graphics terminology like "shaders", etc. Additionally it seemed to require a bizarre amount of boilerplate code to create compute "pipelines" and other things.

Is there some reason that running code on a GPU must be so difficult? Why is there no API I can call like,

`GPU.exec(() => 2 * 2);`

Sorry for the dumb question. I'll definitely be taking a look through this guide later.


Hi, I'm a graphics programmer. Sorry for our confusing and pretty awful terminology, it's mostly an accident of history. I don't think we're the only field to have that problem, though.

> why must it be so complicated to execute some simple instructions on a GPU

Because they're not simple instructions. Here's what your 2 + 2 compiles to on an AMD GPU: https://hlsl.godbolt.org/z/qfz1TjPGa

In order to fill a super wide machine, we need to give it enough work to fill. That means we need to schedule a large number of jobs, and we those jobs to have a hierarchy. The hierarchy for compute work is:

Dispatch -> Workgroup (aka Thread Group) -> Subgroup (aka Wave, Wavefront, Warp, SIMD Group) -> Thread (aka Wave Lane, Lane, SIMD Lane)

Much like any weird device like a DSP, there's a lot of power here, but you need to program the hardware a certain way, or you'll get screwed. If you want to be immersed in what this looks like at a high level, go read this post for some of the flavor of the problems we deal with: https://tangentvector.wordpress.com/2013/04/12/a-digression-...

GPU parallelism isn't magic, and a magic compiler can get us close, but not all the way.


Are you the guy that made the link cell shading video? If so, I really like your stuff! Any advice on how to get into graphics programming?


I am, yes! Thank you!

I'm actually been a spec contributor to WebGPU, and can vouch for quite a bit of it, I think it's a pretty great API overall, and I think it strikes a nice balance between the apparent-simplicity-hidden-complexity of OpenGL and something like daunting like Direct3D or Vulkan.

A lot of the things I mentioned above don't really come into play for quite some time. You can get quite far with just triangles, vertex shaders, and pixel shaders. So go give it a try, draw some stuff, and then work your way up and down the stack.


Oh wow lol. I know we've gotten into discussions (probably some debates) on HN before. Never connected your name to that channel. I'm apparently subscribed and have seen 5 of your videos so far. All really fantastic videos. I apparently missed the Mario Galaxy 2 video, so there goes the next hour of my evening.


Any thoughts on mesh shaders?

From what I've seen (as a non-graphics programmer), the paradigm seems closer to what I'm used to.

Though I'm guessing that limited hardware support will hamper adoption.


As a fellow graphics programmer, this is a perfect answer.


Aside from the other comments which provide great answers about the way GPUs work and their history... I want to give my gut instinct for why an API like this doesn't exist on top of all that history.

I suspect it's simply because the "complexity floor" for running code on a GPU is already high. Obviously your trivial example is only for the sake of example. But I suspect that nobody has had the incentive to create an API that looks like that because anything close to that is just better done on the CPU.

Once you have problems that need a GPU to solve, then working through the peculiarities of the complex API is worth it.

I don't know this for a fact, but it's me reasoning backwards from the fact that nobody has seen fit to create an API like this, when all sorts of zany APIs exist.

For example, there are already plenty of code generators for graphics-oriented shader programs, which will take e.g. a material description like `{color: red, opacity: 0.5, reflective: true}` and generate the GLSL code.

Why would there not be an equivalent which compiles an expression language to GPGPU instructions? I must assume it's because the programs you might want to express with it are probably best run on the CPU.


My example wasn't great, and you're right that this would be better ran on a CPU.

For context, my very naive working model of GPUs is that they're basically just massively parallel CPUs with slower clock speeds. So given this it seems like it would make sense to have some kind of async math library which allow you to send batch calculations to the GPU to run in parallel.

Basically anything where you typically create a loop and do some calculation seems like it would better batched and sent to the GPU. Assuming my working model of GPUs is correct anyway.

So last week I was working on a JS library which required doing exactly this – some basic addition and multiplication in a loop – and I thought, hey, I wonder if I can just batch this and send it to the GPU for a slight performance boost? I was aware of projects like GPU.js and figured it should be a relatively simple thing to try, but turns out I was wrong.

What I thought would take me 30 mins last week has now turned into a quest to learn about shader code and the history of GPUs, and I'm still not exactly sure why.


"basically just massively parallel CPUs" is correct-ish, but it has the added wrinkle that you have to think of them as CPUs where large chunks of cores have to be executing the same instruction at the same time, which means that it's mostly loops where each iteration runs close to the same instructions that make sense.

Also, I think you might be drastically underestimating how utterly insane modern JavaScript engines like V8 are. JS code running as well as it does on CPUs is sort of a minor miracle in the first place. Even then, there's still a place/need for WebASM on top of it.

Transposing that unto a GPU in a transparent manner is asking a LOT. Not to mention that there's a lot of stuff you can do in JS that, if feasible at all, makes no sense whatsoever in GPU-land. Even basic stuff, e.g. string-based property lookups, just does not belong in GPU code for the most part.

If I see an API that looks like `GPU.exec(Callable)`, then I expect that any callable can be passed to that API. In JavaScript, that can mean a LOT of different things, many of which would be hell on earth to convert to GPU-compatible code.


Very, very wide SIMD units is also a good model. Your performance is proportional to how much of the same thing you're doing at once


TIL about GPU.js, which on the face of it seems like exactly what you were asking for in your original comment. Was the problem that the code you wanted to write didn't quite fit into the expected execution model, even if GPU.js allowed you to write things in an ergonomic way?


Many reasons, some of them not technical (e.g. historical baggage and closely guarded IPs), but mainly because GPUs are by far not as standardized and "open" as CPUs. On CPUs, the instruction set is basically the abstraction layer to the underlying hardware, on GPUs the abstraction layer is a software driver instead, and that needs to be much more abstract because the differences between the various GPU architectures are much bigger than the differences between ISAs like x86 vs ARM.

Having said that: your idea absolutely makes sense, and I would love to see shader compilers merged into regular compiler toolchains, so that CPU and GPU code can be easily mixed in the same source file (technically it's definitely possible, but this would also be the death of "3D APIs as we know them" - those would basically be merged into language stdlibs, and it would require many parties working closely together - so there's a risk of the 'many cooks' problem).

But give it a decade or so, and hopefully we'll get exactly that :)


> I would love to see shader compilers merged into regular compiler toolchains

they are already using the same compiler infrastructure: pretty much every relevant shader or otherwise GPU oriented compiler out there is based on LLVM. As for the parts of the toolchains that are different: that's simply because GPUs and CPUs are inherently very different.

> so that CPU and GPU code can be easily mixed in the same source file

what advantage would it have to mix together into the same source file the code for totally different targets with totally different parallelism models? I find it cleaner and simpler (apart from maybe trivial one-liner shaders that you can embed in a string) to keep them separated. Also makes debugging much easier.


> what advantage would it have to mix together into the same source file

It would be easier to share data structure declarations between the CPU and GPU side (MSL can already do this by sharing header files, a complete integration would just be the next step). GPU buffers and images could be described with [[attribute]] annotations (also very similar to what MSL does to extend C++ with shader semantics).

E.g. maybe a GPU buffer with vertex data could be described like this:

    [[gbu_buffer]] vertex_t vertices[] = {...};
...instead of calling a CreateBuffer(...) 3D-API functions, etc... (the code that's generated by the compiler might actually call a 'builtin' CreateBuffer function under the hood).

TL;DR: MSL is already halfway there by using standard C++ extended with custom attributes, it's just missing the last step to merge the Metal shader compiler with Apple's Clang toolchain.


PS: also if shader functions could be directly defined as 'regular functions' in CPU-side source code (and behind a special function pointer type), there would be no need for string literal shenanigans like this:

https://github.com/floooh/sokol-samples/blob/3f10c1a0620cec9...

(and shader compilation would happen during the regular build and would also generate regular compiler errors - with current toolchains that's only possible with a lot of build system magic)


Shaders are programs that run on the GPU. That term, like all others, come from the graphics processing background and history of GPUs.

Like the CPU, a GPU only understands its own instruction set. If you want to run software on a GPU, you must write it in a language that can compile to that instruction set. Making interpreters that run arbitrary languages doesn‘t really work on GPUs. They aren’t general purpose processors, after all.

Edit: researching GPU history might be helpful. In the old days, the era of early OpenGL and Direct3D, what you could do with a GPU was extremely limited. You could enable/disable some things and tweak some parameters, but what the GPU was really doing was out of your control. Google fixed function pipeline.

In more recent times, with Shaders and especially with the newer standards / APIs like Vulkan, Direct3D12, Metal and now WegGPU, the GPUs internals have been made much more flexible and blown open. We now have much more low level control. GPUs are quite complicated, so naturally working with these low level APIs is also complicated.


I hope this isn't too simplistic but it might help to think about what GPUs were originally designed for - banging out arrays of coloured dots at ridiculous speeds. Crudely (caveat: from my layman's understanding), when rendering graphics, a shader is a function whose input is a set of parameters that help identify where the pixel lies within its source triangle, and whose output is the appropriate colour for that pixel. A simple example might be to set the three vertices of a triangle respectively to red, green, and blue, and then interpolate the colour between those three points. When rendering the triangle, the shader is executed once for each pixel within it, and since each pixel's colour is independent of its neighbours', the problem is 'embarrassingly parallel' and can therefore be accelerated by running the same shader across multiple execution units varying only the input parameters.

Since computer graphics calculations are math-heavy, GPU cores are as you know excellent numerical processors, and therefore also great for other embarrassingly parallel maths problems; the challenge is that you have to think in broadly the same way as for graphical workloads about how to break up and execute the problem.


TL;DR; It has to work blazingly fast on all kinds of GPUs, which are not nearly as homogeneous as you might expect. The complex setup is the "lowest common denominator" of that ecosystem.

GPUs are stupendously good at ridiculously wide parallel processing, yet you still need to sequence these operations. Modern Graphics API setups like Vulkan and WebGPU are the currently best known way to write such systems. Shaders express the parallel algorithms, and the pipelines binds and synchronizes them together.

It's just legitimately the best way, as far as we know, to get the most bang out of your buck for that hardware while remaining vendor-agnostic. Anyways, a higher-level abstraction can always be built on top that provides you with the UX you desire, and we will see a bunch of them for WebGPU soon enough.


WebGPU is a low-level interface close to the hardware. If you don't care about having control over those details, it's not the right API to use. You'd have better luck with either a high-level wrapper or something else like WebGL.


WebGL still requires shader code I believe, and also uses graphic-specific terminology. My understanding is that there is no high-level native API for compute-specific use-cases. You are currently forced to use an API intended for graphics and then create abstracts for compute yourself.

I believe GPU.js exists to fill this void and facilitate simple GPU compute on the web without the need for all of the graphic-specific terminology and boiler plate code. But why a 3rd party library is needed for this makes little sense to me since a high-level native GPU (or hybrid) compute API seems like a great way to optimise many JS projects on the web today.

I must be missing something though because this seems like such a no brainer that if it was possible it would already be a thing.


Put briefly:

GPUs don't work like CPUs, at all, and need to massaged carefully to run efficiently. You can't just give a GPU an `int main()` and go ham. The APIs we get to use (DX12, Vulkan, Metal, WGPU) are all designed to abstract several GPU vendor's hardware architectures into an API that's just low-level enough to be efficient without forcing the app to explicitly target each GPU itself. There's no one way to run fast on everything with more than a trivial program.

As for why no 'universal high-level native GPU compute API' exists, GPU vendors don't want them because it's too much effort to implement. They'd much rather everything go through a low-level interface like Vulkan and the high level stuff get done by 'someone else' consuming Vulkan. Each new 'universal' API requires buy-in from 4+ different hardware vendors. Someone needs to painstaking specify the API. Each vendor must implement the API. Those implementations will be broken, just look at Android drivers, and now your 'simple' universal API is actually 4 different APIs that look almost the same except for strange esoteric behavior that appears on some hardware that is completely undocumented.

This is what OpenGL was and is why Vulkan, DX12 and Metal exist. The universal 'simple' API doesn't exist because it's too hard to make and not useful enough for the people programming GPUs regularly.


Standard web APIs are a lot like kernel-level OS features. They have a ton of inertia to them, so they arguably should be limited to stuff that is infeasible in "userland". (There are additional reasons for that in the case of OSs, but that's the relevant one here).

If you had such a high-level API baked in, it would get leapfrogged by some library project within a few years, and then we'd be stuck with yet another API to support that no one actually uses.


It’s not a dumb question but I think the issue is the abstraction layers you’re looking at.

Computing is hard in general. Your operating system and compilers do a lot of work that you don’t need to be aware of. In that way, your programming languages are high level abstractions and you trust the compiler to give you the best performance. However people still write simd intrinsics and assembly to get even more perf when the compiler can’t.

GPUs used to be much simpler , in that the abstracted a lot of the complexity away, letting the driver and hardware deal with things like a compiler would. But due to being so massively parallel, people wanted lower level control. This is almost exactly like people using simd intrinsics on CPUs.

So the programming model for GPUs became lower and lower level to allow for maximum performance. The higher level interface moved into frameworks and game engines.

Really, if you find this complex (and it is), you should start by learning a game engine. Just like dumping a first year student into simd would be hard, so is learning about GPUs from the graphics APIs.

Start high and go low, because a lot more will make sense as you understand the building blocks better from context.


It has to do with offloading computation to an external entity.

For example, lets take a simple example: CPU + DSP. I have an Atari Falcon here which sports something like that, but quite a few modern SoCs also have DSPs included (QCOM, TI, ...), or you have SBCs (single board computers, like Raspberry Pi) which sport an additional DSP chip on it.

Now what do you think the process will be to do the computations on the DSP?

You will need at the very minimum some way to hand data over to the DSP, execute some code, and retrieve whatever result back to the CPU. This can be as simple as just handing around memory pointers (in SoCs with "unified memory"), or as complicated as specifically triggering the communication line between these two chips (some fast serial connection, ... whatever). That also means you need buffers for that, i.e. the CPU has to somehow provide buffers to some API that deals with the handover; at the very minimum you need a buffer for the input data, the actual code, and for the output (unless you reuse the input). Now, you define some code to do whatever computation, who figures out the sizes of these buffers? For the code itself it can be easy, depending, but input and output?

From there it gets easily more complicated, and that is the very reason why there is so much "boilerplate" in APIs dealing with GPUs, and some drivers actually support OpenCL, but I'm not sure right now if that is made available in browsers even. This aims directly at "compute", but also brings all the aspects you need for properly offloading that with it. There is simply no way around that.


Loads of people have stated why easy GPU interfaces are difficult to create, but we solve many difficult things all the time.

Ultimately I think CPUs are just satisfactory for the vast vast majority of workloads. Servers rarely come with any GPUs to speak of. The ecosystem around GPUs is unattractive. CPUs have SIMD instructions that can help. There are so many reasons not to use GPUs. By the time anyone seriously considers using GPUs they're, in my imagination, typically seriously starved for performance, and looking to control as much of the execution details as possible. GPU programmers don't want an automagic solution.

So I think the demand for easy GPU interfaces is just very weak, and therefore no effort has taken off. The amount of work needed to make it as easy to use as CPUs is massive, and the only reason anyone would even attempt to take this on is to lock you in to expensive hardware (see CUDA).

For a practical suggestion, have you taken a look at https://arrayfire.com/ ? It can run on both CUDA and OpenCL, and it has C++, Rust and Python bindings.


A large part of the complexity comes from the GPU (often) having a separate memory from the CPU connected via a relatively slow bus. Another large part is that the GPU runs asynchronously from the CPU. Another large part is that you must write your code in a way that allows thousands of concurrent threads of execution or you won't run faster than the CPU. Another large part is the fact that the instruction set and features and general performance level of the GPU are unknown at development time and wildly different on common hardware. Despite all that, reaching near peak hardware utilization and real-time interactive performance is expected by users on all hardware regardless of the differences. All these things together mean that the obvious and simple designs for APIs unfortunately don't work very well.


I recently became aware of a high-level language called Futhark [1] that fits well in that space. It is statically-typed and functional.

Maybe some other array languages work in the GPU as well, I don't know.

[1] https://futhark-lang.org/


The short version is for me is, programming a GPU via a GPU API is like programming Assembly Language. If you want easy you use a higher level language like Python. Python is "usually" slower than assembly language but you can read a file, add up all the numbers in it, and print out the answer in 1 line of code where as in assembly here's lots of setup to read files, buffer their input, parse them into numbers, etc.

Maybe you want Use.GPU

https://usegpu.live/


I assume this type of API is not available natively because it's not how the interacting with the gpu works and is best left to libraries like gpu.js


I guess it is most likely you are not going to write a program multiplies two numbers, but you will need to multiply matrices passed in a certain way. those matrices are big ones, so you need to allocate some space for them, then load, then do the math, then use them in some other shader to paint some pixels on screen.


    GPU.exec(() => 2 * 2);
That's CUDA or SYCL (if you're willing to use C++).


It's not?

When writing with CUDA you also have a lot of boilerplate and things to think about like streams, blocks, grids. The list goes on and on.

To get to that level of abstraction you'd have to use something like pytorchs cuda backend.


It kind of is, because latest versions are increasingly supportive of standard C++.

The CUDA machinery is still there, just not exposed to work with directly.


Sure, if you use the compiler formerly now as Portland Group Compiler. nvc++ vs nvcc.


If you want high level abstractions use tools that provide those. But because GPUs are complicated and come with lots of caveats (see other comments) library authors need the comparatively low level access


You might be interested in https://futhark-lang.org


It's really interesting how the natural ecosystem for wgpu and wasm seems to be Rust, when C++ is also a prime candidate. Could this be the recency bias associated with Rust being the new kid in the block?


Zig fits pretty naturally here too. We've got ~19 WebGPU examples[1] which use Dawn natively (no browser support yet), and we build it using Zig's build system so it 'just works' out of the box with zero fuss as long as you grab a recent Zig version[2]. No messing with cmake/ninja/depot_tools/etc.

WASM support in Zig, Rust, and C++ is also not equal. C++ prefers Emscripten which reimplements parts of popular libraries like SDL, for me personally that feels a bit weird as I don't want my compiler implementing my libraries / changing how they behave. Rust I believe generally avoids emscripten(?), but Zig for sure lets me target WASM natively and compile C/C++ code to it using the LLVM backend and soon the custom Zig compiler backend.

[1] https://github.com/hexops/mach-examples

[2] https://github.com/hexops/mach#supported-zig-version


> ...Emscripten which reimplements parts of popular libraries like SDL

AFAIK those library headers are just historical artifacts from a time when those library didn't have official Emscripten ports, and which today are only reluctantly maintained if at all.

Emscripten's web API wrappers make a lot of sense though, and the other unique feature is the 'inline Javascript' support (so you don't need to maintain the Javascript glue code in separate .js source files).


interesting; I didn't realize libraries were adding official emscripten ports using the emscripten APIs. That's definitely a level up in my eyes, then.


At least SDL2 has 'proper' Emscripten support:

https://wiki.libsdl.org/SDL2/README/emscripten

My own cross-platform headers too:

https://github.com/floooh/sokol

...there are also auto-generated Zig bindings btw ;)

https://github.com/floooh/sokol-zig


It's probably as simple as 'cargo vs cmake'. wgpu-native is a regular cargo package which is straightforward to integrate and build. Google's libdawn is a mix of cmake and Google depot tools, which makes the same process anything but trivial.

I don't agree with your take on WASM though. Emscripten does a pretty good job of making developer's life easier.

(there's also a real need for a medium-level cross-platform GPU API which abstracts over D3D12, Vulkan and Metal, and WebGPU seems to be the most robust option for this at the moment)


The natural ecosystem is C++

UE4 Wherever You Go: Bringing UE4 to Phones, Tablets & Web | GDC 2015 Event Coverage | Unreal Engine: https://www.youtube.com/watch?v=LqHjkX3etG0

Then you have emscripten and now WebGPU

Gamedev's ecosystem is C++, wasm + C++ matured thanks to emscripten, you don't need to port or rewrite anything, when you need to port something, then you can leverage the already existing ecosystem, Rust plays nice with WASM but is too young

Another evidence is Photoshop https://web.dev/ps-on-the-web/

Also wgpu is a rust library, WebGPU is the proper name

But yeah, the language doesn't really matter with WASM, most of the popular languages target it nowadays, so it's a matter of what's your favorite one, and what ecosystem you can leverage to achieve the result you want..


Agree that Unreal Engine is a powerful force. But it's not FOSS. While the licensing terms are attractive to gaming companies, I'm not sure how well that goes over in other sectors. In particular FAANG seems to prefer Apache/BSD or writing their own.


I think it’s because when WebGPU was being designed, a lot of rusties participated. Want to change the world? 90% of success is showing up.


The natural ecosystem is the browser.

There is Dawn for C++, also shipped alongside node.


Yes I agree, the natural ecosystem is webasm in the browser now that for the first time in history everyone has a good fast and general jit vm they can use. I'm surprised something like webasm didn't happen sooner but at least it is here now.


WebGPU doesn't have anything to do with WASM.

Anyone that wants to use it productively is also better off with BabylonJS, PlayCanvas, Threejs.


Yes I'm agreeing that the natural ecosystem of the browser is because of thing like webgpu and webasm now that it has been invented.


From https://eliemichel.github.io/LearnWebGPU/getting-started/hel...:

> Since wgpu-native is written in rust, we cannot easily build it from scratch so the distribution includes pre-compiled libraries

Isn't this the wrong way round? Building from source seems to be the default for Rust programs, as they are typically statically linked.


Given that this article seems to be coming from a “you’re writing C++” perspective, I read this as “you probably don’t have a rust compiler installed so it’s not as easy for you to build from source as a C++ library would be.”


Ah, that makes sense! The author is also using CMake, which I imagine doesn't have much integration with Cargo.


Being statically linked works just as well with binary libraries in compiled languages.


With reference to the section on how difficult it is to build wpgu-native and how you basically have to use a precompiled binary, there appears to be an easier way to integrate a Cargo library into a CMake project: https://github.com/corrosion-rs/corrosion


This is the first WebGPU guide I've seen that shows how to make shaders do something useful, really appreciate that.


Will be interesting to see if WebGPU opens the door to viable WASM/WebGPU first driven Web UIs.

I guess the main downside at the moment are limitations like not being able to issue http requests from within the wasm sandbox, and payload size of the wasm itself. Of course UI rendering code can and should be separated out from business logic... but then the level of complexity of maintaining JS/WASM together makes it all seem not very worthwhile... I'd rather just do it all within a single language.

Still, likely to get widespread adoption for DataViz style widgets/games.


From https://eliemichel.github.io/LearnWebGPU/_images/rhi.png, it looks like on FOSS systems that Vulkan is the supported graphics library for interfacing with the GPU. This is in contrast with WebGL, which specifically targets OpenGL ES, without support for Vulkan or the various proprietary graphics libraries. Am I right in this evaluation?


BTW. For the last 15 years, all web browsers on Windows do WebGL on top of DirectX (using the Angle library https://github.com/google/angle).


Syntax discussions are hardly ever productive but the `vec4<f32>` syntax is truly awful in shaders. HLSL solved this ages ago with `float4`, `int4`. And even GLSL which uses `vec4` to mean float makes sense given you are almost always dealing with floats in shaders.


Great guide that introduces people to a overcomplicated, redundant feature-creep.


GPUs are getting ridiculously powerful and having proper API to utilize them is important.

People want to have powerful apps in their browser, so there's clearly a need to have access to compute shaders from JS and WASM.


Some of us wouldn't consider design by committee APIs as proper.


"People want to have powerful apps in their browser"

There is something rotten in that statement. No, you absolutely don't have to use such power in the browser. There is a reason we have a thing called an operating system.


The post is actually about using WebGPU as native library in native applications (outside the browser or Electron).

(but apart from that there seems to be a growing schism between people who see the browser as a document viewer, and others who see it as an application platform, I'm definitely in the platform camp ;)


Heaven forbid a doctor be able to share a link of a 3D visualization for a procedure so patients have a better understanding of the decisions they have to make. That'd be way too convenient!


With browsers being operating systems that do sandboxing by default. If I know I want to run an app, I prefer a native version, but browsers are superior for trying things out.


I'd be more sympathetic to this statement if any of the modern operating systems had halfway decent sandboxing.

Insert inevitable reference to "the Birth and Death of Javascript" but native platforms should be copying the web more in general. The DOM is just structured CLI output, with all of the advantages that CLIs have. Seriously, as a user nearly every native app on my machine ought to (in a better world) be rendering its interface to a structured text format that I can inspect and modify on the fly and pipe into other applications or dynamically restyle as I see fit. Because of that structure, most web-apps can be extended arbitrarily with user extensions. The sandboxing on the web is... okay it's not great, but it's better than what anybody else is doing. The delivery platform for apps is better specifically because the sandboxing is better, which allows me to try out apps quickly without giving them a lot of unnecessary access to my computer.

People are building apps in the browser because browsers offer capabilities that native operating systems don't offer. The few platforms where browser dominance doesn't hold true (iOS, Android) largely maintain native dominance by severely limiting what web browsers can do. That's the wrong way to move forward; offer a native experience that's as good as the web and people won't need to build apps on the web.

Even just good application sandboxing alone would be a good start, and yet whenever something like Flatpak tries to make progress in that direction, people are constantly complaining about it and explaining that sandboxing is the wrong way to approach app security. It's not really surprising then that people look at that discussion and take away from it that maybe the desktop isn't a good fit for their app.

----

My only critique of the webGPU direction is people bypassing the DOM, which... is usually the result of native developers and native frameworks trying to bring native development patterns to the web instead of rallying behind the idea of user-inspectable pure-text interfaces in the DOM. In that regard, and very specifically in that regard, I do kind of agree that the web is becoming too much like a native operating system. In the sense that it is repeating the same exact mistakes that native operating systems made.

But in regards to people just generally building apps in browsers... good. Honestly, progressive web apps are a safer, more user-respecting way to deliver functionality than native OSes are.


> if any of the modern operating systems had halfway decent sandboxing

What qualifies as "halfway decent"? For servers, Linux seems to have a good set of utilities (taken advantage of by docker). Windows doesn't seem to have as good system calls (like chroot, for example), but UWP apps seem to be decent enough for a sandboxed app. What would be a good example of sandboxing for you?

Personally I think that websites are the most popular way to distribute applications because it only requires a single link or URL to download the software and have it run immediately as required. No other cross-platform computing VM (JVM, CLR) can boast about that kind of ease of distribution. Additionally, HTML + CSS + JS have proven to be the most popular UI platforms. XAML just doesn't come close to the flexibility provided by these tools, and non-declarative UI is just out of the question for the majority of people designing websites in HTML, who are not software engineers


> because it only requires a single link or URL to download the software and have it run immediately as required

Agreed, but I would argue that the only reason why it's possible to have a single URL to download and run the software is because there's decent sandboxing.

Linux does have some decent security utilities, but it's only recently that they're starting to show up in non-developer contexts for everyday apps (basically Wayland, Flatpak, Bubblewrap, all of which have experienced pushback from the community) -- and while I fully approve of that effort, it's not anywhere close to the level of security yet that the web offers.

I would say that if you would feel comfortable with a native app installation workflow that worked exactly the same as the modern web's app delivery mechanism (ie, you can click on a link and a native app that has never been pre-vetted or moderated is installed and launches on your computer without any confirmation and without taking you to a store page), then... that's halfway decent sandboxing. The first bar to clear is "do I feel comfortable running arbitrary untrusted code that has never been validated by anyone?"

The second couple of followup tests that the web itself struggles with is stuff like "is this spying on me, is this fingerprinting my computer, how easy is it for this thing to phish me?" To me, that would be when we start to move pass halfway decent sandboxing to "good" sandboxing. I'm not sure the web deserves that label.

But I don't know of any native platforms that pass even that first bar where I would feel comfortable running malicious code on the system. I mean, sure, you can set up VMs and Firejail and I could if I wanted to set up a Linux system that had good sandboxing -- but none of that stuff is accessible to your average user; it's security for the very few people who know how to set up that security. Meanwhile, the web is security for every single person with a web browser on every single one of their devices.

Arguably mobile phones have started getting closer, but Apple is still claiming that the reason they can't open up their app store is because it would be insecure, so if we take them at face value they don't believe that their sandboxing is good enough to allow for untrusted code execution. And I'm hoping that eventually the combination of Wayland and Flatpak turns into decent universal sandboxing on Linux for every app the user is running on every distro, but we've still got a long way to go.

> Additionally, HTML + CSS + JS have proven to be the most popular UI platforms

This is certainly part of it, but... people write JS that don't like JS. I don't think it's a good enough explanation on its own.

But just opinion me, I don't have strong opinions about that.


Telling people that the thing they want is stupid, won't stop them from wanting it.


Alas. Three times alas...


Yet they all though it to be worth the engineering time to deliver this utility. While i agree with you i know a lot of people that find interaction via a browser more natural than a desktop application.

A bit like some emacs jokes go and say the operating system is just their bootstrapper for emacs.


If every system has a web browser that handles WASM and program develop paradigms shift to primarily utilize it, then programs by default are cross-platform and memory sandboxed.

Sounds a lot better than today.


I mean - to be fair, you have this already today.

It's just limited to html/js/css as the tooling.

Which is actually a-ok for the vast majority of use-cases. Honestly, I really don't want companies to be making GUIs using WASM/WebGPU. Html/css/js solve that domain fairly well already, and the "canvas-only" wasm implementations I've seen so far are pretty damn monstrous. Outside of being a completely opaque box to me, the user, they're also usually not terribly performant, break all sorts of expected behavior, and have cross platform issues.

I'm interested in seeing if webGPU enables more ML possibilities on the web, especially ML that's running client side and not over the wire. But really - that's about it.

GUI tasks seem like a bad pick for this. They're literally several steps backwards from where we are right now, and they completely remove the ability to inspect the running application as the user.

And that last point is a really, really big deal. Personally, I like a web that supports things like 3rd party integrations, ad-blockers, user-agent css styling, etc. I don't want a fucking canvas riddled with ads that I cannot interact with.


This is my one fear with GPU stuff.

I'm using Canvas when I build games on the web, and that's about it. And even there I'm regularly trying to think "isn't there some way I could represent this application state as a tree?"

There are a few places where this doesn't make sense, but if we're honest most application state even for native apps really ought to be represented in the form of an interactive document.

I want people to be able to build more powerful stuff on the web, but I don't want them to start looking at the DOM as if it's just some inconvenient quirk of the platform rather than a really seriously integral part of the web's success as a platform.

Yes, it's often inconvenient to build fancy effects on top of the DOM, yes you have to worry about updates and performance -- because your entire application state is not supposed to be in the DOM 100% of the time; the DOM should be treated as a render target presenting the application state and controls that are relevant to the user right now. The DOM forces you to represent your current app's interface as a pure-text XML tree;that is the common denominator for all of your users no matter what device or features or settings they have enabled. Targeting that common denominator is good practice because in most (not all but most) cases if you can't sit down and write out your app's current interface as presented to the user on a sheet of paper by hand as an XML tree, probably something in your interface has gone wrong or gotten too complicated and you ought to be rethinking the UX anyway.


But bypassing DOM is the main point when you have to draw and animate a lot of stuff quickly.

I say let the browsers use these low level API for speeding up DOM and CSS. But also allow people to bypass this layer.

Back in the day I migrated (more like recreated) a web RTS from DOM to canvas and getting rid of all that HTML and CSS was a massive relief. Deleted so much markup, styles and js all while improving performance massively.


I'm using WebGL for game development right now (and am somewhat looking forward to WebGPU) -- I'm not saying everything needs to be in the DOM. Just most things. It shouldn't really be the default for application frameworks to target Canvas (and multiple frameworks do treat it like the default rendering target). Like I said:

> There are a few places where this doesn't make sense, but if we're honest most application state even for native apps really ought to be represented in the form of an interactive document.

An RTS may be an exception to that because for most RTS games you can't sit down and describe what's going on in the game in a clear way using a pure-text tree or interactive document. But "exception" is the important word there. Most web apps aren't games, most web apps shouldn't be laying out complicated 2D/3D graphical scenes.


Just makes the browser the level that has to be up to date, which determines what hardware is still usable.

If Chromebooks are anything to go by, this will just end up making more hardware unusable for new software.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: