> I'm thinking why not train a LSTM to take in http requests and generate the http response?
Why? With responses generated according to what? Are you really just suggesting using neural networks in the compiler's optimiser?
> Then try using a smaller network until something like your registration flow, or a simple content management system was just a bunch of floating point numbers in some matrices saved off to disk.
Why? What's the advantage over just building software?
I'm suggesting that you take an existing system and build up a corpus of request/response pairs. Then you use the LSTM to build a prediction model so that given a request it will tell you that the current production system will produce the following sql statement and this http response. Once the LSTM's output is indistinguishable from your current production system , for all use cases, then you replace the production system with the LSTM and a thin layer that can listen on the port, encode/decode the data, and issue sql queries.
Why would I want to do this? I'm not 100% sure ... I think it would be super fast once you got it working. I think it would avoid many security bugs. You wouldn't have to read that "oh drupal 3.x has 20 new security bugs" better go patch our code. I think when I had this idea I was thinking about it terms of a parallel system that could catch hacking by noting when actual http responses diverged too much from the predicted response. The main idea being that for a given input the output really is 100% predictable, assuming your app doesn't use random numbers like in a game or something.
To link this idea to the article, I think things like XML parsers could be written this way .... I can't prove it but I suspect that they would be very fast and not come with all the baggage that the article complains about.
Since you seem sincere, I’d like to mention that neural networks (as of now) are a complete clusterfuck for a problem with as much structure as you’re describing. There are no known ways to impose the relevant constraints/invariant on neural network behavior — and stop them from producing junk — leave alone doing something useful. That Karpathy article is pure hype with very little substance (like most commentary on neural networks). I like the vision, but it might take a minimum of twenty to fifty years to get there.
If you consider yourself a world-leading expert on neural networks and have some secret sauce in mind, by all means, good luck... otherwise it sounds like a fools errand.
Thanks for the feedback. I don't consider myself a world-leading expert on neural networks by any means.
I do want to point out that I'm thinking of doing this on a very limited website, not a general purpose thing that replicates any possible website. When I imagine the complexity of a modest CMS or an online mortgage calculator I think that it is much less complex than translating human languages. The fact that web code has to be so much more precise than human language actually makes the task easier. But to be fair, I'm all talk at this point with no code to show for it. So I will keep these comments in mind, this thread has been helpful for helping me think through some of this stuff.
Mutable state in the sense of database writes would be part of the network's output and just passed on to a regular db. Mutable state in the sense of variables that the application code uses while processing a request? Well LSTM networks can track state like that.
For session based variables? Not sure, either it all becomes stateless and the code has to read everything from storage for each reqeuest .... or maybe the lstm is able to model something like an entire user session and remember the stuff that the original app would have put in the session.
That Andrej Karpathy article that I linked to two comments above ... he pointed out, in a different blog post, that regular neural networks can approximate any pure function. Recurrent neural networks like the LSTM can approximate any computer program. It is because they can propagate state from step to step that allows them to do this.
As far as it being 100X slower, well at a certain point I will be willing to take your money :)
The main idea being that for a given input the output really is 100% predictable [..] I think it would be super fast once you got it working.
I imagine it would be fast, then you realise you've made a static content caching layer out of a neural network and replace it with Varnish cache and it would be hyper fast.
I don't think a caching layer would work. One example would be an online mortgage estimator. You input the loan amount, interest rate, length of loan etc. all as http input parameters. I'm suggesting that the LSTM can eventually figure out that those variables are being used by the application code to go in to a formula. That application code and its formula would all be replaced by the LSTM.
I just don't know how you can achieve that with static cache ... only if somebody else requested that exact mortgage calculation before and it is still in the cache.
Also, my idea of the "given input" from the earlier comment would have to include results of sql queries that would form the entire input to the LSTM.
But honestly I think over trained auto encoders can be used as hash maps. That would be an application more in line with what I think you are saying.
Seems that either the NN memoizes all the inputs and outputs until the function is totally mapped - then functions as a memoized lookup table, or the NN has discerned what the mortgage calculation is, and is doing exactly the calculation your {Python} backend does, but migrated into an NN middleware layer instead, which sounds like it would be slower.
And then you're hoping that the NN would act like a JIT compiler/optimiser and run the same code faster. But if it was possible to process (compile? transpile? JIT compile?) the Python code to run faster, then writing a tool to do that sounds easier than writing an AI which contains such a tool within it.
So there's a handwave step where the AI develops its own innate Python-subset optimiser, without anyone having to know how to write such a thing, which would be awesome indeed .. is that possible?
Why? With responses generated according to what? Are you really just suggesting using neural networks in the compiler's optimiser?
> Then try using a smaller network until something like your registration flow, or a simple content management system was just a bunch of floating point numbers in some matrices saved off to disk.
Why? What's the advantage over just building software?