>>I then pulled out the Google Translate app from my pocket, put it in airplane mode, and demonstrated how it translates the text under the camera directly on the image.
So I had a look at the google reseach blog cited [1] and it turns out that Deep Learning is not used to do translation in the Google Translate app. It's only used to do Optical Character Recognition- OCR. Note, that's not handwritten digit recognition, I haven't seen any claims that the Google Translate app can do that, so it most probably can't, and can only recognise print characters.
OCR Is not something you absolutely need a deep network to do, in fact it's one of those cases were you really don't want to deploy such an expensive system because there are far cheaper alternatives, like the logistic regression mentioned in the article.
The google research blog typically doesn't say anything about how the actual translation is done, but it seems to me, from playing around with the app a bit, that it does word-for-word translation, possibly with some probabilistic heuristic to figure out the most common/likely such translation. That's very reasonable, given the app has to run on possibly weak hardware but it's also not as marketable as "Machine Translation with Deep Learning on Google Translate App".
So this is not a very good example of the superiority of Deep Learning. It's more a good example of the superiority of the Google hype machine.
I totally agree that it's a delightful engineering feat to get all that to run on a smartphone and I'm suitably impressed actually.
The hype I'm concerned about is the one about Deep Learning. The article is careless with the technology details and makes it sound as if Google is running deep neural networks for machine translation on peoples' phones, which would be, at this time, bigger news than AlphaGo beating Lee Sedol.
Running a DNN on a phone isn't super hard. Training one is, but running is pretty efficient.
They definitely could run a DNN for translation on phones. For example the example seq2seq implementation in TensorFlow[1] would run on a phone fine, and that achieves close to state-of-the-art.
For example the example seq2seq implementation in TensorFlow[1] would run on a phone fine, and that achieves close to state-of-the-art.
Lol no it wouldn't and it doesn't. Either you would have to strip it down too much to fit on a phone, so it would lose performance (and it's far from state-of-the-art anyway, being an example) or it would not fit on your phone, unless you removed everything else from it (and even then).
Just to put things into perspective: there are large swaths of the global population that do not even own smart phones. They need translation also, you know.
no it wouldn't and it doesn't. Either you would have to strip it down too much to fit on a phone, so it would lose performance (and it's far from state-of-the-art anyway, being an example) or it would not fit on your phone, unless you removed everything else from it (and even then).
Yes, it is an example program. Nevertheless, to quote: "Even if you want to transform a sequence to a tree, for example to generate a parsing tree, the same model as above can give state-of-the-art results, as demonstrated in Vinyals & Kaiser et al., 2014"
Language modelling isn't the same as translation, but it is closely related. Seq2seq as an approach does achieve close to state-of-the-art for translation[1].
Why do you think it won't fit on your phone?
A related seq2seq model in TensorFlow comes in at less than 500kb[2]. The (much more complex) Image Recognition demo is designed to run on Android[3].
The new TensorFlow quantization approach[4] will reduce that size and increase performance even more.
Indeed. I guess I thought you were saying that this tutorial example could achieve state of the art results. Apologies for the misunderstanding.
>> Why do you think it won't fit on your phone?
The question is really why do you think it will. You claim it does. You need to show that's true. Can you show me a trained machine translation system using deep neural networks running on a smartphone and having state of the art performance?
Until you do (or anyone does) I will remain skeptical. As I should and as should everyone. You can't expect to just claim state of the art performance and have everyone praise your achievement.
>> The (much more complex) Image Recognition demo is designed to run on Android[3].
That's a demo. I guess it might be state of the art in demos?
Regardless, here's a hint of how practical it is to deploy this demo on users' devices. This bit I quote from the demo's github:
The TensorFlow GraphDef that contains the model definition and weights is not packaged in the repo because of its size.
So maybe it will run on your phone but good luck sending that to your users. Unless you want them at your castle gates with torches and pitchforks.
>> The new TensorFlow quantization approach[4] will reduce that size and increase performance even more.
What do you mean by expensive? Smaller neural networks are computationally cheap to run forward propagation on, which is all you need to do after deployment. This model is running in real time on your phone, after all. I'm also confused by your suggestion to use logistic regression instead. The accuracy will certainly be worse - even if you use relatively state of the art non-convolutional features like a spatial pyramid with SIFT or HOG (the computation of these features would almost certainly be more expensive than running forward propagation for a reasonably sized neural network, in any case). Thoroughly confused over here.
Both bits you're confused about, how deep nets are expensive to run and logistic regression scales better etc, are mentioned in the article and I use them in the same context.
EDIT: can't wait for the post timer to cool down, so please keep in mind I mean the submitted article, not the google research blog cited in that article.
I think you are overstating what the article says here. Google Translate lets you view an image through your phone's camera and it replaces - in real time - words in one language with the translated version in another. It's pretty amazing, and to get it to work the whole stack needs to be optimized. Part of that is making the network as efficient as possible, and one way to do that is to make it smaller.
logistic regression scales better etc, are mentioned in the article
>> I'm a bit confused. Neither of the words "logistic" or "regression" are mentioned in the article you linked to
As noted above, logistic regression as a more scalable alternative to deep neural networks is mentioned in the parent article, not the link to the google research blog. The google research blog is linked to from the parent article as proof of the parent article's claim that deep nets do machine translation on your phone.
You do need deep-learning to get state-of-the-art OCR when the image is not scanned from a book, but photographed text (or text-in-the-wild) in unconstrained conditions.
It is a good example of the superiority of deep learning.
> Before I finalized my decision to leave, Patrick asked me to go talk to Sam Altman. He said Sam had a good outsider’s perspective, had seen lots of people in similar circumstances, and would probably have a good recommendation on what I should do.
> Within five minutes of talking to Sam, he told me I was definitely ready to leave. He said to let him know if he could be helpful in figuring out my next thing.
Actually, it was. (Structurally speaking.) Greg and I had had plenty of conversations about what he was looking for and I asked Sam to share his honest assessment with him. It's no good for Stripe to have a CTO who wants to be something else :-).
Best of luck. Do you have a hypothesis? More experienced ML folks don't seem to be on board [1] [2]. I haven't seen anyone posit how to build a machine that can set its own goals.
I'm sure you'll build and learn something cool nonetheless
Another organization that tried to build AI was called Thinking Machines [3]. They were good at parallel programming. A few of them would later form "Ab Initio Software" [4]. Their motto is "from first principles," which Elon likes to say. Socrates did too.
My first job out of college trained us on that software. It's easy to use even if you're not a programmer, which was impressive for parallel software in 2004. I still get job emails for having it on my resume. The company is private and does not share its training resources. Boston mentality, I guess. Despite being so closed, they appeared to be quite successful in 2004 and have managed to stick around.
The question of how to create general AI is fascinating but I don't think OpenAI is actually aiming for anything like a leap to full general intelligence at the moment. Rather, it seems like their goal is to follow the same incremental path mainstream AI is following but without having to hide it's methods.
"OpenAI is a non-profit artificial intelligence research company. Our goal is to advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return."
"In the short term, we're building on recent advances in AI research and working towards the next set of breakthroughs."
-- Which is to say, ordinary progress, not extraordinary progress.
That said, the quoted verbiage is vague and different from what Altman and Musk said in interviews [0].
> Altman: The organization is trying to develop a human positive AI
> Musk: There’s two schools of thought — do you want many AIs, or a small number of AIs? We think probably many is good. And to the degree that you can tie it to an extension of individual human will, that is also good.
OpenAI's about page is also different from what was written in the AI open letters signed by Musk and presumably other backers of OpenAI last year,
> Artificial Intelligence (AI) technology has reached a point where the deployment of such systems is — practically if not legally — feasible within years, not decades, and the stakes are high: autonomous weapons have been described as the third revolution in warfare, after gunpowder and nuclear arms. [1]
> The key question for humanity today is whether to start a global AI arms race or to prevent it from starting ... We therefore believe that a military AI arms race would not be beneficial for humanity [1]
Then, in another letter,
> We recommend expanded research aimed at ensuring that increasingly capable AI systems are robust and beneficial: our AI systems must do what we want them to do
It sounds like Musk either thinks we are approaching AI, or he is misusing the term AI. The AI letter signees and backers of OpenAI may benefit from more discussion with experienced AI researchers such as those I linked above. What Altman and Musk said about the founding goals of OpenAI is not grounded in reality.
> However, there was one problem that I could imagine happily working on for the rest of my life: moving humanity to safe human-level AI. It’s hard to imagine anything more amazing and positively impactful than successfully creating AI, so long as it’s done in a good way.
Would be interested to hear about his role at Open AI - I would have thought you'd need at least a PhD in Statistics/CS to be able to contribute to AI research, no?
In the earliest days, I was doing everything required to make sure everyone else could focus on research: recruiting, operations, spinning up the cluster, taking out the garbage; really whatever needed to be done.
There's a surprising amount of traditional engineering work needed to make AI systems happen. Over the past month or two, I've focused largely on launching OpenAI Gym: https://twitter.com/gdb/status/726099677716713472.
That being said, I'm planning on working on a research project next. As Ilya likes to say, deep learning is a shallow field — it's actually relatively easy to pick up and start making contributions.
Greg,
I don't know you. It is pretty incredible reading the blog post and what you have signed up to do. Kudos to Sama for the vision, and all the amazing people you were able to get together for the shared mission. Making the org non-profit is the right model for this. It is all the more surprising given that the whole VC model is to make profit. I am curious to see what you guys come up with. All the best.
I imagine a guy as talented as him(published papers on combinatorics in undergrad) would have no trouble learning all the things necessary to contribute to AI research.
It seems like significant credit is given to the backpropagation method. I see backpropagation as a core mechanism in how neural networks learn. Are there other ideas present (or that could be explored) for a different core mechanism for learning? The backpropagation algorithm makes sense, but I still feel like there are other core mechanisms through which natural intelligence has developped.
These natural mechanism might not be applicable to artificial intelligence, but I think we could learn from them.
There is a theory by Hinton that the brain uses some variation of backprop for learning, and that it explains STDP.
However backprop is difficult to use on recurrent signals, as opposed to feed forward ones. I have been thinking about methods to train RNNs in real time. As opposed to saving every time step and iterating over it later.
It's a much more difficult problem because you lose the nice properties of backproping error signals over long time spans, and instead have to save any information that might be relevant in the future.
One idea to do that is autoencoders. They can be done recurrently, compressing the information from previous time steps into a single vector. But they don't work very well, or at least no one has figured out how to do it well. There's also HTM which is a totally different system inspired by models of the neocortex, but it also isn't as good as LSTM yet.
The core learning method in biological neurons is believed to be something like STDP (Spike-Timing-Dependent Plasticity). Basically, the arrival time of a spike at the post-synaptic end of a neuron is compared to the arrival time of a spike at the pre-synaptic end. The sign and scale of the difference of arrival times will cause a change in the respective synaptic strengths.
It has similarities to backprop, and depends on backpropagated signals but backprop is simpler, faster and can exploit knowledge about the loss function (as far as we know STDP cannot). The downside (?) to backprop is that it cannot exploit temporal information like STDP but several recurrent models have found ways around that.
>Are there other ideas present (or that could be explored) for a different core mechanism for learning?
There is the predictive coding theory of neuron spiking (review paper here: http://www.ncbi.nlm.nih.gov/pubmed/23177956), which says that human neurons fire to signal surprise. The neural networks is eventually slowly rewired so that inhibitory "predictive" signals successfully suppress excitatory "observation" signals, with a null-action indicating that the brain's model of the world remains largely correct.
This is a very rough layperson's understanding, since I'm not a neuroscientist, so read the paper!
This is an illuminating definition of deep learning from the end of the article:
The goal of supervised deep learning is to solve almost any problem of the form “map X to Y”. X can include images, speech, or text, and Y can include categories or even sentences.
Interesting to think about the implications when such systems become commonplace.
One thing I've heard talked about is to use deep learning to re-learn arbitrary programs from their inputs and outputs... Then you can throw away the program and just run the neural network.
I'm sort of a skeptic, so I think that would just make the programs massively less efficient (usually), more incorrect, and harder to maintain, but I guess it's a research direction. Maybe programmers need to be put out of a job :)
It would be really fun to see a web app built on those principles! User clicks on /login/ link -> neural network {"hum ... I wonder what she's after ... let's send her a login form"} -> response.
Trained on millions of MechanicalTurkers ... optimizing the function "which user can you keep engaged the longest". :)
It's a bit more precise than that because of the examples given, but also the interesting bit is that it does it with only general guidance from the programmer, on a set of x which is not previously known. Up to now algorithms have encoded domain knowledge, but that's about to change if deep learning lives up to its promise.
Supervised learning is function approximation given examples of inputs and outputs, plain and simple. So yes, of course a function is anything that maps X to Y. Deep learning just does that with lots of computation steps.
>It’s hard to imagine anything more amazing and positively impactful than successfully creating AI, so long as it’s done in a good way.
Really, what? This is horseshit of the highest order. Why would you think this is true? The vast majority of automation tasks don't require advanced AI. The vast majority of human work can be removed without recourse to AI.
In addition, AI produces enormous problems, especially in the colossally power-imbalanced society we live in. It's overwhelmingly likely that an AI would be used first for surveillance and oppression; this is probably already happening with the most sophisticated intelligences we have available to us.
Finally, there is a superabundance of "intelligence" available to us already, in the form of human intelligence. These intelligences are already highly capable of understanding and solving problems that an AI would take generations to appreciate on human terms (if it ever could)
To me, the most positively impactful thing that we can do as human beings is to enable the full application of human intelligence to solving human problems, currently going to waste in most corners of the globe.
People in AI try to replicate the "power of the human brain" when there are billions of "human brains" that are surviving at a subsistence level or working in mind-numbing menial tasks.
I can't understand the cognitive dissonance that makes a person marvel at how "amazing the brain is" while simultaneously ignore the suffering of billions of those brains.
>I can't understand the cognitive dissonance that makes a person marvel at how "amazing the brain is" while simultaneously ignore the suffering of billions of those brains.
I can.
Humans are good at coming up with brilliant ideas (such as, say, the concept of translation). But they are absolutely poor at executing them effectively and "at scale" (translating arbitrary works from one language to another). So "AI advocates" can talk about how amazing the brain is in coming up with ideas (as if ideas were all that were needed), but what they really want are mechanical brutes that are able to execute those ideas quickly and effectively.
At least some people hope that the proliferation of AI labor could mean a reduction of human labor, potentially reducing the "suffering of billions of those brains". This, of course, hinges on whether if the gains of productivity can get redistributed fairly (or if they just accumulate to those who already have capital). And then, there's all the social turmoil that occurs during the transition phase: humans may not want to be obsolete, humans may actually like working, AI accidentally becoming an 'exisitenal threat' due to human error, etc. The brains will suffer more in the short-term, in the faint hope that they will suffer less in the future.
EDIT: "The vast majority of automation tasks don't require advanced AI. The vast majority of human work can be removed without recourse to AI."
I would argue that when you get to the point when we have automation tasks and human work unnecessary, that we already have AI. We may never reach the stage of Strong AI because it turns out that what we actually do is not intelligent enough to require Strong AI.
> We may never reach the stage of Strong AI because it turns out that what we actually do is not intelligent enough to require Strong AI.
That's interesting. If we find a rigorous enough definition of "Strong Intelligence", would humans necessarily qualify as one?
I'm not sure. That's why we need to come up with a good, rigorous definition that doesn't elevates humanity, but is instead an objective, reasonable definition of intelligence that humans can agree upon. I'm doubtful that we can ever find that definition though.
Right now, humans consider intelligence to be "whatever machines haven't done yet" (Tesler's Theorem), but as machine capabilities increase, then there is a real possibility that humans may believe that intelligence doesn't exist at all (after all, if machines can do everything, and if machines are not intelligent, then nothing requires intelligence). [Source: https://plus.google.com/100656786406473859284/posts/Yp83aFwF...]
I do think that intelligence does actually exist and that current AI can already do intelligent things, but that the stuff that current AI can do won't match my vague understanding of the term "strong". If current trends continue indefinitely, then, of course, we won't ever have Strong AI, but we still have machines that do everything. At least, that's one possible way of thinking about intelligence.
But that's the thing, we don't have a good definition of intelligence at all (and I don't have one either) so we don't really know what's going on. We could invent Strong AI and never even recognize it, and maybe even dismiss it because it doesn't resemble what we think of as intelligence (much less "strong intelligence"). There's just so much that we don't know that talking about it is very difficult. AI is not just a field where you get to write pretty algorithms. It is also a philosophical field, and it is a shame that the philosophical and the practical aspects of AI are disconnected.
What I think is the crucial missing component is: how does your intelligent system define goals?
Right now goal-setting is something intelligences do not, and cannot, do for themselves. Humans must define the bounds of a problem carefully before a robot brain can perform useful work (some kind of numerical optimization).
The preliminary problem, then, is: how do humans define goals?
And the final problem: construct an intelligence that is able to efficiently set and achieve goals that are broadly in line with human goals.
I think this statement of the problems neatly sums up my difficulty with the notion of "strong AI", or "AGI", or Robot God or what-have-you and the possibility that it might be somehow useful in the world.
Because the way humans set goals, I think, is through vague heuristics that are represented as narratives carried by culture and society; we hold these narratives and pass them back and forth to each other, through various tongues and modes of fashion.
This means that human desire is the product of a constantly-shifting stream of socialization, which we are all drinking from and pissing into at once. The only meaningful way to accurately represent this, I think, is for engagement in it. You must participate in culture to "get it". Where this participation breaks down ("let them eat cake") we get strife.
Where does this leave the poor robot mind? It can only be "intelligent" in the way that we want when it can appreciate the horror of losing its daughter to a prison camp, when it can come to feel the memory of an inherited tragedy as both burr and serious weight. At this point we're just raising children again.
At any other point it's simply a dumb slave, doing exactly what we tell it - or a capricious, self-serving monster to be fought.
And since we don't know how humans decide their own goals (because knowing that would be a very revolutionary discovery that would immediately be used in a variety of other fields, including politics and advertising), we can never really establish a road map to building "strong AI"/"AGI"/"Robot Gods" (or even recognizing if we have built one by sheer accident). Clever. I like that.
There are probably ways to "cheat" your criteria though by having AI simulate the idea of discovering goals and acting on them, such as building a bot that searches Tweets on Twitter and then writing Tweets based on those Tweets it discovers. But these are "cheats" and won't be universally accepted. We could argue for instance that this bot really has a higher-level goal of finding new goals and carrying them out, and is only coming up with "lower-level" goals based on its initial "higher-level" goal. So, again, you're probably right. We don't know how to have AI create goals on its own...we can only give it to them.
I would say that "dumb slave[s]" or "capricious, self-serving monster[s]" are still threats to worry about though. Just because robots do what we tell them to doesn't mean that they will do what we want them to. Bugs and edge cases can exist in any exist system, and the more complex the system, the more likely it is for those bugs and edge cases to slip by unnoticed. These bugs/edge cases could lead to pretty catastrophic results. Managing complexity when programming AI would be a good place for "AI Advocates" to focus on.
"Very few people today would have the audacity to explicitly try building human-level AI."
Hmmm, there are a considerable number of people/groups who have this audacity... Have for decades... and they have been explicitly trying with much incremental success.
One thing such people speak about is ridding the space of outlandish buzz words/promotions that mask the true nature of how things function. This 'hype' creates barriers to contribution, learning, and progress.
Furthermore, the difficult efforts have been overshadowed by statistically mapped input/output flow models currently being called "A.I".
There is no mystery w.r.t how deep learning/etc work I.M.O.
You have inputs 'X'.
You take a known solution space 'Y' (supervised learning) or you create an arbitrary one (unsupervised learning) 'Z'.
You break apart the input space and map it to nodes in a graph.
You break apart the output space and map it to nodes in a graph.
Input flows are decomposed into minimal component parts and recomposed into higher orders of correlation. This is then compared (via flow restricted weighting) to increasing orders of the output space.
How does this magical 'piece apart and and piece back together' process work during active flows? It works based on guided encoding of 'importance' weights on the partial information represented by individual nodes in the flow graph network. Thus why under/over fitting can occur if you have too many/too little nodes.
How are the weights codified? By encoding the partial derivative (partial contribution) a node has w.r.t to the accuracy of the solution ... Error function (Desired - Actual). Curve fitting.
It's essentially distributed brute force statistical gradient descent which is why you have to beat on it, tune it, anneal it, and cram hoards of data through for it for it to yield anything of value. "throw enough dirt and it will stick"
Frankly, there is nothing to understand ... NNs/etc are distributed optimizers guided by partial objective information. The resultant network/weights are a spaghetti jumble of 'whatever gets the right output out the other end'.
You're throwing a bunch of 'agents' at a solution space and having them gradually combine their results to a final solution. This was previously known as constraint optimization before it got the silicon valley treatment of buzzwords :
This is not A.I. I don't feel anyone who has a grain of integrity ever thought it was.
It's very slimmed down version of cortical Algorithms with lots of missing pieces at best.
Strong A.I is being developed far from such thinking and is a totally different animal. People who work in this space are necessarily guarded and un open as there is a lack of appreciation, value, and funding for their 'audacious' efforts.
Of course, once more solid systems are developed, I'm sure you'll hear from them again in the form of 'blackbox' capability presentations.
Currently, the spotlight and money are being thrown at PHDs and names as no one has a clear understanding of what they're looking for. Namely because no one wants to spend the time/money on defining that. People are moreso interested in getting products/results out the door.
".... lets get the best minds, throw them in a room, throw money at them and hopefully a solution will come about"
Seems very similar to distributed brute forcing of a problem space with a made up objective function...."throw enough dirt and it will stick"
There are a lack of generalist being brought into these A.I labs and efforts as they are perceived to have little value. Yet, its the 'generality' and 'fuzzy' stuff that underlies our very intelligence. From general to specific or specific to general...
So, the industry wants to brute force this w/ money/PHDs/Buzzwords/industry names...
The more complex and disjoint a problem space is, the harder it becomes to brute force....
Time will tell. All roads eventually lead to Rome.
Though, some will take considerably longer.
>There is no mystery w.r.t how deep learning/etc work I.M.O.
Good, there shouldn't be. Being mysterious doesn't make something better, and simplicity is desirable.
>Frankly, there is nothing to understand ... NNs/etc are distributed optimizers guided by partial objective information. The resultant network/weights are a spaghetti jumble of 'whatever gets the right output out the other end'.
Basically yes. But that's not only incredibly effective, it's quite possibly how real brains work too. A lot of people do believe it is a path to AGI.
>You're throwing a bunch of 'agents' at a solution space and having them gradually combine their results to a final solution. This was previously known as constraint optimization before it got the silicon valley treatment of buzzwords :
That's not an accurate description at all, there are no "agents". In fact your whole description of NNs sounds off.
And backprop wasn't invented or named in silicon valley. In fact it's been around since the 80's. But whatever.
>Strong A.I is being developed far from such thinking and is a totally different animal. People who work in this space are necessarily guarded and un open as there is a lack of appreciation, value, and funding for their 'audacious' efforts.
Every "AGI" project is a bunch of pseudoscience. They have no idea how to build an intelligence. They have no idea how the brain works. They have no results to show with their algorithms, they aren't beating benchmarks. The theories are always vague and ad hoc and include a million special cases to make their systems do anything.
Basically yes. But that's not only incredibly effective, it's quite possibly how real brains work too. A lot of people do believe it is a path to AGI.
> It's only incredibly effective in a static world. The world is not static nor is the human brain. Nor is the interplay between a human brain and the world and A.I. It's a very disjoint, dynamic, and interdependent relationship with far more complexities than could ever be represented in a statistical flow map much less in the incomplete mathematics and statistics that underly them. I have no doubt that people believe they can make a statistical map of the world. It wont be the first nor will it be the last time people try to make an 'effective' one. The dynamics of the world will change and they will be invalid as they in no way are structured based on a true understanding of what's going on. Nor is there any awareness of what's going. Aren't the overly complex and flawed risk models that no one could explain what caused the crash in 2007/2008? You think the dynamics in the world are less or more? So, I say to people subscribed to this provenly flawed thinking : Good luck.
That's not an accurate description at all, there are no "agents". In fact your whole description of NNs sounds off.
And backprop wasn't invented or named in silicon valley. In fact it's been around since the 80's.
> Agents/node.. Tomato/Tomato .. they are partial computation nodes receiving and instantiating fed back partial derivatives based on computing an error between expected/actual. Where's the intelligence? Hindsight is 20/20 ... You're brute forcing the partial elements that contribute to a desired answer by slamming a cheese grater (NN) in forward and reverse flow ... hoping the important stuff sticks somewhere. Don't try to make it seem any more complex than that. Curve fitting at its finest. Constraint optimization. Gradient Descent. Regression. statistics all packaged up with fancy buzzwords.
The Backpropagation algorithm is used to find a local minimum of an error function. Flashback to grad school where there were a laundry list of methods in constraint optimization courses. There's nothing special about it. What is special is the thinking behind it.
You rightfully stated, most of these methods were developed in the 70s' 80s'. While people are off copying and instantiating the works of that time and relabeling it with buzzwords, there is little attention being paid to the thinking that yielded those methods. That's what matters .. the actual intelligence and thinking.. not what pops out the other end.
With little focus/money being put into expanding upon the thinking of that time period, those focused on it are not going to get any further than they .. Even worse, you maintain no understanding as to why they did what they did. Which is why no one can tell you how NN works. The intelligent people who defined them are dead.
Every "AGI" project is a bunch of pseudoscience. They have no idea how to build an intelligence. They have no idea how the brain works. They have no results to show with their algorithms, they aren't beating benchmarks. The theories are always vague and ad hoc and include a million special cases to make their systems do anything.
> People in the weak A.I space may have the lay person fooled by slapping buzzwords and A.I on everything. However, anyone who has spent anytime doing grad work in this area before it took on fancy names knows better... It's distributed gradient descent. The objective function is broken down into partial forms and instantiated at the distributed computational points in the gradient descent flow path defined by a NN. You slam it in forward and reverse and eventually enough stuff gets jammed into the lines for future flows.
I recall something named Genetic algorithms/evolutionary programming that were supposed to be the keys to the future...
So, Strong A.I ... AGI. I'm thinking those who have the best shot at it are people who know how NNs work on down to the mathematics and statistics, theory, philosophy and pseudoscience. Given this understanding, they have the ability to formulate new math/statistics/computational models and frankly whatever else it takes to represent a true form of intelligence.
I imagine they are working hard at doing that very thing in the shadows while others busy themselves trying to beat cooked benchmarks and fight over coin and the spotlight.
So, you go down your path and they will seemingly go down their path... But don't for a second think they don't understand exactly where your path is likely to lead you.. Many of them have gone down it and found nothing of value at the end. I guess the new crop of individuals who have no understanding of the thinking behind these algorithms they're copying/instantiating have to take this journey for themselves.
Call AGI a bunch of pseudoscience and foolishness. I have no doubt a good number of people will be praising and following it like religious zealots much like the work of those 'crazies' from the 70s'/80s that everyone laughed at but now can't wait to copy/relabel and call their own.
>It's only incredibly effective in a static world. The world is not static nor is the human brain
Neural nets aren't static. And yes they aren't great at online learning yet, but they are better than anything else and there is research into improving that.
>Where's the intelligence?
I'm not claiming a purely feed forward NN is intelligent, on it's own. But I do believe it could be extended and built upon to create one.
And just because an algorithm is simple, does not mean it's not intelligent. There is zero proof that intelligence requires complex algorithms. It's just all the simple ones we've tried haven't worked, yet.
>You're brute forcing the partial elements that contribute to a desired answer by slamming a cheese grater (NN) in forward and reverse flow ... hoping the important stuff sticks somewhere. Don't try to make it seem any more complex than that. Curve fitting at its finest. Constraint optimization. Gradient Descent. Regression. statistics all packaged up with fancy buzzwords.
Yes and it's super effective. What's your problem? Many, many intelligent people have tried to come up with more effective algorithms. Besides minor tweaks and variations, nothing has done better. But by all means, invent one yourself if you can.
>Which is why no one can tell you how NN works. The intelligent people who defined them are dead.
Almost anyone can tell you how an NN works these days. And that's simply not true, many of the early researchers in NNs are now very respected and run their own labs. They are far from dead, they are publishing more research than ever.
>So, Strong A.I ... AGI. I'm thinking those who have the best shot at it are people who know how NNs work on down to the mathematics and statistics, theory, philosophy and pseudoscience. Given this understanding, they have the ability to formulate new math/statistics/computational models and frankly whatever else it takes to represent a true form of intelligence.
Oh I don't disagree. And I'm very familiar with how NNs work, I've even written code for them from scratch. And I don't believe AGI will be just a big regular NN, there need to be more insights into how intelligence works. But I believe NNs will be a big part of it.
It isn't mere intelligence that we seek but _human_ intelligence. NN research will more likely than not culminate in something akin to, say, dog or horse intelligence or features of intelligence share by all species, rather than the (desired) human intelligence.
I see nothing in NN research that is quintessentially human (although there may be circuitry that is unique to humans that has not yet been revealed, this will most likely be uncovered by brain science rather than NN research IMO) and so I believe NNs are not the right level of approach to AI.
The author recommends Nielsen's book as an entry guide to the field. Do you have additional recommendations?
> NNs/etc are distributed optimizers guided by partial objective information.
Sounds like NN is a special case of something more general. I am interested in the field that studies these concepts from the first principles and in a more rigorous and general way. What's that field? Thanks.
It depends on what your aim is. If you want to jump into the field, follow what the field is doing and/or suggesting.
If you want to go down the rabbit hole, i'd suggest
Jeff Hawkin's book (On Intelligence)
The essential takeaway will be that the current focus is on cortical algorithms representing the neocortex. Even as such, the current models are only partially inspired.
From there, something might stand out to you.
Caution : This is the harder and unpaved road.
Studies in :
Neuroscience.
Neurobiology.
Computational Neuroscience.
Information theory.
For conversion work to computational land, a creative mind and a broad array of knowledge and experience in software engineering. Might take years but you'll maintain a depth of understanding and much greater capability of tackling AGI.
On AGI efforts, I personally can't imagine how one can maintain they are on a path to AGI by way of instantiating an artificial form of it yet have not even a basic understanding of how the biological form of it functions or how far off their models are from the essential parts that make it tick.
I guess to some people, it's cortex all the way down.
Lots of theoretical models, philosophical, and metaphysical thinking... The scientific method... Actual in-depth studies centered on the system you're trying to understand and duplicate in another form. You know.. Everything but the foolishness employers and VC firms look for in people.
How all the greats did it :
> Einstein
> Von Neumann
> Alan Turing
> etc
If you have a proven coding background, you're off and running in the production lab. The problem with the layers of hype, academic jargon, overly complex white-papers, and hand waving is that it makes people believe this is unapproachable outside the narrow scope of thinking that everyone is currently subscribed to. Scopes change with time and the people who tend to progress and widen these scopes are often those who think outside the narrow box everyone else is set upon. This is what Jobs meant by 'think different'.
Or you can attempt to brute force it with statistical models, PHDs, computing power, and truckloads of data hoping something miraculously emerges.
So, what type of thinking is used to develop strong A.I?
Strong thinking... Something that most and the industry aren't set upon which is why it most always takes an outside to usher in such new paradigms.
As is the history of the Googles of the world ...
Which is why I say you are more likely to hear about strong A.I once someone has developed it in the dark. It isn't going to be a 'thing' until it is a thing for people don't know how to recognize, value, or back undefined things until someone goes out of their way to make it into a reality.
Isn't deep learning the correct approach though? I mean we are trying to emulate biological neural networks which is fundamental to intelligence. Do you think that the artificial neural networks we are using right now are not rigorous enough?
> No. It's a piece of a much larger puzzle and only a partial piece at that. An overfit piece that people are over-applying. This is why things are overly complex and filled with statistics...
I mean we are trying to emulate biological neural networks which is fundamental to intelligence.
> There is far more functional complexity to the underlying biology. This is why studying neurobiology/neuroscience have value as opposed to resorting to ever more complex statistical models that no one understands.
Do you think that the artificial neural networks we are using right now are not rigorous enough?
> Of course not. Out of all the amazing people centered on it, no one can say why/how it works. Is it magic? lol... That should tell you something and trigger a red flag. I can state why it works and already have. It's just not something that's convenient and would necessarily cause one to admit that its not the broad general answer were looking for....
So, for some time, those centered on this paradigm are probably going to build out wildly elaborate NNs. They will require boat loads of data and computational power and achieve great outputs. Coincidentally this fits nicely in the cloud computing model that the big tech titans maintain.
Somewhere down the line, more solid and thought through computational models inspired by actual understanding will come out that will shake the very foundation of said approaches and so will go another page in tech history.
NNs are biological inspired. With all of the fanfare surrounding them, You maybe never stopped to question how inspired.
Drop Paul King/Paul Bush a line a Quora or dig through some of their posts.
It's better to talk w/ a Neuroscientist/Computational Neuroscientist about this stuff IMO.
I'm also interested in getting into AI, as it is dream of mine to work on a problem of gread importance. I'm not sure where to start though. I have seen many Neural Network posts on HN but they end up taking me to some Github repo full of code with no documentation at all. Is there some resource that starts from zero and codes up a learning algorithm? Currently I'm reading through some Juder Pearl and E. T. Jaynes books because I'm fascinated by the theory, but I'm none of this show me how to get hands-on coding up an algorithm. If anyone here could recommend something to help get this newbie's foot in the door, I'd be grateful.
> Mapping images to categories, speech to text, text to categories, go boards to good moves, and the like, is extremely useful, and cannot be done as well with other methods.
A Hinton inspired speculation about a Robot that plans & communicates.
I posit that general A.I. must be embodied to develop, it must learn from the real world by interacting with it and learn by trial and error.
Deep learning is a huge leap for AI. It learns best from raw data. Babies learn motor control & input interpretation first[1]. Moravec's paradox proposes learning these basics are the hard foundation that higher levels of reason build on.
Abeel and Levine's work[2] where robots learn robust functions to map from pixels to motor torques for domestic tasks show solutions built on features learnt from raw data are more resilient than hand-engineered ones. These trajectories through motor-torque space to produce a desired goal are intriguingly like planning.
Mikolov's[3] Word2vec & Radford's Image Vectors[4] produce semantic spaces where vector math is akin to reasoning. Higher level control could be achieved through vector algebras deriving actions and things directly from the robots internal vector space of motor-torques and raw-pixels
Thus internal vectors are like thoughts[5] which can interpret, reason, plan, and control.
Karpathy shows internal vectors can bridge modalities from words to pictures[6]. If the higher level control vectors of actions and things could translate to the modality of language as verbs and nouns, perhaps the robot could discuss its plans, and receive orders.
So I had a look at the google reseach blog cited [1] and it turns out that Deep Learning is not used to do translation in the Google Translate app. It's only used to do Optical Character Recognition- OCR. Note, that's not handwritten digit recognition, I haven't seen any claims that the Google Translate app can do that, so it most probably can't, and can only recognise print characters.
OCR Is not something you absolutely need a deep network to do, in fact it's one of those cases were you really don't want to deploy such an expensive system because there are far cheaper alternatives, like the logistic regression mentioned in the article.
The google research blog typically doesn't say anything about how the actual translation is done, but it seems to me, from playing around with the app a bit, that it does word-for-word translation, possibly with some probabilistic heuristic to figure out the most common/likely such translation. That's very reasonable, given the app has to run on possibly weak hardware but it's also not as marketable as "Machine Translation with Deep Learning on Google Translate App".
So this is not a very good example of the superiority of Deep Learning. It's more a good example of the superiority of the Google hype machine.
[1] http://googleresearch.blogspot.co.uk/2015/07/how-google-tran...