Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I am an AI skeptic. I am baffled by anyone who isn’t. I don’t see any path from continuous improvements to the (admittedly impressive) ‘machine learning’ field that leads to a general AI

- I share the skepticism towards any progress towards 'general AI' - I don't think that we're remotely close or even on the right path in any way.

- That doesn't make me a skeptic towards the current state of machine learning though. ML doesn't need to lead to general AI. It's already useful in its current forms. That's good enough. It doesn't need to solve all of humanity's problems to be a great tool.

I think it's important to make this distinction and for some reason it's left implicit or it's purposefully omitted from the article.



Yeah I agree - during undergrad, I spent a few years studying neuroscience, and I was very let down by my first ML/AI course. Compared to what I had learned about the brain, what we called an "ANN" just seemed like such a silly toy.

The more you learn about neurobiology, the more apparent it is that there are so many levels of computation going on - everything from dendritic structure, to cellular metabolism, to epigenetics has an effect on information processing. The idea that we could reach some approximation of "general intelligence" by just scaling up some very large matrix operations just seemed like a complete joke.

However, as you say, that doesn't mean what we've done in ML is not worthwhile and interesting. We might have over-reached thinking ML is ready to drive a car without major fourth-coming advancements, but use-cases like style transfer and DLSS 2 are downright magical. Even if we just made marginal improvements in current ML, I'm sure there is a ton of untapped potential in terms of applying this tech to novel use-cases.


I'm not sure I buy that - biology is often messier because of nature related constraints, it gets optimized for other things (energy, head size, etc.)

The way a plane flies is quite different than the way a bird flies in complexity - they share an underlying mechanism, but planes don't need to flap wings.

It's possible that scaling up does lead to generality and we've seen hints of that.

- https://deepmind.com/blog/article/generally-capable-agents-e...

Also check out GPT-3’s performance on arithmetic tasks in the original paper (https://arxiv.org/abs/2005.14165)

Pages: 21-23, 63

Which shows some generality, the best way to accurately predict an arithmetic answer is to deduce how the mathematical rules work. That paper shows some evidence of that and that’s just from a relatively dumb predict what comes next model.

It’s hard to predict timelines for this kind of thing, and people are notoriously bad at it. Few would have predicted the results we’re seeing today in 2010. What would you expect to see in the years leading up to AGI? Does what we’re seeing look like failure?


> It’s hard to predict timelines for this kind of thing, and people are notoriously bad at it. Few would have predicted the results we’re seeing today in 2010. What would you expect to see in the years leading up to AGI? Does what we’re seeing look like failure?

Few have predicted a reasonably-capable text-writing engine or automatic video face replacement, but many have predicted self-driving cars would have been readily available to consumers by now and semi-intelligent helper-robots being around.

Just because unforeseen advancements have been made, does not mean that foreseen advancements come true.


People tend to predict simple technological substitutes for human tasks rather than novel things. I suspect we won't get artificial humans because we'll end up not actually wanting that and getting something better instead. Just like we got cars instead of artificial horses.


AGI isn't really about artificial humans.

It's about very good general problem solving software that's way beyond the capabilities of humans while not being aligned with human interests. Not because the software is evil, but because aligning values is an unsolved problem (humans aren't even totally aligned - and values also change).

If you have an intelligence that's very good at achieving its goal and you don't have a good way to align its goal with human goals, you can very quickly get into trouble if that intelligence thinks much faster than you do.

https://www.lesswrong.com/posts/mMBTPTjRbsrqbSkZE/sorting-pe...

https://www.lesswrong.com/posts/4ARaTpNX62uaL86j6/the-hidden...

https://www.lesswrong.com/posts/BEtzRE2M5m9YEAQpX/there-s-no...

https://www.lesswrong.com/posts/5wMcKNAwB6X4mp9og/that-alien...


We used to think that machines would be bad at arithmetic and pure logical reasoning, and good at the more primitive animalistic ones, but it turns out the latter is a much harder problem.

Also, self-driving cars were mostly hyped up by companies, FSD is quite obviously a hard problem, much closer to general intelligence than the average NN application.


When did we think that?


Automatic video face replacement always seemed an obvious application to me. I'm more surprised that the tools for it are still so rough. I guess we can thank social taboos for that.

When I was a kid, I remember wondering how Soviets were obsessed with faking photos. A few years later, I saw Terminator 2 and realized that faking videos was also a thing. The tools for it would clearly get better and better over time. When I studied ML in the early 2000s, it seemed obvious that pattern recognition tasks such as image manipulation would be "easy" for computers, once we found the right approach and made the ML systems big enough. In the end, I decided not to pursue ML, because jobs were still scarce and I found discrete problems more interesting. That was probably the worst career mistake I've ever made.


self-driving cars are available to consumers now. Search for FSD on youtube and see all the consumers using their self driving cars.

Or, watch the latest Veritasium

https://www.youtube.com/watch?v=yjztvddhZmI


And yet all of them force you to keep your eyes on the road at all times or you can die. Can you honestly call that FSD?


#1 it's still beta. The point is to show the progress is real

#2 see linked video, he sits in the backseat, there is no driver and no one to take control.


If you can't go to sleep, the car is not fully self-driving


It’s as much FSD as my robot vacuum not hitting the wall…

These are just overhyped drive assist tools that market themselves immorally as something they aren’t.


The quirky thing to remember about gpt-3 is that it really is just a giant autocomplete based on the internet. It can do math insofar as it’s memorized some text which did that math with slightly different verbiage etc.

If you ask it to compute something that would never have been seen on the internet it’s likely to fail. E.g. add 2 extremely large/rare numbers together


As another poster indicates, that's specifically not the case. Most possible arithmetic problems of reasonable size aren't anywhere in the dataset, but it can solve them with pretty good accuracy.


You should read the excerpts in the paper I link to which suggest otherwise (that it’s not memorized and that it’s deducing rules).


A good example of those constraints is there are hard upper limits to heat and energy use by a brain that are simply outdone by, say, a massive supercomputer.

A rough calculation, humans can feasibly consume 4-5 TJ/annum of energy, of which a lot is going to go into motion or whatever. And if devoted to mental activity, it has a shelf life of ~70 years before they die.

A distributed computer might theoretically burn TJ/hr and once the weights are known they may as well be in a permanent record. The upper limits of what computers can learn and get good at are much higher than what humans can. They won't need as much implementation trickery as biology to get results.


The converse is also true: a supercomputer built using current tech that could do everything a human can, could neither fit in the volume of a human skull nor use as little energy as a human brain.


> The way a plane flies is quite different than the way a bird flies in complexity

And a plane is a vastly simpler machine than a bird!


I've heard this airplane argument before, and while I do consider it plausible that AGI might be achievable with some system which is fundamentally much different than the human brain, I still don't think it can be achieved using simple scaling and optimization of the techniques in use today.

I think this for a couple reasons:

1. The current gap in complexity is so huge. Nodes in an ANN roughly correspond to neurons, and the brain has somewhere on the order of 100 billion of them.

Even if we built an ANN that big, we would only be scratching the surface of the complexity we have in the brain. Each synapse is basically an information processing unit, with behavioral characteristics much more complicated than a simple weight function.

2. The brain is highly specific. The structure and function of the auditory cortex is totally different to that of the motor cortices, to that of the hypothalamus and so on. Some brain regions depend heavily on things like spike timing and ordering to perform their functions. Different brain regions use different mechanisms of plasticity in order to learn.

Currently most ANN's we have are vaguely inspired by the visual cortex (which is probably why a lot of the most interesting things to come out of ML so far have been related to image processing) and use something roughly analogous to net firing frequency for signal processing. I would consider it highly likely that our current ANNs are just structurally incapable of performing some of the types of computation we would consider intrinsically linked to what we think of as general intelligence.

To make the airplane analogy, I believe we're probably closer to Leonardo da Vinci's early sketches of flying machines than we are to the Right Brothers. We might have the basic idea, but I would wager we're still missing some of the key insights required to get AGI off the ground.

edit: it looks like you added some lines while I was typing, so to respond to your last points:

> it’s hard to predict timelines for this kind of thing, and people are notoriously bad at it. Few would have predicted the results we’re seeing today in 2010. What would you expect to see in the years leading up to AGI? Does what we’re seeing look like failure?

I totally agree that it's hard to predict, that technology usually advances faster than we expect, and that tremendous progress is being made. But the road to understanding human intelligence has been characterized by a series of periods of premature optimism followed by setbacks. For instance, in the 20th century, when dyes were getting better, and we were starting to understand how different brain regions had different functions, it may have seemed like we were close to just mapping all the different pieces of the brain, and that completing the resulting puzzle would give a clear insight into the workings of the human mind. Of course it turns out we were quite far from that.

As far as what we can expect in the years leading up to AGI, I suspect it's going to be something that comes on gradually - I think computers will take on more and more tasks that were once reserved for humans over time, and the way we think about interfacing with technology might change so much that the concept of AGI might not seem relevant at some point.

As to whether the current state of things is a failure - I would not characterize it that way. I think we're making real progress, I just also think there is a bit of hubris that we may have "cracked the code" of true machine intelligence. I think we're still a few major revelations away from that.


> Few would have predicted the results we’re seeing today in 2010.

That's hardly accurate - didn't Musk and Co. promise self-driving cars by 2012? We're in 2020, and the SDC's are great for making youtube videos, but not any good at piloting a vehicle without human intervention.

Since the 90s it has been clear that the only thing holding back what we have today is limited processing power. While there may be some new insights and directions in AI, they are not "general" and they require 3 orders of magnitude more processing power for a lot smaller improvement in performance.

What has been clear since 2010 is that this field has passed the point of diminishing returns already. We throw vastly more computational power at problems that we ever did before, and then call the result an improvement.

Deep blue beat the best human at chess using 11.8 GFLOPS of computational power. Alphago beat the best human at go using 720000 GFLOPS of power. The complexity difference between Chess and Go are within a single order of magnitude - 10x to 99x difference in complexity (https://en.wikipedia.org/wiki/Game_complexity). The difference in AI processing power to beat the best human between Chess and Go is between 4 and 5 orders of magnitude (1000x and 100000x).

This does not look like a success to me - it looks like a brute-force approach. If you spend 10000x more resources for a 10x more benefit, you're at the point of diminishing returns.

Here's a great paper that should be written (but won't be) - plot the improvements in AI and the usage of computational power for AI on the same chart.

From the 90s (https://en.wikipedia.org/wiki/History_of_self-driving_cars#1...): "The robot achieved speeds exceeding 109 miles per hour (175 km/h) on the German Autobahn, with a mean time between human interventions of 5.6 miles (9.0 km), or 95% autonomous driving."

Yup, 95% autonomous. Today we have 95.x% autonomous with roughly 10000x the resource power thrown at the problem.

So, yeah, your assertion that "Few would have predicted the results we’re seeing today in 2010." is wildly off mark, we predicted more than what we see today because we did not expect to hit a point of diminishing returns quite so quickly.

The people who did the 95% SDC in 1997 would have been disbelieving if anyone told them, in 1997, that even with 10000x more processing power thrown at the problem and new sensor hardware that was not available to them, it won't get much better than what they had.


- 2010 isn’t 2012 (and I don’t think that it’s true Elon even said that at the time - at best he may have said 2016?)

- FSD is one example, but the improvement of computer vision since 2015 has been massive and deep learning approaches to general problem solving too. This wasn’t something people were predicting in 2010.

- The deep blue and stock fish style approaches vs. the alpha go or alpha zero approaches are categorically different - the latter being a lot more interesting and closer to general learning vs. the older approach which is more like brute force.

- GOFAI was a bad approach and the optimism in the 60s was wrong. Today’s looks more promising. Being wrong in the 60s doesn’t necessarily mean people are wrong now. It’s hard to know: https://intelligence.org/2017/10/13/fire-alarm/

For the AGI bit I’d recommend reading some of Eliezer Yudkowsky’s writing or Bostrom’s book (though I find Bostrom’s writing style tedious). There’s a lot of good writing about take offs and AGI/goal alignment that’s worth reading to get a base level understanding of the concepts people have thought through.

AGI doesn’t need to be human like to be dangerous - it can be good at general problem solving with poorly aligned goals and just act much faster. Brains exist everywhere in nature, simpler than human brains. A lot of that computation in training could be an analog of the genetic “pre-training” of evolution for humans that gets our baseline which could be one reasons humans don’t seem to require so much. There was a massive amount of “computation” over time via natural selection to get to our current state.


Waymo is doing better, I believe.

> plot the improvements in AI and the usage of computational power for AI on the same chart.

Would that be meaningful? I mean, I use an infinity times the computational power for writing a letter than people did 100 years ago, still producing more or less the same results.


> they share an underlying mechanism, but planes don't need to flap wings.

A lot of birds don't need to flap their wings either.


Not if we mount a jet engine to their backs.


I think most of the complexity of biology is accidental, not essential. Eg. why don’t we have a normal abstraction for sending signals? Instead, we have like 10s of slightly different ones with different failings each, but each having many repetitive machinery leading to inefficient “spaghetti code”.

And while our brain is objectively very impressive, I don’t see how our complex abilities are anything but emerging features.


Yes and no - the brain is also incredibly parsimonious with respect to how little resources it uses to achieve the information processing power it has. If you could make a computer which could compete in terms of utility, energy usage, and size, you'd be a billionaire in no time.

It's probably true that you could imagine a "perfectly designed" brain which could perform better on some tasks with less complexity, but I think it's also true that there's been a lot of selection pressure towards increased intelligence, so this is probably fairly well optimized.

> why don’t we have a normal abstraction for sending signals? Instead, we have like 10s of slightly different ones with different failings each, but each having many repetitive machinery leading to inefficient “spaghetti code”.

What do you mean exactly by this? Like different neurotransmitter systems? Because I think it's actually quite elegant how the properties of different neurohormones lead to different processing modalities. It's like we have purpose-built hardware on the scale of individual proteins specialized for different purposes. I'm not so sure a more homogenized process for neural signaling would be an improvement.


John von Neumann wrote an essay on the topic titled The computer and the brain, which is quite a good comparison between the two types of systems, even though knowledge of the two was pretty primitive at the time. The basic idea is that computers are multiple orders of magnitude faster at serial calculations, but brains offset this difference by the sheer number of “dumb” processing units, with insane number of interconnections. Also, I don’t think that comparing the training of a neural network to the brain is fair from an energy usage point of view - compare the usage of the final NN with it.

As for how optimized is the human brain, well good question. I think not even a single biological cell is close to efficient, at most it is at a local minima. The reason is perhaps that “worse is better” in terms of novel functionality. But I don’t think there was a big evolutionary pressure on sufficient intelligence once it emerged - it is sort of a first past the post wins all.


> John von Neumann wrote an essay on the topic titled The computer and the brain

I have to be honest, I would take any such comparison from the 1950's with a huge pinch of salt. I think perceptions about how "dumb" an individual neuron is as a processing unit have shifted quite a bit since then.

> Also, I don’t think that comparing the training of a neural network to the brain is fair from an energy usage point of view - compare the usage of the final NN with it.

I'm not considering this in terms of the training efficiency, I'm looking at it in terms of the ratio between operational utility and energy used. There's no trained ANN with anything remotely close to the overall utility of the human brain at any scale, let alone one that weighs 3lbs, fits into a human skull and runs on 20 Watts of power.


I only mentioned that essay because I think the fundamental vision of it is still correct — in serial computations silicone beats “meat” hands down. And that is in both power efficiency and performance.

The fundamental difference between our current approach and biological brains is just as much a hardware one as it is theoretical. CPUs and GPUs are simply not best fit for this sort of usage — a “core” of them is way too powerful for what a single neuron can do (even with the more correct belief that they are not as dumb as we first thought), even if they can calculate multiple ones simultaneously. I’m not sure of specifics but couldn’t we print a pre-trained NN to a circuit that could match/beat a simple biological neural network in both speed and power efficiency? Cells are inefficient.


> in serial computations silicone beats “meat” hands down. And that is in both power efficiency and performance.

I just don't think this is a meaningful comparison, and I'm not convinced it's evidence of the "limitations" of biological computation.

Silicone beats biology in doing binary computation because they're a single-purpose machine built for this task. But a brain is capable of serving as a control system to operate millions of muscle fibers in parallel to navigate the body smoothly in unpredictable 3D space, while at the same time modulating communication to find the right way to express thoughts and advance interests in complex and uncertain social hierarchies, while at the same forming opinions about pop-culture, composing sonnets, falling in love and contemplating death.

For me to buy the argument that ANN's can be more efficient than biology, you'd have to show me a computer which can do all of that using less resources than the human brain. Currently we have an assembly line for math problems.

> a “core” of them is way too powerful for what a single neuron can do

I just think you're vastly under-counting the complexity of what happens inside a single neuron. At every synapse, there's a complex interplay of chemistry, physics and biology which constitutes the processing of the neurotransmitter signal from the presynaptic neuron. To simulate a single neuron accurately, we actually need all the resources of a very powerful computer.

So it may be the case that we can boil down intelligence to some kind of process which can be printed in silicon. But I think it's also entirely likely that the extreme parallelism (vast orders of magnitude greater than the widest GPU) of the brain is required for the kind of general intelligence that humans express, and the "slowness" of biological computation is a necessary trade-off for the flexibility we enjoy. If that's the case, it's going to be very hard for a serial computer to emulate intelligence.


I by no means say that our brain is not impressive - even a fly’s is marvelously complex and capable. But all of them are made up from cells that were created through evolution, not intelligent design. The same way the giraffe has a recurrent nerve going all the way down and up inside its neck for absolutely no reason other than evolution modifying only one factor (neck length) without restructuring, cells have many similar sorts of “hacks”. So I think it is naive to think that biological systems are efficient. They do tend to optimize for a local minima, but there are inherent hard limits there.

Also, while indeed we can’t simulate the whole of neuron, why would we want to do that? I think that is backwards. We only have to model the actually important function of a neuron. If we were to have a water computer, would it make sense to simulate fluid dynamics instead of just the logical gates? Due to the messiness of biology, indeed some hard to model factors will effect things (in the analogy, water will be spilt/evaporated) but we should rather overlook the ones that have a minimal influence on the results.


> while indeed we can’t simulate the whole of neuron, why would we want to do that? I think that is backwards. We only have to model the actually important function of a neuron.

Yeah so I think this is where we fundamentally differ. It seems like your assumption is that neurobiology is fundamentally messy and inefficient, and we should be able to dispense with the squishy bits and abstract out the real core "information processing" part to make something more efficient than a brain.

So if that's your assertion, what would that look like? What would be the subset of a neuron that we could simulate which would represent that distillation of the information processing part?

Because my argument would be, the squishy, messy cellular anatomy is the core information processing part. So if we try to emulate neural processing with the assumption that a whole neuron is the base unit, we will miss a lot of that micro-level processing which may be essential to reaching the utility and efficiency achieved by the human brain.

I'm not against the idea that whatever brains we happened to evolved are not the most efficient structure possible. But my position would be, we're probably quite far in terms of current computing technology from being able to build something better. I would imagine we might have to be able to bioengineer better neurons if we really want to compete with the real thing, rather than trying so simulate it in software.


I can’t think of any field of research where the model used is completely accurate. At one point we will have to leave behind the messy real world. While a simple weighted node is insufficient for modeling a neuron, there are more complex models that are still orders of magnitudes less complex than simulating every single interaction between the I don’t know how many moles of molecule (which we can’t even do as far as I know, not even on a few molecule basis, let alone at such a huge volume).

But I feel I may be misrepresenting your point now. To answer your question, maybe a sufficient model (sufficient to be able to reproduce some core functionality of the brain, eg. make memories) would be one that incorporates a weight for each sort of signal (neurotransmitter) it can process, complete with a fatigue model per signal type, as well as we can perhaps add the notable major interactions between pathways (eg. activation of one temporarily decreasing the weight of another, but in a way bias is sorta this in the very basic NNs). But to be honest, such a construction would be valuable even with arbitrary types of signals, no need to model it exactly based on existing neurotransmitters. I think most properties interesting from a GAI perspective are emerging ones, and whether dopamine does this and that is an implementation detail of human brains.


> What would be the subset of a neuron that we could simulate which would represent that distillation of the information processing part?

You only need to accurately simulate the input and the output.

Frankly, if that can’t be done with a Markov process I’d be very surprised, and we already know that Markov chains can be simulated with ANNs


So just to unpack this a little - there's a lot of different mechanisms going on in neural computation.

For instance one of those is spike-timing dependent plasticity. Basically the idea is that the sensitivity of a synapse gets up-regulated or down-regulated depending on the relative timing of the firing of the two neurons involved. So in the classic example, if the up-stream neuron fires before the down-stream neuron, the synapse gets stronger. But if the down-stream neuron fires first, the synapse gets weaker.

Another one is synchronization. It appears that the firing frequency of groups of neurons which are - for instance representing the same feature - become temporally synchronized. I.e. you could have different neural circuits active at the same time in the brain, but oscillating at different frequencies.

Another interesting mechanism is how dopamine works in the Nucleus Accumbens. Here you have two different types of receptors at the same synapses: one of them is inhibitory, and is sensitive at low concentrations of dopamine. The other is excitatory, and is sensitive at high concentrations. What this means is, at a single synapse, the same up-stream neuron can either increase or decrease the activation of the down-stream neuron: if the up stream neuron is firing just a little, the inhibitory receptors dominate. But if it's firing a lot, the excitatory receptors take over, and the down-stream neuron starts to activate more. Which kind of connection weight in an ANN can model that kind of connection?

My overall question would be, do you think back-propogation and markov chains are really sufficient to account for all that subtlety we have in neural computation, especially when it comes to specific timing and frequency-dependent effects?


If Markov processes won’t cut it, a Turing machine will. And an ANN can approximate a Turing machine.

To boil it down, if you really want to argue that the behaviour of a neuron can’t be simulated by an ANN, you’re arguing that a neuron is doing something non-computable. At which point you might as well argue it’s magical.


So I think this thread was about two claims:

1. Can ANN's (in their current iteration) achieve general intelligence

2. Can they do it more efficiently than a biological brain

It certainly has not been established that a Turing machine can achieve general intelligence.


I dunno man, when I got into learning CNN it becomes very easy to see how neural networks work at an intuitive level and how that plays into its "intelligence". For example, early layers of the network respond very well to simple features like edges and simple patterns, while later layers respond really well to more and more abstract things, like the shape of a person or a wheel etc. The craziest part is that this is all emergent from a random initialized state, all these patterns and abstractions develop with no manual intervention, they just happen to be the result of backpropagation consistently lowering a cost function.

The biggest thing is that computing is just now finally able to have enough data and large enough networks to really start to create more generalized models. With a computer 20 years ago you might be able to do squeeze out simple pattern recognition, but for every layer and every neural node you add to a neural network, the more complex the model becomes and the more edge cases it can fold into itself.

Take a look at Universal approximation theorem and how, with enough nodes in a neural network, you can solve pretty much any problem given the right weights.


And there's so much we still don't know about the nervous system and cognition.


So you took an undergrad ML course and you're using this as the basis for your conclusions about how ML can scale? You understand modern neural networks as large matrix operations and then attack that idea leading to intelligence as a joke?

I also find it improbable that intelligence will emerge from modern ML without some major leap. But you have added nothing to the discussion, beyond some impressions from undergrad, when we are talking about something that is a very active and evolving research area. It's insulting to researchers and practitioners who have devoted years to studying ML to just dismiss broad areas of applicability because you took a course once.


I'm sorry, I don't mean to insult or offend anyone. I'm just recounting my observations based on my understanding of the subject - and that is really not to disparage the amazing work that's being done, but rather to highlight the scale of the problem you have to solve when you're talking about creating something similar to human intelligence. It's entirely possible I'm wrong about this, and I would love to be proven so.

Do you disagree substantively with anything I have said, or do you just think I could have phrased it better?


Thanks for your reply. I suppose a quick way to summarize my criticism is that it reads to me like you've dismissed the strengths of ML on technical grounds, while you imply you don't have any real technical experience in the field. You make a superficial comparison between the compexity of biology and ML, without providing any real insight, just saying one has lots going on and the other is matrix multiplication.

If your conclusion is that current gradient based methods probably won't scale up to AGI, you're probably right. But if you want to get involved in the discussion of why this is true, what ML actually can and can't do, etc. I would encourage you to learn more about the subject and the current research areas, and draw on that for your discussion points.

Otherwise, it comes across as "I once saw a podcast that said..." type stuff that is hard to take seriously.

No doubt I come across as condescending, please take what I say with the usual weight you'd assign to the views of a random guy on the internet :)


You don’t have to be an expert in a field to recognize that the current popular approaches to something aren’t even close to getting there.


Actually you do have to be an expert to make sweeping statements with any credibility in a young field making advances every day. Huge ones and surprising ones every year.

If you can’t characterize the technical problem that creates a limitation then you are just expressing an uninformed opinion.

Even if you were an expert!


Not to get into the rest of the discussion, but I disagree with the classification of ML as a young field. AI is an established field and I would argue that nothing in modern ML is _fundamentally_ so different that it would justify classifying it as a new field.


Yeah, it's almost as old as computing - likely 60-70 years old. The thing about it is we had the blueprints for a lot of stuff like neural networks almost at the dawn of computing, but it took almost half a century for us to even begin to try out some of the ideas, because the computing hardware wasn't even close - it would have been like trying to build a CPU out of vacuum tubes.

Once we finally had the tools to even start trying, in the late 80s/early 90s, it took us a very long time to "calibrate" these general ideas and figure out the "devils in the details" that were necessary to make certain ideas viable (for example, neural networks were discarded as a dead end in the 80s, and only considerably later were we able to discover that multi-layer networks essentially "salvaged" the idea).

Machine learning without the era of "modern computers" was a bit like flight before we'd really mastered the internal combustion engine - we understood quite a bit about it, and had theories about a lot of stuff (like the basic shape of a wing), and could successfully build gliders and such. Contrary to a lot of propaganda, the Wright Brothers didn't just arrive in the world like "lightning from a clear sky", but ... it had to become practical to do for us to then move on to putting the ideas through the paces, and all of the established theory from beforehand ran into the usual treatment of "no plan of battle survives contact with the enemy".


In terms of its origins, and the core algorithms, I agree neural networks have many decades on them.

However until the hardware and software support for mainstream massively parallel execution became available it was a niche tool.

So the level of adoption, experimentation, deployment and research resources available are multiple orders of magnitude greater than 20 or 30 years ago.

As a practitioner (for my entire career) the field still operates as a new field, with enormous areas for new experimentation and interesting new creative advances happening quickly.

So we are still at the beginning.


Thanks for your insight!


I'm in favor of changing the terminology from AI and ML to something along the lines of 'prediction model' so that the idea of machines 'thinking' is replaced with them 'predicting'. it's just easier for our mushy meat brains to think that AI and ML means that it'll lead to general AI or as I like to call it 'general purpose decision maker'. it's all about the language!


In not so long past, there was another popular expression - "computer-aided ...", which was quite fit for the practical use (like CAD for design, CAT for translation etc)

Perhaps, CAI for inference or insight would express it more fairly.

Alternatively, AI could've stood for 'automated inference', but sure it's all too late to rebrand.

We humans still not clear about nature of our own intelligence, yet already claimed being able to manufacture it.


I think inference isn't the right term either. I think current ML is more like automated inductive reasoning.


Automated inductive reasoning sounds a lot like artificial intelligence to me...


Idk maybe it's semantics, inference to me sounds more like a logical leap is happening, whereas in my mind the simplest form of inductive reasoning is just expecting a pattern to repeat itself.


Expecting a pattern to repeat itself may not be sufficient to count as intelligence, but general purpose pattern recognition certainly seems to fit the bill.


Computer Aided Pattern Recognition sounds reasonable in setting public expectations.


I think you're right if you eliminate the word "aided"


I like the term “data driven algorithm“. It makes it clear to everyone involved that what we’re doing is just adjusting an algorithm based on the data we have. No-one in their right minds would confuse that with building a true “A.I.”.


To be frank: that very much does not make it clear to everyone involved. If you told the average Joe you had a “data driven algorithm” instead of “AI” you would likely get a blank stare in return.


confusion is better than wrongful understanding?


What about “data derived algorithm”? The algorithm itself isn’t really driven by data after it has been designed anymore.


I mean if we want to be really accurate, we could say something like "highly dimensional data-derived function"


why stop at that? 'high dimensional matrix parameterised data derived non linear function optimisation and unique hypothesis generation' just rolls off the tongue doesn't it xD


I'm sorry to say that I don't see any clear line separating "data driven algorithms" from the embodied minds that we are.


why? we don't understand the architecture, but the brain certainly uses electrical signals in an algorithmic way


We already have "Pattern Recognition", not sure why it got absorbed by Machine Learning (the two terms seemed to co-exist with some overlap on what they covered), and then ML got absorbed by AI.


ML is still widely used and is much more common than AI as a term. So I wouldn't say that it has been absorbed by AI but their use sometimes overlaps depending on the target audience.


ML seems to be an ok term to me? It's the "intelligence" part in AI that needs a disclaimer.


When I have encountered for the first time ML term I decided to learn what is that new great stuff. To my great surprise this was a typical old new thing called "statistical inference" in the days when I was working as a statistician.

There were a few new things, like ignoring model, choosing right variables, whatever was available was thrown into the equation, if it was clear that such "model" is over-fitted, there were some methods to overcome this by adding some random coefficients to the model that were smoothing it a little.

So, the naming is there... could be modified by adding some clarification that we don't care that much about understanding model we plan to use.


I propose “heuristic optimization”.


I like this the most, but you can also generate things, which isn’t implied by optimization strongly.


Iirc, predictive coding is a well known branch of math that's said to be the next big step towards AI.


Do I think or predict?


I predict therefore I will be


> I don’t see any path from continuous improvements to the (admittedly impressive) ‘machine learning’ field that leads to a general AI

> I share the skepticism towards any progress towards 'general AI' - I don't think that we're remotely close or even on the right path in any way.

This isn't how science works though. Quoting the wikipedia page for Thomas Kuhn's "The Structure of Scientific Revolutions" (https://en.wikipedia.org/wiki/The_Structure_of_Scientific_Re...):

"Kuhn challenged the then prevailing view of progress in science in which scientific progress was viewed as "development-by-accumulation" of accepted facts and theories. Kuhn argued for an episodic model in which periods of conceptual continuity where there is cumulative progress, which Kuhn referred to as periods of "normal science", were interrupted by periods of revolutionary science."

I think this is the accepted model in the philosophy of science since the 1970s. That's why I find this argument about AI so strange, especially when it comes from respected science writers.

The idea that accumulated progress along the current path is insufficient for a breakthrough like AGI is almost obviously true. Your second point is important here. Most researchers aren't concerned with AGI because incremental ML and AI research is interesting and useful in its own right.

We can't predict when the next paradigm shift in AI will occur. So it's a bit absurd to be optimistic or skeptical. When that shift happens we don't know if it will catapult us straight to AGI or be another stepping stone on a potentially infinite series of breakthroughs that never reaches AGI. To think of it any other way is contrary to what we know about how science works. I find it odd how much ink is being spent on this question by journalists.


I think you're misunderstanding Kuhn slightly. He invented the term paradigm shift. What he means by normal science with intertwined spurts of revolution is more provocative. He means that in order to observe periods of revolution, the "dogma" of normal science must be cast aside and new normal must move in to replace it. Normal science hits a wall, gets stuck in a "rut" as Kuhn describes it.

I think, in a way, Doctorow is making that same argument for the current state of ML: "I don't think that we're remotely close or even on the right path in any way". In other words, general thinking that ML will lead to AGI is stuck in a rut and needs a new approach and no amount of progressive improvement on ML will lead to AGI. I don't think Doctorow's opinion here is especially insightful, he's just a writer so he commits thoughts to words and has an audience. I don't even know wether I agree or not. But I do think this piece comes off as more in the spirit of Kuhn than you're suggesting.

And of course you can interpret Kuhn however you want. I don't think Kuhn was saying you shouldn't use/apply the tools built by normal science to everyday life. But he, subtly, argues that some level of casting off entrenched dogmatic theories, in the academic domain, is a requirement for revolutionary progress. Kuhn agrees that rationalism is a good framework for approaching reality, but also equates phases of normal science to phases of religious domination that predated it. Essentially truly free thought is really really hard because society invents normals (dogma) and makes it hard to deviate. Academia is no exception. Science, during periods of normals, is (or can become) essentially over-calibrated and over-dependent on its own contemporary zeitgeist. If some contemporary theory that everyone bases progressive research off of is not quite right, it kinda spoils the derivative research. Not always true because sometimes the theories are correct.


This is an excellent post. Thank you!

I felt like the part that wasn't in line with Kuhn was the idea that there was something wrong with a field if incremental improvement couldn't lead to a breakthrough like AGI. You're right. He's arguing Kuhn's point. But he seems to use it to conclude that machine learning is a dead end when it comes to AGI. Further, he seems to think this means AGI won't happen any time soon.

But, if I'm not misinterpreting Kuhn again, knowing that a revolution is necessary to overturn the current dogma (which I would argue is deep learning) doesn't tell us anything about when the revolution will occur. It could be tomorrow or 50 years from now or never. So, specifically, it doesn't tell us anything about machine learning in general, whether AGI is possible, or when AGI will happen.


>So it's a bit absurd to be optimistic or skeptical.

We skeptics aren't skeptical that AI is possible, were skeptical of specific claims. I think it's perfectly reasonable to be skeptical of the optimistic estimates, since they really are little more than guesses with little or no foundation in evidence.


This seems akin to Asimov's "Elevator Effect": https://baixardoc.com/preview/isaac-asimov-66-essays-on-the-... starting p 221.

I agree that one would think that Science Fiction writers would have enough of an imagination to be able to consider alternate futures (Cory CYA's by saying such a scenario would make a good SF story) - but there are already promising approaches to AGI: Minsky's "Society of Mind", Jeff Hawkins' neuro-based approaches, the fairly new Hinton idea GLOM: https://www.technologyreview.com/2021/04/16/1021871/geoffrey... .

“By 2029, computers will have human-level intelligence,” Kurzweil said in an interview at SXSW 2017.

Time to get to work, eh? https://www.timeanddate.com/countdown/to?msg=Kurzweil%20AGI%...


1960s Herbert Simmons predicts "Machines will be capable, within 20 years, of doing any work a man can do."

1993 - Vernor Vinge predicts super-intelligent AIs 'within 30 years'.

2011 ray Kurzweil predicts the singularity (enabled by super-intelligent AIs) will occur by 2045, 34 years after the prediction was made.

So until his revised timeline for 2029 the distance into the future before we achieve strong AI and hence the singularity was, according to it's most optimistic proponents, receding by more than 1 year per year.

I wonder what it was that lead him to revise his timeline so aggressively. I think all of those predictions were unfounded, until we have a solid concept for an architecture and a plan for implementing it an informed timeline isn't possible.



That's funny. Of course, I was referring to Asimov's Elevator Effect, which is that if aliens visited NYC with some probe in 1800 and then in 1950, they would be astonished at all the very tall buildings, and would have to assume people were now living in these tall towers for reasons TBD. They would not know that elevators had been invented, and hence, the buildings would only be occupied 8 hours per day or so; and nobody would live in them. Elevators allowed this major unexpected result. There is more, I couldn't find the actual essay.


>I think this is the accepted model in the philosophy of science since the 1970s.

Perhaps, but "philosophy of science" has never been something the majority practicing scientists consider relevant, care about, or are influenced by, since forever.


is this related to Foucault? in an old debate with Chomsky, Foucault spends a lot of time on a concept similar to what you are talking about


> I share the skepticism towards any progress towards 'general AI' - I don't think that we're remotely close or even on the right path in any way.

I actually think that AGI is deceptively simple. I don't have a proof, but I have a (rather embryonic, frankly) theory of how is it gonna work.

I believe AGI is an analogue of third Futamura projection, but for (reinforcement) learners and not compilers.

So the first level is you have problem and a learner, and you teach learner to solve the problem. The representation of the problem is implicit in the learner.

The second level is that you have a language, which can describe the problem and its solution, and a (2nd level) learner, and you teach the 2nd level learner to create (1st level) solvers of the problem based on the problem description language. The ability to interpret the problem description language is implicit in the 2nd level learner.

The third level is, you have a general description language that is capable of describing any problem description language, and you teach the 3rd level learner to take a description of the problem description language, and produce 2nd level learners that can use this language to solve problems created in it.

Now, just like in Futamura projections, this is where it stops. You have a "generally intelligent" creature on the 3rd level. You can talk to them on level of how to effectively describe or solve problems (create a specialized language for it) and they will come all the way down with the way to attack (solve) them.

In humans, the 3rd level, general intelligence (AKA "sentience"), evolved eventually from the 2nd level, and it was a creation of the general internal language (which probably co-evolved to be shared). The 2nd level is an internal representation of the world that can be manipulated, but only ever refer to the external world, not itself, so it allows creatures to make conscious plans, but lack the ability to reflect on the planning (and also learning) process itself. The "bicameral mind" is a theory how we acquired 3rd level from the 2nd, and the 3rd level is why "we are strange loops".

Anway, the problem is, the higher you go up the chain, the harder it becomes to create the learner, it's a lot more general problem. But I think the ladder must be, and should be, climbed. I believe that Deepmind (and RL research) has solved the 1st level, is now working on the 2nd level, but they already somewhat dimly see the 3rd level.


> I think it's important to make this distinction and for some reason it's left implicit or it's purposefully omitted from the article

I beg to disagree. They clearly state your opinion at the end of the piece, using the metal-beat analogy. Great things were done by blacksmiths beating metal, but not an ICE


I'm am both.

Why I'm pro-AI: Neural nets.

I worked on object detection for several years at one company using traditional methods, predating TensorFlow by a few years. We had a very sophisticated pipeline that had a DSP front end and a classical boundary detection scheme with a little neural net. The very first SSDMobileNet we tried blew away 5 years worth of work with about two weeks of training and tuning.

Other peers of mine work in industrial manufacturing, and classification and segmentation with off the shelf NN's has revolutionized assembly line testing almost overnight.

So yes, DNNs absolutely do some things vastly better than previous technology. Hand's down.

Why I'm Anti-AI: hype

The class of problems addressed by recent developments in NN/DNN software have failed horribly in scaling to even modestly real-world, rational multi-tasking. ADAS level 5 is the poster child. When hype master Elon Musk backs away, that is telling.

We're on the bleeding edge here, IMHO we NEED to try everything. There's no telling which path has fruit. Look at elliptic curves: half a century with no applications, now they are the backbone of the internet. Yes, there will be BS, hype, snake oil, vaporware, but there will also be some amazing tech.

I say be patient and skeptical.


> it's left implicit or it's purposefully omitted from the article

It's explicitly right there in the essay...

> Machine learning has bequeathed us a wealth of automation tools that operate with high degrees of reliability to classify and act on data acquired from the real world. It’s cool!

> Brilliant people have done remarkable things with it.

You seem to be in agreement with the article but don't realize it.


> It doesn't need to solve all of humanity's problems to be a great tool.

As a side note, I'd like to say humanity's own intelligence is actually able to come up with solutions to its problems, we don't need AGI for that. Humanity is unable to implement those solutions for reasons beyond technical. How an AGI would get over those hurdles I have no idea


Humanity has been able to introduce at last as many problems as it managed to solve. The big question is what is AI/ML trying to accomplish.


There's good reason to be skeptical of AI as it is. Here's a couple of reasons

Racial bias in facial recognition: "Error rates up to 34% higher on dark-skinned women than for lighter-skinned males. "Default camera settings are often not optimized to capture darker skin tones, resulting in lower-quality database images of Black Americans" https://sitn.hms.harvard.edu/flash/2020/racial-discriminatio...

Chicago’s “Heat List” predicts arrests, doesn’t protect people or deter crime: https://mathbabe.org/2016/08/18/chicagos-heat-list-predicts-...


It's very easy to fix these problems though. There's nothing inherently broken about the models or direction that prevents error rates from being made more uniform. In fact newer facial recognition models with better datasets do perform approximately equally well across skin tones and sex


Easy to fix technically, but first the issue must be recognized and demonstrated, then the delicate process of negotiating the social and economic realities in which the technology operates.

And that's the problem with ML in general: its failure to recognize the implicit biases in choice of dataset and training and the resulting problems, of which Microsoft racist chatbot Tay[1] is merely the most blatantly ludicrous.

1 https://spectrum.ieee.org/in-2016-microsofts-racist-chatbot-...


And the first cars didn't have seatbelts.

It's fine, these are not complicated problems, and they are much easier to spot and fix than most problems in software engineering at scale. Don't be fooled by the negative PR campaigns and clickbait, there's no reason to be skeptical about ML in general because of this.

Also, Tay attempted to solve a much harder problem than image classification. It's hard to build a safe hyperloop. It's no longer hard to build a safe microwave oven.


Forgive me, because I’m not an expert in ML. If this is an easy problem to solve why is it still a problem years after it’s so widespread that msm both knows about it and have written continual investigative journalism about it? It’s clearly not cutting edge anymore once it gets to that point and yet it’s still a problem. Why?


It's some work, but not hard to solve technically having been at companies that deal with very similar problems. The main difficulty is less technical and more investment needed vs value + investment is partly outside of the modeling engineers making the system. Part of the improvement can be done by classical computer vision techniques. But mixing classical computer vision techniques with modern ones both feels somewhat like a hack and complicates the system. The other big area though is dataset improvement. Engineers building ml systems and the people collecting and organizing the needed datasets are normally different people with mild connections to each other. For companies that rely mostly on existing datasets and finetune from them, having to add a data curation process is a big pain point. Most companies have immature data curation processes. Many of the popular open source ml datasets have poor racial diversity. The most popular face generation dataset is celebA, full of celebrities (mostly white ones).

Other issue is for many of these systems having a racial bias in the error rate has mild business impact which makes it harder to prioritize in fixing. Last issue the work needed to fix this tends to be less interesting than most of the other work to make the system.

So overall, the main issues are lack of good open source fair datasets with loose licensing, cross organizational need to solve it (engineers can not code up a fair dataset), and business prioritization.

edit: Also solve here is getting accuracy across races to be close not zero. ML models will always have an error rate and if your goal is 0 errors related to racial factors that is extremely hard. Modeling is about making estimates of data not knowing the truth of that data.


Overfitting is also a technically easy problem to solve, but high profile cases in which it's not solved with obvious negative consequences could also lead to investigative journalism.


The short answer to your question is the same as the one to a lot of programming questions: It's not a technical problem, it's a people problem. Just getting the industry to recognize and acknowledge bias took investigative reporting. The prime example really is the situation with social media and targeted advertising algorithms. We still have people, influential people, like Mark Zuckerberg going around saying that ML isn't really a problem, everything's fine, social media isn't playing any role in destabilizing democracy, targeted ads aren't a threat to anyone's safety, and neither of them have anything to do with the breathtaking levels of economic equality we see.

No doubt there are still plenty of other issues with ML that haven't (yet) made it to popular attention, and the people employing it aren't making decisions based on social value or common good, but simply invoking free markets and capitalism as their guiding philosophies.


I'm afraid the problems with ML are less like "whoops. we don't have seatbelts" and more "surely internal combustion engines optimized for power and mass production couldn't cause problems. It's not like there are going to be millions of them crammed together in lines 3 or 4 across crawling around at 10mph every day. Plus, fossil fuels are cheap, plentiful, and really have no downside we know of. Way better than coal at least - much less awful black smoke!"


Isn’t that just human bias seeping through into the data set, so of course the neural net trained on that will show similar biases. The problem here is the human element.


As they say, it's not a technical problem, it's a people problem. But it's not "just" human, it's that the field in general is elevating ML, AI, whatever you want to call it, with hype like "algorithms aren't biases like a human would be", which is technically true, but also trivial. The people creating these systems didn't even consider that they would reflect and even enshrine, with all kind of high-priest-of-technology-blessings, their biases, that's why we got Tay and why things PredPol is terrible. The key is to acknowledge and actively protect against systematic bias, not make a business of it (coughtwitterfacebookcough).


I'm curious how the physics of light is termed racial bias, it's skin-colour bias if anything -- you can be "black" and be lighter skinned than a "white" person, for example -- but surely it's a consequence of how cameras/light works rather than a bias.

Of course if you don't take account of the difficulties that come with using the tool then you might be acting with racial bias, but that's different. Or, all cameras/eyes/visual imaging means are "racist".


Well, if you really want to know, I have done the research and can recommend several other papers in addition to the one linked. The short answer is that it's not "just physics", and choices made by the chemists and technicians at Kodak, Fuji, Ilford, Agfa, etc to decide how films depicted skin tones were made with racial bias. Digital imaging built on the color rendering tools and tests that originated in the film industry, and thus inherited their flaws.


Sure, that would be interesting to read about - it would be weird not to adjust your film sensitivity according to market, of that were possible. Always happy to learn, link me up.


Isn't that what's meant by "admittedly impressive"?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: