Get Your Smart On
“GRANDAD: …it's good for many many things it's going to be magnificent in healthcare and education and more or less any industry that needs to use its data is going to be able to use it better with ai, so we're not going to stop the development you know people say "well why don't we just stop it now?" we're not going to stop it because it's too good for too many things also we're not going to stop it because it's good for battle robots and none of the countries that sell weapons are going to want to stop it.”
says the man that “left” in order to speak out about the dangers of ai.
he tells us we should train to become plumbers.
he thinks we’ve never had to deal with anything smarter than us, the fucking arrogance.
this stuff is an existential threat, he says, and unless we do something soon we are near the end.
all of this in the trailer to the interview at the start of the video. saves you time, you don’t need to watch the rest, you already know what he’s gonna say.
”GRANDAD: … so i go off and i learn something and i'd like to tell you what i learned so i produce some sentences this is a rather simplistic model but roughly right your brain is trying to figure out how can i change the strength of connections between neurons so i might have put that word next and so you'll do a lot of learning when a very surprising word comes and not much learning when if it's when it's very obvious word if i say fish and chips you don't do much learning when i say chips but if i say fish and cucumber you do a lot more learning you wonder why did i say cucumber so that's roughly what's going on in your brain…
interviewer: i'm predicting what's coming next
GRANDAD: …that's how we think it's working nobody really knows for sure how the brain works and nobody knows how it gets the information about whether you should increase the strength of a connection or decrease the strength of a connection that's the crucial thing…
Mona Mona: No bro, that is not the crucial thing, the crucial thing is that nobody knows how the brain works. But continue….
GRANDAD: …but what we do know now from ai is that if you could get information about whether to increase or decrease the connection strength so as to do better at whatever task you're trying to do then we could learn incredible things because that's what we're doing now with artificial neuronets it's just we don't know for real brains how they get that signal about whether to increase or decrease..”
I got productively annoyed and started looking into this. Here is what I’ve dug up.
The original idea emerges around 1943 (towards the end of World War II), and a first phase of development ran up until the 1960s. The original idea of neural networks was inspired by biological neurons. It proposes that simple computation units could “learn” by adjusting connection weights. Something called the "perceptron" (Frank Rosenblatt, 1958) was an early model that could learn to classify patterns. This line of reasoning was discredited in 1969 in the Minsky-Paper, where authors Marvin Minsky and Seymour Papert proved mathematically that the “perceptrons” model couldn’t solve simple problems like: classify patterns that were not linear, and generally pattern recognition was fundamentally limited and shallow.
Early neural network models were single layer, and although multi-layer networks existed, there wasn’t an effective way to train it, and computing power was insufficient.
I think what this means is that the model is premised on linear thinking patterns that simply don’t work, or aren’t how human reasoning works. Like the example of “fish and chips,” that it can handle, but anything beyond this it cannot.
The critique was nearly fatal, as funding completely dried up for neural network research. The AI community shifted its focus to symbolic AI models. To this project the philosopher Noam Chomsky contributed, and his opinions about AI are well known even if many of these folks don’t seem to know that philosophers and philosophers of language were involved in thinking about AI early on.
But Mr. Hinton, the GRANDADDY never stopped working on this, never gave it up. Three things happened that led to the comeback for neural networks. In 1986, Grandad plus a couple other men (David and Ronald) published a paper with a “backpropagation” algorithm ("Learning representations by back-propagating errors"). Theu used backpropagation to effectively train multiple-layer networks, which solved the problem of non-linearity. But it needed a lot of power to be able to run probabilities backwards and forwards in multiple threads.
Think about any small decision you made today. Now imagine you made a slightly different decision (had eggs instead of tofu for protein), and this affected how your future unfolded. (Use your imagination: the eggs make you nauseous, you got sick and go to Urgent Care, where you met your husband-to-be. No eggs, no husband, ok?) Endless daily decisions that could change the course of your day and you life, in multiple timelines, both forwards and backwards. Your life represents just one thread through all those decision points, but a zillion other line-throughs are just as possible. That is the problem solved when…
GPUs became available that could perform parallel computation, and made training large networks feasible. Now neural networks no longer needed to stay in the shallows, and could go deep. In 2006 Grandad co-authors work on “deep belief networks” showing how this can be done, and proved it by building AlexNet, winning (with his team) the ImageNet competition in 2012.
But it was the arrival of BIG DATA that made any of this feasible. Big computation power and big data.
The core assumption GRANDAD betrays here is that human cognition (including language) is some sort of distributed processing that takes place across networks of neurons that have quantifiable “more” or “less” strength capacitation. In other words, that human cognition works like LLMs work, and so thinking is statistical at heart.
My conspiracy theory is that he left AI research to fear monger and propagandize this idea that WE. TALK. LIKE. ROBOTS MAN.
And moreover, that we think like we talk.
Assumption #2 is that intelligence is a scale issue. That intelligence emerges from parallel, distributed processing at a large scale, and that this large scale can be replicated with big data. (And while knowing a lot of stuff can make you kinda smart, it’s a useless kind of smart. Data is not knowledge, knowledge is not wisdom. There are other things involved.) Already these systems have eaten up all the data ever produced by humankind and they are running out of data.
I guess that makes us look a lot like somebody’s little data farm animals.
Assumption numero tres: Learning happens through gradual weight adjustments across connections. Weight is an interesting metaphor for these “adjustments” since weights are a way to quantify matter. But how do you weigh the thought of that tender spot behind the ear of your beloved? Or the time it took to find it?
Assumption squared: That knowledge is “stored” in patterns of connectivity, not symbolic rules. This assumes knowledge is data, without any loss of meaning. Need I say more?
Neural networks were originally patterned after biological neurons, and the continued use of that biological language gives it an air of je ne sais quoi authority, but the back-propogation on which these LLMs run breaks with the analogy. Real neurons can't propagate signals backward through the network like the algorithm requires. It is just not how brains or thinking works, as GRANDAD has himself admitted in his recent paper, "Forward-Forward Algorithm" (2022), that introduces a new learning algorithm for neural networks. So he is still at it, attempting to create more biologically plausible learning because humans are reducible to biological beings. #sarcasm
In order to make good on his bet that neural networks mirror human cognition, GRANDAD is out here selling the dream that human cognition mirrors neural networks, even if its proponents haven’t yet actually figured out how, and their current models are admittedly mistaken. Hinton continues evolving his views on how closely artificial networks should mirror biological ones.
But manipulating popular opinion on Youtube is just as good for making it true.
Like Chomsky puts it, with the philosopher’s clarity: this has nothing to do with language, learning, intelligence, thought, nothing. It' comes down to a sophisticated, high-tech plagiarism. It’s a glorified autofill. Or my favorite analogy, it’s a word calculator.
Hinton, Geoffrey. "The Forward-Forward Algorithm: Some Preliminary Investigations." arXiv, 2022, arXiv:2212.13345.
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "ImageNet classification with deep convolutional neural networks." NIPS'12: Proceedings of the 25th International Conference on Neural Information Processing Systems, vol. 1, Curran Associates, 3 Dec. 2012, pp. 1097–1105.
Memisevic, Roland, and Geoffrey Hinton. "Unsupervised Learning of Image Transformations." IEEE CVPR, 2006.
Minsky, Marvin, and Seymour Papert. Perceptrons: An Introduction to Computational Geometry. MIT Press, 1969.
Rumelhart, David E., Geoffrey E. Hinton, and Ronald J. Williams. "Learning representations by back-propagating errors." Nature, vol. 323, no. 6088, 9 Oct. 1986, pp. 533–536.