Above fig­ure: A sec­tion of adult rat hip­po­cam­pus stained with anti­body, cour­tesy of EnCor Bio­tech­no­logy Inc.

The Divide

In May 2019, after the Inter­na­tional Con­fer­ence for Learn­ing Rep­res­ent­a­tion, ML Engin­eer Chip Huyen pos­ted a sum­mary of the trends she’d wit­nessed at the con­fer­ence. Her 6th point on the list was “The lack of bio­lo­gic­ally inspired deep learn­ing”. Indeed, across 500 talks, only 2 were about bio­lo­gic­ally inspired Arti­fi­cial Neural Net­work (ANN) archi­tec­tures. This was noth­ing new. For years, the con­ver­sa­tion between Neur­os­cience and AI has been dis­ap­point­ingly unfruitful.

Why is this a shame?

Well, to begin with, a quick look at an ANN next to a real sys­tem of brain neur­ons shows that some­thing is a little off.

The ANN we use within com­puters are orderly and sequen­tial, while the webs of brain neur­ons are chaotic. This stark con­trast exists not only struc­tur­ally, but also beha­vi­our­ally. That is to say, there are beha­viours within the brain that are not cap­tured by ANN.

Take for example the growth of Dend­ritic Spines in rela­tion to new experiences:

And there is another reason the lack of bio­lo­gical inspired AI is dis­ap­point­ing: AI research is in a rut.

The Holy Grail

Across most of AI research, the “Grand Slam” would be, without a doubt, Arti­fi­cial Gen­eral Intel­li­gence, usu­ally abbre­vi­ated as AGI. There is no one single defin­i­tion for the exact threshold at which AI becomes AGI (and, to be fair, a per­fect defin­i­tion is likely impossible). The idea is still easy to sum­mar­ize: an AI sys­tem which can inter­pret the world and think crit­ic­ally and emo­tion­ally in a sim­ilar fash­ion, and to a sim­ilar cog­nit­ive degree, as humans.

There have been fant­astic gains across all sub-domains of AI in the past dec­ade, but as of late our quest for AGI has been hindered by three primary ‘stick­ing points’.

1) Deeper does not equal smarter

When the “Deep Learn­ing Revolu­tion” began in 2012 with AlexNet, it was imme­di­ately clear that deep 10+ layer ANN were the future. How­ever, depth only helps to a point. The bene­fits of more lay­ers decrease as you add more, even­tu­ally flat­ten­ing out. Even past 1000 lay­ers, although per­form­ance is good, we don’t arrive at any­thing close to human level thought ([He et al. 2016]).

In 2018, AI heavy­weight Yann LeCunn even went as far as say­ing “Deep Learn­ing is dead”.

2) Con­tem­por­ary AI is fragile

AI is easy to break. David Cox, dir­ector of the MIT-IBM Wat­son AI Lab, gives an example of this in a [2020 MIT talk]. Spe­cific­ally, he talks of Adversarial Attacks, in which a net­work can be tricked into think­ing it is see­ing some­thing which it, in fact, is not. Take this example of a friendly dog, which an ANN cor­rectly iden­ti­fies as a dog. How­ever, an Adversarial Algorithm can adjust the pixels in such a way that the net­work believes it is instead look­ing at another animal – here, an ostrich (Szegedy et al. 2014).

The dif­fer­ence is strik­ingly subtle. Cer­tainly any human would still agree the 3rd pic­ture is of a dog.

A sim­ilar tech­nique can be used to trick ANN into being unable to identify people in video

3) AI Lacks Robustness

In the field of com­puter vis­ion, it’s fairly com­mon to hear of AI per­form­ing at “super-human” levels (ex. Lan­glotz et al. 2019). But, to take another point from David Cox, this is some­what mis­lead­ing. The primary reason is that, in com­pet­it­ive image data­sets (such as ImageNet), images tend to be framed quite gen­er­ously: prop­erly centred, cor­rectly ori­ented, well lit, in a famil­iar set­ting, and over­all very ‘typ­ical’. When we instead go for data­sets con­tain­ing more com­pre­hens­ive examples (a knife lying on the side of a sink, a chair lay­ing on its side, a T‑shirt crumpled on a bed, etc.) the drop in per­form­ance is drastic. Here we see a 25–40% per­form­ance drop when mov­ing from ImageNet to a more com­pre­hens­ive data­set (Object­Net).

Source: Borji et al. 2020

The Brain Gain

In Decem­ber 2019, an AI research body in Mon­tréal (ima­gin­at­ively named Montréal.AI) hos­ted an AI debate between Yoshua Ben­gio and Gary Mar­cus. What made this talk stand out is that, while Yoshua is an AI spe­cial­ist, Gary’s domain is cog­nit­ive science.

In Decem­ber 2020, on account of the first debate being well received, the com­pany hos­ted a second debate. Instead of a two-way debate, the panel con­sisted of 17 speak­ers. About one third were AI spe­cial­ists, while the other two thirds con­sisted of neur­os­cient­ists and psychologists.

What tran­spired in both of these multi-dis­cip­lin­ary debates was quite aston­ish­ing. The ideas and expect­a­tions flowed, often fuelled by the AI side, but reg­u­larly put in check by the neur­os­cience side. Addi­tion­ally, insights into poten­tially fruit­ful new dir­ec­tions more often than not came from the psy­chi­at­rists, based on how humans learn and develop. In sum, the inter­dis­cip­lin­ary advant­age shined through brightly. I would like to high­light some of the ideas that stood out the strongest.

The Cat Carousel

One of the ideas brought for­ward by Fei­Fei Li of Stand­ford Uni­ver­sity is the his­tor­ical ‘Cat Carou­sel’ exper­i­ment (Held & Hein, 1963). One kit­ten is free to explore, while a second kit­ten is tethered to the first.

They both explore their sur­round­ings, how­ever only one of them actu­ally does the mov­ing. The exper­i­ment shows that only the act­ive kit­ten devel­ops nor­mal depth per­cep­tion, while the pass­ive kit­ten leaves the exper­i­ment unco­ordin­ated and clumsy. The les­son is that learn­ing must be accom­pan­ied by move­ment. Bar­bara Tver­sky, pro­fessor at Columbia Uni­ver­sity, also added an example: humans often move their hands when watch­ing lec­tures, draw­ing shapes in phys­ical space to bet­ter mem­or­ize concepts.

AI research­ers have made note of this. At Deep­Mind, researches have cre­ated a sim­u­lated “Play­room” in which AI can inter­act with its surroundings.

Repro­duced from Deep­Mind 2020.

The AI sys­tem is fed both visual input regard­ing its sur­round­ing, as well as com­mands in nat­ural lan­guage. In turn, it trans­lates these into decisions regard­ing how to move and how to respond. One may ask which archi­tec­ture allows for this? The answer is rooted in Neur­osym­bolic AI.

Neur­osym­bolic AI

The concept of Neur­osym­bolic AI ori­gin­ated dur­ing the very first steps into AI (see, for example, McCarthy et al. 1955) and Min­sky, 1991). The research­ers hoped to cre­ate a sys­tem cap­able of stor­ing and retriev­ing inform­a­tion related to objects and con­cepts. For example, if asked “what do you know about apples?” the sys­tem could reply that they are green, red, or yel­low, that they can be sweet, sour, or tart, and that they can be used to make pies (and so on and so forth).

Some of the basic ideas of Neur­osym­bolic AI are now being revamped to tackle more com­plex AI chal­lenges – chal­lenges that con­vo­lu­tional neural net­works are unable to tackle alone. Take for example this question:

Repro­duced from this lec­ture by David Cox.

As a human, we would likely approach this prob­lem with 3 steps: firstly count­ing the num­ber of large objects, then the num­ber of metal spheres, and then quickly check­ing that the two num­bers match. In an attempt to recre­ate this beha­viour, a Neur­osym­bolic AI sys­tem may look some­thing like this:

Repro­duced from this lec­ture by David Cox.

The net­work con­tains a com­puter vis­ion com­pon­ent which seg­ments the image, and then clas­si­fies the objects within the image into a table. Sim­ul­tan­eously, an NLP routine trans­lates the ques­tion into a num­ber of pro­gram­ming steps, which can be applied to the table. How­ever, in order to train this sec­tion of the net­work, the pro­gram­ming steps must be dif­fer­en­ti­able (in order to apply gradi­ent des­cent). Hence, this step within the Neur­osym­bolic AI sys­tem is some­times called Dif­fer­en­tial Pro­gram­ming.

Think­ing Fast and Slow

When given an unfa­mil­iar prob­lem sim­ilar to the above one, our mind works slowly and delib­er­ately. Altern­ately, with empty­ing the dish­washer, or tying our shoelaces, we can oper­ate quickly without much thought. This two-tier beha­viour is what Daniel Kahne­man describes as Sys­tem 1 (fast, instinct­ive beha­viour) and Sys­tem 2 (slow, logical pro­cessing). Some­times, this fast/slow dicho­tomy can be quite puzz­ling. Take this for example:

If asked to recog­nize a face, humans can instant­an­eously answer. It feels as though this hap­pens without even think­ing. But given a math prob­lem we have to slowly think, work­ing through the cal­cu­la­tion step-by-step.

But, curi­ously, com­puters have the oppos­ite skill-set. Facial recog­ni­tion is a rel­at­ively recent skill, which has reached matur­ity only in the past two dec­ades. It gen­er­ally requires fairly heavy-duty archi­tec­ture. On the other hand, basic arith­metic cal­cu­lat­ors have been around since antiquity. Take, for example, the Anti­kythera Mech­an­ism), a very prim­it­ive com­puter dat­ing from around 200 BC.

So, why is this the case?

Evol­u­tion­ary Priors

Per­haps unsur­pris­ingly, arith­metic was his­tor­ic­ally not as use­ful to us as facial recog­ni­tion. Being able to recog­nize friends from foes was much more import­ant to us than adding num­bers. Some recent research has shed light on exactly how this evol­u­tion­ary hard-coded facial recog­ni­tion oper­ates, at least in monkeys.

Data Repro­duced from Doris Tsao’s Montréal.AI talk, Decem­ber 2020.

What exactly have we found? The sys­tem respons­ible for Face Recog­ni­tion is fairly com­pact, con­sist­ing of roughly 200 neur­ons. The neur­ons pro­ject down to a roughly 50 dimen­sional fea­ture space, in which each dimen­sion cor­res­ponds to a mul­ti­tude of actual facial fea­tures (eg. hair­line, eye pos­i­tion, lips, etc.), as opposed to just one. Fas­cin­at­ingly, research­ers can work back­wards from the MRI scan of the neuron activ­a­tions and recon­struct the face the mon­key is look­ing at without actu­ally see­ing it themselves!

Under­stand­ing this sys­tem is a huge step for­ward in under­stand­ing what are called ‘pri­ors’. That is, sys­tems which have been ‘pre-loaded’ into our brains by evol­u­tion. The ques­tion is: where should we draw the line? For example, should an AI sys­tem come with an entire Neur­osym­bolic AI archi­tec­ture as a prior, or should this be developed through learn­ing over time? That is, born of some more simple and fun­da­mental system.

There is still no con­sensus, and a lot more work, in neur­os­cience, psy­cho­logy, and AI, will be needed before an answer is clear.

Let’s Talk Numbers

Can we gauge how close we are to AGI by look­ing at the num­bers? That is, how close is digital archi­tec­ture to that of our brain? What if we con­sider the num­ber of operations/second?

  • Brain: ~10¹⁷ operations/second (Source)
  • NVIDIA Titan‑V GPU: 10¹⁵ operations/second (Source)

This would sug­gest we are about a factor of 100 away from AGI. What if we instead looked at the num­ber of neurons?

  • Brain: ~1.5×10¹⁴ Syn­apses (roughly equi­val­ent to para­met­ers) (Source)
  • Microsoft ZeRO & Deep­Speed: 10¹¹ Para­met­ers (Source)

So here it looks closer to a factor of 1000. But these num­bers obscure the fact that the cor­rel­a­tion between brain met­rics and intel­li­gence is cur­rently a com­plete black box. To illus­trate this, take for examples the fact that ele­phants have roughly three times more neur­ons than humans, and likely syn­apses too.

There were those who believed the intel­li­gence of humans came from the num­ber of neur­ons in their cereb­ral cor­tex – an area asso­ci­ated with con­scious­ness. How­ever, orcas have more than twice the num­ber of cereb­ral cor­tex neur­ons as humans.

So, all this to say, the num­bers do very little to shed light on how far we are from AGI. We cur­rently have no idea what is so spe­cial about the human brain to con­vey the intel­li­gence required to become the dom­in­ant spe­cies, as opposed to ele­phants and orcas.

The Four Miss­ing Pillars

What would rep­res­ent a good first step towards AGI? Well, Doris Tsao offers four inter­est­ing neur­os­cience aven­ues, each of which could hold some keys towards unlock­ing AGI.

Repro­duced from Doris Tsao’s Montréal.AI talk, Decem­ber 2020.

The four dir­ec­tions are:

  1. A bet­ter under­stand­ing, from a neur­os­cience point of view, of exactly how the human mind learns.
  2. Uncov­er­ing the appar­ent robust­ness of human vis­ion. That is to say, our lack of sus­cept­ib­il­ity to Adversarial Attacks, abil­ity to over­come noise, and skill at recog­niz­ing objects in a wide range of set­tings and orientations.
  3. Under­stand the Bind­ing Prob­lem: how the mind sep­ar­ates dif­fer­ent objects in our field of vis­ion, and addi­tion­ally com­bines mul­tiple inputs into a single uni­fied ‘exper­i­ence’.
  4. Under­stand what exactly is going on when we ‘see’ our thoughts (ima­gin­a­tion, dreams, consciousness).

The Pess­im­ists

Call­ing Chris­tof Koch, pres­id­ent and chief sci­ent­ist at the Allen Insti­tute for Brain Sci­ence, a pess­im­ist is per­haps unfair. Maybe ‘real­ist’ would be better.

Regard­less of which title you accord, Chris­tof Koch advoc­ates that look­ing to neur­os­cience for help with AI is a hope­less pur­suit. He points out some hum­bling facts about the human brain.

  • Neur­os­cient­ists have iden­ti­fied over 1000 dif­fer­ent types of neur­ons. In ANN we usu­ally treat all neur­ons as identical in their functioning.
  • Some neur­ons have “dend­ritic trees” which have on the order of 10,000 inputs and out­puts, lar­ger than what we find in most ANN.
  • Cur­rent neur­os­cience is unable to sim­u­late func­tional beha­viour when study­ing sys­tems on the order of 100 neur­ons (eg. C. Eleg­ans, see fig­ure below).
  • In try­ing to do so with the human brain, we would be work­ing with a sys­tem on the order of 10¹¹ neurons.

Although we now have a com­plete wir­ing dia­gram of the C. Eleg­ans nem­at­ode neur­ons, we have no idea how the sys­tem trans­lates to beha­viour. (Source: Varsh­ney et al. 2011)

And there’s another angle here: there is no reason to believe the human brain is in any away ‘optimal’. Our brain has evolved under a num­ber of dif­fer­ent con­straints, and the set of con­straints for AI sys­tems will be com­pletely different.

  • Meta­bolic Con­straints: The human brain must sat­isfy a basic need for nutrients.
  • Mater­ial Con­straints: The human body is mostly com­posed of car­bon, hydro­gen, oxy­gen, and nitrogen.
  • Evol­u­tion­ary Con­straints: The human brain was shaped to optim­ize per­form­ance in the con­text of very spe­cific con­di­tions, many of which exis­ted only mil­lions of years ago.

To get a feel for how many evol­u­tion­ary arti­facts still linger in our minds, take a quick look at the list of cog­nit­ive biases
we have iden­ti­fied. There is no reason to think it neces­sary to rep­lic­ate these biases in AI sys­tems when push­ing towards AGI.

Bring­ing it all Together

Even though Chris­tof Koch remains a pess­im­ist, he does con­cede one point.

Brains primar­ily provide exist­ence poof that adapt­ive intel­li­gence is pos­sible in phys­ical hard­ware.

In our quest for AGI, the rela­tion­ship between Neur­os­cience research and AI research does not have only two pos­sible pre­sets – that is, being “mar­ried together” and “never talk­ing”. In sum, AI research­ers should both avoid ignor­ing the insights com­ing from Neur­os­cience and Psy­cho­logy, and sim­ul­tan­eously should not aim to exactly rep­lic­ate said insights within digital hard­ware. It is not neces­sary to refer to the brain when design­ing AI, and indeed we should view AGI not as a chance to rep­lic­ate our human mind, but a chance to rein­vent it. That being said, when we find ourselves stumped, we shouldn’t for­get that we already have a work­ing intel­li­gence, built on phys­ical hard­ware. The inter­dis­cip­lin­ary approach can be both beau­ti­ful, and fruit­ful. So why not peek into the mind from time to time?