Assuring Superintelligence

The single most important question of the coming century is, how can we trust superintelligence? An algorithm that claims to have the cure for cancer, but the solution is frankly incomprehensible to human doctors and requires an immense amount of investment in infrastructure to achieve. Do we take the leap of faith? How do we organise our society around solutions to problems that we don't understand? I argue that faith in machines at this scale will completely erode the civilisations we have worked hard to build. Instead, to borrow a phrase from one of the pre-eminent inheritors of Hegel's vision, Robert Brandom, we must build a "spirit of trust" in the human-machine assemblages that will come to define our century. To do this we must not think in terms of AI Safety or AI Alignment. It is the deepest moral imperative that we think in terms of AI Assurance. If we cannot trust a machine, it is not doing lasting good for our society.

Imagine the machine that is portrayed in the previous vignette. The purpose of this piece of writing is not really to depict a concrete future that we may find ourselves in. I don't think we will be ruled by a single benevolent superintelligence in a state of blissful unity. We can think of this picture as more of a metaphor for what a positive outcome might look like in a society where we are willingly giving up decision making to intelligent machines deeply embedded in every element of our lives. It is intended to provoke a question of how we might thrive in such a world and, if we are able to answer this first question convincingly, what the benefits of such a world might be.

The answer to the former question is a programme for Assuring Superintelligence. Here, I do not mean Bostrom's massive general purpose superintelligence - the concept responsible for a quite religious ferver in the Bay Area. Instead I mean the superintelligence of AlphaGo. A narrow, specified intelligence that goes beyond human ability in a given domain and can subsequently teach us new concepts about said domain. It is surely not controversial to suggest that in the coming years, we will see this kind of superintelligence seeping through every area of our lives. Some might be indeed be alarmed to find out that these machines are no new thing! TikTok's Monolith architecture is a superintelligent system designed to choose which video to show you next, and it can certainly do so with an accuracy that no human could ever replicate, and certainly not at that scale. In this case, those who have lost themselves to a many hour fall into the infinite scroll abyss know exactly the powerlessness we have to fight against a machine that knows us better than we know ourselves. The horror in this instance is that this highly pressurised algorithmic atmosphere causes us to willingly rescinding our own decision making for hours at a time, and that it feels so damn good when we do it! Bytedance distills the magic of Vegas into a single superintelligent software architecture!

Putting aside the possibility of an artificial general superintelligence aside for a minute, it seems clear to me that a society permeated by narrow superintelligences of this sort is powerful enough to radically alter the fabric of our *Geist*. I choose a particular superintelligence here that I am not so fond of, however, imagine this power in drug discovery! Imagine a Ukrainian family torn apart by an unjust war having to spend less time in the firing line. As with every coin, there are two sides, and if the flip lands our way, what a future we could inherit.

But what is our way? What does a positive future look like? The answer I would like to give is one of trust. A world in which we know that when we interact with a superintelligence for a given task, we are happy to accept its decisions not because we hope that it is right, or have to overcome a hurdle of apprehension every time we do so, but because we have a genuine human feeling of trust directed towards it. When move 37 happened in AlphaGo's famous match against Lee Sedol, the team were sent out to sea without a map. They didn't know if their system had completely failed or if it had played a genius, creative move beyond human intelligibility. If this system had been in a safety critical environment like government decision making, cancer diagnosis or a defensive capability - the human structure around it would have collapsed. The doctor would have resorted back to traditional methods, the government would have fallen back into bureaucracy or the general would have rescinded precious position. If they could have known, however, that they could trust the machine in its decision, even when their understanding of the decision failed, what an advantage they would have. We will be enabled to cut through the previously impenetrable ceiling of possibilities afforded by human intelligence, and willingly dive into the unknown beyond.

Trust will be the defining feature of the superintelligent systems that make the world a better place, and so, we must work out immediately what it means to assure them.