Transcript of EP 183 – Forrest Landry Part 2: AI Risk

The following is a rough transcript which has not been revised by The Jim Rutt Show or Forrest Landry. Please check with us before using any quotations from this transcript. Thank you.

Jim: Today’s guest is Forrest Landry back for a part two, basically extending and deepening our conversation from back on March 14th on EP 181. Welcome, Forrest.

Forrest: Good to be here. Nice to chat with you again.

Jim: It’s amazing. It’s only been 17 days since we chatted last, but it feels like a hundred years. You have the concept of dog years, that a dog year is seven human years. I’m getting this sense that large language model years or something like 25 or 50 or thousand, something like that, because the day we chatted was the day that GPT-4 was released, the 99-page report, which I had read in the morning. I hadn’t actually played with it. Later after we finished up, I played with GPT-4 a bit and I started posting results on Twitter and all kinds of stuff flew from that, and I connected with other people working in the field.

And then again, this sense of, “Whoa,” the API, I got access to the GPT-4 API through my open AI commercial account. I went back and looked, it was only four days ago, and I’ve already written four programs that use it, and I’ve come up with a scientific paper that I think will be interesting, we’ll see, that is a new way to reduce the hallucinations.

So it’s like, “Whoa.” So I continue to have this truly liminal feeling about where we are that reminds me a lot of PCs in 1979 and ’80, where suddenly a shitload of stuff becomes possible and isn’t even very hard. That’s the amazing thing about it. And of course, we all need to keep that in mind as we start digging into some ways this might go wrong.

Forrest: Understood, agreed. So, how would you like to start? I mean, we can talk obviously about what has happened.

Jim: Yeah, before we do that, let’s do a little bit of a review from last time. I think one of the things that we may well talk about again, and I think we found a useful near center discussion last time was what is Rice’s theorem?

Forrest: Basically, it’s an observation that from the content of a message or the content of a piece of software, computer program, that we can’t have some method, some algorithm or some methodology that would allow us to assert for certain that some arbitrary algorithm or some arbitrary message had some characteristic feature.

It’s essentially an extension of the halting theorem, which says that if I had a program, can I determine by just analyzing the program, whether or not it will ever halt, whether it’ll ever stop.

So the Rice theorem essentially is a bit like if we received a message from some alien civilization, for example, would we be able to assert that that message was something that would actually be good for us to read? Or would it actually cause some fundamental harm to our society or our species or our civilization? And so we would like to basically assess the safety characteristics associated with an incoming message from an alien species.

In the same sort of way, we’d like to assess the degree to which an artificial intelligence is going to be aligned or misaligned with human interests or with interests of life itself in a broad sense. And so there’s a thing about can we determine whether or not alignment is existent for a given AI system? And moreover, what sort of conditions would be necessary for us to establish that alignment is essentially connected to the Rice theorem?

Jim: Well, the bottom line is that we cannot get 100% certainty about something like alignment or frankly any specific outcome, at least in a program that’s complicated enough.

Forrest: Well, the thing goes somewhat farther. It’s not saying, for example, that we can’t get arbitrarily close to certainty. It’s actually saying that to some extent that the answer to a question like that is unknowable using the tools involved.

So in effect, it’s stating a limit on what can be done with algorithms or what can be done with purely causal processes. So in effect, there’s a notion here that it’s not just can we get to anywhere near or approximately sure about an answer, but can we get any answer at all? So in other words, if we’re trying to say, can we assert that such and such a system will be at least 95% aligned over the next 10 years, we’re basically saying that from the Rice theorem that in the general case that we can’t assert whether or not it will be even 10% aligned over even 10 minutes.

And so in effect, there’s a sense here of without actually running the program, we don’t know what it’s going to do. But of course, if you’re running the program, you’ve already taken the risk. If you’ve read the message from the alien civilization, then you’ve already subjected yourself to whatever the effects of that are going to be.

So there’s an important result here, which is basically to say, if we’re trying to establish that an artificial intelligence is going to be aligned with human wellbeing over the long term, there are certain features that any algorithm would need to have. One of them being that we could essentially know something about the inputs, know something about the nature of how the AI worked, IE, what are its internals? Can we have a model of that? Which would itself be needed to basically simulate what the AI would do with those inputs. Then we would need to have some sort of comparison to, given what it would do, does that actually match with what alignment would be, what safety would be? So there’s a characterization and comparison aspect. And then finally, if it’s turning out that the outputs of the AI as simulated would be not in alignment, that we would want to have some sort of constraint, some sort of conditionality to essentially either prevent those inputs from arriving or for those outputs from being generated.

So there’s basically five aspects there. There’s know the inputs, be able to model the system, be able to predict or simulate the outputs, be able to assess whether those outputs are aligned or not, and be able to control whether or not those inputs or outputs are occurring.

And it turns out that for things like the Rice theorem and a few other things having to do with control theory and modeling and such that exactly none of those five conditions, all of which would be necessary, any one of which failing would effectively prevent us from being able to establish safety or alignment of AI systems, that exactly none of those conditions obtain. We can get approximately an understanding of the inputs and approximately an understanding of what the thing might do. And we can sometimes compare relatively weakly, but in effect to have control to establish safety to even reasonable thresholds, like you said, we can’t get to a hundred percent, but at least we should be able to get relatively close for proportional to however severe the risk is.

If we’re looking at say a bridge or something like that and you have trucks going over it and stuff like that, we would want to say, “Hey, we want to build the bridge so that the chances of something going wrong, even if there’s extreme weather, a lot of rain, a lot of snow and wind and so on, that even with heavy loads, that there’d be enough of a margin of safety, that it’s really, really unlikely that the bridge would collapse and people would get killed.” Similar with aircraft and things like that, that we want to establish at least a threshold that means that from an engineering point of view, that the systems are reasonably robust against most categories of hazards for most of the time that that system would be used. The Federal Aviation Administration has gone through a lot of effort to assure that planes don’t fall out of the sky or collide with buildings and things.

So in the same sort of sense, it’s not just that we’re trying to achieve a notion of safety, we’re trying to exceed a threshold, a certain minimum threshold of being able to assure that over a reasonable period of time for when the systems are used. And so in that sense, for each of the five characteristics, input, modeling, simulation, assessment, and essentially constraint, that these each themselves need to be accurate or complete to a threshold, that they need to since pass a certain level of capability. And so when we say that there’s an absence of constraint, it’s not only the case that it means that we can’t ensure alignment or safety to those thresholds, but it means that we can’t even achieve it for any of those five aspects to a degree sufficient to basically pass what might be the same sort of requirements that say the Federal Aviation Administration would use for a product like an aircraft being deployed in commercial or civil settings.

Jim: Well, let me make a little distinction here. You mentioned bridges, it’s a good example. I actually did, took a few civil engineering courses when I was a lad, and one of the things we learned is you cannot guarantee that a bridge won’t fall down. Design it to survive a Richter 11 earthquake right underneath it, probably not. Design it to survive a 500 mile an hour wind. And then more suddenly, the famous Tacoma Bridge that, you see all the famous pictures of it twisting and getting into residence and stuff. While you can play whack-a-mole and put all kinds of damping into a system, you can’t quite even guarantee that you won’t get a runaway harmonic from cars driving over the bridge. But, what you can do is squeeze down the risk to the point where you say it’s safe enough to proceed.

So strikes me that Rice’s theorem is not particularly relevant, because even in something as mundane as a bridge, we can’t get to absolute certainty about the safety of a design. And so rather than being a mathematics problem, it has to be thought of as an engineering problem. How would you respond to that?

Forrest: I would say the model of the bridge is actually very different than the model of the AI system and the constraints involved there. So in one sense, when we’re saying, “Hey, we can converge towards a safety thing,” we need to have certain predictability elements built into the design. So when we’re doing the kind of calculations that are involved to say, assess how would the system respond under earthquake conditions or under wind conditions? Would there be resonances? Would there be the kinds of conditions that would cause a failure?

We can actually assess things like that. So engineering has, for the kinds of equations that we’re talking about with the stresses and the forces involved in a bridge, we have enough capacity to be able to predict the outcome. So in other words, we can model the system and kind of have an assessment as to the model and the actual physical system will behave in a similar way.

Whereas when we’re dealing with AI systems, we don’t actually have models like that. We don’t have the kinds of dynamics which would converge to a known state. It’s a bit like when we’re looking at infinite sum equations. There are some that will converge to a definite value, but once certain conditions are established, the summation doesn’t converge, it doesn’t have a definite value. The notion of having a definite value at the summation just turns out not to work at all.

So in the same sort of sense for the kinds of things that we’re talking about with artificial intelligence systems, it’s actually very different than the kinds of things associated with the bridge, because things like Rice theorem say that not only can we not predict it, but we can’t predict it at all. It’s not that we’re not converging on a thing or we’re getting consistently better approximations, it’s that we don’t have any approximation. The system is in a certain sense, fundamentally chaotic.

And so there’s a kind of inherent limit to the predictability itself, which is not tractable using the kinds of tools of algorithms and causal process that would work really well in other fields of engineering. And this is why to some extent this is actually pretty important, because we can say, “Hey, there are certain limits or certain thresholds of safety that we must achieve over certain durations. If I don’t get the model at least approximately right, then it just isn’t to pass.” And so in effect, what the arguments are showing is that it’s not like we can get arbitrarily close to certainty, it’s that we have no certainty at all. We have literally no information about what the interior state of this thing is going to be at some arbitrary point in the future.

Jim: That is true. On the other hand, we do have possibility of doing external ensemble testing in the same way we do automated testing on other complicated software, which you, let’s say it’s a large language model, you send it a million test probes and you see what comes out the other side. You get some sense of the statistics of the relationship between inputs and outputs.

And I think it’s particularly useful to think about that in the cases of these large language models, which are extraordinarily simple, right? They’re feed forward networks. They aren’t doing any cycling back, they don’t have any memory at all. There’s a one pass through with a few little internal loops, but they loop once and then keep going. And so they’re remarkably simple. And you can even set the, I know on some of them, I don’t know if it’s GPT or not, you can actually set a seed number so that you’ll get the exact same. No, you’re not on GPT, but on some of the image generators you set a seed number. So you get the exact same result every time with no [inaudible 00:13:42]. And that potentially makes, at least for this one class of technology, which everybody’s kind of up in an uproar, the opportunity to do probe result testing at scale to get a sense of what the statistics are.

Forrest: Well, that does make sense, and I basically agree with the overall idea that you have. But there’s a couple of things that we really need to be mindful of, one of which is the dimensionality associated with the statistics. If I have a rough sense as to how to characterize the inputs to the system, and I basically am saying, “Okay, for this range of inputs, this system will produce this range of outputs.” And in effect, I can characterize the input output relation reasonably well. The trouble happens when to some extent there’s the emergence of a relationship between the outputs and the subsequent inputs.

So in any real world situation, for example, we’re like in the case of chat GPT-4, obviously people such as ourselves who are collecting the output from the chat GPT thing and saying, “Hey, this is really interesting.” They make an article about it and it appears up on the web, which then appears as part of the crawl input for the next version of the chat GPT.

So in effect, the past outputs of the system can potentially become the basis for the training of the next system. So there’s essentially a feedback curve that starts to emerge over time. And of course, this is a very slow acting process because obviously the outputs are not necessarily being put back as inputs or put back as training material right away. Although, when you look at things like the Alpha Go program or the kinds of things that Google was doing to train that, they were basically creating a sort of self simulation thing where the outputs were essentially treated as inputs for its own training purposes. And effectively, the system was able to train itself within a few days.

The result here is that once you start introducing these kinds of feedback loops, it’s no longer possible to characterize the dimensionality of the input space or of the output space, where therefore of the statistics or the statistical distributions that are involved. And so in effect, what you start ending up with is an increasing convergence towards what might be called shelling points or stable points within the hyperspace of all possible behaviors. The trouble is that we can’t in advance necessarily know whether or not one of those convergence points is it is self representing some sort of black swan condition or some sort of catastrophic outcome that results from a kind of feedback cycle within the dynamic of how the system is working.

So in this particular sense, it’s not so much that I’m worried about things like instrumental convergence happening and some sort of fum hypothesis of, oh, this thing’s going to take over the world and start turning us into paperclips. It’s more the case that through the feedback dynamics that we’re looking at, that there starts to become more and more destabilizing aspects within civilization and civilization process, IE, how human beings make sense of things. And over time a greater and greater displacement of humans from the loop. So in a sense, we start to see an economic decoupling.

And then over an even longer period of time there’s a kind of necessary convergence associated with the dynamics of how, and I’m saying evolutionary process, but I’m not talking about it as applied to biological things, but actually just as a mathematical model applied to technological things, which effectively compels the systems to essentially converge towards states or capacities, which effectively have as a side effect deep environmental harms, not just to humanity, but to life and nature itself. And these sorts of factors might not be the kind of thing that would show up in say a few years or even a few decades, but do show up over hundreds of years.

And in that particular case, we’re actually talking about something that’s quite pernicious, but at that particular point, the issue is that there aren’t humans in the loop at all to constrain the behavior of the system on any basis. And that we can’t replace human oversight with machine oversight because of things like the Rice theorem, which effectively is a result having to do with the dimensionality of the space itself.

So those aspects where we’re basically saying, “Hey, we can’t predict in advance whether or not the statistics that we’re doing on the system have the right kind of dimensionality to even observe the sort of outcomes that are relevant,” it’s inherent in the nature of the system itself that we wouldn’t be able to characterize what dimensionalities would be relevant in advance using any algorithm or any kind of structural basis at all.

Jim: Yep. Yeah, let me take that idea and other conversations I’ve been having with people and suggest sort of three big clumps of AI risk and that I’m going to suggest it’s useful for people to clump AI risk these three ways so that they don’t get confused. And when I read the stuff in the popular press, I go, “Jesus Christ, these people aren’t thinking clearly. It’s like listening to grandpa talk about rock and roll in 1959.” And maybe we can help get some language out there. I’d love to get your reaction to this, and then I’m going to suggest that part three opens up into a number of more nuanced sub ideas that you have.

So number one, and this is where so much of the traditional AI risk work has been, research, is in the I’ve started to call it Yakowskian risk, Eleeser Yakowski, the idea of an AI way, in fact, it’s the Werner Venge and [inaudible 00:19:11] singularity that we get at a machine that’s 1.1 as smart as a human, and we give it the job of designing its successor. Six hours later, it’s a billion times smarter than we are, and by the end of the afternoon it kills us, right?

Forrest: I call that the foom hypothesis, F-O-O-M, it just blows up, basically.

Jim: Yeah. Yep. Fast take.

Forrest: Category one risk, what I call instrumental convergence risk.

Jim: Yes. And that’s instrumental convergence risk.

The second risk is, this one I’ve actually been pushing for years, saying I am glad people like Eleeser are out there and [inaudible 00:19:44] and Tegmark and others, because there is that risk. And whether it’s a fast takeoff or a slow take off, frankly doesn’t matter on galactic time scales. If it’s 10,000 years or if it’s 20 minutes, really doesn’t matter that much. But it does tactically. But on the scale of the history of the universe, it doesn’t matter.

But there’s a much earlier risk, which is people doing bad things with strong narrow AIs, right? Let’s take for instance the surveillance state that’s emerging in China. You don’t need anything close to an AGI, artificial general intelligence, to do facial identifications, keep track of where everybody is, be able to vector a cop to harass them when the algorithm says they got a 25% chance of participating in a demonstration. You could really build a state-of-the-art police state with the technologies we have today, which is probably a long way from AGI.

And then other examples, let’s imagine, and we may be getting close to this with GPT-4 and probably closer still with GPT-5, is the ability to write advertising copy that’s so amazingly persuasive that it really overcomes human resistance. Whether that’s possible or not, I don’t know. But the whole art and science of advertising has been to attempt to do just that, at least since Brene’s in the twenties. If our AIs are put to this task and specifically designed for this task, they could well get there.

So that’s class two of people doing what most of us would agree are intrinsically bad things with AIs that are far short of a Yakowskian paperclip maximized.

Forrest: I tend to refer to this class of things as essentially inequity issues. So in other words, everywhere that you destabilize human sense making, everywhere you destabilize human culture, human economic processes or social political processes, using AI to essentially swing a vote for a given candidate, for example.

So whether we’re talking a totalitarian state or democratic one or an economic process or an advertising process or one subgroup gaining some sort of power over some other subgroup, that all of these things show up as inequity issues. And so I think of this as essentially being, as you were mentioning it, sort of the second category of AI risk. But in this particular sense, it’s a fairly broad one because it attaches to both narrow AI and general AI.

Jim: And then the third, and this is where I think your conversation has been helpful, and other people are starting to see this too in different ways, is that even if we were able to have a regulatory book to avoid the worst of the obvious bad narrow AI things, and somehow we convince people not to proceed to create the first class, the paperclip maximizer that’s a billion times smarter than we are, there’s a third risk, which is that these AIs are increasing the rate and increasing the capacity of a series of players that are already on a doom loop, the so-called meta crisis. The fact that so many of our businesses and nation states are caught in multipolar traps where they’re forced essentially to respond to each other’s behavior by doing more and more bad stuff. AI could be thought of as an accelerator on that whole system. So even if nothing explicitly bad was being done, and even if billion times human intelligences aren’t being created, the mere acceleration of what we call Game A trends that are already in place by itself is a very dangerous and bad thing.

Forrest: While I will acknowledge that we could categorize this as a third category of risk, for my own part, I tend to lump it in with the second category. The third category for me is what I call the substrate needs convergence, which maybe is a flavor of what you’re calling a third category.

So when I think of the business systems, for example, and competing against one another, that effectively I’m treating the businesses as if they are a component within the AI system, rather than treating the AI system as a component within the business. Because a business is an abstract entity, it’s not a thing in the world in the sense of trees and buildings. It’s a series of agreements and legal structures. So in this particular sense, the idea of there being system associated with business, and again, this sort of abstract notion of system associated with AI, which of course is encoded in substrate the same way that businesses are encoded in buildings and people and such, there is a sense in which businesses or institutions in general and artificial intelligence architectures are not actually that dissimilar from one another.

So in this particular sense, I would say yes, there is a kind of multipolar trap dynamic. But the nature of the risk shows up in the feedback loop, in the sense that in the same sort of way that institutions, for example, may compete against one another and end up in a kinds of arms race, there is a sort of side effect that occurs with respect to the environment or the playing field in which the contest is happening.

So in effect, it’s not usually the case when we see two teams of football players playing a game, for example, that we’re that concerned about the AstroTurf for the stadium being destroyed as a result of them playing the game. But in the case of things having to do with say, geopolitics or institutions such as governments competing against one another, or institutions such as businesses, many of which are at a scale that are already larger than most countries in the world, it does become the outcome of the game that the territory in which the game is being played is manifestly damaged by the action of the playing, or in this case of war and so on.

So in this situation, we’re basically saying, “Okay, there’s an environmental hazard. There’s a kind of background process that is essentially going on that is basically converging towards the needs of the institutions or towards the needs of the artificial substrate or the AI systems themselves that doesn’t really have very much regard for the wellbeing of the playing field or the wellbeing of the…”

Forrest: … really have very much regard for the wellbeing of the playing field or the wellbeing of the environment. In this case, the human world, human interactions, social and cultural dynamics and or ecosystem dynamics involving all the rest of life. So when we think about the kinds of things that are needed to promote organic chemistry in a sense of people making love with one another and producing children, almost all of the reactions of that involve a simple alphabet of elements, carbon, nitrogen, phosphorus, sulfur and things like that. And at ordinary temperatures and pressures, i.e. room temperature type things and pressures and humidity levels that are relatively benign. But when you think about the kinds of things that machines are made out of or that institutions are made out of, the businesses, and the buildings and so on, the conditions necessary for formation for those kinds of things involve temperatures well north of 500 degrees centigrade, and don’t even really get interesting until you’re up to 1500 degrees centigrade, i.e. way too hot for the kinds of conditions at which cellular based life or organic life would be able to endure at all.

And then when we’re looking at the operating conditions in which most computers or systems would want to operate at, it’s much, much colder and much more sterile in the sense of water and so on, that in effect any choices that the machines or the institutions make to favor themselves are fundamentally choices that are hostile to life, hostile to humanity. So in this sense, this third category of existential risk is to say that when we’re looking at the degree to which these AI systems are interacting with one another, or the businesses are using them that are interacting with one another, that the playing field that they’re damaging is essentially human relationships, human cultures, life systems, planetary systems, ecologies and so on. And so in effect, it’s not something that shows up in the short term, it shows up over a hundred years or so, but that in fact the level of convergence, or the level of perniciousness, or the degree to which there’s essentially an asymmetry that favors system interaction over life world interaction tends to be very, very, very strong.

It’s very, very asymmetric in the same sort of way that human processes, for the most part haven’t had much regard for the natural world. To some extent we’re in sense creating the conditions which are fundamentally hostile to human life. And this is, again, it’s an existential risk category that is basically building on what your third category was, but strengthening it to say that it’s not just that these things are competing with one another and gaining in capacity, but that has a side effect of their gaining capacity that they are effectively displacing choice and capacity from ecosystems and from human civilization itself.

Jim: Now at the basic level of material, let’s say the material depletion of the world, yes, if we turn the whole world into a data center or a chip foundry, those would not be places suitable for our kind of life. But we would have to go a very, very, very, very long way, more than a hundred years, I think before any significant percentage of the earth’s surface was covered with data centers and chip foundries.

Forrest: Well, actually it’s not as slow as you think. So for instance, if you look at just over the last 200 years, the degree to which technology has affected the surface of the earth, right now if we were to basically look at it as a percentage of the surface of the earth is covered by roads, or by buildings, or that have been converted to human use, i.e. are in some way constrained by or defined by what would be our intentions or purposes for that land, the amount of wild, unaffected spaces has been going down as a percentage of the earth’s surface by a very noticeable percentage over a period of time. And that’s just 200 years.

If we’re looking at the part of the earth that’s affected by issues having to do with global warming, or the distribution of plastics, or an increase in lead in the atmosphere, or an increase in the level of radioactivity or any of those kinds of things, it isn’t the case that obviously that the whole surface of the earth would be converted to foundries, but the toxic side effects of those things happening don’t just stay where the foundry is, they go basically everywhere else. So in a sense, what we’re looking at is what is the degree of toxicity associated with the deployment of technology or the deployment of systems essentially that aren’t necessarily designed with any kind of life world concerns in mind?

So in this particular sense, we are actually seeing very strong convergence dynamics that, again, as you mentioned in the first existential risk, it might not be the case that instrumental convergence happens in 1 year, or 10 years, or maybe even 100 years, although I think that’s a bit of a long timeline for that particular concern, that the instrumental convergence dynamics, we can calculate something about how they’re going to look. And basically inside of a thousand years, you’re looking at very, very strong and very, very high probabilities of levels of a toxicity which would be essentially fundamentally disabling of all cellular life.

So in this particular sense, and this is an example of the kinds of geometric trends that we’re looking at, is that it’s actually quite hard for us as human beings using ordinary sense and reason to think about these kinds of things. But if you look at, for example, the level of energy usage that has happened over the last thousand years and the degree to which energy usage per capita has increased, there’s a fairly steady and significant increase in the amount of energy per capita across the whole of history. And that curve has a very specific growth rate. It has a very specific exponent. And if we basically say, okay, well if the growth of energy usage happens at the same sort of rate that it has for the last thousand years, then we can pretty confidently predict that well before 400 years from now, that the surface of the earth would be hotter than the surface of the sun just from the waste heat.

So in this sense, there’s some very clear things that our future can’t resemble our past in things like energy usage, or increase in economic development, or rate of inventions and so on and so forth without having these sort of exponent based equations basically describe things that are so different from the world now that the discovery or idea of how life would happen just doesn’t even make sense anymore. So in this sense, there’s some things that when we’re saying, hey, there’s a dynamic that’s occurring in this third category of existential risk, which is accelerated by the second category of existential risk and combines particularly badly with the first category of existential risk associated with artificial intelligence, then the overall pitcher starts to become very sobering indeed.

Jim: Yeah. I’m still not sure I buy, maybe my vision isn’t out far enough, the environmental degradation from our IT children, because we look at where the real environmental degradation has occurred, it’s absolutely dominated by agriculture. A relatively low tech, though getting more tech activity, that is what’s killed off the species, that’s what’s made us and our mammals and our mammal servants, like cows, and sheep and stuff outweigh all the other mammals about 20 to 1. It’s agriculture. And agriculture operates on a fast scale. And if you had say a second thing, it’s roads and cities, and they’re again not near as fast as agriculture as it turns out. If you add up land use, it’s a small percentage compared to agriculture. But the area under pavement, let’s say, is really, really big compared to any imaginable, maybe it’s just a lack of my imagination, cluster of data centers and chip foundries.

We’d have to have a whole lot of growth in, there’s only a few hundred chip factories in the world, they probably fit in less than a hundred square miles. And data center is bigger than that, but I don’t know, they’d probably all fit in a thousand square miles. And further, and this is hugely important because this is perhaps part of the remediation path or the heading off with a pass, is you talk about these exponentials. Yeah, exponentials keep going up until they don’t. Famously Herb Stein said, “If something can’t go on forever, it won’t.” And this is the essence of the game B critique of game A, is that game A was built to drive exponentials as rapidly as possible, and which worked fine in 1700 when the world’s population was 650 million and the average human was generating on the order of a few hundred watts of power per person.

And we’ve already overshot the edges of the sustainable boundaries when we have 8 billion people, 15 times as many, each one consuming more than 10 times as much as each. So we’re at 130 times as much degradation on the environment as we were in 1700. And so what was reasonable to do in 1700, drive exponentials, is now literally impossible. We can’t do it. We will die. So I’m not too worried about that we keep building data centers until the earth’s surface was hotter than the sun. That ain’t going to happen. And there is fortunately a fairly simple rule that, of course it’s going to be a collective action problem to get there, and that’s of course part of what game B’s thinking is all about, is for instance, suppose we were to cap energy at 4,000 watts continuous per person, which is about the level of energy consumption in Portugal, which is not super built up fancy place, but it’s not a hellhole either. It’s pretty nice place actually. Shows that good quality human life is possible at 3,500 watts, essentially.

And if we did that and we look at the learning curves and production rates on our low carbon energy technologies, it looks quite plausible that we could reach zero carbon at 35 or 4,000 watts per person for everybody on earth. And if we have that as an external boundary condition, that by itself stops a runaway depredation of the environment of the sort you’re talking about.

Forrest: Well, again, you’re talking about it as a kind of collective action problem. We’d have to have some sort of choice that was collective enough and essentially understood by the major stakeholders in a way that was actually favoring that outcome. So in this particular sense, I agree that the real leverage is to develop the capacities among human people to really be able to make choices like that that are actually favoring our own wellbeing and favoring the wellbeing of the world that we live in. And I agree with you that it’s not like I’m saying the whole world’s surface will be converted into chip foundries, but I am saying that in the same sort of way that the degree to which agriculture has been an issue is largely driven by the population and the number of people in cities, because it’s not that everybody in the city is out in the country tilling the fields, but that there’s a demand created in the cities for there to be existing fields of that kind.

And so in this sense, we’re basically suggesting that over time that there is essentially more and more demand for technology type processes to be happening just anywhere in the world. And that in effect, it’s not the case that we can just have technology, which is a linear process. It takes materials, consumes them and produces machinery, and then eventually pollution of one sort or another. So when we’re talking about fundamental toxicity, we’re saying either there’s a depletion of something or there’s too much of something. And so in this particular case, we’re saying that as a side effect of the demand first created by human beings for technology, and then as a result of technology creating a demand for itself, because eventually you’re decoupling the human elements away from the world of what the machinery is using. So for instance, at this point I would say there’s three basic markets.

There’s the market of physical labor, i.e. we can pick up something and carry it to some other place, or we can build something, think of that as essentially involving atoms. And then we can talk about a second market, which has to do with things like intelligence, and creativity and the ability to solve problems. That market obviously is something that we’re talking a lot about these days. It’s already been the case that human labor, for example, is not necessarily treated as the best way to get things done in many cases. If we can replace a person with robots, a factory owner is largely going to want to do that, because over the long term, the robot’s going to be cheaper to use as a way of producing products than say a person would be who has all sorts of legal issues associated with that and needs various kinds of environmental conditions to suit them and so on.

So when we’re looking at the third market basically being the market of reproduction, I eat what produces families, and there’s a demand for people to have the resources to do that. So in this particular sense, we can say very strongly that the machines are already faster, and stronger, and more durable and robust than human beings are. And at this particular point with things like ChatGPT and the various kinds of things that we’re starting to use now as tools for making art and so forth, there is starting to be a condition where we’re saying, well, there might not necessarily be 10 years from now as these systems become more well-developed, very much of a circumstance in which human beings are going to be able to participate in the economic process. In other words, if you look at the degree to which human labor, or human intelligence, or human reproductive capacity, obviously the machines, we’re not mating with robots per se.

I mean, I’m not thinking about cyborg issues per se as being a thing that the machines would demand. That would be something that human beings might want. But the idea here is that we’re being factored out of the economic system, both on a labor basis and on an intelligence basis, and weren’t ever going to be part of it on a reproductive basis. So in this particular sense, speaking to the inequality risk, which was the second one, that over time the human utility value goes to zero. And so in this sense, the asymptotic approach of that basically means that there’s less and less constraint for the third risk of technology issues essentially wanting to produce more technology for the sake of technology. In other words, it becomes a self-driving system the same way that human beings want to produce more human beings. So in the same way that the population of human beings in the world has climbed from roughly a few million maybe 10,000 years ago to something like 8 or 9 billion approximately over our species, that in the future you would have essentially technology basically reproducing itself for its own sake.

And this of course represents a real problem, because in that sense, it is an environmental hazard. It’s not necessarily just the foundries, but all of the mining that happens to produce those, all of the special and exotic chemistry that’s involved and all of the systems that are needed to do that. A foundry currently involve something like six continents worth of resources and coordination to basically be able to produce one microchip for a cell phone, for example. And the level of technological development associated with the foundries in Taiwan represents essentially million manure efforts or billion manure kind of efforts, and that’s with the creativity that we have currently. But say, for example, that most of that creativity gets replaced with machines basically designing machines, then at some point there’s no real breaks on the degree to which technology becomes self-sustaining and self reproducing.

At that point, it may be just a matter of time before essentially economic processes and or environmental concerns continue to converge to something where the economic system displaces human beings completely, and or the technology system displaces life completely. And it won’t necessarily be because it’s the same thing everywhere. It’ll be just because it’s different from what life is today.

Jim: Ah. But yes, I could see how you could run into a scenario where that could happen. And I think, well, you labeled in your writing this additional thing as the economic decoupling. And I do see how that does fit pretty well with your inequity, with the calling the second one an inequity, or I might call an asymmetry issue. Of course, some people call that possibly a good thing. There’s this concept of fully automated luxury communism, which is when we get to the point where machines can do everything better than we can. At least with respect to provisioning, we should be happy to let them do, and we can spend our time singing, and dancing, and fucking, and eating, and drinking, and rolling up a fat doobie, getting a banjo out and plunking away. I mean, we can do some really interesting things. And maybe there are some things that only humans can do, like science, until we get to AGI at least. It may not be possible for narrow AIs to actually be able to do science.

But even if there isn’t anything else for us to do, we can nonetheless take the provisioning problem away, which has been the fundamental driver of all life. The idea that what drives Darwinism is making a living and then sometimes failing to make a living. And then in human cultures, it’s been the provisioning question brought more at a higher level where we had famines, and die offs, and people conquering other people and stealing their, et cetera. Maybe we get to a point where the provisioning problem goes away and we have a grand, glorious future for humanity.

Forrest: I would love to believe that. But I find myself remembering the whole, and this has been part of the online conversation too, is the Luddite movement. So you had a group of people who were making textiles, and you had owners of the kind of companies for which would employ textile makers, and then somebody invented a device that would effectively make the textiles in an automated way. And the people were basically saying, “Hey, well what do we do?” And there was prosperity claims being made, the machine would do the work for the human beings and everybody would have more leisure time. And in fact, that argument that we would have more leisure time was widely promoted around the early part of the 19th Century. And so in effect, we would say, okay, well that’s an optimistic perspective.

But what actually happened was that the proceeds from the automation weren’t evenly distributed. So in other words, the people that purchased the equipment to make the textiles, of course the owners of the capital in that particular case were the beneficiaries of the profits associated with the increase in the productivity associated with the machinery. And those benefits weren’t distributed to the workers that were displaced. They were effectively made to be in service to the people who own the machinery, in this case, the factory. And so in this particular case, the technology resulted in a fundamental increase in the level of inequality associated with that society. And the similar things are happening today.

In other words, we’re basically suggesting that the kinds of economic drivers and the kinds of human factors about just what people choose to do and how they choose to make choices is that, again, when the artificial intelligence systems require large data centers and essentially the pooling of huge amounts of data as produced for the commons by billions of people operating on the internet, but to essentially develop and to have the capacity to gather all of that information which was produced for the commons, and then make it into service for private interests, that the equipment and the kinds of costs associated with processing all that data and so on and so forth basically mean that the artificial intelligence systems are a bit like the textile manufacturing machines, that they’re expensive to own, they’re expensive to operate, and moreover than that, they’re way complicated.

So in this particular sense, to even just understand how to use these systems, or how to build them, or how to engineer them, let alone to be able to afford to pay engineers to do this, or to pay people to collect the data, or to essentially develop the data centers to do the training for the AI systems, that in all cases somebody’s going to end up paying for that, and whoever has the capacity to pay for that is also going to be the person that’s going to be asking for a return on their investment.

Jim: Well, that’s only if you accept the current capitalist paradigm. I mean, there’s a reason that the person labeled the book Fully Automated Luxury Communism. The assumption was the means of production would be socially owned. And that’s not that big a leap. And again, I’ve studied a lot of this stuff, and the reason, one of the reasons that communism didn’t work in the 20th Century was the Ludwig von Mises calculation problem. It turns out that it’s just really, really hard to figure out the distribution, consumption, production, savings and investment cycle in any kind of centralized way. At least it was really hard in 1922 when he wrote the paper and it remained hard at least to 1999, maybe. But with systems of this sort, it may well be quite possible that we can automate the calculation problem and something like actual social ownership of the means of production. And particularly when we have so such high capacity to make things relative to a limited human capacity to consume, that we can actually square the circle and make social ownership work at a high enough level of efficiency that it beats the alternatives.

Forrest: Unfortunately, I must disagree, because the problem that you’re describing as to why socialism didn’t work, I would characterize that as essentially asking the wrong question. So in effect, when we say, okay, well what is it that actually went wrong? We can say, well, maybe we weren’t able to calculate how to do the distribution of resources

Jim: And production and everything, what to make, how much, when, all the problems of the whole arc of provisioning are in very, very complex literally, and we’re intractable to central planning.

Forrest: I would say the notion of central planning is based upon algorithm or system, or more fundamentally on the notion of causation. The real issue is the relationship between change, which is something that’s just going to happen in the world. It’s going to happen naturally no matter what. We can’t constrain change. But we can use causation, and systems, and calculations and so on and so forth to try to design outcomes that are consistent with what we want. But somewhere along the way we have to actually account for what is it that we want and what is it meant by the values that would basically be the basis on which we would do resource allocations, or think about what is the notion of fairness actually mean, or what does it mean to have a high quality life?

And in this particular sense, the issue is is that you can’t replace choice with machinery. You can’t replace essentially human values with an algorithm. And the effort to do that is essentially to be blind to the kind of issues that were the reason that things like socialist systems failed, is that without actually accounting for the nature of the human animal and the kinds of biases at a psychological level that show up at a sociological level effectively make it something that we, at this point, although we’ve become very, very skillful at working with causation, we haven’t developed a similar level of skillfulness in thinking about choice individually or collectively in a way that compensates for the kinds of things that have happened as a result of evolution, let alone for the kinds of things that have been enabled by technology.

So in this particular sense, I don’t think it was because we couldn’t do the calculations, but because there was essentially a very strong incentive not to, and that in a lot of ways we can’t just ignore the effects of human beings basically being willing to do this, or to try to subvert the results, or to at any even more basic level, to try to use the causation associated with an algorithm or with some system to essentially favor their own benefit against the benefit of the commons. So in a sense, anytime you have a causal system in place, or anytime you have a system in place like a system of laws or a system of dynamics, it’s going to be just as a side effect of the fact that there’s some causal relationship between inputs and outputs that some creative person is going to figure out a way to arrange the inputs in such a way that the outputs are going to favor their private benefit.

So in a sense, you’re looking at that without actually looking at the nature of how we make choices collectively to essentially be to both private and public benefit, to have essentially a level of choice making process that is anti-corruptible fundamentally, then in effect, trying to basically have things work on the basis of some ultimate system is just, it’s just going to fail.

Jim: Yeah. No, and I think you may be straw manning this a little bit. At least my vision of using the algorithms to figure out the production, distribution, consumption, savings, investment and innovation curve is a coordination, not necessarily the decisions. And we would still use social mechanisms of some fort not yet designed to choose what should be made approximately at least. And so I think that we’re not assuming taking humans out of loop entirely and just having the algorithms make whatever the hell they want. Clearly there has to be a whole series of rather sophisticated loops, within loops, within loops which transfer steering information, governance of the machine from the human to the automata. And I think that’s a very different thing than what you were using there I might describe as a-

Jim: Different thing than what you were using there, I might describe as a bit of a straw man.

Forrest: Well, if you’re basically having the AI systems make choices on our behalf, then my argument would apply. If you have-

Jim: On how. Let’s say we have the machines where we give it general guidance. We want enough clothes, we want enough choices in clothes, we want enough food, the following amount of choices in food. We give some significant parameterization of what our desires are for the coming month, let’s say, by some form of social process. And then say, “Mr. AI, go do that.”

Forrest: How do you know that you solve the principal agent problem? So as the principal, you’re saying, “Hey, I want you to operate and benefit to humanity.” But at that particular point, having the means of production and the capacity to arrange outcomes and so on and so forth, why wouldn’t it basically look like it’s following your instructions and seemingly for short while doing so, but ultimately favoring its own self? How do you know that the system itself doesn’t become corrupt to favor its own production?

Jim: You’d have to certainly watch it very carefully. And fortunately, that’s one of the good things about the coming world, it’s also a bad thing about the coming world is we will soon have radical transparency about everything. We’ll be able to see what’s going on minute to minute, and if the system starts to diverge, we can intervene.

Forrest: Well, again, once you have visibility of that particular kind, it represents a kind of power. And so now we’re basically saying, “Well, who has the power?” And if they have the power, then effectively they have the option to affect the thing in a conditional way. IE the causal processes is being applied. And anywhere you have causal process, then I’m just going to re-raise the argument about how do we know that that causal process isn’t going to favor the choices of some minority over some majority? So in effect, anywhere that you have causation that can be leveraged in service to the choice of some and against the choices of others, that’s fundamentally what a weapon is. A weapon is any process of a person using a causal system to suppress the choices of some other person. And so in effect, when we’re looking at technology itself, it’s creating essentially a high level of causal dynamics in the same sense you were saying earlier about that there are linear systems, they take inputs, they process them, and they produce outputs. In a certain sense, we can even regard a lot of the ones that are existing today as being deterministic.

But, that just basically means that the people that are controlling the inputs or that are designing the systems to some extent are investing those systems with their own agency. They’re investing those systems with their own desires for the outputs to be shaped a certain way. And so in this particular sense, you can ask the same question, I might be able to type into a Google search bar a query term, and it might produce search results that are coming back, but how do I know that the search results that are coming back aren’t just favoring some corporations over others or some advertisements or some candidates over others. So in effect that the system seems like it’s operating with respect to my interest, but it isn’t actually. It’s operating with respect to the interest of whoever’s paid to have this particular link promoted over that one.

And so in this particular sense, I’m saying that AI, to the degree that it becomes more and more intractable or inscrutable to human understanding and human interest and so on and so forth, then the degree to which it can become in service to corruption of various kinds, increases without limit.

Jim: Yeah, that’s the class two problem. That people use AI for, let’s call it, just bad purposes. I would suggest that it seems unlikely, though I will Eliezer disagrees, that feed forward neural nets like the current LLMs… And I know a lot about cognitive psychology and the science of consciousness and agency and causality and seeing agency arise from the current class of AI strikes me as unlikely.

Forrest: Agreed.

Jim: And that gets you to the third level. And again, this third problem, which is AI’s embedded in what’s already corrupting and difficult system of multipolar traps, people trying to exploit each other, et cetera, is where the real danger comes in. And this is this where we’re about to pivot to another part, which is talking about what we do about this. But before we do that, I realized I forgot something. Which is, you made a great point last time something I had never caught before. Which is this class three, let’s say, multipolar traps, corporations, militaries, et cetera armed with decent but not overpowering, Ais potentially have a ratchet effect that could easily lead through the multipolar traps essentially to creation accidentally or on purpose of the type one risk, the Kowskiean fast takeoff brain. Why don’t you talk about that a little bit and then we’ll switch to civilization design.x

Forrest: Well, just to finish off the last piece, which is just that over over time, when we’re looking at the level of technological development, like the arms race you’re speaking to, the current systems, as you said, probably don’t have agency. But with an arms race of the kind that you’re talking about that have one government basically trying to figure out how to build systems that are more autonomous, that you would end up with things having agency relatively quickly. I mean, if you’re looking at the Russia-Ukraine war for example, there are already strong efforts on both sides to basically try to develop autonomous vehicles of one sort or another that effectively act as war fighting machinery to replace the soldiers. And so in effect, you end up with a certain level of autonomy built into the system much the same way we were talking last time, which is we gave it a general instruction and this is the principal agent problem again, it says, “Hey, we would like you to do X, and then it’s left up to the system itself to figure out how to implement X in its current environment.”

So in this particular sense, when we’re looking at what are the capacities that are needed for an autonomous vehicle like that, well, one of them might be to preserve its own survival, to preserve its own capacity to function. And maybe that becomes an instrumental convergence thing in the sense that it’s, well, if you can, make more of yourself so that you’ll be become an army rather than just an individual element on the field. In this particular sense, a risk that when we’re saying, “Hey, win this war for me,” which would be a very general instruction it has as sub-components survive, endure, reproduce that to a large extent that it becomes harder and harder for us to know when the breaks would be put on, when there would be a kind of, “Okay, the war is one you can stop now.” The factor of that particular instruction might not arrive with the same level of authority or with the same level of commitment once the thing has 1,000 instances of itself on the field.

So in this particular sense, there is a sort of multipolar trap dynamic, which can drive an instrumental convergence factor. But to make this a little more visceral for you, when we’re thinking about, say, what is the likelihood that we would end up with this thing developing agency of its own, it’s really important to keep track of the time scales in which we’re making these arguments. So for instance, yes, it might be the case that the things like the large language models are linear, feed forward dynamics, and in some sense deterministic in that particular way. That doesn’t necessarily mean that over time we won’t see feedback loops emerge that result with the kinds of general intelligence that would have the kind of agency that we would be concerned with. Even with the LLMs as they’re standing now, that’s already a possibility.

When we’re looking at say, what is the rate at which the capacity of these systems is increasing? And again, we’re going back to some of these geometric curves that we mentioned earlier. We can start to say, okay, well how big is the human brain? We know that’s a general agency thing, and obviously for whatever we would think is the meaning of consciousness or intelligence or agency that clearly something like that is going on in individual people. And in this particular sense we could say, okay, well how powerful are these large language models and what is the rate of increase given Moore’s law and various analogs of that as shown in terms of the rate of manufacturing capacity or the degree of miniaturization and so on? So in this sense, we can say, okay, given current trends of technology development, which of course we don’t know if Moore’s law might quit at some point or another, but just that given current trends and economic drivers and so on and so forth, there’s a large number of reasonably well calibrated reasons to basically say that the compute power and the complexity levels associated with these large language models or analogous systems of whatever type that are developed for intelligence will be at the same level of compute capacity as associated with an individual human brain or a large collection of human brains probably by 2035 at the latest, and maybe considerably sooner than that.

So for instance, if we’re saying that the artificial intelligence systems that we’re building have basically no gain, no efficiency increase or no capacity increase, or can’t be designed in any more efficient way than neural networks and wetware, that in some respects we’re saying, okay, as the outside most conservative effort as to when equivalence would be achieved, it would be well before 2035.

Jim: This is where some of the things I know about, I can push back on that. Things like agency consciousness, the nature of cognition-

Forrest: I haven’t made any claims about that, I’m just talking about sheer bandwidth. That’s it.

Jim: Okay. Okay. Okay, good. Because I was going to say architecture matters a lot. And you can have all the bandwidth you want and it’s not organized in a certain way you’re going to have a completely different kind of cognition. So okay, let’s say it has the pure power of a human… And we’re probably not far, I would say by ’27 or ’28 we’ll be at the point where we have the computational bandwidth of a human relatively easily.

Forrest: When I say 2035, I’m being really conservative, right?

Jim: Yeah. 2027, 2028, because you even take today-

Forrest: You’re more optimistic than I am. I get it.

Jim: Yeah. The HH100 Nvidia Tensor box, you put them on a super high speed switch embedded in the context of not that much actual CPU because you don’t need that much linear CPU with all this other stuff-

Forrest: I get it.

Jim: But anyway, but pure bandwidth is not the same as the equivalent of human capacity.

Forrest: Understood. But on the other hand, if we’re looking at it from an architectural point of view, the question then becomes is it more likely that we would be able to design an architecture that works given the results that we’ve achieved already? So in other words, to a large extent we’re not actually that far away from being able to say things like, well, we were actually surprised. Most people are surprised that even with a relatively weak architecture such as those associated with large language models currently, that the preponderance of evidence suggests that we would be able to have architectural convergence to at least a agentic systems.

The notion of intelligence is just to say something like it is appropriately responsive to its environment or it’s appropriately responsive given these inputs, that’s already happening. The next question becomes agency, and I don’t need to care about consciousness. Of course, as a metaphysicist I could speak to the issue of consciousness in ways that may be surprising for a lot of people, but leaving that aside, we can basically point to the notion of is agency a thing that happens?

Well, actually, surprisingly enough, it’s already happening in a large extent. Because, for example, for any code that is produced for an app that’s on your cell phone, whose agency does that express? It expresses the agency of the developer. And then as a secondary thing, the agency of the app user. So in effect, the notion of software, the notion of code as an embodiment of the agency of the developer is already happening. So in effect, there’s a sense here in which when we think about the idea of writing, for example, and we say, well, the writing encodes the meaning of the author. That software encode encodes the agency of the developer.

And so in this particular sense when we’re looking at large language models and such, this is no longer clear where the agency is coming from, but it’s still there. It’s still a system that is impressible with agency. It just happens to be the agency that is, at this moment, partially a combination of the developers of the large language models, and partly a result of the sheer number of people that wrote all the texts that’s being used as an input for the training model. But in this particular sense, it would be naive to say that there’s no agency in the system. It’s just that at this particular point, it’s implicit and latent.

Jim: Well, and particularly, let’s take the class three risk where the agency comes from, all the players and the AIs are enablers of the players. So even if the AIs have no agency themselves, the emergent memeplex, the Igorgores, as people call them, obviously have something like agency in the sense that a corporation’s maximizing profit, makes a whole bunch of decisions about that constantly in a very distributed fashion. And so at that regard, the class three risk agency is already there as part of the complex architecture that is-

Forrest: Well, there’s three agencies involved. There’s the agency that is essentially the people that are using the system, the leaders and the owners, and the ones that are effectively trying to create inputs in the sense of what do we want the distribution of resources to be? Then you have the agency that was encoded in the system in a latent way by the data inputs of the large language model of the developer, of the large language system that are training it to try to have, as existing with ChatGPT for example, that if you ask it a question about some controversial topic that it won’t give an offensive answer. So to some extent, there’s already a large effort on the part to shift the agency of the system or shift the capacity of the system to produce answers that are consistent with the agency of the developers. So in this sense, the system, again, i encoding of the latent agency of the developers. So that represents the second group.

The third group basically being the people that are affected by the outputs of the system, in this case, the general public. So if we’re looking at some say future, I’m going to throw the word utopian hypothesis about what might be a perfect Marxist system and so on and so forth. I’m going to basically start off with a fairly high level of skepticism because the level at which the agency of the rulers and the owners, the level of the latent agency and the system itself, and the level of the agency of the public, the general membership of that union, for example, these are not going to be equal. They’re not going to be equally represented. There’s going to be essentially a surprisingly large amount of latency in the second class held within the machine itself and very, very little of the agency of the general public represented.

And unfortunately, it will seem for a period of time that the agency associated with the leaders of the people that are putting the inputs to the system as to, quote-unquote, what is the utopian society? They will believe that they have agency at first, but because of the dynamics of the system itself and the feedback loops that are involved in it, based upon substrate type arguments, we can basically say that it converges to the case where the agency of both the first and the third class converges to zero in favor of all agency accumulating in the machine itself. And this again, is over the long period of time. This isn’t something that happens overnight in some sort of FUM hypothesis.

I have to admit that I’m somewhat skeptical about that hypothesis of we give it 0.1% over human advantage and overnight it’s going to turn into some sort of super singularity, super intelligence kind of thing. My feeling about the convergence argument associated with instrumentalism is, yes, that could probably happen, but it’ll probably take a lot longer, maybe a decade or two. And at the very least, even if that doesn’t happen, the instrumental convergence thing is shown to be inexorable. So in effect now a thing of, well, what sort of things can we do to prevent this happening?

Jim: We went a little further here than I wanted, but it was some great stuff, so I’m glad we did it. But now let’s pivot to just exactly this question. I’ll point out, you said the rulers, the people, I mean, those are institutional design questions. We don’t have to have a system of billionaires and oligarchs and such. That’s what we have, it’s a frozen accident. That’s where we’re at today. But, these systems, the Federal Reserve banking system, fractional reserve banking, were not brought down from Mount Sinai by Moses. All this shit’s relatively recent human invention, in fact, almost all of it in the last 300 years. So let’s roll up our sleeves here and think a little bit from a civilizational design perspective, in the face of this risk that we have drawn in great detail, how should we be thinking about how to design a civilization that can survive this risk and can indeed take advantage of it to prosper or take advantage of it to some degree and prosper?

Forrest: This is a broad enough question that we might need a third interview in order for that to be articulated even reasonably well. But I can give some hints.

Jim: Maybe we’ll see how this goes. If it does, we’ll get to come back for a third round.

Forrest: Well, that would be welcome. But in any case, the idea here is, and as you’ve mentioned already, the notion of institution as it’s received currently from all the history that has been part of the human species, that isn’t something that was designed as something that emerged, and there’s a reality to the momentum of how quickly or how fully can we transition to another model of human interrelationships and human communication and so on.

But the one thing that I can certainly point out is that this really goes back all the way to the emergence of cities and the invention of agriculture and how that changed human interrelationships. So in effect, there’s a sort of notion here that when we’re thinking about large scale human interrelationships, there’s a sort of Dunbar number. There’s a sense of we can’t keep track of all of the people or the inner models of what their motivations are, and therefore what kinds of choices. That the notion of trust itself has to become, to some extent, coordinated in a larger sense than that which would be held just in terms of human understanding. And so in effect, when we’re thinking about institutional design we’re actually seeing a kind of compensation for our own limits, our own cognitive limits.

So in this particular space, what I’m first of all is noticing is that the kind of trade off that has been made has largely been a result of our incapacity to keep track of a billion people and to have relationships with a billion people. But to some extent, there’s a need for us to actually care about one another. So in effect, it is the case that we can care for one another at a first person level, like our friends and our family and people we have relationships with on a first name basis. And that in effect when we’re talking about economic systems or exchange systems or dynamics of resource allocation and so on and so forth, we’re needing to basically think about civilization design in terms of these sort of ultimate notions of value and care. They’re not the kinds of things that are currently well represented in institutional design, because institutional design, again, trying to minimize the degree to which we had to care of course replaced relationships that were, say, intimate between partners, for example, or between mother and child with things that were hierarchal or defined by roles and procedures and dynamics of one formalized kind of another. In other words, we used causal methodologies to essentially solve the problem of human coordination.

Jim: What I like to say about that, the way I described that is that since accelerator around 1870, we have given up on the Dunbar number face-to-face community as our major source of security and provisioning that was very human, very warm or cold, but it was very human and was high-dimensional, et cetera, and replaced it with two lower dimensional systems that are very cold and un-human. On one side the market and on the other side government. And in some sense, that is what has happened over the last 150 years.

Forrest: I would’ve said hierarchy and transaction as being the two things, and government and institutionalized religions and so on and so forth. And similar to yourself though, but at a much earlier date, I would’ve said that the rise of the city state, so go back to Samaria, for example, and the first places where you had cities of any scale larger than a few hundred people.

Jim: Yeah. Well, again, and it’s but actually been going on since that time. But anyway, so let’s move on and let’s now try to ground it in the discussion of AI. So, take these principles of what civilization design needs to do and how would we… I mean, again, I realize we’re sketching here with construction crayon on a brown paper grocery bag. This is not CAD/CAM design by any means. Well, let’s think about, let’s throw out a real one. A couple days ago a open letter was put out calling for the big boys to stop training any models more powerful than GPT-4 for the next six months. As just as an example of out of the blue attempt by some fairly prominent people in the AI space to say pause. I personally don’t think it’ll happen. I did sign that letter interestingly, but for a contrarian reason, which is I would like the open source large language models to catch up with the big boys, and a six month pause by the big boys will not affect the little boys because they’re not yet up to GPT-4. And so at the end of six months, if the open source boys are up to GPT-4, it reduces the chance of winner take most outcome from the big boys. But anyway, that’s neither here nor there.

How do you think about civilization design with making some kinds of decisions about this otherwise inexorable growth and power of AIs?

Forrest: Well, as I mentioned, I’m thinking about it in the sense of how to actually have care relationships at scale. And so in effect, it’s rather than, as you mentioned, degrading care relationships into transactional relationships or hierarchy relationships, which effectively are a compensation for our limited cognitive capacity as individuals, that we can actually think about governance architectures, small group processes and ways of thinking about human interactions that emerge a level of wisdom that can in fact make choices at scale with the kinds of cares and values that we have in a distributed way.

So in this particular sense, probably an insignificant difference to most thinkers in this space, I do think it is possible for human intelligence to be effective at making wise choices at scale in a way that is genuinely reflective of the health and wellbeing of all concerned. So in that particular sense, it’s a bit more, to me, as a question of is technology helpful to this? Is there a role for technology in governance process, given that I believe that it is possible for human beings to essentially do good governance in just the fundamental nature of what they already are. And yes, I’m aware that this hasn’t been done before and we don’t have a prior example of something like this working. But I think that that’s largely because of the same sort of things you mentioned earlier, which is that there’s a sort of emergent quality for how the civilization that we have has come about. And I don’t think that until relatively recently that we’ve had the psychological and sociological tools and information or just a sheer understanding of just how cognition works that is deep enough to actually understand what might be possible that is essentially not already predicted or predicated on the results of just sheer evolutionary process.

Our capacities to relate to one another as human beings has obviously been something that’s been emergent over the last million years of social interaction. And nature has, through its evolutionary process, provided us with certain capacities to naturally engage at tribal scales. But that in the sense for us to actually be able to engage in a healthy and humane and wisdom oriented way to actually be able to address things like existential risk or things associated with AI and so on, that we are actually going to need to understand how to compensate for the biases that evolution has built into us to solve certain problems. Given that evolution wouldn’t have even known to try to solve for the kinds of problems that technology has introduced in the last 5,000 years or so, let alone last 200 or so.

Jim: Or the last four weeks.

Forrest: The last four weeks, exactly. So in this particular sense, we can’t really rely on the built-in evolutionary techniques to give us the answers. We’re actually going to have to, as you mentioned, do design thinking about this sort of thing with a mind of all of the things that are genuinely relevant as far as human physiology and psychological development and evolutionary psychology and things of that nature. So at this point, it’s basically, we’ve only recently-

Forrest: … of that nature. So at this point, it’s basically, we’ve only recently gotten to the place where we have the tools and the understanding necessary to even be good engineers in this space. It’s a bit like expecting a caveman to do the Golden Gate Bridge. They just don’t have the mathematical background to understand what that even means. Whereas at this particular point, we’re just crossing into the threshold where that becomes possible and hopefully we’ll actually be able to implement some things in that space before artificial intelligence systems and the sheer momentum of commercial process as currently given displaces our chance, our one chance to essentially become relevant and wise in these particular ways.

So in this case, to go back to the question of, well, what is the role of AI in this sort of space? What is the role of just more broadly speaking technology in this space?

Jim: And the other way around, what do the design principles need to do to get a grasp on technology to keep one of these runaway race conditions from occurring? Both questions simultaneously, right?

Forrest: Well, the second question is a little more specific. So when I’m thinking about the relationship between say, nature as a category of process, humanity as a category of process, and technology as a category of process, I’m basically saying that nature was only able to inform humanity to a certain extent. And so in this particular sense, if we’re thinking about what is the right relationship between say, man, machine and nature, we’re wanting the choice-making processes that humanity is making with respect to what happens with machinery and how it affects nature to be one of effectively enabling nature to be healthier, to be more capable of actually being nature. And humanity to itself, be more capable of actually being human.

So in this particular sense, we’re looking at a sort of forward cycle. In the same way that nature has supported humanity and humanity is currently supporting technology, that we need technology to support nature. And in a broad sense, what that looks like in terms of actual practices concerned is that rather than having machinery make choices for us, which is largely the way in which people are thinking about the use of artificial intelligence, I give it a general instruction and it makes all the specific decisions. In this particular sense, what we’re really looking to do is to compensate for the kinds of things that would have been biases in our choice-making process. So our biases for the most part are heuristics that evolution has built into us to deal with situations that would occur naturally in the world. If I hear a stick break and my first response is to crouch or to hide or to defend myself against the possibility that that was a signal of a tiger, 99% of the time, that’s the right response, even if there’s no tiger there.

So in this particular sense, there’s a sense in which those heuristics have been super helpful for us up until this point, but what got us here won’t get us there. And so there’s a sense in which we now need to make choices which are not structured by biases, not structured by heuristics, but structured in terms of actual grounded principles. And the grounded principles themselves are going to be the kinds of things that emerge from a deep understanding of human psychology, a deep understanding of social dynamics and a deep understanding of things like the fundamental relationship between choice, change and causation. So in this particular sense, the tools or the intellectual capacities that are provided by, in this particular case, a certain category of metaphysical thinking allows us to understand the relationship between choice, change and causation fundamentally, and therefore to be able to say something productive about what is a good relationship between man, machine, and nature.

Jim: Okay. So I can certainly see that, and this sort of fits pretty closely with the Game B view that we have to learn, and not just learn, we have to do living in balance with natural systems, and in fact actually give back considerable room to natural systems. We have trespassed beyond sustainable limits on natural systems, and having technology help us do that is a good thing.

Forrest: Exactly. So we’re effectively using technology to correct the damages associated with technology. So in effect, we’re compensating for the toxicity that has already shown up. So in that sense, if we went cold turkey and stopped all technology usage the world over, nature would just be disabled. It would have already suffered the harms of all the things that have happened already. So in this particular sense we’re saying, okay, we want to use the right level of technology that has essentially a healing impact. Because obviously, just even all of the human beings trying to do this stuff, there isn’t enough energy, there isn’t enough capacity just in our bodies to heal the ecosystems that have been damaged by technological usage.

So in this sense we’re saying, okay, we want to step down the level of technology a little bit to correct for the past usages of technology, to restore not only the capacity for ecosystems to thrive, but also for human cultures to thrive, for human beings to be living in a fully-embodied way. But this doesn’t resemble something that has come from the past. It doesn’t resemble a capitalistic system and sure as heck, it doesn’t resemble a socialist system. It doesn’t resemble either of these in the same sense that neither change nor causation represents choice. So in effect, we’re actually dealing with a new kind of category of process or a new understanding of how to relate to the world that is defined in terms of embodied values or embodied choices as made by living systems in order to be able to support the wellbeing of living systems.

And treating technology as a kind of adjunct to that process, as a kind of support infrastructure for that process that effectively allows us to do the kinds of things that nature itself just can’t do. For example, just to give a wild example, I would love to see technology used to put up mountains in various places in the Northern African desert, basically to turn it into something that’s lush. With the right kinds of understanding of wind currents and evaporation rates and the ways in which moisture moves around and so on and so forth, we could turn deserts back into rainforest conditions in places on the Earth that right now, nature would never have that outcome because the land just isn’t shaped properly.

Jim: Or have been so degraded that it can’t recover. Remember the Fertile Crescent in the Middle East? Now it’s all desert and it has no way on its own to come back.

Forrest: Exactly, but with the right sorts of understanding of how to shape and do geoengineering, we could… And again, geoengineering in the sense of in-service to nature rather than a service to profits. But what happens is that by doing that, we could turn the Fertile Crescent back into the Fertile Crescent, and that would effectively create the kinds of ecosystems that would support thriving human life as well as every other kind of life that’s imagined. And so in this sense, there’s a very strong need for us to utilize technology in a way that is actually compassionate with respect to human culture and human understanding. But that basically means for us to embrace choice rather than to displace it. Economic actors would basically suggest that all of the profits and proceeds associated with that is essentially accrued by the sheer fact of not having to pay people to make choices on your behalf.

But in this particular sense, if I can get the machine to make my choices for me, thinking of a corporation for example, you hire people to make choices for you and you pay them money. The money becomes essentially a choice capacity you give to them and they choose to align, as employees, their choices to you, whatever your business intention is. And so in this particular sense, what we’re basically saying is that the economic hype associated with artificial intelligence systems is largely perceived as a side effect of choice displacement. But if we’re really going to understand the right relationship between technology, humanity, and nature, or between artificial intelligence and ourselves as culture, we’re going to need to get very, very wise to the nature of how to make good choices and what that actually looks like and what’s supportive of that process versus what is not.

In this particular sense, I’m applying the idea of love is that which enables choice as being enabling human choice or life choices, rather than enabling machine choices. And so in this particular sense, while it might be the case that in the same way that with the loom, for example, that the labor associated with individual people basically doing grunt work was displaced, and obviously that’s a good thing. If we don’t think about the nature of the choices being restored to them, rather than just accruing to the owners of the machinery, we’re going to see a replay of that same dynamic, lots of hype being made, the hype being used to essentially authorize the production of these machines, machines being owned by the ultra-wealthy and the most capable people. And ultimately, the benefits and the proceeds of that production capacity, whether it be of ideas or of things in the world accruing basically to the few and not to the many. And at that point, you’re going to end up with increased economic dysregulation and/or eventually war.

Jim: Oh, yeah. My favorite is that, at some point, I think we’re actually very close to that point, people just say, “Fuck it, there’s nothing in for me. Let’s get the guillotines out.” And I think I’m going to be talking to some really rich fucks next week, I think it is, or the week after. Amazingly, somebody hired me to give a presentation to some of the richest people in Europe. And I’m going to start with a picture of a guillotine and a black swan walking up to it.

Forrest: Oh, man.

Jim: And I’m going to say, “People, your thinking about the right tail of the power law distribution is way inadequate, right? We are definitely within the range of a measurable probability each year that the guillotines come out.” And that’s not a good way to do a social transition, but it has happened in the past and there’s a chance that if the powers that be don’t change, it could happen again. So anyway, I’m getting a little out of control here, so let me calm down a little bit.

The other reaction I wanted to say is that your vision reminds me a whole lot of Tyson Yunkaporta’s concept of humans as a custodial species, that we have got too much power for too long and we can’t just walk away, even if we wanted to, which we don’t. We have a duty to use the power we have to make the Earth wonderful and beautiful again, and we could do that. So I think many of us are on the same page with you on that regard. I very much like the way you frame this of humans, nature and technology. How does that cycle make sense? Instead of being what it is today, where the inner loop of money-on-money return drives technology in a relentless exponential, as fast as anybody can drive it, further accelerated by multi-polar traps around things like nation state military competition, also coupled to money-on-money return through defense contractors-

Forrest: Well, this is part of the reason-

Jim: Let me just sort of finish the horror story, which we all know, of course. We call it the litany of shit, some of us. That it’s the exact opposite of this closed loop of humans, nature and technology. So there’s got to be a gigantic institutional shift to get from here to there. What’s that look like?

Forrest: Well, this is part of the reason why I was distinguishing between institutions and communities. Institutions are going to be basically based upon transactional relationships and hierarchal process, whereas communities are always going to be based upon care relationships. So if it’s basically a group of people interacting on the basis of their care for one another, then in effect is legitimate to call it a community. But in this sense, to speak a little bit about what you were calling institutional change. We do need to speak a little bit about to why that happened. How is it that we arrived in this circumstance? What is fundamentally the thing that occurred that made it so that the things that are happening are actually happening? And part of that has to do with what might be thought of as social psychology or psychological process like how we individually make choices.

Because in the same sense that we have individual biases in terms of how we evaluate the relative safety or danger of a particular situation and what our responses would be, we are driven by relatively basic needs, food and shelter and sexuality, and things like that. So in a sense, when we’re thinking about how to coordinate choices to actually favor the vitality of the commons, essentially to do the governance thing of protect the land and the people who, or better, to help the land and the people thrive, that we are in a sense needing to understand the ways in which our individual choices can oftentimes be co-opted by literally biological process. And so this is why I was saying earlier that I do see that there is a kind of way in which we become more discerning and more attuned to the kinds of things that would allow us to develop the wisdom necessary to make choices in this space.

But to some extent, that basically means things like not getting involved in the cardinal sins, rage or the sort of gluttony kinds of things that are largely defining relationships at a social and political level currently. So in this sense, I’m basically saying that part of it is that individually and collectively, we need to get a lot more savvy about how we each individually make choices and whether those choices are genuinely reflective of our embodied values. And so in this particular sense, there’s a large level of unconsciousness with the way most people make choices. They see something and they feel a desire or attractiveness for it, but they don’t necessarily know that that desire is itself a reflection of an even deeper desire. And that by adhering to that even deeper desire, that they can actually have a longer lasting and more fulfilling outcome both for themselves and for their families and friends.

And so there’s a kind of skillfulness of not just being skillful in the world of being able to build systems and technology centered around symmetry and causation, but to be able to look inward and have equal skillfulness of knowing oneself and having coherency and being able to operate as a unified and integral being. So in a sense there’s, for lack of a better word, a set of spiritual principles that develop continuity and coherency in terms of one’s own choices and capacity for choices that allow them to sync up and to join with the choices of others and to join with the choices of communities as a whole, that effectively represent fundamentally the kinds of dynamics that we’re looking for when we say large-scale institutional change. But this is a different mindset.

So in effect now having to think about how to bring people into the awareness of what does that feel like and what does it look like, and what are the techniques by which this sort of discernment is developed, and the ways in which we can become more aware of our communication processes in terms of how they facilitate things like sense making. So in this sense, when we’re starting to do this sort of work, we’re starting to say, “Okay. Well, it’s not so much that we’re starting with strategy, which would be an engineering institutional top-down sort of perspective, but that we’re starting with culture and we’re thinking about communication dynamics and what does it mean to be a healthy individual and what does it mean to be a healthy family and a healthy community, so that we can have a real sense of what does it mean to be a healthy world.” So in this sense, there’s a series of deep principles that effectively inform what are the practices. And those principles are as much reflective of what’s going on inside of our psyches as something that’s going on outside of ourselves.

Jim: And of course, as we’re both aware, there are groups of people starting to make a little bit of traction on these issues. And we’re very early, we don’t know what the hell we’re doing. It requires a lot more experimentation to figure out where theory and practice come together. But in any case, it’s very, very early. It’s a few hundred thousand people probably at most, and that may well be being very generous. And yet, we’re talking about ChatGPT 5 or GPT-5 being here in a year, and something that we can’t even imagine by 2027 or 2028. There seems a giant mismatch here between the cycles of maturation that’s going to be necessary to follow this path you’ve pointed out, which strikes me as correct and the driving factor of money-on-money return driven exponential tech around AI. What do we do? I mean, I suppose one thing we could do is just say, stop until we mature. That we are 12-year-olds who have found in the attic granddad’s World War II machine gun. Just fucking stop for a while.

Forrest: Well, we could hope that there would be a closing of the gap between what we can do and an awareness of what we should do. So this is classically thought of as the ethical gap, that just because you can do something doesn’t mean you should do it. And so in effect, there is a need for us to become aware of, well, what do we really want? And want is not even the right word. It’s more like what do we have a passion for? What is the space from which we are actually operating when we’re thinking about these kinds of choices? And so in effect, when we’re thinking about short-term hedonistic satisfaction in the sense of, “Wow, I just got a little bit more powerful today,” relative to, “Wow, I just had a joyful experience with my friends and family,” there is a kind of reconciliation here where we’re basically saying, do we truly want the choices about the future of the world made on the basis of those people who have become most skillful at winning games of power? Or do we really want to have the choices of the world made on the basis of what does the world really need? What does it really have a passion for? Where is the thrivingness coming from?

Because we can certainly say, as many people have, “Hey, I experimented with ChatGPT. I’ve experimented with the Facebook system that is promoted recently.”

Jim: LLaMA? Yep.

Forrest: Yeah. That whole thing is such that we can notice that we’re scared and we can-

Jim: Or excited.

Forrest: Well, yes, there’s both, but to some extent, regardless of which emotion we’re having, the emotion is pointing back to something. It’s pointing back to a set of values, it’s pointing back to an underlying series of cares. If it’s fear, it’s a fear of what we’re losing or what we’re potentially about to lose. And if it’s excitement, it’s some sort of passion of, “Wow, there’s a sense of thrivingness here that I feel is possible and that I want more of.” But in either case, we want to clarify what it is that is the actual basis of choice, what it is that actually matters to us? And not matters to us in some abstraction thing. And that I got another zero on a bank account or a few bits shifted in some database.

What I’m really looking for is the kinds of things that would be basically enlivening, like at an actual visceral, grounded level. And to really be connected to that and to really understand that as essentially an orientation from which we live. This is where we make the transition from self-actualized in Abraham Maslow’s scale, to world actualized, which is that sixth unnamed level that he didn’t mention it until later on in his life. So in this particular sense, it really is important for us as a species to become world actualized, which is essentially a new level of discernment or a new level of psychological development. And if we are going to make a transition to that kind of thing, then naming that and valuing that and basically saying, “Well, who are the people that I know or that I’m connected to who are essentially an embodiment to that? And how can I be supportive of that process? How can I be connected to the things that I can recognize as being world actualized? Who are the exemplars of this? Who are the people who are mindful of the relationships with ecological process or people who know about permaculture, who are connected to the wildness of the world and genuinely can be supportive of this?”

So in this particular sense, it might be things like actually involving indigenous people as stakeholders in decisions that affect the ecosystem. They have, as a firsthand contact, a deep knowledge of nature. They might not know anything about technology. And so in this particular sense, there is a kind of conscientiousness that is how do we do this given everything we know, not just what the indigenous know, and not just what the engineers know, but literally what the people know as fundamentally their desires for thriving, not just short-term hedonistic gains?

Jim: Yeah. Again, that’s essentially the Game B story, but again, it ain’t going to get even close to done by 2027 or 2028.

Forrest: You have no idea. I mean, we can see that given sufficient pain, that people will at some point or another say and rise up. I think the real risk here is that if you end up with a totalitarian dystopian state as a sort of 1984 kind of thing, that even if they were to rise up at that point, it would be too late. So in this particular case, it may be the case that we are wanting people to be more and more afraid of what they’re losing so that they actually value it.

Jim: Yeah, that would be nice. We shall see. We shall see. I will say, I’m going to give the one positive thing, why I’m having so much fun with these new technologies and networking with some people with a similar vision. I’m going to make the argument that yes, the one, two, and three risks are all there and the various forms we’ve talked about. However, there’s also the fact that the nature of this particular technology, the nature of the LLMs is that they’re not that hard to build, and that they are potentially a empowering of the periphery in the same way the PC was. The PC brought down the big glass computer room, and then even the mini computers, and it gave power to the people. This was the Steve Jobs insight. The early online systems before the internet and the web, the same. The internet and the web also empowered the periphery. In fact, folks like you and I would never have met, probably.

Forrest: Certainly did at first. But then you have the rise of companies like Facebook and Google and the rest. And you might end up with empowering the periphery at first, but unless we put discernment practices in, like the kinds of things where people say, “No, we’ve seen this game of centralization played over and over and over again throughout history. This time we need it to be different because the stakes are too high.” So one hand, I love the idea of empowering the periphery, but now I want the periphery to not be naive. I want the periphery to essentially understand what the stakes are, and to be able to notice the encroachment of civilization processes that favor centralization. And to say, “Hey, you know what? It’s not efficiency that we’re looking for here. It’s vitality. And if I trade away vitality for efficiency, it’s a bad deal, and I don’t want someone to convince me that it’s a good deal.”

So those people that currently feel disempowered because they were laid off recently from their tech jobs or larger segments of the artist community, for example, which were basically recognizing that their economic viability has been shifted very recently, that in a lot of ways we’re now needing to existentially be much more conscientious about the basis of our choices. So in this sense, there’s a kind of empowerment to the periphery in the sense of what questions are we asking? Are we aware of the implications of those questions? Can we become more discerning about the encroachments of centralizing forces? Because they’re going to continue to come with the hype cycles and with the same sort of dog and pony show of how wonderful this is. But it’s wonderful for them, but it’s not wonderful for you. It’s just that part of the message hasn’t been included.

Every time we make a deal or think about a trade of some sort of, know that there’s cost, there’s benefit, and there’s risk. People will sometimes talk about the benefits. Well, they’ll almost always talk about the benefits. They’ll sometimes talk about the costs, but they’ll almost never mention the risks and the risks are to you or to someone else that’s not even part of the transaction. So without factoring in some notion of what is really involved in the whole of the thinking of the process, unless we’re actually discerning about risk factors, at least as much as we are about cost factors and benefit factors, then we’re not making the choices holistically enough to genuinely be a true response.

Jim: Ah, that’s very true. Very true. I could go down another rabbit hole, but I think that that is a lot of really good stuff we talked about here today. I really want to thank you for us. This is just like… I love talking to you.

Forrest: Awesome. It’s been enjoyable, thank you.