Motivation

I’ve become interested in what algorithms or computations are and what constitutes their similarity to understand better how large a universe would have to be for evidential cooperation in large worlds (ECL, or formally “multiverse-wide superrationality,” MSR) to be interesting. Here is an insightful article on the topic. This post only documents a self-experiment I conducted to form some intuitions for the topic. It doesn’t contain any qualitative insights.

In particular it seems to me that either (1) the universe is infinite in some way or (2) it is not. If it’s not, (a) it may be too small for ECL to be relevant or (b) it may be big enough. In case ii.a, ECL does not go through; in case ii.b, ECL is fine. But in case i, which is maybe most likely, there’s a sense in which it is more exposed to infinite ethics than all the rest of aggregative consequentialism in that it both depends and is threatened by infinity. Aggregative consequentialism is only threatened by infinity. This could be alleviated if it turned out that the expected gains from trade from ECL are considerable even in a universe that is only, say, “15 million times as large as the volume we can observe.” The article suggests that this size and even much greater finite sizes are plausible. Unfortunately, this is mostly about the volume of space and not about the amount of energy in it.

Please see the summaries by Caspar Oesterheld or Lukas Gloor for an introduction to evidential cooperation in large worlds. I’ll give a quick, narrative introduction in the following.

Another World

In the following, I’ll try to sketch the idea behind ECL in intuitive terms. This attempt may easily be flawed. So please see Caspar Oesterheld’s paper before deriving strong conclusions about ECL. I wouldn’t want you to dismiss ECL because I made a mistake in my presentation or endorse it for reasons it can’t deliver on.

That said, imagine a world that is different from ours in that superrationality is somehow much more widely known and considered than in ours. Maybe Douglas Hofstadter, Gary Drescher,1 or someone from the Center on Long-Term Risk who has worked on ECL has become very famous in that world. (Even more famous than Douglas Hofstadter is in ours already.)

Further imagine, much like in our world, that there are different camps of conscientious altruists – some highly skeptical of all interventions that don’t have a long, well-studied track record of having any effect, and others highly sceptical of such effects. The first group is enthusiastic about GiveWell’s top charities and the second group is enthusiastic about more research into long-term effects. Both of them, however, are even more enthusiastic about the Uncontroversial Missions Force (UMF), which somehow combines the best of both worlds. These facts are known to both camps.

All of these altruists are very busy, so they just have time to keep well-networked within their intellectual community and don’t engage much with the other community.

In this world the coordination problem goes through roughly two stages:

  1. Much like in our world, the altruists plan to donate purely within respective camps and hope that the other will fill the funding gap of the UMF. Sadly, this gives them evidence that the other altruists will do the same, and UMF will go underfunded. This is in neither of their interests.
  2. But then they realize that if their acting this way gives them evidence that the other altruists will do the same, then this should hold also in cases where they act differently. “Differently” is quite a wide spectrum, though, so that alone is unhelpful to select the optimal alternative action. But a Schelling point comes to mind: They can fill 50% for the UMF funding gap. This action seems like enough of a Schelling point that they believe that the other group is fairly likely to choose the same.

This might’ve worked without the premise of superrationality being well known, but then the other group is a lot less likely to think of this solution, and that reduces the first group’s confidence that the plan will work.

The Real, Large World

ECL makes one additional crucial assumption: that the universe is infinite in time or space (i.e. that it has infinite volume and matter, not merely without edge, and that the matter is not arranged in some repetitive way), and that any solution to the problem of infinite ethics that we may find or settle on doesn’t upset it more than it will upset all of aggregative consequentialism.

This assumption has two related convenient implications:

  1. Not everyone needs to have thought of ECL. You can just cooperate with those who have.
  2. It doesn’t matter in ECL whether anyone tries to freeride on the compromise. You can just cooperate with those who don’t.

Self-Similarity Experiment

Going into this, I didn’t know how similar people tend to be or how to measure that, but three years ago, Tobias Baumann came up with an operationalization that at least allowed him to approximate an upper bound on the similarity of two people by comparing different person moments of his own person:

I just cooked up and carried out a small experiment: I went through archives of online chess games I played last month and sample random positions (filtering those where the best move is obvious) from these games. Then, I think about what move to make, and check whether my favored move matches the move my past self actually played.

The results surprised me a lot. My prior was that I’d play the same move as my past self most of the time (maybe 80% or so). But actually, I chose the same move only 40% of the time (10 out of 25)! Sometimes my past self played “surprising” moves that didn’t even cross my (present) mind.

Of course, I’m not claiming that this is any serious evidence about correlations, but I still found it interesting. Contextual differences (like not actually being in a game, time of day, how exactly the browser window looks, …) are the most plausible explanation for this effect – it’s unlikely that my chess play has radically changed since then. Still, I would have expected my choices to be more “convergent” or stable, and not depend so much on random stuff.

So I also wanted to test how much you could know about me today based on information about a past version of me.

I don’t know chess, so I couldn’t reproduce this experiment exactly, but I have been playing Othello occasionally over the past years and know enough strategy to have some seemingly nonrandom feeling about most situations. In particular I was able to find a few dozen games from September 2015, which I used as a basis for my experiment.

My main training in Othello happened in 2005–2006. Since then I’ve played it again during phases of about a month each in 2015, 2018, and now, 2020. There might’ve been more such phases between 2006 and 2015. My impression was generally that I started from a similar level each time, improved slightly over the course of a month, but then lost some of that ability again over the intervening years. So I surely haven’t, across the board (no pun intended), improved since 2015.

Experiment Setup

I selected 20 games from 2015 in which I played black and an AI (DroidZebra, today Reversatile) played white. I divided them into segments of first 10 moves, then 11 times 4 moves, and finally the rest of the moves for a total of 240 positions. The rationale was that we played the same openings repeatedly, which would’ve made many of the positions redundant, so I skipped many moves at the beginning until all games were distinct. I also skipped the final moves because I’d often be able to count out the last few moves precisely, which I could’ve also done in 2015, so that differences in performance here would indicate only that I must’ve blitzed the game in 2015 or been too sleepy to care. In one case, I chose a position two moves later than the preordained cut because my next move would’ve been a pass.

I pasted the game transcripts (to me unintelligible alphanumeric strings like f5d6c3d3c4f4c5b4d2e3) into a spreadsheet and set it up such that for each position all further moves were hidden from me. You can find the spreadsheet here. Then I pasted the start of the move sequence that I could see into WZebra and tried to find the best move or moves.

For each position, I recorded one move that I would choose today, all moves that I considered plausible today, the number of all available moves, and afterwards also the perfect moves according to WZebra’s AI. (AIs are vastly superior to all humans in Othello.) The plausible moves were moves where I imagined I’d be almost unsurprised if they turned out as good or better than the move that I had chosen. I sometimes picked an arbitrary chosen move from among the plausible moves. Originally, I had planned to record the number of all moves that were not obviously terrible (the intersection of the set of all moves that I found obviously terrible and the worst moves according to the AI), but I couldn’t find any principled threshold for what moves I should consider obviously terrible according to the AI.

I skipped 90 positions to save time and skipped 6 more because of misclicks that led me to see the perfect moves before I had made my choice. The final sample was one of 144 positions.

Evaluation

The last sheet in the spreadsheet calculates the number of times the 2015 move coincides with the 2020 move; the number of times the 2015 move was among the 2020 plausible moves, and two custom scores. In the following, I’ll call it a hit if the 2015 move coincided with the 2020 move or if the 2015 move was among the 2020 plausible moves.

I called the scores “peculiarity” and “plausibility.” They range from −1 to +1. Peculiarity is just a special case of plausibility, so I’ll focus on explaining the latter. The idea is to weigh a miss by the (negative) additive inverse of the probability of it randomly happening and a hit by the (positive) probability of it randomly not happening. This way, I don’t have to exclude any “obvious” or forced moves from the start but can penalize them later. A forced move has no influence on the score, a hit among 10 possible moves has a strong positive influence on the score, and a miss among 2 possible moves a medium strong negative influence on the score.

\(All\): set of all legal moves at a position

\(Plausible\): set of 2020 plausible moves at a position

\(old\): 2015 chosen move

$$ plausibility = \begin{cases} 1 - \frac{|Plausible|}{|All|} & \text{if } old \in Plausible \\ - \frac{|Plausible|}{|All|} & \text{if } old \notin Plausible \\ \end{cases} $$

The peculiarity is just the plausibility where \(Plausible = \{new\}\).

Results

  1. The positions had 1–20 legal moves, 9 on average.
  2. I chose the same move in 57% of positions for a peculiarity of 0.41. (As opposed to 40% in Tobias’s experiment.)
  3. The 2015 move was among the 2020 plausible moves in 76% of positions for a plausibility of 0.52.
  4. I got somewhat worse at the game, picking one of the perfect moves 54% as opposed to 62% of the time.

Reservations

  1. I’m quite enthusiastic about ECL: insofar as unknown biases have crept into my experiment, they more likely overstate my similarity to my 2015 version.
  2. There seems to be less controversy over what makes an Othello move better than there is over what makes a decision theory better. The only disagreements that I could see in the first case are that there may be moves A and B, where A is better than B under perfect play such that:
    1. The sequence after A may be less intuitive for the player or more intuitive for the opponent than the sequence after B, making B better in some sense, or
    2. A and B are both winning moves under perfect play but the sequence after B allows for fewer losing mistakes than the sequence after A.
  3. This experiment took longer than I thought it was worth, so I resorted to solving positions also when I was somewhat tired. These are unlikely to coincide with the times when I was tired while playing in 2015, a source of lower similarity. This lower similarity is likely uninformative as ECL-relevant decisions are probably made over the course of months and years rather than hours.
  4. There may be differences between playing a whole game in one go (or rather Othello) compared to solving a particular position, but I don’t know how informative these differences are.
    1. I sometimes made mistakes because I overlooked pieces in key positions, which I imagine is less likely if my mental picture of the board forms step by step throughout the game.
    2. I may have more time to notice mistakes if I think about a future move or a move sequence over the course of several moves. On the other hand, I also have a less clear picture of the future board setup when it’s still several moves away.
    3. I may overlook mistakes I made visualizing a future board position if I just carry out a preordained move sequence, so just solving one position may solve this issue.
  5. I sometimes got curious about the reasons for why a particular move of mine was a mistake and experimented with different moves to try to understand it. I didn’t realize that I may thus see WZebra’s evaluation of future positions that I would later have to solve. This happened rarely, I avoided it when I noticed the failure mode, and I don’t think I ever directly remembered a perfect move, but I may have remembered general ideas like “Diagonal control is important in this position.” Since my 2015 gameplay was better than today’s, this might push in the direction of overstating similarity.
  6. I noticed thinking patterns along the lines of “I’m unsure between moves A and B. Today, I’d play the more risky one to potentially learn something interesting, but I was more risk averse in 2015 and so probably played the safer one.” This might push in the direction of overstating similarity since I likely remember more about 2015 than I know about most near-copies of mine, but I never tested whether I was even right about my 2015 version.
  7. I tried to find the best moves in every position even if all options were obviously terrible. Back in 2015, I probably gave up on a game at some point and hardly thought about the moves anymore when I became clear that I had lost by a large margin. This may understate my similarity.

Further Research

I’m not planning to prioritize this investigation again any time soon (within a year or so), but if I do, I’d like to:

  1. Compare a future version of myself to my today’s version, ideally a shorter time in the future than 5 years but enough to forget the 144 positions.
    1. This would be more interesting as an upper bound on the similarity between two people at the cost of a bit more risk of overfitting. (Even if I don’t remember the particular positions that I’m solving, I may have learned particular principles this time that I’ll still apply next time. A milder form of overfitting. But I think the risk is low.)
  2. Compare that future version of myself on the basis of solving positions only, not continuous game play vs. solving positions.
    1. This will happen automatically if I use the same positions.
  3. Select games of people who are around my level and solve positions from their games.
    1. This way, I can test how different the similarity between my own person moments is compared to interpersonal similarities.
  4. Try to play intentionally similarly to a near-copy of myself.
    1. In ECL, the cooperation partners try to agree on a compromise utility function without being able to communicate. They do this not by selecting one action that is the perfect Schelling point because that action would likely be morally irrelevant for most of them. Rather they do this by selecting a bargaining solution that is a Schelling point and apply it to their best guess of the distribution of goals.
    2. If Othello is to serve as a model for that, we can hold constant the equivalent of the bargaining solution and will, by necessity, hold constant the goals. As a result is should be very easy to cooperate with a near-copy by always playing the same moves.
    3. I’d like to test how well this works by thinking about my general strategy, solving some positions with this acausal cooperation goal in mind, waiting a year for my memory to fade, then trying again, and finally comparing the results.
    4. The scoring could be based on points that I get when I play the perfect move or the same move with some downward adjustment when there are several perfect moves.

Acknowledgements

Thanks for feedback and suggestions to Max Daniel and Daniel Kokotajlo. No chiasmus intended.


  1. His last name is strangely easy to type for me.