Monday, February 29, 2016

Counterpossibles

I'm casually interested in counterpossible reasoning (or "logical counterfactuals"), asking what would be true if something logically impossible were true. Here's an example due to Carrie Jenkins:
  1. If the square root of 2 were rational, it could be expressed as n/m, with n, m integers.
  2. If the square root of 2 were rational, it couldn't be so expressed.
It sure seems to me, and to her, that 1 is true and 2 is false. However, most logics don't deal with this sort of thing naturally, i.e. with normal conditionals; if we assume that root 2 is rational, then prove that it is irrational, then we can prove any other statement from this contradiction -- if you build counterpossible worlds this way, they all turn out to be worlds where everything is both true and false.

I'd especially like to know about counterpossible reasoning because it seems like it could be part of a theory of reasonable decision-making. Suppose that a computer program is making a decision; it should evaluate what would happen if it chose option 1, 2, or 3. However, since it is a program, it actually only chooses one of these options, so all but one of these choices are logically false (or maybe they are all false, if the program fails to halt). The program should be using counterpossible reasoning to figure out what would happen if it made each choice.

It seems suggestive to me that the program "doesn't yet know" which choice it will make while it's reasoning, and so it will reason the same way for the choices it will choose and the choices it won't. Maybe this gives us some clue, or limits the ways that it can do counterpossible reasoning? On the other hand, this kind of counterpossible reasoning might be specific to decision-makers, and be unsuitable for e.g. imagining counterpossible mathematical systems like Carrie's root-2-rational world I gave above.


I have a few desiderata for counterpossible reasoning (which I really haven't done serious reading on, so for all I know these are well-known proposals):
  1. There should be exactly one world W defined by counterpossibly assuming S. (A world is a set of assignments of truth values to all statements.)
  2. W should assign each statement either "true" or "false", not both or neither.
  3. S should be true in W.
  4. If S is actually true, then W should just be the normal assignment of truth values to statements.
  5. If S is not actually true, then in most cases W should assign other statements S', S'', etc. the opposite of their normal truth values. These are the "counterpossible consequences" of S. All other statements are "counterpossibly independent" of S.
5 is interesting, because it means that we can't go for a W that is minimally different from the actual world in its truth assignments; this would be the world where only S is different, which is not very satisfying. These desiderata clearly aren't enough to pin down a particular way of reasoning; defining 5 better is an obvious next step, but I'm not sure how to do it.


One possibility that is tempting to me is to work in a proof system P instead of a complete world W (which are different because of incompleteness).  To counterpossibly assume S, add S as an axiom of P, and add a rule to P that says something like "whenever a rule in P says you could conclude not-S, conclude S instead". I think this satisfies 2, since we can never conclude not-S and the proof system is otherwise unchanged. I'm somewhat worried about 4; the mangled proof system seems like it could have trouble with proofs by contradiction, since it can't assume Q and conclude not-S to get a contradiction. This construction would also make statements not provable in P counterpossibly independent of S relative to P, which seems OK to me. Is this construction satisfactory overall? I'm not sure, and I'm out of blogging time for today! Maybe I'll come back to it -- writing this post makes me more excited about the possibility.

I'll leave you with something that seems to me like a nice application of counterpossible mathematical reasoning:

Sunday, February 28, 2016

Deep learning in simple words

I wanted to try out the Up-Goer Five text editor, which lets you use only the 1,000 most commonly used english words (compiled by Randall Munroe), and I settled on trying to explain deep learning. Here's what I ended up with:
Sometimes, we want a computer to answer some kind of question, like who is in a picture or what the best move is in a game. To do this, we tell the computer to pretend it has a lot of little parts; each of the parts will look at the question, do a very easy bit of work, and then give its work to the next parts of the computer. At first, the computer doesn't know what each part should do, so it just picks any bit of work without thinking about it. The last parts of the computer will give us the answer.
At first, the parts don't know what to do in order to answer the questions. To help the computer learn, we write down many questions and the right answers. Then, we show the questions to the computer one at a time. The computer uses its parts to guess the answer and checks if it got it right.
Then comes the most important step: if it guessed the answer wrong, the computer changes all of its parts just a little so that it gets a little closer to the right answer. If it does this enough times, it will start to guess the right answers to new questions more and more often.
In some old learning computers, not very many parts worked on a question one after the other. Since one part can't do very much, and the parts couldn't share their little bits of work with each other, the old computers couldn't learn to answer very hard questions. Deep learning is when many parts of the computer work on the question one after the other, so that each part can build on the work of the other parts to answer hard questions.
When lots of parts work on questions one after the other, it's harder to figure out how to change the parts to make the guesses better, and sometimes the parts never figure out the right things to do. But now, some people figured out how the computer should change its parts, and computers are much better at learning to do lots of things.
I'm pretty happy with this. An interesting effect of using the text editor was the sense of focus it gave me; it pushed me to prioritize the most critical aspects (in my mind) of deep learning, instead of more tangential connections between deep learning and other things. I think the result accurately represents my understanding, and that someone who understands DL better could use it to diagnose holes in my picture of how DL works, which is nice.

This leaves something to be desired; for example, I love Chris Olah's explanation of DL as differentiable functional programming, and my description didn't really explain anything about backpropagation or the importance of differentiability. Still fun!