Tuesday, August 30, 2016

Preludes and Haiku

Over the weekend I saw an awesome performance of Chopin's Preludes. The full piece is about 45 minutes, and it's made up of 24 short pieces. The shortest one is only 12 bars long! I really loved listening to these pieces, and I think the short (episodic?) structure contributed to this -- each one is like a taste of some particular thing, kind of like a multi-course meal.

I gather that a lot has been written about the Preludes from a musical point of view, but the thing that caught my eye was that preludes before Chopin's were almost always literal preludes to longer pieces, introductions to fugues or improvised (?!) introductions to other compositions. (When did improvisation stop happening during classical music performances?) Chopin's Preludes, on the other hand, weren't written to introduce anything, or were just introductions to one another. Chopin's Preludes apparently elevated preludes' status to an art form in and of themselves.

This reminded me of haiku, which I learned in the excellent book Bashō and his Interpreters used to be just the opening stanza of renga, a form of long collaborative poem created at poetry gatherings, before being elevated to stand-alone pieces. No great moral; I wonder if there are other opportunities to split off prelude-like things into forms of their own? Maybe movie trailers?

Incidentally, that Bashō book is pretty great! The author, Makoto Ueda, gives a selections of poems from throughout Bashō's life, interspersed with biographical context about what Bashō was up to (including maps of his travels!). Each poem is given in its original Japanese, a word-for-word English translation, and Ueda's personal translation to idiomatic English; Ueda gives notes explaining the context of the poem's composition and any references you need to understand the poem, e.g. to Japanese plays or traditions or puns; then Ueda lists some commentary from poetry critics throughout history. I'd love to read more poetry books that do this, since otherwise I'm not in a position to really understand the poems. I think it's probably the best overall introduction to haiku I've seen, and I definitely composed some after reading it.

Saturday, August 27, 2016

Acts, omissions, friends, and enemies

The act-omission distinction plays a role in some ethical theories. It doesn't seem relevant to me, because I'm much more concerned with the things that happen to people than with whether a particular actor's behavior meets some criteria. (Of course, if some consequence is more beneficial or harmful if it is caused by an act or an omission, I'd care about that.)

Supererogatory acts, and the broader concept that some acts are required / forbidden while others are not, also play a role in lots of ethical theories, but don't seem relevant to me. To me, ethics is entirely about figuring out which acts or consequences are better or worse, and this doesn't give an obvious opening for making some acts required or forbidden.

I recently had an idea about why these concepts appear in some ethical theories: acts/omissions and supererogatory acts seem useful for identifying allies and enemies. This is roughly because acts tend to be costly (in terms of attention and other resources), and supererogatory acts tend to be expensive as well. There's not a lot more to say:

  • Allies will pay the cost to help you through their acts; supererogatory acts are especially good indicators of allies.
  • Enemies will pay the cost to hurt you through their acts.
  • Neutral parties may hurt or help you through omissions, but since these aren't costly, they don't carry much information about whether that party is an ally or enemy; they don't seem to be thinking about you much.

From my perspective, this is a tentative debunking of these concepts' role in ethics, since allies and enemies don't belong in ethics as far as I can tell. For others, allies and enemies might be important ethical concepts, and maybe this could help them explain these concepts' role in those ethical theories.

Final note: I remember hearing about supererogatory acts' evil twins, i.e. acts that are not forbidden, but are morally blameworthy; "suberogatory acts" (search for the term on this page). These might be useful for identifying allies, who will avoid suberogatory acts, but they don't seem to play much of a role in any ethical theory.

Sunday, August 14, 2016

Do reinforcement learning systems feel pain or pleasure?

Do reinforcement learning systems have valenced subjective experiences -- do they feel pain or pleasure? If so, I'd think they mattered morally.

Let's assume for the time being that they can have subjective experiences at all, that there's something it's like to be them. Maybe I'll come back to that question at some point. For now, I want to present a few ideas that could bear on whether RL systems have valenced experiences. The first is an argument that Tomasik points out in his paper, but that I don't think he gives enough weight to:

Pleasure and pain aren't strongly expectation-relative; learning is not necessary for valenced experience

An argument I see frequently in favor of RL systems having positive or negative experiences relies on an analogy with animal brains: animal brains seem to learn to predict whether a situation is going to be good or bad, and dopamine bursts (known to have something to do with how much a human likes or wants something subjectively) transmit prediction errors around the brain. For example, given an unexpected treat, dopamine bursts cause an animal's brain to update its predictions about when it'll get those treats in the future. Brian Tomasik touches on this argument in his paper. This might lead us to think that (1) noticing errors in predicted reward and transmitting them to the brain via dopamine might indicate valenced experience in humans, and by analogy (2) this same kind of learning might indicate valenced experience in machines.

However, I think there is a serious issue with this reward-prediction-learning story. This is that in humans, how painful or pleasurable an experience is is only loosely related to how painful or pleasurable it was expected to be. If I expect a pain or a pleasure, it might reduce my experience of pain or pleasure, but it doesn't seem to me that my valenced experience is closely tied to my prediction error; a fully predicted valence doesn't go away, and in some cases anticipating pain or pleasure might intensify it.

Biologically, it shouldn't be too surprising if updating our predictions of rewards isn't tightly linked to actually liking an experience. There seems to be a difference between "liking" and "wanting" an experience, and in some extreme cases liking and wanting can come apart altogether. Predicting rewards seems very likely to be closely tied to wanting that thing (because the predictions are used to steer us toward the thing), but seem less likely to be tied closely to liking it. It seems quite possible to enjoy something completely expected, and not learn anything new in the process.

In a nutshell, I'm saying something like:
In humans, the size of error between actual drive satisfaction and predicted drive satisfaction doesn't seem strongly linked to valenced experience. Valenced experience seems more strongly linked to actual drive satisfaction.
This seems to me like evidence against RL systems having valenced experience by virtue of predicting rewards and updating based on errors between predicted and actual rewards. Since this is the main thing that RL systems are doing, maybe they don't have valenced experiences.

If valenced experience doesn't consist of noticing differences between expected and actual rewards and updating on those differences to improve future predictions, what might it consist of? It still seems very linked to something like reward, but not linked to the use of reward in updating predictions. Maybe it's related to the production of reward signals (i.e. figuring out which biological drives aren't well-satisfied and incorporating those into a reward signal; salt tastes better when you're short on it, etc.), or maybe to some other use of rewards. One strong contender is reward's role in attention, and the relationship between attention and valenced experience.

The unnoticed stomachache

Consider the following situation (based on an example told to me by Luke about twisting an ankle but not noticing right away):
A person has a stomachache for one day -- their stomach and the nerves running from their stomach to their brain are in a state normally associated with reported discomfort. However, this person doesn't ever notice that they have a stomachache, and doesn't notice any side-effects of this discomfort (e.g. lower overall mood).
Should we say that this person has had an uncomfortable or negative experience? Does the stomachache matter morally?

My intuitions here are mixed. On the one hand, if the person never notices, then I'm inclined to say that they weren't harmed, that they didn't have a bad experience, and that it doesn't matter morally -- it's as if they were anesthetized, or distracted from a pain in order to reduce it. If I had the choice of giving a Tums to one person who did notice their stomachache or to a large number of people who didn't, I would choose the person who did notice their stomachache.

On the other hand, I'm not totally sure, and enough elements of discomfort are present that I'd be nervous about policies that resulted in a lot of these kinds of stomachaches -- maybe there is a sense in which part of the person's brain and body are having bad experiences, and maybe that matters morally, even though the attending/reporting part of the person never had those experiences. Imagine a human and a dog; the dog is in pain, but the human doesn't notice this. Maybe part of our brain is like the dog, and the attentive part of our brain is like the human, so that part of the brain is suffering even though the rest of the brain doesn't notice. This seems a little far-fetched to me, but not totally implausible.

If the unnoticed stomachache is not a valenced experience, then I'd want to look more at the relationship between reward and attention in RL systems. If not, then I'd want to look at other processes that produce or consume reward signals and see which ones seem to track valenced experience in humans.

Either way, I think the basic argument for RL systems having valenced experience doesn't work very well; none of their uses of reward signals "look like" pleasure or pain to me.