Monday, December 19, 2011

Information and entropy

I have a correction to make. A few weeks ago, I wrote that information increases as entropy increases. I was wrong. The relationship between entropy and information depends on who you ask.

Sean Carroll, in his book From Eternity to Here, repeatedly states that information decreases as entropy increases. I take this to be a view more common than mine, though I had long thought it was mistaken. Reading this book changed my mind. It turns out there's more than one way to define information, and unsurprisingly, not everyone chooses to define it as though they're a computer scientist.

As to why there exist different definitions for information, some leading to opposite descriptions of the world around us, that's something of a riddle. Today I'm going to describe that riddle.

First, take view that information and entropy are indirectly proportional—i.e., the view Carroll expresses in From Eternity to Here. Carroll uses an example of a glass containing warm water and an ice cube. The ice cube melts, causing the water to become cool. This change entails an increase in entropy. But as Carroll puts it, information becomes lost along the way. That's because the situation ends with a glass of cool water, but a glass of cool water can result from either an ice cube melting in warm water or else a glass whose water was cool to begin with. Two possible states evolved into one—i.e., information decreased.

However, that's not how I normally think about entropy. I take the view that information and entropy are directly proportional. To see it my way, take another example. Imagine you flip a coin a thousand times, and it comes up heads every time. That's an unlikely, low-entropy result. It's also the simplest result to describe; the two-word description 1000 heads suffices. Contrast that to any high-entropy result you're likely to achieve with a fair coin, where no discernible pattern emerges. In a patternless result, the only way to describe all coin tosses is by listing each toss individually—e.g., heads, heads, tails, heads, tails, tails, tails, etc. That's the meaning of patternless. Thus, higher entropy requires more information to describe what's going on.

So what's going on with the difference between these two scenarios? Which way is the right way of looking at the relationship between information and entropy? I wish Carroll had elaborated on this in his book, but From Eternity to Here is chiefly about time, not information, so I didn't learn why physicists find it compelling to look at entropy and information as being indirectly proportional. I understand only my own view, which stems from a background in computation.

My perspective is that of dealing with computer stuff, including data compression and the shortest program to do X. Put simply: the more random something is, the less it can be compressed—the longer a program must be to contain it. That leads computer scientists to the counterintuitive notion that high-entropy randomness is full of information, whereas patterns are not. That means we look at a TV showing static as containing more information than a TV showing a show, just as a shredded book contains more information than an intact book. As you may imagine, this view takes some getting used to.

As to the riddle of the two examples, and what causes the difference between the two views, the difference is whether one's view is macroscopic or microscopic. The macroscopic view leads to the physics perspective, where molecules are course-grained into big states, such as warm water with an ice cube. As entropy increases, the number of possible macroscopic states decreases, and that's perceived as information loss.

The microscopic view leads to the computer science perspective, where there is no course graining and one keeps track of each individual bit. As entropy increases, the number of possible microscopic states increases, and that's perceived as information gain.

That solves that riddle, but it suggests another riddle entirely: what is information, really? May we say something objective about it?

2 comments:

Shafik said...

Good stuff!

Here's how I look at it: Information is the amount of Order a certain entity or state has, and Entropy is the opposite - it's the amount of Chaos (or Disorder) present in a state.

This means I agree with Sean Carroll's view of entropy (indirect proportionality). Let's go back to his example, a water cup with ice cubes in it. In a very real sense, when the ice melts, the amount of chaos in the combined state of the water molecules *increases*. I now need more Information to describe what's going on in the cup. Since I need more information to describe the current state, that means I *lost some of it in the process*. Hence the cup has lost information.

It feels kind of backward, but this is also compatible with Claude Shannon's idea of information. See "Self-Information": http://en.wikipedia.org/wiki/Self-information. Like Carroll, Shannon would think that the ice-cube state is less probable than the warm water state, and thus contains more information.


Now consider the case of compressing simple text files. Suppose file-1 has a string of 100 "A"s, and file-2 has 100 random letters. In your interpretation, you see file-1 as having much less information, since it's easily described ("100 As"). That's only six characters long. file-2, on the other hand, has no discernible pattern, and hence must be described by spelling out the entire 100 letter sequence ("XJLAINBCXETLHHWQ ...").

Another way you can look at this scenario though, is that file-1 has much more Order and much less Disorder as file-2. file-1 in a sense is "easily describable" or perhaps "filled with information". It is "filled with information" precisely because it is *easy* to describe. file-2, on the other hand, has so much Disorder and not much Order. I'll have a *hard* time describing it. I have to provide a lot of information on my own in order to describe it, which means the amount of information it intrinsically has is *low*.


That's how I think of this at least. You're right though, the definition of "information" is quite hard to pin down once you really want to get rigorous. It'd be really cool to dive deeper into Information Theory to investigate this issue further.

Craig Brandenburg said...

Shafik— The fight over defining “information” (see here) seems arbitrary to me, like labeling electrical charge positive or negative. People would like for the rigorous, scientific definition of information to be compatible with how we use the term in casual speech. But the problem there is that people casually use the term to mean it both ways. That is, if you speak random nonsense to me, you're not informative. But if you speak totally predictably, you're also not informative. Casual mention of information is somewhere in the middle. So I don't see it as important to define the term one way or the other—unless you want to fit in academically, of course.