I have a correction to make. A few weeks ago, I wrote that information increases as entropy increases. I was wrong. The relationship between entropy and information depends on who you ask.
Sean Carroll, in his book From Eternity to Here, repeatedly states that information decreases as entropy increases. I take this to be a view more common than mine, though I had long thought it was mistaken. Reading this book changed my mind. It turns out there's more than one way to define information, and unsurprisingly, not everyone chooses to define it as though they're a computer scientist.
As to why there exist different definitions for information, some leading to opposite descriptions of the world around us, that's something of a riddle. Today I'm going to describe that riddle.
First, take view that information and entropy are indirectly proportional—i.e., the view Carroll expresses in From Eternity to Here. Carroll uses an example of a glass containing warm water and an ice cube. The ice cube melts, causing the water to become cool. This change entails an increase in entropy. But as Carroll puts it, information becomes lost along the way. That's because the situation ends with a glass of cool water, but a glass of cool water can result from either an ice cube melting in warm water or else a glass whose water was cool to begin with. Two possible states evolved into one—i.e., information decreased.
However, that's not how I normally think about entropy. I take the view that information and entropy are directly proportional. To see it my way, take another example. Imagine you flip a coin a thousand times, and it comes up heads every time. That's an unlikely, low-entropy result. It's also the simplest result to describe; the two-word description 1000 heads
suffices. Contrast that to any high-entropy result you're likely to achieve with a fair coin, where no discernible pattern emerges. In a patternless result, the only way to describe all coin tosses is by listing each toss individually—e.g., heads, heads, tails, heads, tails, tails, tails, etc.
That's the meaning of patternless. Thus, higher entropy requires more information to describe what's going on.
So what's going on with the difference between these two scenarios? Which way is the right way of looking at the relationship between information and entropy? I wish Carroll had elaborated on this in his book, but From Eternity to Here is chiefly about time, not information, so I didn't learn why physicists find it compelling to look at entropy and information as being indirectly proportional. I understand only my own view, which stems from a background in computation.
My perspective is that of dealing with computer stuff, including data compression and the shortest program to do X. Put simply: the more random something is, the less it can be compressed—the longer a program must be to contain it. That leads computer scientists to the counterintuitive notion that high-entropy randomness is full of information, whereas patterns are not. That means we look at a TV showing static as containing more information than a TV showing a show, just as a shredded book contains more information than an intact book. As you may imagine, this view takes some getting used to.
As to the riddle of the two examples, and what causes the difference between the two views, the difference is whether one's view is macroscopic or microscopic. The macroscopic view leads to the physics perspective, where molecules are course-grained into big states, such as warm water with an ice cube
. As entropy increases, the number of possible macroscopic states decreases, and that's perceived as information loss.
The microscopic view leads to the computer science perspective, where there is no course graining and one keeps track of each individual bit. As entropy increases, the number of possible microscopic states increases, and that's perceived as information gain.
That solves that riddle, but it suggests another riddle entirely: what is information, really? May we say something objective about it?