Tag Archives: maths

Information: what exactly is it?

I was walking to the tennis courts in Battersea Park a few years back, when I heard something on my Walkman radio. It stuck with me for years, and until tonight I haven’t followed up on it, read about it or written about it. Though I have told everyone at my work, which has resulted, as usual, in groans about how nerdy I am (and genuine amazement at how I could spend valuable time pondering these things).

What I heard was a very short anecdote about someone who wrote a little regarded paper in the 1940’s (see ref below) in which he made an attempt to define a ‘measure’ for information. Although I never read any more about it (until today), what I heard was enough to set me thinking…

————–

Now, if you know lots about this subject then bear with me. Those readers who don’t know what he came up with: I challenge you to this question:

  • what contains more information, a phone-number, a ringtone or a photo?

Are they even comparable?

Bits & Bytes…

In this computer age, we already have some clues. We know that text doesn’t use up much disk space, and that photos & video can fill up the memory stick much quicker.

But what about ZIP files? These are a hint that file-size is not a very accurate measure of information content.

So what is a megabyte? Is it just so many transistors on a microchip? Happily, its not, its something much more intuitive and satisfying.

Information: what is it?

If you go to Wikipedia and try to look up Information Theory, within a few seconds you are overrun with jargon and difficult concepts like Entropy; I hope to avoid that.

Let’s rather think about 20 questions. 20 Questions is the game where you have 20 questions to home in on the ‘secret’ word/phrase/person/etc. The key, however, is that the questions need to elicit a yes/no response.

To define information simply: the more questions you need in order to identify a ‘piece of information’, the more information content is embodied in that piece of information (and its context).

This helps us to answer questions like: “How much information is in my telephone number?”

Let’s play 20 questions on this one. How would you design your questions? (Let’s assume we know it has 7 digits)

You could attack it digit by digit: “is the first digit ‘0’? Is the first digit ‘1’? Then changing to the next digit when you get a yes. If the number is 7 digits long, this may take up 70 questions (though in fact if you think a little you will never need more than 9 per digit, and on average you’ll only need about 5 per digit – averaging ~35 in total).

But can you do better? What is the optimum strategy?

Well let’s break down the problem. How many questions do we really need per digit?

We know that there are 10 choices. You could take pot luck, and you could get the right number first time, or you might get it the 9th time (if you get it wrong 9 times, you don’t need a 10th question). However, this strategy will need on average 5 questions.

What about the divide and conquer method? Is it less than 5? If yes, you have halved the options from 10 to 5. Is it less than three? Now you have either 2 or 3 options left. So you will need 3 or 4 questions, depending on your luck, to ID the number.

Aside for nerds: Note now that if your number system only allowed 8 options (the so-called octal system), you would always be able to get to the answer in 3. If you had 16 options (hexadecimal), you would always need 4.

For the decimal system, you could do a few hundred random digits, and find out that you need, on average 3.3219… questions. This is the same as asking “how many times do you need to halve the options until no more than one option remains?’

Aside 2 for nerds : The mathematicians amongst you will have spotted that 23.3219 = 10

Now, we could use 4 questions (I don’t know how to ask 0.32 questions) on each of the 7 digits, and get the phone number, and we will have improved from 35 questions (though variable) to a certain 28 questions.

But we could take the entire number with the divide and conquer method. There are 107  (100 million) options (assuming you can have any number of leading zeroes). How many times would you need to halve that?

1. 50 00o 000
2. 25 000 000
3. ….

22. 2.38…
23. 1.19…
24. 0.59…

So we only needed 24 questions. Note that calculators (and MS Excel) have a shortcut to calculate this sort of thing: log2(107) = ~23.25…

OK, so we have played 20 questions. Why? How is the number of questions significant? Because it is actually the accepted measure of information content! This is the famous ‘bit‘ of information. Your 7 digit number contains about 24 bits of information!

Epilogue

As you play with concept, you will quickly see that the amount of information in a number (say the number 42), depends hugely on the number of possible numbers the number could have been. If it could have been literally any number (an infinite set) then, technically speaking, it contains infinite information (see, I’ve proven the number 42 is all-knowing!).

But the numbers we use daily all have context, without context they have no practical use. Any system that may, as part of its working, require ‘any’ number from an infinite set would be unworkable, so this doesn’t crop up often.

Computer programmers are constantly under pressure to ‘dimension’ their variables to the smallest size they can get away with. And once a variable is dimensioned, the number of bits available for its storage is set, and it doesn’t matter what number you store in that variable, it will always require all those bits, because it is the number of possibilities that define the information content of a number, not the size of the number itself.

————

I hope that was of interest! Please let me know if I’ve made any errors in my analysis – I do tend to write very late at night 😉

References:

1.  Claude Shannon, “A Mathematical Theory of Communication” 1948

Analogies not equations, please!

Have you ever noticed how equations look far more complicated and hard to understand than the concept they represent?

I sometimes get myself stuck having to read other people’s work (it’s the ‘peer review process’) and when I first read it, I am often utterly confused, like a person stumbling around a dark room they’ve never been in before. However, because I am expected to make intelligible commentary, I soldier on until I understand what is being said.

Once you understand something, it is hard to remember what you felt like before you understood it. How did that equation look the first time you saw it? I have been thinking about this…

Let’s consider ‘equations’ – a common part of many technical documents. I have found that I always overestimate how clever or useful the equations really are when I first see them. So what does this mean?

It means that using equations to help teach people we risk turning them off by giving them the impression that the work is harder than it is.

Let me give an example:

Maxwell’s wave equations. These are considered (rightly) to be an cornerstone of physics, as they model the behaviour of waves in the inter-related electric and magnetic fields. When I first read them, they were ‘greek’ to me, literally. Here’s a small one:

maxwell-faraday-equation

Obviously, you need to know more to understand what they are about. You need to know what each symbol represents – and you need to know what the operators (the × in this case) actually do. For anyone who has not specifically studied maths at university would then need to backtrack quite far, because in this case the ‘×’ is not the ‘×’ most folks know and love, its the ‘cross product’ which applies to vectors. That even leaves most science graduates cold, draining the joy of discovery for a few hours or days while you go away to learn (or remember) what the heck that means.

But is it all worth it? Is the complexity of partial differential equations and matrix multiplication really required in order to understand what the equation is describing?

Of course not!

So why are equations always wheeled out to ‘explain’ phenomena? This is a failure of teaching. Of science communication. Surely concepts can be explained much better by the use of anecdotes, metaphors & illustrations?

Scientists working at the bleeding edge of science have to be very precise in their logic, and when communicating with one another, equations are undoubtedly very efficient ways to describe hypotheses. And so, while they are good ways for experts to relate, they make it harder for newbies to “break in”, and are dreadful teaching tools.

The Maxwell equations really just describe how waves propagate in a medium – and really its just the full 3-d version of waves in a slinky, or ripples in a pond. The equations, while drawing on complex (and difficult) maths, are describing something the human brain already has an intuitive grip on, because we’ve seen it!

I’m not suggesting we could do away with equations – they are valuable in the predictions they make for those who already understand what they represent – I am just suggesting that equations should be de-emphasised, and only dragged out when the student starts to feel the need to describe the phenomenon mathematically.

So my message to all university lecturers and text-book writers is: describe a phenomenon with the use of analogy, please!

Imaginary numbers challenge

I have a challenge for people who understand imaginary numbers (if that is indeed possible).

Now, I have seen how imaginary numbers can be useful. Just as negative numbers can.

For example, what is 4-6+9?  7. Easy. But your working memory may well have stored ‘-2’ in its mind’s eye during that calculation. But we cannot have -2 oranges. Or travel -2 metres. Oh sure, you can claim 2 metres backwards is -2 metres. I say its +2 metres, the other way (the norm of the vector).

What about a negative bank balance? I say that’s still platonic, a concept. In the real world it means I should hand you some (positive) bank notes.

We use negative numbers as the “left” to the positive’s “right”. Really they are both positive, just in different directions.

Now for imaginary numbers. I have seen how they allow us to solve engineering problems, how the equations for waves seem to rely on them, how the solution of the differential equations in feedback control loops seem to require them.

But I argue that they are just glorified negative numbers. The logarithmic version of the negative number.

So what is my challenge?

Well, the history of mathematics is intertwined with the history of physics. Maths has made predictions that have subsequently helped us to understand things in the real world. Maths models the world well, such as the motion of the planets, or the forces sufferred by current carrying wires in magnetic fields.

But the question is: is there any basis in reality for imaginary numbers? Or the lesser challenge, negative numbers? 

Is there a real world correlation to “i” ? Or is it a mere placeholding convenience?

Or perhaps positive numbers also lack this correlation?

In Praise of Logarithms

It occurs to me at this time that powers (or logarithms) are an equally justifiable numbering system of their own, indeed they may in fact be more meaningful and representative of ‘reality’ than the linear numbering systems we use so often. What I am referring to is a numbering system where consecutive ‘numbers’ are not simply the last number +1, but the last number multiplied by some factor. So: 1 2 3 4 may be used to represent 1 10 100 1000 in the case of a base 10 log (or exp), or 1 2 4 8 in the case of a base 2 system. (You can see that the numbers are simply obtained by raising the base (2 or 10) to the power of the number in question – so these really are just the logarithmic version of normal numbering)

But wait! The notable thing here is that this system has no apparent negative numbers: -2 -1 0 1 2 3 becomes 0.01 0.1 1 10 100 1000 for base 10.

Aside 1: you will note that addition in this system is ‘altered’. Addition and multiplication are mixed up! 2*3=5 while 6*9=15 (true for all bases!) and on the other hand 2+2=3 (in base 2) while 2+2=2.3010… (in base 10).

Aside 2: Negative numbers? The concept of negative numbers in this context has a strong (and genuine) relationship with imaginary numbers in conventional numbering systems. The only way to obtain negative numbers is to raise your base to the power of an imaginary number: e^(i*pi) = -1 being the famous example of this.

There is however much benefit to had to stop thinking of these numbers as powers but rather think of them as numbers in their own right – a numbering system, extending from -infinity (representing the infinitely small) to +infinity (representing the infinitely large). This is much more in keeping with reality – in which negative numbers don’t really exist! In fact we are so used to them that we have forgotten that they are just as weird as imaginary numbers (they were the imaginary numbers of their time). They, just like imaginary numbers, are so darn useful and sensible that we forget that really don’t have any basis in reality. They are firmly stuck in the platonic world.

So what of reality then? Imagine two points in open space. How far apart are they? A yard? A mile? One cannot say as the space has no reference measure besides the two points. The only definition we might attach is to say the distance is “1”. I.e. we define all distance in that world to be the distance between the two points. If you added more points the distances between them could then be expressed as multiples of the length AB. Using (for example) a base 2 system – because a base two system gives us the case of doubling (or splitting in half) with each increment. So if CD where the AB doubled 10 times then its length would be 10. If EF where AB halved ten times then its length would be -10.

So what’s the point? This ‘numbering’ system allows a better basis for attacking the big cosmological question: What is the nature of space?

Aside 3: “Information” has been shown to be binary (each bit of info halves the unknowns). If you have two boxes, 1 bit will tell you which one has the prize – 4 boxes will need 2 bits – 32 boxes – 5 bits. There is no such concept as negative bits. This numbering system linearises information content.

Aside 4: Which base? Well I am not tied down on this yet. 2 is good and e has a strong case. 10 is probably not as useful as we think (Q: why is 10 ‘special’? A: Its not.)

Please, o blogosphere, dispense thine thoughts!

The scientific method defined (well hypothesised at any rate)

I recently realised that the jury is out on exactly what science and the scientific method are (or should be, at least).

Some would say that science is the endeavour to understand the world, answer the “how” behind the ocean tides, rainbows or seed germination. So the scientific method is any way we might do this. Sounds reasonable to me.

However, some would say that science is the business of ‘facts’ or ‘truth’ and proofs. We do experiments to ‘prove’ our hypothesis. This is the definition I would like to take issue with.

Theories and facts confused…

I get really agitated when I hear people say that evolution is a ‘fact’. Not because I’m a  nutty young earth creationist (I’m not), because no-one has yet furnished a proof. But, you may argue, there’s loads of evidence, its clearly a fact.

But evidence is not the same as proof.

Even if something is 99.999% sure, it is still not sure.

I think the trouble comes because people are never taught that those ‘theorems’ and ‘proofs’ they learned in maths class are not quite the same as the theories and evidence in the scientific method.

So is maths a science? Well, yes, sort of. But while it can deal with real things, like counting sheep, it actually deals with a sort of imaginary world (the so-called Platonic ‘world of ideas’). The whole of maths is a mental construct with no known (‘proven’) basis is reality. But nonsense, you say, of course there are numbers in the real world! Well so there are, but there are no proofs!

Proofs are only possible is a fully ‘understood’ world, and because the world of maths is underpinned by a set of axioms, it is, more or less, ‘understood’. But the real world in which we live is not like that. We don’t understand how the brain works, we don’t know how many dimensions there are, we don’t even know if there is a god.

So does that mean we don’t know anything? The media (and opponents of science) use this uncertainty to undermine science. “You can’t prove there is no God, because there is!” Hey presto, a proof of God.

No, science and the scientific method doesn’t do proofs and facts. So what does it do?

Let’s consider the old chestnut, evolution. People had a book that explained the marvellous spectrum of life, from the caterpillar to the jellyfish. This was good enough for many years. But some clever folks started to question why God would bother to make different tortoises on different islands, and why He would go to all the trouble of putting dinosaur bones in certain rocks and why he would disguise their uranium-lead isotopes to make them look millions of years old.

So a theory was proposed (Darwin’s natural selection) that explained the incredible story of species and, for good measure, predicted that humans are apes, which went down well in the church.

Since then, loads and loads of observations have been made that confirm the theory (with the odd tweak). Its a theory that would have been easy to disprove. If it was wrong, some animals that couldn’t have logically been explained by the theory would have cropped up. But they haven’t.

But all this evidence is not proof. And the lack of a disproof isn’t a proof.

The same is true for all accepted theories. The sun and the moon are thought to cause the tides. If that a fact?

If you ask a scientist, even a good one, he/she may well say yes, its a fact. Because it is so darn likely to be right. Because there is no good alternative theory. Because non-one is disputing it. Because the maths is just so neat. Because the theory can make predictions. All good reasons to accept a theory. But they do not make it fact.

So we do know ‘stuff’, plenty of stuff, facts to all intents and purposes, but not strictly facts in the sense of logical proof.

So what is the scientific method, then?

Science is the system of theories and hypotheses about the nature of reality that have not yet been disproven and which are ranked by the weight of evidence in their favour.

It is like a model of the world that we are ever refining, chucking out wrong theories, refining the ones that work. The scientific method is that refinement process. Well that is my hypothesis. The truth may be altogether different!