Monthly Archives: December 2008

Information: what exactly is it?

December 21, 2008Education, Mathematics, Physics, Science, Science communication, Things Explainedcomputers, maths, Sciencejarrodhart

I was walking to the tennis courts in Battersea Park a few years back, when I heard something on my Walkman radio. It stuck with me for years, and until tonight I haven’t followed up on it, read about it or written about it. Though I have told everyone at my work, which has resulted, as usual, in groans about how nerdy I am (and genuine amazement at how I could spend valuable time pondering these things).

What I heard was a very short anecdote about someone who wrote a little regarded paper in the 1940’s (see ref below) in which he made an attempt to define a ‘measure’ for information. Although I never read any more about it (until today), what I heard was enough to set me thinking…

————–

Now, if you know lots about this subject then bear with me. Those readers who don’t know what he came up with: I challenge you to this question:

what contains more information, a phone-number, a ringtone or a photo?

Are they even comparable?

Bits & Bytes…

In this computer age, we already have some clues. We know that text doesn’t use up much disk space, and that photos & video can fill up the memory stick much quicker.

But what about ZIP files? These are a hint that file-size is not a very accurate measure of information content.

So what is a megabyte? Is it just so many transistors on a microchip? Happily, its not, its something much more intuitive and satisfying.

Information: what is it?

If you go to Wikipedia and try to look up Information Theory, within a few seconds you are overrun with jargon and difficult concepts like Entropy; I hope to avoid that.

Let’s rather think about 20 questions. 20 Questions is the game where you have 20 questions to home in on the ‘secret’ word/phrase/person/etc. The key, however, is that the questions need to elicit a yes/no response.

To define information simply: the more questions you need in order to identify a ‘piece of information’, the more information content is embodied in that piece of information (and its context).

This helps us to answer questions like: “How much information is in my telephone number?”

Let’s play 20 questions on this one. How would you design your questions? (Let’s assume we know it has 7 digits)

You could attack it digit by digit: “is the first digit ‘0’? Is the first digit ‘1’? Then changing to the next digit when you get a yes. If the number is 7 digits long, this may take up 70 questions (though in fact if you think a little you will never need more than 9 per digit, and on average you’ll only need about 5 per digit – averaging ~35 in total).

But can you do better? What is the optimum strategy?

Well let’s break down the problem. How many questions do we really need per digit?

We know that there are 10 choices. You could take pot luck, and you could get the right number first time, or you might get it the 9th time (if you get it wrong 9 times, you don’t need a 10th question). However, this strategy will need on average 5 questions.

What about the divide and conquer method? Is it less than 5? If yes, you have halved the options from 10 to 5. Is it less than three? Now you have either 2 or 3 options left. So you will need 3 or 4 questions, depending on your luck, to ID the number.

Aside for nerds: Note now that if your number system only allowed 8 options (the so-called octal system), you would always be able to get to the answer in 3. If you had 16 options (hexadecimal), you would always need 4.

For the decimal system, you could do a few hundred random digits, and find out that you need, on average 3.3219… questions. This is the same as asking “how many times do you need to halve the options until no more than one option remains?’

Aside 2 for nerds : The mathematicians amongst you will have spotted that 2^3.3219 = 10

Now, we could use 4 questions (I don’t know how to ask 0.32 questions) on each of the 7 digits, and get the phone number, and we will have improved from 35 questions (though variable) to a certain 28 questions.

But we could take the entire number with the divide and conquer method. There are 10⁷ (100 million) options (assuming you can have any number of leading zeroes). How many times would you need to halve that?

1. 50 00o 000
2. 25 000 000
3. ….
…
22. 2.38…
23. 1.19…
24. 0.59…

So we only needed 24 questions. Note that calculators (and MS Excel) have a shortcut to calculate this sort of thing: log₂(10⁷) = ~23.25…

OK, so we have played 20 questions. Why? How is the number of questions significant? Because it is actually the accepted measure of information content! This is the famous ‘bit‘ of information. Your 7 digit number contains about 24 bits of information!

Epilogue

As you play with concept, you will quickly see that the amount of information in a number (say the number 42), depends hugely on the number of possible numbers the number could have been. If it could have been literally any number (an infinite set) then, technically speaking, it contains infinite information (see, I’ve proven the number 42 is all-knowing!).

But the numbers we use daily all have context, without context they have no practical use. Any system that may, as part of its working, require ‘any’ number from an infinite set would be unworkable, so this doesn’t crop up often.

Computer programmers are constantly under pressure to ‘dimension’ their variables to the smallest size they can get away with. And once a variable is dimensioned, the number of bits available for its storage is set, and it doesn’t matter what number you store in that variable, it will always require all those bits, because it is the number of possibilities that define the information content of a number, not the size of the number itself.

————

I hope that was of interest! Please let me know if I’ve made any errors in my analysis – I do tend to write very late at night 😉

References:

1. Claude Shannon, “A Mathematical Theory of Communication” 1948

Skepticism: religion’s cancer

December 19, 2008Atheism, Evolution, In the media, SkepticismAtheism, cancer, carcinogenic, epidemiology, god, religion, Skepticismjarrodhart

Religion has been described as a virus. This is not because it’s ‘bad for you’ necessarily, but rather due to the way it spreads.

It’s not hard to see the parallel: like viruses (and bacteria), religions exist within a population and spread from person to person.

But what about atheism? Is it a viral idea (meme) too?

I will argue that it isn’t. Perhaps it’s more like a cancer, a ‘mutation’ that kills off religious infections.

Cancers are sneaky, because they can occur spontaneously, almost by chance, and are therefore a very statistical phenomenon: your chance of getting cancer is affected by a), your exposure (to carcinogens causing mutation events), and b), your predisposition (genes affecting your ability to cope with the these mutations).

Your chance of becoming an atheist is likewise affected by a), your exposure (to information about how the world works) and b), your predisposition (intelligence, or ability to apply logic to the information).

I.e. atheism differs from religion in the same way that carcinogens differ from viruses.

Can we develop this idea? I think so.

Let’s look at how you ‘get’ atheism…

Picture it: you’ve been brought up in a good god-fearing, church-going family. You went to Sunday school, you know which of Cain and Abel was the baddy and you can explain to people about how there is good evidence for The Flood. You also have a healthy fear of sex and the other sins.

But you go to school and you learn about plate tectonics and see how well South America slots into Africa, and then you learn how European bees are not quite the same as African ones, just like Toyota Corollas aren’t, and one day, while looking at the grille of your step-mother’s 1.3GL, and daydreaming about the A-team, a thought strikes you, like a shot of cancer-causing sunshine on that patch of skin on the back of your right shoulder, that cars evolve differently in different counties and maybe that explains all the animals and perhaps God didn’t make a women out of Adam’s rib after all, cos’ that never did make much sense, because a rib is a pretty silly thing to make a women out of anyway.

Catching a dose of Christianity on the other hand, does not come from inside, as the result of reasoning, it comes from outside, from other people.

Most often you will be born into a house absolutely soaked in the infection, you will be infected soon enough, prayers will be said at mealtimes, the church is so big and grand, and the hymns are so catchy, and then they wheel out Christmas and baby Jesus (or baby ‘cheeses’ as my son says)…

But even if you’re not so lucky, there’s hope. You can drop in at a church any time (though Sundays are best I’m told) and the chances are, even if you are down on your luck, short of friends, and even if you aren’t very nice, the sweet people there are quite likely to help you. That feeling of family, of unquestioning acceptance – brings a special warmth to the cockles of the heart.

Once you’re in the door, religion, having evolved pretty niftily, can now play you like a violin. Your emotions, developed to help promote clan solidarity, are hi-jacked and kick in nicely. Did you know, that if you really listen to what these folks say, and really try to feel God’s love, you will indeed feel something! Now that’s a clever infection…

A house price prediction…

December 10, 2008Economics, In the media, The scientific methodbaby-boom, Britian, Economics, economy, houses, UKjarrodhart

House prices, like the stock market, are tricky to predict.

As with the stock market, there are two classes of parameters that affect the prices – the so-called ‘fundamentals’, like supply and demand, the price-to-earnings ratio on the one hand, and the more transient effects like the economic climate and the ever-slippery ‘confidence’.

There has been feverish speculation for years in the UK, and the prices rose for 15 consecutive years, and are at last dropping.

So why did the prices get so high? Many economists would argue it was a classic “bubble”, a self-perpetuating cycle of confidence building more confidence; in other words the fundamentals were being ignored.

Of course, the people found fundamentals they claimed justified the prices; in particular increased demand. Folks living longer, divorce, folks marrying later, immigration, and the breakdown of the family unit; all these things mean we need more houses.

But if these fundamentals were the whole reason, the prices wouldn’t be dropping as they are now. OK, so now most will admit it got out of hand and this is a correction. But how far has it got to correct?

The bubble, it seems to be agreed, was really helped by two factors:

Firstly there was a throttle on the supply – planning permission is notoriously hard to get and the government probably knew it and were happy with prices rising, it made everyone feel prosperous. On a more sinister front, housing developers may have sitting on prime real estate to deliberately keep prices high.

Secondly, there was easy credit – anyone and their dog could get the cash so people who really shouldn’t have been in the game got in and are now out of their league.

But there is a third factor I’ve not seem discussed in the media: the baby-boom generation.

Hasn’t this bubble coincided with the baby-boomer’s ‘rich’ phase – the age from 45-60 when the kids are off and 25 years of mortgage payments have built up the asset list? Surely this is the age-group that is most likely to own big houses, or multiple houses for that matter?

So what will happen now? The bubble has burst, the correction is in full swing, but what will happen in the next 10 years as the baby boomers start retiring, downsizing, and dying? Will this coincide with the next bubble-burst? Will the industry and government look at the population age profile during planning?

I personally hope this is why the market is cock-eyed – why it is that a professional engineer in his mid-thirties with a internationally comparable salary can’t afford more than a mid-terrace house with a 5×5-metre garden…

So I predict (well pray really, if that’s possible for athiests) that we will get into an oversupply situation and that house prices should correct from this ‘second-order’ bubble.

Of course, even if I am right, it may be that the prices are kept up by nasty developers identifying whole towns to ‘let go to ruin’ just to keep the prices high in the next town along…

Celebrity Dynamics

December 10, 2008Economics, In the media, Science communication, The scientific methodanalysis, celebrity, famejarrodhart

Celebrity Dynamics.

The list of people we all ‘know’ isn’t that long, yes, it probably thousands – politicians, actors, singers, historical figures, sports stars – but in a country like the UK, it is still a remarkably small fraction of the populace.

Of course, there are ‘spheres’ – people interested in politics know more politicians, sports fans have more sporting heroes – we here in Cornwall have our local ‘Cornish’ celebrities.

However, if we remembered every celebrity, we would soon run out of space in the public ‘memory’, so we have to be selective.

The media know this – they constantly face choices of which story to follow, and the decisions will often be arbitrary; two minor celebrities did two things today, and we only have 45 seconds of time to fill in our variety news programme – which shall we choose?

This decision process is simple – the editor will pick the celebrity who has more recent ‘hits’ in the news.

Why? Because they know that the audience is more likely to recognise the name – and they know that if the audience hear that name twice it reinforces the memory.

This simple logic creates a very interesting system in which the rise to fame becomes ‘autocatalytic’ – a self-perpetuating, accelerating process. All you need to do is pass some ‘critical point’ of news coverage and you may be in for a ride!

However, we can only hold so many names in the list, so anyone who is out of the news for a time drops off the radar pretty fast, even if they did once enjoy high exposure.

If you are like me, you’ll be thinking of exceptions – folks who just stay famous regardless – do they buck this logic? I don’t think so.

Such people most likely still get exposure, even if its not them in the news – perhaps we see their CD on our shelf, or we talk about their ‘field’ (Thatcherism, Darwinism, Keynesian economics,), and this may be accentuated if their field gets in the news – as has recently been the case for Keynes.

So what value does this theory have?

I think it explains:

why so many great deeds don’t lead to fame
why often only one person from a high achieving team is ‘selected’ for fame
why there’s no such thing as bad publicity
local fame does not easily turn to national fame

It also suggests that if you want to be famous, you should:

a series of newsworthy events in succession is probably better than a single highly newsworthy achievement
if you are in a group/team/band, you need to be the leader or public face of the group
you should associate yourself with a newsworthy field, ideally become the posterboy/girl for the field, always dragged out when the field is in the news

And if you want to stay famous once you are you should keep in the public eye:

associate yourself with newsworthy events
differentiate yourself from other celebrities in your ‘space’ or
gang together with other celebrities to create newsworthy events
become the posterboy/girl for a newsworthy field/subject, the one dragged out when the field is in the news

Aside: There seems to be another way to maintain fame:- create mystique, the image of privilege, of some higher plain of existence away from the mundanity of everyday life. People say they like down-to-earth celebrities – that’s because they are very rare – you have to be ‘proper’ famous to stay famous without this tactic!

Of course, this all assumes you want to be famous! You can equally use the theory to keep a low profile 😉

Good luck either way!

The Provincial Scientist

Out on a limb…