Category: Technology

Programming, gadgets (reviews thereof) and computers

Some notes on SQL: 1 – creation

These are some notes on SQL a language for creating and querying databases, I’m learning it because a couple of programming projects I have in mind for work and home will need it. The source for these notes is  Head First SQL book. I’ve used a previous book in this series and I quite like the presentational style. I’m using MySQL via it’s command line client to do the exercises, because it’s about as straightforward as you can get. The code is formatted for display using this SQL to HTML formatter. Notepad++ recognises SQL as a language and will do syntax highlighting, so I type my commands into it and copy them into the MySQL command line client.

SQL is quite an old language and the convention is to write keywords in block capitals (reminds me of FORTRAN!). Command sequences are terminated by a semi-colon.

To start, this sequence creates a database, sets it as active and then adds a table containing a range of fields of different types, the command DESC shows the layout of a table:

CREATE DATABASE my_database;
USE my_database
contact_id INT NOT NULL,
first_name VARCHAR(20),
last_name VARCHAR(20),
birthday DATE,
life_story BLOB,
weight DEC(3, 2) NOT NULL DEFAULT 80.00,
state_code CHAR(2),
appointment DATETIME

Desc contacts;

NOT NULL keywords are used if a field must be specified on INSERT. Once created data can be added to the table using the INSERT command:

INSERT INTO contacts (contact_id,first_name,last_name,birthday,life_story,weight,
(1,‘Ian’,‘Hopkinson’,‘1970-24-04’,‘A very long text string’,80.0,
’10:30am 2010-21-06′

INSERT can be used with no specified fields (in which case values for all supplied fields must be provided for all fields), or with a subset of fields. In order to add the ‘ character we can either use ” or \’ (where ” is two single quote characters, rather than a double quote).

To delete a table:

DROP TABLE contacts;

This command should be used with care since it deletes the table whether or not it contains data. The next post should be on the SELECT command. 

Of course you can find SQL cheatsheets elsewhere.


That’s nice, dear

This blog post is about programming, for people that don’t program – at least that’s the effect I’m aiming for. The title is in recognition of my tolerant wife, The Inelegant Gardener, who has learnt the appropriate response to my enthusiastic displays of the results of my programming: “That’s nice, dear”!.

I started programming a long time ago – in around 1980, at the school computer club, when I was 10. Since then I’ve been taught odd bits of programming by scientists, and done quite a lot of programming as part of my scientific job. I’ve started to get more interested in proper software engineering in the last few years. This is a roundabout way of saying I am an enthusiastic amateur.

People associate programming with the mathematically minded, but this isn’t necessarily the case: the codebreakers at Bletchley Park, who were amongst the first users of electronic computers, had a range of skills – amongst them were linguists and crossword wizards. I was talking to a Fellow in linguistics, who’d helped write his college’s library software – as he pointed out: a very logical view of language is a great benefit for a programmer. Programming is about giving an idiot very exact instructions, if the instructions concern maths then you need to know maths – otherwise you don’t.

The core of programming is still what I learned years ago, data (numbers or letters) is stored in “variables” that have names. There are conditional statements: “If [something is true] Then [do this] or else [do the other]”. There are looping statements: “Do this 100 times”. And there are functions: “add 2 to this number, square it, add the number you first thought of and tell me the answer” or “how many times does the letter a occur in this sentence”.

These simple statements are being buried under an increasing depth of additional ideas. Since the 80’s the big thing in programming has been “object-orientation”. In object-orientated programming you package up data of a particular sort with functions that relate to that data. So if you had data modelling an octopus you would include functions such as “wave-tentacles” and “change colour”, such functions would be useless for data describing a horse. The real benefit to this is comprehending larger software systems, because a sea of functions and data is grouped together into logical islands. Beyond this there are design patterns – reoccurring systems of objects which I haven’t entirely go the hang of.

In addition to the changes in language, there are changes in the tools used to program: syntax highlighting is nice, it amounts to colouring the verbs, nouns and proper names in programming in different colours – makes it easier to spot mistakes. Auto-completion is another handy tool, in a well-designed language there are only a limited number of next possible statements when you are programming – auto-completion presents you with them as you type. Sites like Stackoverflow are great for asking programming questions, and there no end of function libraries available on the web to help you out.

I have a number of little software projects on the go, you can see them in much the same way as woodworking projects, suduko or crosswords: they keep me out the way, muttering quietly to myself and exercising my brain. It doesn’t matter that what I’m doing isn’t groundbreaking and new.

Programming does lead to some odd habits; when I started programming it was useful to know binary and hexadecimal number systems, as a consequence I believe that numbers such as 1024 and 128 are nice and round. I’ve come to appreciate a wide range of bracket styles [] (){} since they are all used for different things and the semi-colon is one of the most important pieces of punctuation in my life. If I program for too long in a stretch I start to forget how to speak to people.

And just to show off the results of my latest fiddlings: maps of the UK election results. I got interested in doing this just after the General Election. The Guardian has published a lovely spreadsheet of election results, including data on every single candidate. You see lots of maps of data of this sort, I wanted to know how it was done. (Technical details beyond the maps.)

First of all the gender of MP’s by constituency: constituencies represented by ladies are marked pink, those by men marked blue:

The black constituency in northern England is Thirsk and Malton, which held its election on 27th May, following the death of one of the candidates during the general election campaign.
The population of each constituency is also interesting, here I have coloured the constituencies with 9 different shades of green, the palest shade corresponds to a voting population of between 20,000 and 30,000, the darkest shade corresponds to a population of between 100,000 and 110,000:
The Western Isles (now known by it’s Gaelic name: Na h-Eileanan an Iar) has the smallest population at about 22,000 and the Isle of Wight has the largest population with just under 110,000 potential voters. I used ColorBrewer to find a nice set of colours.
Finally here’s a map of which party came second in each constituency in the 2010 General Election:
Red for Labour, blue for the Conservatives, orange for Liberal Democrats, yellow for Scottish Nationalists, pale green for Plaid Cymru, dark green for Sinn Fein, blue for Ulster Conservatives and Unionists, and there are a few independents and minor Northern Island parties which are all coloured white.


So the task is to get the spreadsheet data into a map: To get started I did a bit of memory trawling and googling, a couple of people have written about colouring in maps: this one uses shapefile format map data and the R programming language, whilst this one uses SVG format map data and Python (another programming language). It turns out the shapefile format data for constituencies is a little difficult to get – you have to fill in forms! However enterprising people on Wikipedia have made SVG format constituency maps available. SVG stands for Scaleable Vector Graphics, it’s an XML format which means it’s plaintext and there are standard means to extract data from it and manipulate it. The only real problem is that the constituency names in the spreadsheet don’t exactly match the names inside the SVG format map – I had to resort to some horrible constituency by constituency coding for a load of them. To do this I used the C# programming language, largely because Visual Studio Express C# is a very nice, free development environment which I’ve used before. To view the SVG maps inside my application I used the Webkit .NET library to provide a webbrowser control (which wraps up the rendering engine used in the Safari and Google Chrome browsers) – the native C# webbrowser control is based on Internet Explorer – which doesn’t render SVG. Output to bitmaps is a bit clumsy, Inkscape (a free SVG editor) wasn’t keen on displaying the original constituency map, so I resorted to viewing the map in Google Chrome and taking a screen shot (a terrible bodge).

A brief history of gadgets

This is a post about gadgets and my relationship with them, spurred by my purchase of the latest gadget: an HTC Desire smartphone aka “Shiny”.

I have a suspicious relationship with telephones, basically I consider talking to people at a distance a form of devilry and if you expect me to type messages in a 26 letter alphabet using a 12 key keyboard you’ve got another thing coming. Telephone use at SomeBeans Towers is simple: most nights Mrs SomeBeans rings her dad, once a week on Sunday my mum rings me, roughly once a month my dad rings me with a list of computer problems for solution and once every three weeks I ring Majestic to arrange wine delivery : “Simples”. All phone calls beyond this are a cause for chaos, consternation, confusion etc. I appreciate this makes us “anomalous” but I’m too old to care.

Mobile phone use is even more occasional, whilst skiing we sometimes arrange slope side meet-ups via mobile phone. Mrs SomeBeans was reduced to hysterics watching me typing a text message, moving my lips as I did it and eventually giving up because of cramp. When buying a house use of the devil’s mobile speech-horn is inevitable. At the start of the last house purchase I sat on the train with my mobile phone ringing from my pocket thinking: “Why isn’t that person answering their phone? It sounds an awful lot like mine (I think)”, on the previous house buying occasion I attempted to recharge my phone via it’s headphone socket, it died.

My last trip to the phone shop was rather embarrassing for all concerned, I bought a Samsung E250 slider phone. I completely missed the point of the “slider” bit and asked the phone-geek whether it was a touch screen phone: “No, sir”. I expressed a desire for a phone embedded in an SLR camera lens, at which point the phone-geek claimed that the picture from some phone was as good as an SLR, so I like to think I wasn’t the only one to come away from the experience looking like a complete idiot. The Shiny was bought over the internet to avoid embarrassment.

I did wonder about the touch screen aspects of the Shiny, I believe there a two types of people in the world: those that are happy to smear their horrible, greasy fingerprints across displays, and those that wish to kill them. I fall into the second group, so there was some risk I would have a touch-screen phone which I was psychologically incapable of touching. Fortunately this has turned out not to be the case, whilst transferring over numbers from the old Samsung I repeatedly tried to use the screen as a touch-screen, to the chagrin of all involved.

Why the HTC Desire? I do a bit of programming and the Android operating system on which it runs is relatively straightforward for me to program on (the iPhone requires you to use a Mac). Android has such magic software as Google Goggles which carries out picture based searching – it works too: it successfully recognised “Luncheon of the Boating Party” by Renoir on our wall, as well as a rather more obscure photo, and Tasty by Kelis (front or back cover). It has a radio too (you notice phone functionality is pretty much the last thing on my mind here). Chatting to friends of a more phone-friendly nature I got the impression that the Desire was the way to go. Blackberries looked a bit serious and business-like (and don’t have radios).

The predecessors of the Shiny are my Psion 5mx, a fabulous PDA with a lovely almost proper keyboard and built-in software which did not need to be supplemented, I gave it up after many years because the connector between screen an keyboard started to break regularly, I followed this with a Dell Axim x51v which I never really loved. My phone history is completely unmemorable, my first mobile phone looked like a toy bone for a dog, and made young people laugh. I’ve had two other phones since then but I scarcely used them. In other gadgets I got a bluetooth GPS to talk to my PDA so I could geotag my photos, then I got a Garmin GPS60 which did it rather more robustly.

I joined the digital camera revolution rather early with a Kodak DC210, this was a revelation to me since previously my photography experience was limited to taking a few shots on film in a weeks holiday, finally filling a film after 18 months having completely forgotten where I was or what technically I had done to achieve the effect I had. After the Kodak I got an Olympus C750UZ, in parallel I also had a Casio Exilim S20 for it’s extreme compactness, then I went SLR picking up a Canon 300D from my father-in-law on which I became completely hooked, upgrading to a 400D shortly thereafter. The reason for going to the SLR was that even with a reasonably good point-and-shoot I was finding there were photos I knew I just couldn’t take because I couldn’t control the camera. An SLR gets round this problem by having a decent set of real controls, like a focus ring on the lenses, rather than some buried menu options and octopus-friendly button pressing. The thing with an SLR is that the camera body serves the function of a gateway drug, your dealer makes the real money on the lenses*. There’s probably a business model in giving away SLR camera bodies and making your profit solely on lenses and accessories. I’ve subsequently got a second Casio Exilim S10 to fill my dinky camera needs, this camera will take photos when people smile and claims that it can recognise different people and prioritise snapping according to your preference. This seems like a new way of offending friends and family, is making your wife anything other than top smiling priority grounds for divorce?

Returning to the HTC Desire: it’s fantastic! It can replace phone, pda, GPS and dinky camera: all in a beautiful package. The interface is a joy to use, wave your finger around on it and you go skittering off to different parts as if you’re skimming across the surface of a large desk. My old PDA felt like you were peering into a tiny fixed porthole on it’s innards. Tapping buttons on the screen gives you a touch of vibration feedback. Even the internet is pretty usable, as is the 200 page PDF manual.

I have a bit of a mixed attitude to gadgets: things that are nice to use like the Psion, Canon 400D and Shiny, I really like. Things with crap interfaces, I can’t abide: programming video recorders and central heating systems I hate for their horribly kludgey 20 random keystrokes with no user feedback nastiness.

And now, if you don’t mind, I will return to fondling Shiny.

*For the camera fans I have the 10-22mm50mm f/1.8, 18-55mm kit, 28-135mm, 100mm macroEF 70-300mm lenses.

The Presidents of the Royal Society (reprise)

In my previous post I described how I downloaded and played with the data on the fellows of the Royal Society, including a plot of the presidents of the society and their terms of office. I was a bit unhappy with this plot, I felt like it could be a bit more interactive. So I’ve been busy. The plot below shows you who was in office when you wave your mouse over it, and highlights their term. On the face of it this looks like I’ve done very little, but it took me six hours of playing with Protovis (a Javascript visualization library) top achieve this! You can do lots of very neat things with Protovis, and having done it one visualisation it should be easier to do the next one.

The Royal Society and the data monkey

This year finds the Royal Society celebrating its 350th anniversary. The Royal Society is Britain’s national academy of science, one of the first of such societies to be founded in Europe. My brief investigations suggest that only the Italian Accademia dei Linceis and the German Academy of Sciences are older, and then only by a relatively small margin. The goals of the Royal Society were to report on the experiments of its members and communicate with like-minded fellows across Europe.

The Gentleman Administrator is planning some historical blogging on the Royal Society this year, starting with this post on the founding of the society and the role that Charles II played in it. On the face of it this post is about the history of the Royal Society, but in truth it says more about me as a data monkey than it does about the Royal Society. I shall explain.

The Royal Society supply a list of previous members as a pair of PDF format files, these contain each fellow of the Royal Society with their election date, their membership type and, for some, the dates of their birth and death. The PDF is formatted in a standard way suggesting to me that it could be read by a computer and the data therein analysed. I suspect there is an easier way to do this: ask the Royal Society whether they can supply the data in a form more amenable to analysis such as a spreadsheet or a database. But where’s the fun in that?

As an experimental physicist, getting data in various formats into computer programs for further analysis is what I do. This arises when I want to apply an analysis to data beyond that which the manufacturer of the appropriate instrument supplies in their own software, when I get data from custom-built equipment, when I trawl up data from other sources. I received a polite “cease and desist” message at work after I successfully worked out how to extract the text of internal reports from the reports database, they shouldn’t have said it couldn’t be done! I will save you the gory details of exactly how I’ve gone about extracting the data from the Royal Society lists, suffice to say I enjoyed it.

First up, we can identify the Presidents of the Royal Society, and their terms of office from the PDF files – this information is in the name entry for each of them. We can look this data up too). I’ve plotted these below in a manner reminiscent of the displays of the earth’s magnetic field reversal, each coloured stripe represents a presidency, and the colours alternate for clarity. The width of the stripe shows you how long each was president:

In the earlier years of the Royal Society’s history the Presidential term varied quite considerably: Sir Isaac Newton served for 24 years (1703-1727), and Sir Joseph Banks for 42 years (1778-1820). Since 1870 the period of the office seems to have been fixed at 5 years.

Next, we can work out the size of the fellowship in any particular year, basically we go through each fellow in the membership list and see when they were elected to the society and when they died: between these two years they were members. These data are plotted below:

We can see that membership in the early years of the 19th century started to rise significantly but then after 1850 it started to fall again.

This fits in with historical records, in the earlier years of the 19th century some younger fellows pointed out that the Royal Society was starting to turn into a fancy dining club and that most of the fellows had published very little, in particular Charles Babbage published Reflections on the decline of Science in England, and on some of its causes. Wheels ground slowly but finally, in 1846, a committee was set up to consider the charter of Society and how to curb its ever growing membership. I’ve not found the date on which the committee reported but subsequent to this date, admission to the society was much more strictly controlled. Election to the Royal Society is still a mark of a scientist a little above the ordinary.

The data on birth and death dates starts getting sparse after about 1950, presumably since many of the fellows are still alive and were reluctant to reveal their ages. Doing analysis like this starts to reveal the odd glitch in the data. For example,Christfried Kirch appears to have died two years before being elected. At the moment I’m not handling uncertainty in dates very well, and I learnt that the letters “fl” before a date range indicate that and individual “flourished” in that period, which is nice.

If anyone is interested in further data in this area, then please let me know in the comments below. I intend adding further data to the set (i.e. hunting down birth and death dates) and if there is an analysis you think might be useful then I’m willing to give it a try. I’ve uploaded the basic data to Google Docs.

The illustration at the top of this piece is from the frontspiece of William Sprat’s The History of the Royal Society of London, for the Improving of Natural Knowledge, published in 1722.