Book review: Fire & Steam by Christian Wolmar

FireAndSteamI’ve long been a bit of a train enthusiast, reflected in my reading of biographies of Brunel and Stephenson, and more recently Christian Wolmar’s The Subterranean Railway about the London Underground. This last one is my inspiration for reading Wolmar’s Fire & Steam: How the railways transformed Britain which is a more general history of railways in Britain.

Fire & Steam follows the arc of the development of the railways from the the earliest signs: the development of railed ways to carry minerals from mine to water, with carriages powered by horses or men.

The railways appeared at a happy confluence of partly developed technologies. In the later half of the 18th century the turnpike road system and canal systems were taking shape but were both limited in their capabilities. However, they demonstrated the feasibility of large civil engineering projects. Steam engines were becoming commonplace but were too heavy and cumbersome for the road system and the associated technologies: steering, braking, suspension and so forth were not yet ready. From a financial point of view, the railways were the first organisations to benefit from limited liability partnerships of more than six partners.

Wolmar starts his main story with the Liverpool & Manchester (L&M) line, completed in 1830, arguing that the earlier Stockton & Darlington line (1825) was not the real deal. It was much in the spirit of the earlier mine railways and passenger transport was a surprising success. The L&M was a twin-track line between two large urban centres, with trains pulled by steam engines. Although it was intended as a freight route passenger transport was built in from the start.

After a period of slow growth, limited by politics and economics, the 1840s saw an explosion in the growth of the railway system. The scale of this growth was staggering. In 1845 240 bills were put to parliament representing approximately £100million of work, at the time this was 150% of Gross National Product (GNP). Currently GNP is approximately £400billion, and HS2 is expected to cost approximately £43billion – so about 10% of GNP. Wolmar reports the opposition to the original London & Birmingham line in 1832, it sounds quite familiar. Opposition came from several directions, some from the owners of canals and turnpike roads, some from landowners unwilling to give up any of their land, some from opportunists.

The railways utterly changed life in Britain. At the beginning of the century travel beyond your neighbouring villages was hard but by the time of the Great Exhibition in 1851, a third of the population was able to get themselves to London, mostly by train. This was simply a part of the excursion culture, trains had been whizzing people off to the seaside, the races, and other events in great numbers from almost the beginning of the railway network. No longer were cows kept in central London in order to ensure a supply of fresh milk

In the 19th century, financing and building railways was left to private enterprise. The government’s role was in approving new schemes, controlling fares and conditions of carriage, and largely preventing amalgamations. There was no guiding mind at work designing the rail network. Companies built what they could and competed with their neighbours. This led to a network which was in some senses excessive, giving multiple routes between population centres but this gave it resilience.

The construction of the core network took the remainder of the 19th century, no major routes were built in the 20th century and we have only seen HS1, the fast line running from London to Dover completed in this century.

The 20th century saw the decline of the railways, commencing after the First World War when the motor car and the lorry started to take over, relatively uninhibited by regulation and benefitting from state funding for infrastructure. The railways were requisitioned for war use during both world wars, and were hard used by it – suffering a great deal of wear and tear for relatively little compensation. War seems also to have given governments a taste for control, after the First World War the government forced a rationalisation of the many railway companies to the “Big Four”. After the Second World War the railway was fully nationalised. For much of the next 25 years it suffered considerable decline, a combination of a lack of investment, a reluctance to move away from steam power to much cheaper diesel and electric propulsion, culminating in the Beeching “rationalisation” of the network in the 1960s.

The railways picked up during the latter half of the seventies with electrification, new high speed trains and the InterCity branding. Wolmar finishes with the rail privatisation of the late 1990s, of which he has a rather negative view.

Fire & Steam feels a more well-rounded book than Subterranean Railway which to my mind became a somewhat claustrophobic litany of lines and stations in places. Fire & Steam  focuses on the bigger picture and there is grander sweep to it.

Book Review: The Idea Factory by Jon Gertner

The_Idea_Factory Cover

I’ve read about technology and innovation in post-war Britain, in the form of Empire of the Clouds, A Computer called Leo and Backroom Boys. Now I turn to American technology, in the form of The Idea Factory: Bell Labs and the Great Age of American Innovation by Jon Gertner.

Bell Laboratories was the research and development arm of the American Telephone and Telegraph Company (AT&T) which held a monopoly position in the US telephone market for over half a century. Bell Labs still exists today as a subsidiary of Alcatel-Lucent but it is much reduced from its former glory.

What did they invent at Bell Laboratories?

An embarrassment of things: the transistor, the charge-couple device, photovoltaic solar cells, the UNIX operating system, C and C++ programming languages. And they also discovered the cosmic microwave background. They were the main contractor for some of the earliest passive and active communications satellites and the earliest cell phone systems. Claude Shannon worked at Bell Laboratories where he published his paper on information theory, in computing Shannon is pretty much the equal of Turing in terms of influence on the field. If statistics is more your thing, then John Tukey is a Bell Labs alumnus.

This is a seriously impressive track record: Bell Laboratories boast 7 Nobel prizes for work done at the laboratory. To get an idea of the scale of this achievement the equivalent figure for Cambridge University is 17, Oxford University 8 and MIT 18. IBM has 5. See for yourself here.

I was semi-aware of all of these inventions but hadn’t really absorbed that they were all from Bell Labs.

For something over 50 years Bell Laboratories benefitted from a state-mandated monopoly which only came to an end in the mid-eighties. They had argued in the 1920s that they needed a monopoly to build the required infrastructure to connect a (large) nation. In the early days that infrastructure was a system of wires and poles, spanning the country, then cables crossing the ocean, then automatic telephone exchanges first valve based then solid-state. They developed a habit of in depth research, in the early days into improving the longevity of telegraph poles, and the leather belts of line engineers, moving on to solid-state physics after the war. In exchange for their monopoly they were restricted in the areas of business they could enter and obliged to license their patents on generous terms.

It’s interesting to compare the development of the vacuum tube as an electronic device with that of the transistor. In both cases the early versions were temperamental, expensive and bulky but through a process of development over many years they became commodity devices. Bell pushed ahead with the development of the solid-state transistor with their optimisation of vacuum tubes as a guide to what was possible.

During the Second World War, Bell Laboratories and its staff were heavily involved in the war effort. In particular the development of radar, which to my surprise was a programme 50% larger than the Manhattan Project in cost terms. Bell Laboratories most expensive project was the first electronic switching station, first deployed in 1964. This is a company that strung cables across continents and oceans, launched satellites and the most expensive thing it ever did was build a blockhouse full of electronics!

Ultimately the AT&T monopoly gave it huge and assured revenue for a long period, relatively free of government interference. The money flowed from captive telephone customers, not the government and the only requirement from AT&T’s point of view was to ensure government did not break its monopoly. In the UK the fledgling computer industry suffered from a lack of a large “home” market. Whilst the aircraft industry suffered from having an unreliable main market in the form of the UK government.

Despite my review which I see makes almost no mention of the people, The Idea Factory is written around people, both the managers and the scientists on the ground. Bell Labs was successful because of the quality of the people it attracted, it sought them out through a personal network spanning the universities of the US. It kept them because they saw they could work in a stimulating and well-funded environment which tolerated sometimes odd behaviour.

It does bring to mind the central research laboratories of some of the UK’s major companies with which I am familiar, including ICI, Unilever and Courtaulds. Of these only Unilever’s survives, and in much reduced form.

The Idea Factory is well-written and engaging, telling an interesting story. It lacks context in what was going on outside Bell Laboratories but then this is not an area it claims to cover.

Book review: Learning SPARQL by Bob DuCharme

learningsparql

This review was first published at ScraperWiki.

The NewsReader project on which we are working at ScraperWiki uses semantic web technology and natural language processing to derive meaning from the news. We are building a simple API to give access to the NewsReader datastore, whose native interface is SPARQL. SPARQL is a SQL-like query language used to access data stored in the Resource Description Framework format (RDF).

I reach Bob DuCharme’s book, Learning SPARQL, through an idle tweet mentioning SPARQL, to which his book account replied. The book covers the fundamentals of the semantic web and linked data, the RDF standard, the SPARQL query language, performance, and building applications on SPARQL. It also talks about ontologies and inferencing which are built on top of RDF.

As someone with a slight background in SQL and table-based databases, my previous forays into the semantic web have been fraught since I typically start by asking what the schema for an RDF store is. The answer to this question is “That’s the wrong question”. The triplestore is the basis of all RDF applications, as the name implies each row contains a triple (i.e. three columns) which are traditionally labelled subject, predicate and object. I found it easier to think in terms of resource, property name and property value. To give a concrete example “David Beckham” is an example of a resource, his height is the name of a property of David Beckham and, according to dbpedia, the value of this property is 1.8288 (metres, we must assume). The resource and property names must be provided in the from of URIs (unique resource identifiers) the property value can be a URI or some normally typed entity such as a string or an integer.

The triples describe a network of nodes (the resource and property values) with property names being the links between them, with this infrastructure any network can be described by a set of triples. SPARQL is a query language that superficially looks much like SQL. It can extract arbitrary sets of properties from the network using the SELECT command, get a valid sub-network described by a set of triples using the CONSTRUCT command, answer a question with a Yes/No answer using the ASK command. And it can tell you “everything” it knows about a particular URI using the DESCRIBE command, where “everything” is subject to the whim of the implementor. It also supports a bunch of other commands which feel familiar to SQListas such as LIMIT, OFFSET, FROM, WHERE, UNION, ORDER BY, GROUP BY, and AS. In addition there are the commands BIND which allows the transformation of variables by functions and VALUES which allows you to make little data structures for use within queries. PREFIX provides shortcuts to domains of URIs, for example http://dbpedia.org/resource/David_Beckham can be written dbpedia:David_Beckham, where dbpedia: is the prefix. SERVICE allows you to make queries across the internet to other SPARQL providers. OPTIONAL allows the addition of a variable which is not always present.

The core of a SPARQL query is a list of triples which act as selectors for the triples required and FILTERs which further filter the results by carrying out calculations on the individual members of the triple. Each selector triple is terminated with “ .” or a “ ;” which indicates that the next triple is as a double with the first element the same as the current one. I mention this because Googling for the meaning of punctuation is rarely successful.

Whilst reading this book I’ve moved from SPARQL querying by search, to queries written by slight modification of existing queries to celebrating writing my own queries, to writing successful queries no longer being a cause for celebration!

There are some features in SPARQL that I haven’t yet used in anger: “paths” which are used to expand queries to not just select a triple define a node with a link but longer chains of links and inferencing. Inferencing allows the creation of virtual triples. For example if we known that Brian is the patient of a doctor called Jane, and we have an inferencing engine which also contains the information the a patient is the inverse of a doctor then we don’t need to specify that Jane has a patient called Brian.

The book ends with a cookbook of queries for exploring a new data source which is useful but needs to be used with a little caution when query against large databases. Most of the book is oriented around running a SPARQL client against files stored locally. I skipped this step, mainly using YASGUI to query the NewsReader data and the SNORQL interface to dbpedia.

Overall summary, a readable introduction to the semantic web and the SPARQL query language.

If you want to see the fruits of my reading then there are still places available on the NewsReader Hack Day in London on 10th June.

Sign up here!

Book review: The Undercover Economist Strikes Back by Tim Harford

undercover_economistWhat have been reading?
Tim Harford’s latest book "The Undercover Economist Strikes Back". It’s about macroeconomics, a sort of blaggers guide.

Who’s Tim Harford?

Tim Harford is a writer and broadcaster. I’ve also read his books The Undercover Economist, about microeconomics and Adapt, about trial and error in business, government and aid. When I get the time I listen to his radio programme More or Less, about statistics and numbers, and also read his newspaper column.

Hey, what’s going on here? You keep writing down the questions I’m asking!
Yes, this is how Strikes Back is written. At the beginning I found it a bit irritating but as you can see I’ve taken to it. It recalls the method Socratic dialogue and Galileo’s book, Dialogue Concerning the Two Chief World Systems. The advantage is that it structures the text very nicely and is likely rather SEO friendly.

OK, I’ll play along – tell me more about the book

The book starts by introducing Bill Philips and his MONIAC machine, which simulated the economy, in macroeconomic terms, using water, pipes, tanks and valves.

That’s a bizarre idea, why didn’t he use a computer?

Philips was working in the period immediately after the Second World War and computers weren’t that common. Also, it turns out that solving certain types of equations is more easily done using an analogue computer – such as MONIAC.

Back up a bit, what’s macroeconomics?

Macroeconomics is the study of the large scale features of the economy such as the growth in Gross Domestic Product (GDP), unemployment, inflation and so forth. Contrast this to microeconomics which is about how much you pay for your cup of tea (and other things).

What’s the point of this, didn’t someone describe economics as the “dismal science”?

Yes, they did but this is treating economics a little unfairly. One of Harford’s pleas in the book is to accept the humanity of economists. They aren’t just interested in numbers, they are interested in making numbers work for people. In particular, unemployment is recognised as a great ill which should be minimised and the argument is over how this should be achieved rather than whether it should be achieved.

Tell me something about macroeconomics

There is a great divide in economics between the Keynesians and the classical economists. The crux of their divide is how they treat a recession. The former believe that the economy needs stimulus in times of recession, in terms of of increased “printing of money”. The latter believe that the economy is a well-oiled machine that is de-railed by external shocks, in happy times there are other external shocks that pass off relatively benignly. The classicist are less keen on stimulus believing that the economy will sort itself out naturally as it responds to the external shocks. These approaches can be captured in toy economies.

Tim Harford cites two examples: a babysitting collective in Washington DC and the economy of a prisoner of war camp. The former is a case of a malfunctioning economy fixed by Keynesian means: the collective worked by parents agreeing to babysit in exchange for vouchers which represented a period of babysitting. But the amount of vouchers available was limited so parents were reluctant to spend their babysitting vouchers for a night out because they were scarce. In the first instance this was resolved by printing more baby sitting vouchers: a Keynesian stimulus.

The prisoner of war camp suffered a different problem, towards the end of the war the price of goods went up as the supply of Red Cross parcels dried up. Here there was nothing to be done, the de facto unit of currency was the cigarette and there was a limited supply of them and nothing could be done to increase that supply.

It’s all about money, isn’t it?

Yes, Harford highlights that money fulfils three different functions. It’s a medium of exchange, to save us from bartering. It’s a store of value, we can keep money under the bed for the future – something we couldn’t do with our valuable goods if they were valuable. And it is a “unit of account”, a way of summing up your net worth over a range of assets.

Is The Undercover Economist Strikes Back worth reading?

I’d say a definite “yes”. We’ve all been watching macroeconomics playing out in lively form over the last few years as the recession hit and is now receding. Harford gives a clear, intelligent guide to the issues at hand and some of the background that is left unstated by politicians and in the news. Harford points out that our political habits don’t really match our economic needs. Ideally we would have abstemious, right-wing governments in the boom years and somewhat more spendthrift left-wing ones during recessions. He ends with a call for more experimentation in macroeconomics, harking back to his book Adapt. And also highlights some shortcomings of macroeconomics as studied today: it does not consider behavioural economics, complexity theory or even banks.

There’s much more in the book than I’ve summarised here.

The London Underground: Should I walk it?

LU_logo

This post was first published at ScraperWiki.

With a second tube strike scheduled for Tuesday I thought I should provide a useful little tool to help travellers cope! It is not obvious from the tube map but London Underground stations can be surprisingly close together, very well within walking distance.

Using this tool, you can select a tube station and the map will show you those stations which are within a mile and a half of it. 1.5 miles is my definition of a reasonable walking distance. If you don’t like it you can change it!

The tool is built using Tableau. The tricky part was allowing the selection of one station and measuring distances to all the others. Fortunately it’s a problem which has been solved, and documented, by Jonathan Drummey over on the Drawing with Numbers blog.

I used Euston as an origin station to demonstrate in the image below. I’ve been working at the Government Digital Service (GDS), sited opposite Holborn underground station, for the last couple of months. Euston is my mainline arrival station and I walk down the road to Holborn. Euston is coloured red in the map, and stations within a mile and a half are coloured orange. The label for Holborn does not appear by default but it’s the one between Chancery Lane and Tottenham Court Road. In the bottom right is a table which lists the walking distance to each station, Holborn appears just off the bottom and indicates a 17 minute walk – which is about right.

Should I walk it

The map can be controlled by moving to the top left and using the controls that should appear there. Shift+left mouse button allows panning of the map view. A little glitch which I haven’t sorted out is that when you change the origin station the table of stations does not re-sort automatically, the user must click around the distance label to re-sort. Any advice on how to make this happen automatically would be most welcome.

Distances and timings are approximate. I have the latitude and longitude for all the stations following my earlier London Underground project which you can see here. The distances I calculate by taking the Euclidean distance between stations in angular units and multiplying by a factor which gives distances approximately the same as those in Google Maps. So it isn’t a true “as the crow flies” distance but is proportional to it. The walking times are calculated by assuming a walking speed of 3 miles and hour. If you put your cursor over a station you’ll see the name of the station with the walking time and distance from your origin station.

A more sophisticated approach would be to extract more walking routes from Google Maps and use that to calculate distances and times. This would be rather more complicated to do and most likely not worth the effort, except if you are going South of the river.

Mine is not the only effort in this area, you can see a static map of walking distances here.