Book review: You look like a thing and I love you by Janelle Shane

You look like a thing and I love you by Janelle Shane is a non-technical overview of machine learning. This isn’t to say it doesn’t goYou look like a thing and I love you book cover into some depth, and that if you are experienced practitioner in machine learning you won’t learn something. The book is subtitled “How Artificial Intelligence Works and Why It’s Making the World a Weirder Place” but Shane makes clear at the outset that it is all about machine learning – Artificial Intelligence is essentially the non-specialist term for the field.

Machine learning is based around training an algorithm with a set of data which represents the task at hand. It might be a list of names (of kittens, for example) where essentially we are telling the algorithm “all these things here are examples of what we want”. Or it might be a set of images where we indicate the presence of dogs, cats or whatever we are interested in. Or, to use one of Shane’s examples, it might be sandwich recipes labelled as “tasty” or “not so tasty”.  After training, the algorithm will be able to generate names consistent with the training set, label images as containing cats or dogs or tell you whether a sandwich is potentially tasty.

The book has grown out of Shane’s blog AI Weirdness where she began posting about her experiences of training recurrent neural networks (a machine learning algorithm) at the beginning of 2016. This started with her attempts to generate recipes. The results are, at times, hysterically funny. Following attempts at recipes she went on to the naming of things, using neural networks to generate the names of kittens, guinea pigs, craft beers, Star Wars planet names and to generate knitting patterns. More recently she has been looking at image labelling using machine learning, and at image generation using generative adversarial networks.

The “happy path” of machine learning is interrupted by a wide range of bumps in the road which Shane identifies, these include:

  • Messy training data – the recipe data, at one point, had ISBN numbers mixed in which led to the neural network erroneously trying to include ISBN-like numbers in recipes;
  • Biased training data – someone tried to analyse the sentiment of restaurant reviews but found that Mexican restaurants were penalised because the Word2vec training set (word2vec is a popular machine learning library which they used in there system) associated Mexican with “illegal”;
  • Not detecting the thing you thought it was detecting – Shane uses giraffes as an example, image labelling systems have a tendency to see giraffes where they don’t exist. This is because if you train a system to recognise animals then in all likelihood you will not include pictures with no animals. Therefore show a neural network an image of some fields and trees with no animals in it will likely “see” an animal because, to its knowledge, animals are always found in such scenes. And neural networks just like giraffes;
  • Inappropriate reward functions – you might think you have given your machine learning system an appropriate “reward function” aka a measure for success but is it really the right one? For example the COMPAS system, which recommends whether prisoners in the US should be recommended for parole, was trained using a reward based on re-arrest, not re-offend. Therefore it tended to recommend against parole for black prisoners because they were more likely to be arrested (not because they were more likely to re-offend);
  • “Hacking the Matrix” – in some instances you might train your system in a simulation of the real world, for example if you want to train a robot to walk then rather than trying to build real robots you would build virtual robots and try them out in a simulated environment. The problem comes when your virtual robot works out how to cheat in the simulated environment, for example by exploiting limitations of collision detection to generate energy;
  • Problems unsuited to machine learning – some tasks are not amenable to machine learning solutions. For example, in the recipe generation problem the “memory” of the neural network limits the recipes generated because by the time a neural network has reached the 10th ingredient in a list it has effectively forgotten the first ingredient. Furthermore, once trained in one task, a neural network will “catastrophically forget” how to do that task if it is subsequently trained to do another task – machine learning systems are not generalists;

My favourite of these is “Hacking the matrix” where algorithms discover flaws in the simulations in which they run, or flaws in their reward system, and exploit them for gain. This blog post on AI Weirdness provides some examples, and links to original research.

Some of this is quite concerning, the examples Shane finds are the obvious ones – the flight simulator which found that it could meet the goal of a “minimum force” landing by making the landing force enormous and overflowing the variable that stored the value, making it zero. This is catastrophic from the pilot’s point of view. This would have been a very obvious problem which could be identified without systematic testing. But what if the problem is not so obvious but equally catastrophic when it occurs?

A comment that struck me towards the end of the book was that humans “fake intelligence” with prejudices and stereotypes, it isn’t just machines that use shortcuts when they can.

The book finishes with how Shane sees the future of artificial intelligence, essentially in a recognition that these systems have strengths and weaknesses and that the way forward is to combine artificial and human intelligence.

Definitely worth a read!

Review of the year: 2019

My blogging this year has been entirely book reviews, you can see a list on the index page. I still find blogging a useful discipline to go with my non-fiction reading but my readership is so low there seems little point in writing other things for a wider audience.

A number of the books I reviewed related to music, I feel this is cheating a bit on the book reviewing front since these are typically teaching books which are more guided exercises than prose. I also read Ian S. Port’s book The Birth of Loud which is about the origins of the electric guitar from the point of view of Leo Fender and Les Paul. The music books are reflected by an increased collection of musical instruments in the household, we started the year with my electric guitar and electric and acoustic guitars for Thomas. We have since gained a bass guitar, for Sharon; a ukulele, for travel; an electric drum kit; an acoustic guitar for me (it’s very pretty – a Fender Newporter, pictured below); and for Christmas a keyboard for the family.

guitar

To learn to drum I got the Melodics app, which plugs into the drum kit and gives direct feedback as to whether I was hitting the right thing at the right time. I found this really usefully but discovered as a result that my guitar playing involved a lot of pausing for thought between passages, so I’ve started using Youcisian for guitar which has similar feedback functionality. The musical year finished with us getting a family present of a keyboard, and now I discover the theoretical side of music is so much easier on a keyboard – on a guitar the notes wrap across the fret board so you can access a couple of octaves with one hand in one place. This is convenient but it means note positions are not as obvious as on a keyboard where everything is laid out in a nice straight line.

Beyond music my reading has been quite eclectic this year. I started with Mapping Society on the use of maps to communicate data about society, moved on to a biography of Hedy Lamarr the Hollywood star who patented the frequency hopping method for secure communications. I went through a spell of reading more work relevant books – a couple of books on JavaScript, a book on marketing and one on international culture and how it impacts business interactions, and a book on rapid prototyping in business. I read several fairly academic history books, Higher and Colder on extreme physiology experiments on Everest and at the poles, Gods and Robots about representations of robots and similar in Greek and other mythology and Empires of Knowledge about some of the correspondents in the Republic of Letters. I also read the sumptuously illustrated catalogue of the Matthew Boulton exhibition. I read a couple of more data science oriented books (Designing Data Intensive Applications and Deep learning with Python). Angela Saini’s book Superior, on race science was a highlight. Returning to my roots I also read Lost in Math by Sabine Hossenfelder which is about how theoretical physics has lost itself in a search for mathematical beauty.

This years holiday was in Benllech once again, only a short drive for us from Chester. Thomas has been learning to swim, and Sharon and Thomas rather enthusiastically flung themselves into the chilly Irish Sea. To be fair the weather wasn’t too bad, we had a couple of really warm days – I got sunburnt feet – and only a couple of heavy downpours in parts of the day when it didn’t matter.

familybenllech _beach

Politics has been largely miserable over the last year, Brexit failed to happen several times through the year but only after it felt we had been brought to the brink of crashing out of the EU with no deal which I found stressful, and now we have a Tory government with a significant majority led by someone unsuited to run a whelk stall which will take us out of the EU, probably as a cliff edge towards the end of the year. I suspect the General Election was won in part because voters are fed up with Brexit paralysis, even those that wish to remain were probably not greatly enthused by the prospect of a second referendum which had every sign of being as tightly contested as the first.

On the positive side, my own team, the Liberal Democrats, has seen a general rise in its fortunes. We have been consistently taking council seats from Labour and Tory, with gains considerably above expectations in the May elections. In the unexpected EU elections in the summer the Liberal Democrats polled second with 19.6% of the vote with only the Brexit Party ahead of them, I tend to see EU elections as indicative of general support in the absence of the First Past the Post system. In parliament we saw mixed fortunes, we increased the number of MPs by defection and by-election to 21, then dropped back to 11 seats in the December General Election, losing our leader Jo Swinson in the process. This is despite growing our vote share from 7.4% to 11.6%, you’ve got to love the First Past the Post system!

Ever keen to be forced to do new things by apps, I’ve started learning Arabic in Duolingo, I have to admit this is largely due to finding Arabic script attractive. I suspect I cheat quite a lot by using minimal pattern matching rather than full understanding the language to get some answers right.

Book review: Higher and Colder by Vanessa Heggie

higher_and_colderHigher and Colder by Vanessa Heggie is a history of extreme physiological research in the later nineteenth and twentieth century. It is on the academic end of the spectrum I read, it is not a tale of individual heroics, although I found it quite gripping.

The action takes place largely in extreme environments such as very high mountains, and the polar regions. There are some references to high temperature environments but these are an aside. One of the themes of the book is the tension between laboratory physiological experiments, such as the barometric chamber work of Bert in 1874, and experiments and experiences in the field. It turns out it is hard to draw useful conclusions on survival in extreme environments from laboratory studies. Much of this work was done to support exploratory expeditions, mountaineering, military applications and more recently athletic achievement. The question is never “Can a human operate at an altitude of over 8000 metres?”, or the like, it is “Can Everest be scaled by a human with or without supplementary oxygen?”. So factors other than the “bare” physiology are also important.

Some of the discussion towards the end o the book regarding death, and morbidity in expeditions to extreme environments brought to mind the long distance marine expeditions of the 18th century. Its not discussed in the book but it seems like these extreme physiology field programmes go beyond simple field research, they are often parts of heroic expeditions to the ends of the earth.

The book opens with a discussion of mountain sickness and whether its cause is purely down to low oxygen or whether other factors are important. One section is titled “Only rotters would use oxygen?” – the idea being that climbing Everest was retarded by a reluctance to use supplementary oxygen. In fact oxygen apparatus only really became practical for climbers in the 1950s, so the reluctance is more to do with technology than honour. The climbing problem is different from a military aircraft where weight is relatively unimportant. Fundamentally there is no short term acclimatisation to altitude. Himalayan populations show some long term adaptations but Andean populations are quite different in terms of evolution scale adaption – populations in the Himalayas have been there much longer. Mentioned towards the end of the book is the fact that humans foetuses spend their time in a low oxygen environment, so these physiological experiments have applications well beneath the mountains and the skies.

The selection of participants into the field, both as experiment and subject, was based on previous experience, gender, class and connections. This means they were almost entirely white and male, particularly those to Antarctica where the US military refused to transport women for a considerable spell. The extreme physiology community is quite close-knit and difficult for outsiders to penetrate, there is a degree of nostalgia and heritage to their discussions of themselves. Although women played a part in missions dating back into the earlier 20th century their presence is hidden, publication culture would typically not name those considered to be assistants. The first woman to overwinter in the British Antarctica base was in 1996.

Native people are similarly elided from discussion although they were parts of a number of experiments and many missions. An interesting vignette: the conventional ergometer which measures human power output was found not to be well-suited to Sherpas since it was based on a bicycle, utterly unfamiliar to a population living in the high Himalayas where bicycles are uncommon. Also the oxygen masks used by Western climbers need to be adapted to suit the differing face shapes of Sherpas. Heggie introduces the idea of thinking of native technology as part of bioprospecting. I was intrigued to learn that “igloo” originally meant something very specific, one of a class of structures from compacted snow, but it was corrupted to mean any building made of compacted snow. Pemmican is another technology drawn from the natives of Arctic lands. These technologies are usually adapted and there is a degree to which they are not adopted until they have been “scientifically proven” by Western scientists.

It turns out that participants in polar expeditions don’t experience much cold – they are two well equipped and often expending a lot of energy. Cold is different to altitude, altitude is relatively un-escapable whilst cold can be mitigated by technologies dating back centuries.

I was broadly familiar with some of the material in this book from reading about attempts on Everest and Antarctic and Arctic expeditions but this work is much more focussed on the experiments than the men. I am contaminated with the knowledge that Heggie has worked with Simon Schaffer and felt that Higher and Colder has something of the style of Leviathan and the Air pump particularly the language around objects and artefacts, and their movement being about communication.

I found this a gentle introduction to the practice of historiography, it is related to the tales of adventure and individual heroism around scaling Everest and reaching the South Pole but quite different in its approach.

Book review: Deep learning with Python by François Chollet

Deep learning with Python by Francois Chollet is the third book I have reviewed on deep learning neural networks. Despite these reviews only spanning a couple of years it feels like the area is moving on rapidly. The biggest innovations I see from this book are in the use of pre-trained networks, and the dominance of the Keras/Tensorflow/Python ecosystem in doing deep learning.

Deep learning is a type of artificial intelligence based on many-layered neural networks. This is where the “deep” comes in – it refers to the numbers of layers in the networks. The area has boomed in the last few years with the availability of massive datasets on which to train, improvements in numerical algorithms for training neural networks and the use of GPUs to further accelerate deep learning. Neural networks have been used in production since the 1990s – by the US postal service for reading handwritten zip codes.

Chollet works on artificial intelligence at Google and is the author of the Keras deep learning library. Google is also the home of Tensorflow, a lower level library which is often used as a backend to Keras. This is a roundabout way of saying we should expect Chollet to be expert and authoritative in this area.

The book starts with some nice background to machine learning. I liked Chollet’s description of machine learning (deep learning included) being about finding a representation of data which makes the problem at hand trivial to solve. Imagine taking two pieces of coloured paper, placing them one on top of the other and then crumpling them into a ball. Machine learning is the process of un-crumpling the ball.

As an introduction to the field Deep Learning in Python runs through some examples of deep learning applied to various classes of problem, including movie review sentiment analysis, classifying newswire articles and predicting house prices before going back to discuss some issues these problems raise. A recurring theme is the problem of overfitting. Deep learning models can learn their training data really well, essentially they memorise the answers to questions and so when they are faced with questions they have not seen before they perform badly. Overfitting can be addressed with a range of techniques.

One twist I had not seen before is the division of the labelled data used in machine learning into three, not two parts: training, validation and test. The use of training and validation parts is commonplace, the training set is used for training – the validation set is used to test the quality of a model after training. The third component which Chollet introduces is the “test” set, this is like the validation set but it is only used when your model is about to go into production to see how it will perform in real life. The problem it addresses is that machine learning involves a large number of hyperparameters (things like the type of machine learning model, the number of layers in a deep network, the form of the activation function) which are not changed during training but are changed by the data scientist quite possibly automatically and systematically. The hyperparameters can be overfitted to the validation set, hence a model can perform well on validation data (that it has seen before) but not on test data which represents real life.

A second round of examples looks at deep learning in computer vision, using convolution neural networks (convnets). These are related to the classic computer vision process of convolution and image morphology. Also introduced here are recurrent neural networks (RNNs) for applications in processing sequences such as time series data and language. RNNs have memory across layers which dense and convolution networks don’t, this makes them effective for problems where the sequence of data is important.

The final round of examples is in generative deep learning including generating text, the DeepDream system, image style transfer and generating images of faces.

The book ends with some thoughts of the future. Chollet comments that he doesn’t like to use the term neural networks which implies the ability to reason and abstract in the way that humans do. One of the limitations of deep learning is that, as currently used, does not have the ability to abstract or generate programmatic descriptions of solutions. You would not use deep learning to launch a rocket – we have detailed knowledge of the physics of rockets, gravity and the atmosphere which makes a physics-based approach far better.

As I read I realised that keeping up with what was new in machine learning was a critical and challenging task, Chollet answers this question exactly suggesting three approaches to keeping abreast of new developments:

  1. Kaggle – the machine learning competition site;
  2. ArXiv – the preprint server, in particular http://www.arxiv-sanity.com/ which is a curated view of the machine learning part of arXiv;
  3. Keras – keeping up with developments in the Keras ecosystem;

If you’re going to read one book on deep learning this should probably be the one, it is readable, covers off the field pretty well, Chollet is an authority in this area and in my view has particularly acute insight into deep learning.

Book review: Superior by Angela Saini

superiorNext I turn to Superior: the return of race science by Angela Saini, having recently read Inferior by the same author. Inferior discusses how men of science have been obsessed with finding differences in all manner of human abilities on the basis of gender. Superior does the same for race.

In both cases largely male, white scientists spend inordinate amounts of time and effort trying to demonstrate the superiority of white males. There is a still pervasive view amongst scientists that what they do is objective and somehow beyond the reach of society. However, there is a a choice to be made in what is studied which goes beyond the bounds of science. Unlike Inferior, Superior reveals explicit funding and political support for racist ideas which stretch to the present day.

For Saini this is somewhat personal since she is of Indian origin, and considered a “Black member” by the NUJ. This highlights one of the core issues with race. The limited palette of races introduced in the 18th century ignored the huge variations across Africa and Indian to render the world down to White, Black, Indian, Chinese.

Furthermore the genetic variations within races, are bigger than those between races. Race was a construct invented long before we knew anything about genes, and it was a construct assembled for specific geopolitical purposes. The fundamental problem with race science is that it is literally skin deep, you might as well try to establish the superiority or otherwise of people having brown eyes, or red hair. The variations in genes amongst red-heads are as large as those between red-heads and blondes.

The account is historical, starting with the first “research” into race, when Britain, France and other countries where building empires by colonisation and the slave trade was burgeoning. It became important to rationalise the mistreatment of people from other countries, and race was the way to do it. White scientists neatly delineated races, and asserted that the white race was at the top of the pile, and thus had the right to take the land of other races, who were not using it correctly, and subjugate them as slaves.

Darwin’s work on evolution in the 19th century gave race science new impetus, white superiority could be explained in terms of survival of the fittest – a natural law. These ideas grew into the science of eugenics which had the idea of improving human stock through breeding. This wasn’t a science practiced in the margins, still renowned figures at the heart of statistics and biology were eugenicists.

Eugenics increased in importance prior to the Second World War but the behaviour of Hitler and the Nazis meant it fell out of favour thereafter. This is not to say race science disappeared. In 1961 a number of academics set up the journal Mankind Quarterly, funded by Wickliffe Draper’s Pioneer Fund. This had the appearance of a respectable academic journal but was in fact an echo chamber for what was essentially white supremacists. Similar echo chambers were set up by the tobacco and oil industries on for smoking and climate change. They look sufficiently like academic science to fool outsiders, and for politicians to cite them in times of need but the rest of their parent fields look on in horror. Mankind Quarterly is still published to this day, in fact within the last couple of years Toby Young was forced to resign as director of the Office for Students having attended meetings at University College London organised by Mankind Quarterly. University College London has a troubled relationship with race science.          

This isn’t to say that all race science is maliciously racist. The human genome project led to plans to establish the diversity of the human species by sequencing the DNA of “isolated” groups, this typically meant indigenous people. Those promoting this diversity work were largely liberal, well-meaning if not somewhat paternalistic but their work was riven by ethical concerns and the natural concerns of indigenous people who they sought to sample.

Touching on origins Saini observes, once Neanderthal DNA was found in white Western Europeans the species experienced something of revival in reputation. Once a by-word for well, being Neanderthal, they are now seen as rather more sophisticated. It turns out the 10,000 year old Cheddar man was surprisingly dark-skinned certainly to Britons wishing to maintain their ancestry was white. The key revelation for me in this section was the realisation that large scale migration in prehistoric times was on on-going affair, not a one off. Waves of people moved around the continents, replacing and merging with their predecessors. 

It has been observed that some races are prone to particular medical conditions (at least if they are living in certain countries, which should be a clue) therefore we seek a genetic explanation for these differences. This approach is backed by the FDAs approval of a drug combination specifically marketed to African Americans for hypertension. Essentially this was a marketing ploy. African Americans experience significant environmental challenges which are risk factors for hypertension, hypertension is a complex condition for which there is no simple genetic explanation.

Even for sickle cell anaemia, for which there is a strong genetic basis, using race as a proxy is far from ideal – the rate of the sickle cell anaemia gene varies a great deal across Africa and is also common in Saudi Arabia and India. A former colleague of mine from Northern Italy had the condition.

For a middle-aged white Western European male scientist Superior is salutatory reading. As for Inferior people men like me have repeatedly asked “What makes us so special?”, it is long past time to stop.