Book Review: Hidden Figures by Margot Lee Shetterly

hidden_figuresHidden Figures by Margot Lee Shetterly tells the story of Africa American women who worked as “computers” at NASA and its predecessor NACA during and after the Second World War.

In a first, this means I am currently reading both fiction and non-fiction by African-American women. (I’m also reading The Parable of the Sower by Octavia E. Butler)

The Hidden Figures worked initially in the West Area Computing Group at the NACA Langley Research Centre in Hampton, Virginia, which did reseach on aircraft and then rocket design. The Computing groups carried out calculations at the behest of engineers from around the Centre, this was at a time when calculation was manual or semi-manual compared to today. Over time they were co-opted directly into research groups, some of them to ultimately become engineers. The West Computing group was mirrored by the East Area Computing Group – comprised of white women.

There is some history for women acting as “computers”, and the necessity of World War II led to the government taking on Africa American women for the job, in face of historic segregation. For African American women this was a rare opportunity, until then the only recourse for African American woman with advanced training in maths was teaching. For a very few the Computing group ultimately acted as a stepping stone to working as an engineer.

Shetterly sees these women as a vanguard to the African Americans in the modern US who have every opportunity open to them. This jars a little to me when I see constant news from the US of, for example black people being more likely to be killed by the police, or a senior African American being brought together by the President with the policeman that aggressively interviewed him on his doorstep because the house looked too nice to belong to a black man. Or African Americans being purposefully disenfranchised.

The shocking thing to me, as a Brit, was the degree to which US society was absolutely, formally segregated on racial grounds. In Virginia, where this story is set, segregation was preserved by the Democratic Party (perhaps some explanation as to why African Americans are not necessarily whole-heartedly Democrats). In Prince Edward County, Virginia they went as far as shutting down all the public schools for 5 years in order that black and white children would not be educated together – white children were given grants to study at private schools. Britain may have been racist in the past, it may still be racist today but it never enshrined it so deeply and widely into law.

In response to this Africa Americans ran a parallel community, segregation didn’t end because the segregation laws were repealed. It ended because African Americans saw the end of those laws as a door ajar which needed a serious push to pass through. Thus when Rosa Parks sat on the bus, Katherine Goble (from this book) went to university and Ruby Bridges went to school they didn’t do so entirely alone. They had the support of their community and the organisation of the NAACP to help them. They had to be twice as good as a white person to get the same job. At the same time they also saw themselves as representatives of their race, and examples to their children.

When you look at a man the age of Donald Trump, 70, it’s worth bearing in mind that his teenage years were spent during the end of segregation by law and his parents were the white generation which fought so hard to keep it.

The focus of the book is mainly the personal lives, and ambitions of the women. There is some description of the work they, and the Research Centre did, but not in any great depth. The book highlights again the transformative effect of, particularly, the Second World War on society in the US. The seeds of theses changes could be seen after the First World War. This mirrors similar changes in society in the UK.

Once “computing” became the realm of high capital machinery the importance of women as computers waned, high capital machinery being the preserve of men. We see the consequences of this even now.

The book finishes with the part Katherine Johnson, in particular, played in John Glenn’s first trip into orbit and her subsequent work on the Apollo moon landing and Apollo 13 recovery. Shetterly emphasises the legacy of this group of women that normalised the idea that Africa American women could ultimately become engineers, scientists or any other sort of professional.

Interestingly my wife and I disagreed on the prominence of the men on the cover of the book (see above). She thought they were central and thus important, I thought they were small and thus unimportant. In the text the men are bit-part players, they are husbands and sons, or drift in and out of the narrative having spoken their line.

Book review: Working effectively with legacy code by Michael C. Feathers

legacy_codeWorking effectively with legacy code by Michael C. Feathers is one of the programmer’s classic texts. I’d seen it lying around the office at ScraperWiki but hadn’t picked it up since I didn’t think I was working with legacy code. I returned to read it having found it at the top of the list of recommended programming books from Stackoverflow at dev-books. Reading the description I learnt that it’s more a book about testing than about legacy code. Feathers defines legacy code simply as code without tests, he is of the Agile school of software development for whom tests are central.

With this in mind I thought it would be a useful read for me to improve my own code with the application of better tests and perhaps incidentally picking up some object-oriented style, in which I am currently lacking.

Following the theme of my previous blog post on women authors I note that there are two women authors in the 30 books on the dev-books list. It’s interesting that a number of books in the style of Working Effectively explicitly reference women as project managers, or testers in the text, i.e part of the team – I take this as a recognition that there exists a problem which needs to be addressed and this is pretty much the least you can do. However, beyond the family, friends and publishing team the acknowledgements mention one women in a lengthy list.

The book starts with a general overview of the techniques it will introduce, including the tools used to address them. These come down to testing frameworks and the refactoring tools found in many IDEs. The examples in the book are typically written in C++ or Java. I particularly liked the introduction of the ideas of the “seam”, a place where behaviour can be changed without editing the code and the “enabling point” – the place where a change can be made at that seam. A seam may be a class that can be replaced by another one, or a value altered. In desperate cases (in C) the preprocessor can be used to invoke test-time changes in the executed code.

There are then a set of chapters that answer questions that a legacy code-ridden developer might have such as:

  • I can’t get this class into a test harness
  • How do I know that I’m not breaking anything?
  • I need to make a change. What methods should I test?

This makes the book easy to navigate, if not a bit inelegant. It seems to me that the book addresses two problems in getting suitably sized pieces of code into a test harness. One of these is breaking the code into suitable sized pieces by, for example, extracting methods. The second is gaining independence of the pieces of code such that they can be tested without building a huge infrastructure up to support them.

Although I’ve not done any serious programming in Java or C++ I felt I generally understood the examples presented. My favoured language is Python, and the problems I tackle tend to be more amenable to a functional style of programming. Despite this I think many of the methods described are highly relevant – particularly those describing how to break down monster functions. The book is highly pragmatic, it accepts that the world is not full of applications in which beautiful structure diagrams are replicated by beautiful code.

There are differences between these compiled object-oriented languages and Python though. C#, Java, and C++ all have a collection of keywords (like public, private, protected, static and final) which control who can see what methods exist on a class and whether they can be over-ridden or replaced. These features present challenges for bringing legacy code under test. Python, on the other hand, has a “gentleman’s agreement” that method names starting with an underscore are private, but that’s it, and there are no mechanisms to prevent you using these “private” functions! Similarly, pretty much any method in Python can be over-ridden by monkey-patching. That’s to say if you don’t like a function in an imported library you can simply overwrite it with your own version after you’ve imported the library. This is not necessarily a good thing. A second difference is that Python comes with a unit testing framework and a mocking library rather than them being functionality which is third-party added. Although to be fair, the mocking library in Python was originally third party.

I’ve often felt I should programme in a more object-oriented style but this book has made me reconsider. It’s quite clear that spaghetti code can be written in an object oriented language as well as any other. And I suspect the data processing for which I am normally coding fits very well with a functional style of coding. The ideas of single responsibility functions, and testing still fit well with more functional programming styles.

Working effectively is readable and pragmatic. I suspect the developer’s dirty secret is that actually we wrote the legacy code that we’re now trying to fix.

Women Writers

Over the past year or so I’ve been making an effort to read more books authored by people who aren’t white men. I suppose the trigger for this was my post on feminism in which I realised that women live quite different lives from me. I thought it would be interesting to find out more, and since I read a lot this seems a natural place to start.

My reading divides into three broad categories:

  • Fiction, quite often science fiction – I don’t tend to blog about this;
  • Technical books in the area of programming and machine learning;
  • Other non-fiction – mainly the history of science or industrial history;

These categories differ in the way that I select books to read and my reason for reading them. Fiction I tend to read in bed shortly before I go to sleep, whilst non-fiction I read earlier in the day (when I can take notes). Fiction I read entirely for entertainment whereas non-fiction I enjoy but I’m normally reading for some purpose.

Non-fiction I select from my interests, and recommendations on twitter. For example, I’m interested in James Clark Maxwell and the number of book-length biographies of Maxwell is approximately 2. Fiction I’ve tended to select from prize winning, recommendations by Amazon or similar or from habit.

It turns out switching to reading more women authors of fiction was pretty straightforward. Of the fiction I’ve read over the last couple of years, about 60% was written by women and 70% was written by women or men who were not white Westerners (I read the Cixin Liu trilogy and a couple of books by Ramez Naam, whose is Egyptian by birth).

Are these books by women different from those written by male authors? One obvious difference is the main protagonist is more often female, and themes around sexual ambiguity are more common. It feels like there is a bit more inner emotional life to characters, and their interactions with others. This is all subjective since I didn’t make these readings blind. Books like “The Left Hand of Darkness” by Ursula le Guin and The Imperial Radche trilogy by Ann Leckie are amongst the best fiction I’ve read. 

In the past I would likely picked books by male authors because that is the sort of book I felt would interest me, I associated women authors with girly things in which a boy should not be involved. There’s a huge range of science fiction written by women so it was easy to change my habits.

Outside of fiction I have had more trouble. On the non-fiction side the fraction of women authors in the books I read is about 14%. This is a little odd since I can easily list several very good women authors in the area in which I read – Lisa Jardine, Andreas Wulf, Jenny Uglow, and Georgina Ferry. It seems likely this low proportion is in part driven by a lower proportion of books written by women in the areas in which I am interested. The proportion of women winners and short-listed authors in the Royal Society Science Book prize, going back to 1988, is about 8% (see the spreadsheet here). I struggle to discern a difference between these books, predominately on the history of science, written by men and women. More generally it seems like the role of women in the development of science is more widely recognised and written about than it was perhaps 30 years ago. Looking back at the authors I have enjoyed I see they have written other books that interest me, and the Royal Society Science Book list looks like a good source for more.

In technical books the proportion of women authors I have read is even lower, at 6%. This corresponds to one author so its something of an uncertain figure: that’s to say chance could have easily given me no female authors or twice the number. This seems to be approximately reflective of the proportion of technical books with women authors. Of the O’Reilly books in their “Python” section 6 of 46 authors were women (corresponding to 13%), for Manning 3 of 70 authors in their Software Engineering section are women (corresponding to 4.5%). It also seems to be roughly in line with the proportion of women contributors to Open Source projects on GitHub (at about 6%).

In my non-fiction and technical reading I felt I had no prior bias as to gender of the author, I selected based on my interests (primarily the history of science) or what I felt I needed to learn from the point of view of professional development. As a result I read a low proportional of women authors in these areas largely because there are a lower proportion of books authored by women.

You can see all the books I have read on my Goodreads profile, although the dates and sequences of reading go to pot in mid-2015.

Book review: Weapons of Math Destruction by Cathy O’Neil

weapons_of_math_destructionObviously for any UK anglophone the title of Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy by Cathy O’Neil is going to be a bit grating. The book is an account of how algorithms can ruin people’s lives. To a degree the “Big Data” in the subtitle is incidental.

Cathy O’Neil started her career as a mathematician before worked for the Shaw Hedge Fund as a quant before moving to Instant Media to work as a data scientist. It’s nice to know that I’m not the only person to have become a data scientist largely by writing “data scientist” on their CV! Nowadays she is an activist in the Occupy movement.

The book is the result of O’Neil’s revelation that algorithms were often used destructively, and are responsible for gross injustices. Algorithms in this case are models that determine how companies, and sometimes government, deal with their employees, customers and citizens; whether they are offered loans, adverts of a particular sort, employment, termination or a lengthy prison sentence.

The book starts with her experience at Shaw where she saw the subprime mortgage crisis from quite close up. In a nutshell: the subprime mortgage crisis happened because it was in the interests of most of the players in the industry for the stated risk of these mortgages to be minimised. The ratings agencies were paid by the aggregators of these mortgages to rate their risk, and the purchasers of these risk ratings had an interest in those ratings to be low – the ratings agencies duly obliged.

The book goes on to cover a number of other “Weapons of Math Destruction”, including models for recruitment, insurance, credit rating, scheduling (for work), politics and policing. So, for example, there are the predictive policing algorithms which will direct the police for particular parts of town in an effort to reduce serious crime but where the police will consequently record more anti-social behaviour which will lead the algorithm to send them there again because it turns out that serious crime is quite rare but anti-social behaviour isn’t (so there’s more data to draw on). And the police in a number of countries are following the “zero-tolerance” model which says if you address minor misdemeanours then more serious crimes are fixed automatically. The problem in the US with this approach is that the police are sent to black neighbourhoods repeatedly (rather than, say, college campuses) and the model is self-reinforcing.

O’Neil identifies several systematic problems which are typically of Weapons of Math Destruction. These are the use of proxies rather than “real outcomes”, the lack of feedback from outcomes to the model, the scale on which the model impacts people, the lack of fairness built into the model, the opacity of the models and the damage the models can do. The damage is extensive, these WMDs can lead to you being arrested, incarcerated for lengthy periods, denied a job, denied medical insurance, and offered loans at most extortionate rates to complete courses at rather low rate universities.

The book is focused almost entirely on the US, in fact the only mention of a place outside the US is of policing in the “city of Kent”. However, O’Neil does seem to rate the data and privacy legislation in Europe – where consumers should be told of the purposes to which data will be put when they supply it. Even in the States the law provides some limits on certain types of model (such as credit scoring) but these laws have not kept pace with new developments, nor are they necessarily easy to use. For example, if your credit score is wrong fixing it although legally mandated is not quick and easy.

Perhaps her most telling comment is that computers don’t understand fairness, and certainly don’t exhibit fairness if they are not asked to optimise for it. Which does lead to the question “How do you implement fairness?”. In some cases it is obvious: you shouldn’t make use of algorithms which explicitly take into account gender, race or disability. But it’s easy to inadvertently bring in these parameters by, for example, postcode being correlated with race. Or part-time working being correlated with gender or disability.

As a middle aged, middle class white man with a reasonably well-paid job, living in a nice part of town I am least likely to find myself on the wrong end of an algorithm and ironically the most likely to be writing such algorithms.

I found the book very thought-provoking, it will certainly lead me to ask me whether the algorithms and data that I am generating are fair and what the cost of any unfairness is.

Book review: I contain multitudes by Ed Yong

multitudesThis book was a Christmas gift, for which I’m very grateful! I Contain Multitudes: The Microbes within us and a Grander View of Life by Ed Yong is all about bacteria.

Bacteria are somewhat neglected in the popular science literature, I think the closest I can come is The Eighth Day of Creation by Maurice Freeland Judson which is about the discovery of DNA and its role in molecular biology in which bacteria and viruses play a part.

Yong’s book is about the relationship between bacteria and other organisms, humans included. It reveals a world where bacteria are not simply passengers on oblivious hosts but are a heavily integrated part of the host’s life cycle.

The study of the “microbiome” is relatively recent. Unravelling the members of a microbial community prior to the invention of cheap, and easy, DNA sequencing was hard. Carl Woese pioneered this approach in the 1970s, and used it discover the archea, a whole knew Kingdom of life (plants and animals are two of the other Kingdoms, to give you and idea of the magnitude of this discovery). Sequencing of the bacterial inhabitants of humans gained pace in the 2000s when it was discovered that we all carry a rich community of bacteria which varies from site to site around the body, let alone from individual to individual. What is true for humans is true for other organisms.

The book continues with an overview of how important bacteria can be to an organisms life. For example choanoflagellates, typically single-celled organisms, only form colonies in the presence of certain bacteria. And bobtail squid rely on bacterial partners to provide their luminescence. The standard lab animals (mice, zebrafish, flies) have been raised in germ-free environments and whilst they do not die, they do not flourish – even in the comfortable environment of the lab. The Wolbachia bacteria interferes with the sex lives of its insect hosts, it is only passed down via the eggs of the female and so it arranges by various means that there are more eggs and females than sperm.

These partnerships are not accidental, in the sense that organisms often provide specific structures to support their bacterial partners and exchange specific molecular markers with them. In some cases the host is essential to the survival of bacteria it contains because they have given up on carrying out tasks essential to their continued existence, for example in the supply of essential nutrients. This is true on many scales, animals from termites to cows have digestive systems designed to accommodate a particular bacterial support team to enable them to digest what would otherwise be food of low nutritional value. The early years of a human infants life are shaped by its acquisition of the right microbiome to prime the immune system and aid digestion.

The reason that bacteria are so effective in providing support services to their hosts is their high rate of evolution. Not only do they replicate fast, they have a promiscuous approach to DNA they come across in their environment. This means that if any bacterial species evolves a useful trait, such as the ability to digest seaweed then its neighbours in the gut can pick up that ability via its DNA. These genes can, eventually, end up in the genome of their hosts.

Japanese people who eat nori seaweed, which contains carbohydrates which the human body can’t digest on its own, host bacteria which can. Moreover, the genes those bacteria use to carry out this digestion were acquired from marine bacteria.

Yong is not misty-eyed about his bacterial subjects, as he points out their symbiosis with other organisms is not altogether harmonious – in the end the bacteria are in it for themselves.

The book finishes with some examples of how bacteria can be used to support human health, and speculates how this approach – currently only used in curing persistent C. difficile infections – could be extended to all manner of ailments including blood pressure and mental health problems.

I’ve been following Ed Yong on twitter for quite a while, and where he found the time to write a book as well as everything else he seems to do is a mystery to me! his style, as a science journalist, can be seen in the book, both in the presentation of the story, with brief character sketches of the scientists involved and quotes from them, and in the titles of the chapters which are entertaining but not necessarily informative. The book is thick with examples which build into larger themes, turn to the back of the book and you’ll find references to the primary literature.

Bacteria deserve our attention, this book is a great introduction to how they shape the lives of “higher” organisms.