Author's posts
Apr 17 2023
Book review: Data Points by Nathan Yau
I picked up Data Points by Nathan Yau as a recommended book on exploratory data analysis in Storytelling with data. I have previously read Nathan Yau’s book Visualize This.
Visualise This was very focussed on the technical side of producing data visualisations, with code samples and so forth. This is a “bigger picture” book divided into three sections: context, exploration and presentation.
Context can be summarised as: who, how, what, when, where, why. Context is covered explicitly in the first chapter using the medium of Yau’s wedding photos as an example. Spinning off from here is a mention of the Quantified Self movement, there was a time a few years ago when this was popular – people would record aspects of their life in great detail and build visualisations from them. This was enabled by the growth of the first generations of smartphone which made this sort of data collection easy. Yau points out – and I think all data scientists can agree with this – that most of the job is actually collecting together the data required for a project and getting it into a shape to visualise.
The “Exploration” chapters start with an overview of what a data visualisation is, one of the strengths of this book is the many examples of visualisations, in this case going as far back as William Playfair in 1786 with the invention of the bar chart. This chapter also highlights that a data visualisation can be a flow chart, or it can be an abstract piece of art which is based on data. Yau cites John Tukey’s Exploratory Data Analysis a number of times which was published in the 1970s at a time when the author felt the need to explain that a “bold” effect can be achieved using a pen rather than a pencil. The point being that we now have immense power in readily available software to produce visualisations at the click of a button which would have taken an expert many hours of manual labour in the relatively recent past.
The next chapters provide a summary of how we build a data visualisation starting with the fundamental building blocks: title, visual cues (the data), coordinate system, scale and context elements. The visual cues are further broken down into attributes like position, length, angle, direction, shapes and so forth.
The book finishes with a chapter on technologies, some of them such as R, Adobe Illustrator, Microsoft Excel, Google Sheets, Tableau are still around and remain good choices. Yau’s favoured combination is R with Adobe Illustrator used to polish the results. The Javascript library Data Driven Documents (d3) and Processing are still active. Other systems like IBM’s Many Eyes project, MapBox’s TileMill have disappeared. Javascript Libraries Raphael and the Javascript Infovis Toolkit appear dormant, in the sense that the activity on their GitHub repositories is minimal. Nobody talks about Flash and ActionScript anymore.
Data Points is much more a book about exploratory data visualisation then Storytelling with data, I think Yau believes that exploratory data analysis is an exercise in storytelling. The strength of this book is the wide range of examples used to illustrate the points being made through the book. The style is chatty, it is not a difficult read. It is less focussed on delivering specific lessons in making data visualisations than Storytelling with data.
Apr 08 2023
Book review: Margaret the First – A Biography of Margaret Cavendish by Douglas Grant
I have come across Margaret Cavendish in number of times in reading about the history of science, I think most recently in a biography of Christiaan Huygens. She is noted for attending a Royal Society meeting in 1666, and for being one of the earliest published female authors in England. She sounded very interesting so I picked up Margaret the First: A biography of Margaret Cavendish by Douglas Grant – one of the few biographies about her.
Margaret Cavendish was born in 1623 to the aristocratic Lucas family of Colchester and died at the relatively early age of 50 in 1673. As a child she was a keen writer, and picked up an interest in science from her brother John although as a girl her formal education was limited.
The Lucas’s were fairly heavily involved in the Civil War on the Royalist side. Margaret joined the household of the queen, Henrietta Maria, as a maid of honour in 1643. She fled to Paris with the queen’s household in 1644. At this point William Cavendish (1st Duke of Newcastle), later to became Margaret’s husband enters the story – he was immensely wealthy and was Captain-General to the Royalist army North England. Following the Battle of Marston Moor he too fled to Europe – to Hamburg in the first instance.
William Cavendish was widower – his first wife, Elizabeth having died in 1643. Margaret and William met in Paris and were married in late 1645. Having read quite a lot of scientific biography I am starting to get a feel for what written resources are available to the biographer – in this case I suspect it was Margaret’s published writings and the financial records of her husband, which were most important. In exile William Cavendish was always struggling for money, although he seems to have had the gift of the gab since a number of times they appear on the brink of destitution which is resolved when William goes and talks to his creditors!
Whilst in Paris, Margaret dined with at least René Descartes and Thomas Hobbes – there was a fairly active salon culture in Paris at the time in which I believe women were moderately involved. In England involvement in intellectual circles appears to have been forbidden for women but perhaps it was a little more open in Paris.
The couple moved to Antwerp in 1648, where they lived in Rubens old house, again surviving on credit which William Cavendish often seemed to spend on horses! It was at this time that Margaret started to write for publication. Grant’s broad view of her output could be summarised as "needed an editor", she appeared to write straight to publication with little sign of returning to work to correct and edit for structure and coherence.
Her early books were poetic with a theme of natural philosophy, this isn’t as outlandish as it first sounds – Erasmus Darwin was to write poetically about natural philosophy in the following century. Her atomic theories would read oddly to our eyes but were not inconsistent with prevailing theories of the time. She sat within the Classical / Cartesian school of natural philosophy with an emphasis on pure thought which in the second half of the 17th century was being displaced by a science driven by observation and experiment. In fact she wrote some criticism of the newly invented microscope. Her writing covers a wide range of forms (poetry, prose, plays, orations, letters), and a substantial fraction of it is what you might describe as romantic fiction – although The Blazing World has been described as proto-science fiction.
Margaret and her husband returned to England in 1660 following the death of Oliver Cromwell in 1658 and the Restoration of Charles II. After spending some time in London, whilst William Cavendish regained possession of his estates, the couple retired to the country from where Margaret promoted her writing – providing free copies of her books to universities and individuals. It is during this period that she attended a meeting of the Royal Society, Samuel Pepys is quite critical of her and the general impression was that men felt she shouldn’t have been there.
She died rather suddenly in 1673, a few years before her much older husband who died in 1676.
It would seem that Margaret Cavendish was a very bright young woman, who missed out almost entirely on any sort of education because she was a women. Her interest in science was promoted by her older brother John, her husband and his brother as well as extensive correspondence and dinners with leading intellectuals of the day arising from her time in Paris and Antwerp. Her work was published and promoted broadly most likely because of the power of her husband, which also served to mute criticism. She was widely seen as a rather eccentric character, in part this seems to be down to a vintage dress sense but her simply writing would probably been a factor too.
It would be nice to report that Margaret Cavendish was a pioneer, soon followed by other women into the public, scientific sphere but she wasn’t. Caroline Herschel’s work was presented to the Royal Society in 1788 – over 100 years later, exceptionally Queen Victoria became a member of the Royal Society but it wasn’t until 1945 that Kathleen Lonsdale and Marjory Stephenson became the first female fellows of the Royal Society. The first women to study for undergraduate degrees started in 1880 with Oxford and Cambridge not awarding degrees to women until 1920 and 1945 respectively.
This book was published in 1956, there are a limited number of biographies of Margaret Cavendish and although this one was entirely acceptable it is a bit dated and I can’t help feeling there will have been a lot of scholarly work done on her life in the intervening years.
Mar 31 2023
Book review: Storytelling with data by Cole Nussbaumer Knaflic
This book, Storytelling with data by Cole Nussbaumer Knaflic, fits in with my work, and my interests. It relates to data visualisation, an area in which I have read a number of books including The Visual Display of Quantitative Information by Edward R. Tufte, Visualize This by Nathan Yau, Data Visualization: a successful design process by Andy Kirk and Interactive Data Visualization for the web by Scott Murray. These range from the intensely theoretical (Tufte) to the deeply technical (Murray).
Storytelling with data is closest in content to Andy Kirk’s book and his website is cited in the (very good) additional resources list. A second similarity with Andy Kirk’s book is that Storytelling is “the book of the course” – the book is derived from her the author’s training courses.
The differentiating factor with Knaflic’s book is the focus on storytelling, presenting a case to persuade rather than focussing on on the production of a data visualisation, although that is part of the process. The book is divided into 6 key lessons, each of which gets a chapter, with a couple of chapters of examples, an introduction and an epilogue this makes 10 chapters. The six key lessons are:
1. understand the context
2. choose an appropriate visual display
3. eliminate clutter
4. focus attention where you want it
5. think like a designer
6. tell a story
I think I got the most out of the understand the context and tell a story chapters, technically I am quite experienced but my knowledge is around how to make charts and process the data to make charts rather than telling a story. The understanding the context chapter talks about the “Big Idea” and the “3-minutes story”. The Big Idea is the single idea you are trying to get across in a presentation, and the 3-minute story is the elevator pitch – how you would put your story into 3 minutes. I liked a callout box with a list of verbs (accept, agree, begin, believe…) used to prompt you for what action you want your audience to take having seen your presentation.
The chapter on choosing an appropriate visual display is quite straightforward, Knaflic presents the 12 types of display she finds herself using frequently (which includes simple text, and text tables). This is a fairly small set since variations of bar charts – horizontal, vertical, stacked and waterfall cover off 5 types. This is appropriate, if you are telling a story to persuade then you don’t want to be spending your time explaining how your esoteric display works. Knaflic steers away from specific technology, only mentioning at the beginning of the book that all the charts shown were made in Microsoft Excel and Adobe Illustrator was sometimes used to get a chart looking just right at the end of the process.
There is a list of sins in data visualisation including the reviled pie chart, and 3D plots but perhaps surprisingly the use of secondary axes to plot data on different scales together.
The chapters on eliminate clutter, focus attention where you want it, and think like a designer are all about making sure that the viewer is paying attention where you want them to pay attention. Some of this is about the Tuftian “eliminate clutter” much of which creeps into charts through default behaviour in software. Some is about using gestalt theories of attention to group items together through similarity, proximity and so forth and some is about using pre-attentive attributes such as colour and type face to draw attention to certain elements. This reminded me of The Programmer’s Brain by Felienne Hermans, which links theories of how our brain works with the practices of programming.
The chapter on tell a story introduces some resources on storying telling from playwrights and screenwriters – basically the idea of the three act play with a setup, conflict and resolution. This is a different way of thinking for me, my presentations tend to follow the traditional structure of a scientific paper but it is interesting to see the link with creative writing and drama – which is generally excluded from scientific writing.
One of the lessons I learnt from this book was to make better use of of chart titles and PowerPoint titles, I tend to go for descriptive chart titles (“Ticket Trend”, to use an example from the book) and PowerPoint titles which simply labelled a section of a talk (“Methodology”). Knaflic encourages us to use this valuable “real estate” in a presentation for a call to action: “Please Approve the Hire of 2 FTEs”.
The six lessons are reinforced with a chapter which covers a single worked example from beginning to end, and another chapter of case studies which looks at fixing particular issues with single charts.
I enjoyed this book, its beautifully produced and fairly easy reading. It also led me to buy two more books Resonate by Nancy Duarte and Data Points by Nathan Yau, and so the “to be read” pile grows again!
Mar 19 2023
Book review: The Man from the Future by Ananyo Bhattacharya
The Man from The Future by Ananyo Bhattacharya has been sitting on my bedside table in the "to be read" pile for a little while. I was aware of Von Neumann largely through his work on computers, and game theory.
The book is organised thematically, firstly on Von Neumann’s early years then on the various fields in which he made contributions.
Neumann János Lajos was born in Budapest in 1903, the Hungary style was to put the family name first – his father was ennobled in 1903 – hence the "von" and he Anglicised his forename to John when he moved to America in 1930. Hungary, and Budapest, in Von Neumann’s time was a hot bed of intellectuals many of whom fled Europe to America with the rise of the Nazis. For someone with a background in physics it is a bit of a Who’s Who – Eugene Wigner, Leo Szilard, Theodore von Kármán, Edward Teller, Dennis Gabor – were all his contemporaries and he seemed to know them personally.
Von Neumann’s first contributions to the academic world were in set theory, he published a paper on defining cardinal and ordinal numbers in 1921 which still stands today. This was at a time when maths was undergoing a foundational crisis, which Einstein described as "Froschmäusekreig" – a war of frogs and mice – a term I aim to use in future!
The set theory paper was written whilst he was still at school, he then moved on to study simultaneously a degree in Chemistry at Berlin, chemical engineering in Zurich at ETH and a doctorate in maths at Budapest – passing all with flying colours. He then moved on to Göttingen in about 1925 where Heisenberg was working. Von Neumann’s contribution was Mathematical Foundations of Quantum Mechanics published in 1932 – not translated into English for 20 years. His key contribution was demonstrating that Heisenberg’s matrix mechanics and Schrödinger’s wave equation theories of quantum mechanics were equivalent. To a degree I feel his contribution held back the field, backing as it did the Copenhagen interpretation of quantum mechanics (i.e. "shut up and calculate") – it wasn’t until the late 1950’s that other started probing the philosophical foundations of quantum mechanics in more depth.
It was during this period he was enticed to Princeton and the Institute for Advanced Studies. As German science declined under the Nazis due to their purges of "undesirables" from the civil service and universities, American science which had been in the doldrums rose – one at the cost of the other.
Von Neumann was clearly politically astute and had seen war coming in the early thirties, in the late thirties he was pro-actively trying to join the US army – fortunately redirected into the Manhattan Project (a project stuffed with scientists later to become Nobel Prize winners). His key contributions were in the simulations done for the implosion bomb (at a time when the idea of computer simulations was radical and new and not yet expressed). I hadn’t realised before was that airburst bomb are used because they are more destructive than the same explosives detonated at ground level, this is why the Trinity test was executed on a tower. Von Neumann was also on the committee that chose the targets for the atomic bombs dropped on Japan at the end of the war.
Von Neumann’s work on the Bomb, and his mathematical interests led him naturally into computing. Prior to the war, as part of the fundamentals of mathematics, Kurt Gödel, Alan Turing and Alonzo Church had done work essential to the foundation of computing. Turing’s work in particular demonstrated that theoretically a machine could be built which could carry out any computation but Gödel had shown that not all problems were computable. Von Neumann met with Alan Turing in 1942, it is not clear what they talked about I imagine both the Bomb and Turing’s codebreaking work at the Bletchley Park were topics of conversation.
Von Neumann had worked with computing devices on implosion calculations, an activity in which his second wife Klára Dán von Neumann was heavily involved. After the war a number of groups were working on computers, and he was convinced that the computer would be more revolutionary than the atomic bomb. His key contribution was a draft report on the EDVAC computer being built at the Moore School of Engineering in the University of Pennsylvania. The significance of this report was that it described clearly the architecture of a modern computer with input and output units, a central processor, memory and so forth – previously computers had largely been designed for very specific tasks and appear to have been logically complex. Von Neumann’s report was widely circulated much to the chagrin of his collaborators who had hoped for lucrative patents on the design of computers.
Stepping back in time a bit, Von Neumann had started working on what would come to be known as "game theory" in the 1920s, publishing his first paper in this area in 1926, followed by another in 1937 and finally a book written with Oskat Morgenstern, Theory of Games in 1943. After the Second World War mathematicians started to infiltrate economics departments and apply game theory ideas to economic problems. This has resulted in some very lucrative public auctions (designed using ideas stemming from game theory), and a fair number of Nobel Prizes in economics.
After the Second World War the US government set up the RAND Corporation which was a think tank, possibly the original think tank. They undertook a wide range of research, trying to maintain the spirit that drove the development of the atomic bomb, radar, codebreaking during the Second World War but also operations research. Von Neumann acted as a consultant and was seen very much as the father of the organisation without necessarily holding an exalted formal position. It was at this time, when they had the only nuclear weapons that the US contemplated a first strike against the Soviets. Von Neumann started quite hawkish but become more dovish over time.
The final chapter of the book is on cellular automata, stimulated by Alan Turing’s universal machine, and also how life works – in the post-war period the structure and mechanism by which DNA works was being elucidated and a number of physicists were interested in both the structure of DNA and how it transmits information. Cellular automata are perhaps best know by John Conway’s Life game. His work was prompted by Von Neumann, although Von Neumann’s book on cellular automata was not published until 10 years after his death in 1957 from bone cancer.
I must admit the book made me think of the nature of a biography, this one is quite heavily focused on scientific themes – Von Neumann is usually introduced at the beginning of the chapter with an outline of his contributions but then a wider cast of characters are brought in. The alternative is more focussed on the minutiae of the central characters life.
From a personal point of view we find Von Neumann is a bit of party animal, married twice with one daughter. His wives found him rather absorbed in his work. His occasionally harsh exterior harboured a more caring private side.
The Man from the Future is an enjoyable read if you have some interest in computing and physics, although deep knowledge of those areas is not required.
Feb 28 2023
Book review: On Savage Shores by Caroline Dodds Pennock
Another book from those I follow on Twitter, On Savage Shores by Caroline Dodds Pennock which is about the Indigenous people who came to Europe in the early years of the invasion of the Americas.
The book is divided thematically into six chapters titled Slavery, Go-betweens (covering translators), Kith and Kin (the transport of families, and the adoption of Indigenous people – mainly boys – by Spanish men), the Stuff of Life (about products such as potatoes, tomatoes, tobacco and so forth), Diplomacy, and Spectacle and Curiosity (about Indigenous people as entertainment).
The focus is on Meso- and South America and the 16th century, when most of the interactions were with Spain and Portugal. There is some mention of colonisation of North America which was more related to Britain, and Brazil which was an interest to the French.
I think the thing that struck me most was the number of Indigenous people in Europe, particularly in Spain, from the very start of the 16th century. I had been aware from reading the history of various scientific expeditions that one or two Indigenous people were often brought back to show off in court. But On Savage Shores highlights that in fact thousands of people were brought, often crossing the Atlantic several times over a period of years. Many were brought explicitly as slaves but others came as diplomats, translators, companions although it is unclear in many cases how voluntary their travel was.
The second aspect which struck me was how active, and engaged in the Spanish legal systems and the Royal courts the Indigenous visitors often were, this was in part because Indigenous people were familiar with legal processes in their own countries. Furthermore courts both legal and Royal are an excellent source of primary documents, it is one of very few ways the Indigenous people were documented. Documents generated by Indigenous people are rather more sparse – there are a handful of pre-invasion codices, some spoken poetry captured in writing at a later date and legal-like documents created to support land claims and the like in Spanish courts. Many of the European records are of those seeking emancipation, quite often successfully.
Columbus very clearly went to the New World with a view to capturing slaves – he had visited the Portuguese slaving fortress, Castle of Sao Jorge da Mina in Ghana prior to his visit and was evaluating the Indigenous people and their suitability for slavery from his first visits to the Americas. To the end of the 16th century something between 1 and 2 million Indigenous people were taken as slaves with most remaining in the Americas but some being brought back to Spain. In the same period about 300,000 Africans were enslaved and taken to the Americas. In theory Spain banned slavery in the mid-16th century, however it wrote itself a number of exceptions which meant the practice was to effectively continue in large volumes for many years.
As well as slaves the Europeans took people they saw as suitable as translators, they also took the children of important Indigenous leaders and some that acted as diplomats – taking their cases to the Spanish Court. For these people the level of coercion is difficult to ascertain. There were certainly a number who came to Europe voluntarily but others had little choice.
A recurring theme is the adoption of sons into the families of, for example Walter Raleigh, Christopher Columbus, and Hernando Cortés. This practice seems to have some basis in Indigenous practices and the adopted sons often gained relatively high social positions back in Europe. Similarly there is a Brazilian boy, Essomericq taken at age 15 by the French in 1504 who became a pillar of the community in France before dying at the age of 90 – although his story is somewhat in question having been recorded sometime after he died by a descendant with a point to prove.
There was a huge population collapse across the Americas due to European diseases in the fifty years after Columbus landed, the diseases travelled faster than the European invaders. The movements of Indigenous people need to be seen in this context, first of all the trans-Atlantic passage was a long grim voyage for all – taking in excess of 6 weeks in the 16th century. Added to this Indigenous people were vulnerable to European diseases, and frequently died in transit or within a few weeks of arriving in Europe. All Europe got in exchange was syphilis. Some of the Indigenous people travelling to Europe were looking for advantage from Spanish support back in their home countries which were in turmoil.
On Savage Shores was revelatory for me, it changed the way I thought about Indigenous people and, to a smaller degree, the Spanish invaders. The switch in viewpoint makes Indigenous people, just that – people – rather than exhibits. On Savage Shores is also an enjoyable read, the focus on one period and one region probably keeps it to a manageable length down a bit. It feels like there is scope for a second book focussed on North America.