Tag: language

Book review: Writing and script – A very Short Introduction by Andrew Robinson

writingA very short book for this review, Writing and script – A very short introduction by Andrew Robinson – this fits in with my previous review on Kingdom of Characters by Jing Tsu. In some ways the “very short” format stymies my reviewing process which involves writing notes on a longer book!

Robinson makes a distinction between proto-writing and fully writing, the first proto-writing – isolated symbols which clearly meant something – dates back to 20,000BC. Whilst the first full-writing defined as “a system of graphic symbols which can be used to convey any and all thought” dates back to some time around 3300BC in Mesopotamia and Egypt. It first appeared in India 2500 BC, Crete (Europe) 1750BC, China 1200BC and Meso-America 900BC. In common with humanity itself there is a lively single origin / multi-origin debate – did writing arise in one place and then travel around the world or arise separately in different places?

When I am reading books relating to history I am very keen to pin down “firsts” and “dates”, I suspect this is not a good obsession. As for mathematics the earliest full-writing was used for accountancy, and bureaucracy!

An innovation in writing is the rebus principle that allows that word can be written as a series of symbols representing sounds whilst those symbols by themselves might convey a different meaning.

I was excited to learn a new word: boustrophedon – which means writing which goes from left to right and then right to left on alternate lines – it is from the Greek for “like the ox turns”. Writing in early scripts was often in both left-right and right-left form and only stabilised to one form (typically left-right) after a period of time. I have been learning Arabic recently (which is read right-left) and was surprised how easy this switch was from my usual left-right reading in English.

Another revelation for me is that a written script is not necessarily a guide to pronunciation – in English it broadly is, some languages do a better job of describing pronunciation in the written form but other languages like Chinese, the core script is largely about transmitting ideas. Arabic holds an intermediate position – accents were added to an alphabet comprised of consonants to provide vowels and thus clarify pronunciation.

As well as the appearance of scripts, Robinson also discusses their disappearance, this happens mainly for political reasons. For example, Egyptian Hieroglyphics fell into decline through invasions by the Greeks and then Romans who used different scripts. Cuneiform was in use for 3000 years before dying out in around 75AD.

Deciphering scripts gets a chapter of its own, classifying decipherment efforts in terms of whether the script was known or unknown and whether the language it represented was known or unknown. Given a sample of a script the first task is to determine the writing direction, and the number of distinct elements. This second task can be challenging given the variations in individual writing styles and, for example, the use of capitalisation. The next step is to identify the type of script (an alphabet – standing for vowels and consonants, a syllabary – standing for whole syllables , or  logograms – standing for whole words) – on the basis of the character count and other clues. The final step, of decipherment, requires something like the Rosetta stone – the same text written in multiple languages where at least one is known – names of people and places are often key here. A broad knowledge of languages living and dead is also a help.

The chapter on How writing systems work expands further on the alphabet/syllabary/logogram classification with a separate chapter on alphabets – I particularly liked the alphabet family tree. Greek is considered the first alphabet which included both consonants and vowels, earlier systems were syllabaries or just contained consonants.

Japanese and Chinese writing systems are covered in a separate chapter. I don’t think I had fully absorbed that Chinese characters were a writing system equivalent to the Latin alphabet, and so can express multiple languages. Kingdom of Characters focus on Chinese elided the fact that Japanese has troubles of its own, particularly in the number of homophones, Japanese speakers sometimes sketch disambiguating characters with their hands to clarify their meaning.

The book finishes with an obligatory Writing goes electronic chapter which highlights that text speak (i.e. m8 for mate) is an example of the rebus principle in action. Robinson also highlights that the electronic publishing has not ended or diminished the importance of the physically printed language, the opposite is true in fact.

This book packs a lot into a short space, it provides the reader with interesting new facts to share (I liked boustrophedon), it would not be a substantial holiday read but a great introduction to the field.

Book Review: Kingdom of Characters by Jing Tsu

jing_tsuKingdom of Characters: A Tale of Language, Obsession, and Genius in Modern China by Jing Tsu describes the evolution of technology to handle the Chinese language from the start of the 20th century until pretty much today (2022). As such it blends technology, linguistics and politics.

The "issue" with Chinese as a language is that it is written as characters with each character representing a whole word, in contrast to English and similar languages which build words from a relatively small number of alphabetic characters. The Chinese language uses thousands of characters regularly, and including rarer forms brings the number to tens of thousands. This means that technologies to input, print, transmit, and index Chinese language material must be changed fairly radically from the variants used to handle alphabetic languages.

The written Chinese language has been around in fairly similar form for getting on for 3000 years, and it was in China that printing was invented in around 600AD – several hundred years before it was invented in Western Europe by Gutenberg. "Penmanship" – how someone writes characters – is still seen as an important personal skill, in a way that handwriting in English is not.

Aside from the linguistic and technological aspects of the process, politics plays and important part.

Kingdom of Characters covers the modernisation of the Chinese language and its use in new technology in seven chapters, (in chronological order), each chapter focuses on one or two individuals each, and some attempt is made to fill out their backgrounds. The first chapter covers the standardisation of the written language to Mandarin which culminated in the 1913 conference of the Commission on the Unification of Pronunciation.

The next step in the modernisation of Chinese was the invention of a Chinese character typewriters, commercialised by the Commercial Press from 1926, developed by Zhou Houkun and Shu Zhendong.

I found the telegraphy chapter quite telling, not through its solution but as a demonstration of what happened when China was not at the table when systems were designed – they were condemned to use a numerical code system which was more expensive than sending alphabetic letters. Interestingly the global telegraphy system seemed to spend a great deal of time trying to stop people sending encoded messages because they saw it as "fare dodging", Chinese was caught up in this effort. Numbers were more expensive to send than letters but representing whole words with numbers was seen as encoding.

Cataloguing gets a chapter of its own, the chapter covers the period from the late 1920s until the 1950s but it feels like a continuation of other discussions on how to break the tens of thousands of characters down into a smaller set of ordered elements in a consistent and memorable fashion. There is a precedent for this, Chinese characters are written in a standard order, stroke by stroke and there has existed for a long time the idea of "radicals" a small set set of foundational strokes. It means that the challenge is two-fold: technical but also linguistic.

In a reprise of the standardisation discussion the fifties saw the simplification of Chinese characters, followed by the introduction of Pinyin – a phonetic system for Chinese. This replaced the Wade-Giles phoneticisation, developed by two Westerners. Growing up in the seventies I first learned that Peking (under the Wade-Giles) was the capital of China, for it to be replaced by Beijing (the Pinyin) in the eighties. The new system also included Chinese tones which don’t have an equivalent in English or other Western languages.

The chapter entitled "Entering into the computer (1979)" is largely about using computers to do photo-typesetting to print Chinese. I suspect the Chinese invention of vector-based character representations may have leapfrogged Western technology. This work was born during the Cultural Revolution which from 1966-76 impacted technological progress rather seriously. I recall in the late eighties a Chinese academic who was visiting the research group where I did my final year undergraduate project, he had worked in the fields during the Cultural Revolution – not voluntarily, and he had a better time of it than many.

The final chapter is on the burgeoning Chinese internet, with a proliferation of input methods, and an audience several times larger than the US audience although it starts with the introduction of Unicode in 1988, and the standing group tasked with the addition of new Chinese characters to the standard from ever more esoteric literary sources.

The broad political context of the work is the decline of China in the 19th century under the Qing Dynasty – forced to open up to foreign influences by the Opium Wars. Towards the end of this time the Chinese language, tied to the ruling dynasty was seen as part of the problem – holding China back from becoming a modern nation. In the 20th century 1912 saw the formation of the republican, Nationalist government although it was in regular conflict with the communists, and then the Japanese in the Second Sino-Japanese War which ended with the defeat of the Japanese in the Second World War. The People’s Republic of China was founded in 1949 with a renewed interest in preserving the Chinese language, but with the interests of the worker at its heart – under the Qing Dynasty literacy, and the use of the written language, was a preserve of the ruling class.

Kingdom of Characters is pretty readable, and will appeal to those interested in radically different writing systems (when compared to alphabetic languages).

Book review: Nabokov’s Favourite Word is Mauve by Ben Blatt

nabokovNabakov’s Favourite word is Mauve by Ben Blatt is an exploration of language through numbers. To set the scene Blatt discusses the attribution of The Federalist Papers – a set of essays written, anonymously, by one or more of Alexander Hamilton, James Madison and John Jay in support of ratification of the new US constitution. The problem was solved in in 1963 by Frederick Mosteller and David Wallace in 1963 by looking at the frequency of different words in the essays and how they compared to the frequencies of words in writings known to be by the three authors. They found that Madison had written all of the essays. An example of their approach: Madison used the word “whilst” in many of his known works but never the word “while”. Hamilton, on the other hand, never used the word “whilst”. Combining the frequencies of a number of such words provides a fingerprint for the writing style of an author. What struck me was that the “fingerprint” words are not at all exceptional.

In the sixties this type of frequency analysis was exceedingly tedious – Mosteller and Wallace physically cut up the essays and made little piles of words in order to count them! This type of heroic manual analysis was not uncommon across many quantitative sciences prior to the widespread availability of computers. These days such analyses are straightforward. The full texts of many books are freely downloadable, and there are programming libraries such as the natural language toolkit (NLTK) in Python which provide functions for word counting and other more sophisticated analyses

Blatt takes the opportunity to extend word counting analysis to more topics and a much extended collection of texts. These include best selling novels, fan fiction, classic fiction and US and UK English corpora (large bodies of expertly selected text). The books are all in English but with some foreign translations, and they are biased to the US market.
The topics covered include: the overuse of adverbs, particularly those ending -ly; he vs she – how male authors sometimes write almost entirely without mentioning “she” whilst the most extreme female authors still write about 20% “he”; differences between US and UK writers – it comes down to blokes, blimey and brilliant; and how the reading age of popular fiction has dropped over the years. Here there is a diversion into Dr Seuss’ Cat in the Hat and it’s 220 word vocabulary, given to Dr Seuss by Rudolf Flesch who was interested in readability, in fact I’ve recently used The Flesch-Kincaid readability index which he helped develop.

The title of the book comes from an analysis of favourite words of authors, those words which they use significantly more frequently than other others. Nabakov is an interesting case since he uses all words about colour significantly more frequently than other authors. This is likely linked to his synaesthesia – of which he has written. Ray Bradbury, in the other hand, is a fan of “cinnamon”, whilst Michael Connelly likes “nodding” and its variants. The chapter on favourite words also covers repeated words and clichés. Blatt is not judgemental about these habits, sometimes they have a dramatic effect.

As almost an aside Blatt reveals some of the commercial side of the publishing industry. I was struck by the “Big Name Author with …” phenomenon where a big name author such as James Patterson or Tom Clancy publish with a lesser known or unknown author. Analysis along the lines of Mosteller and Wallace show that these co-authors write the books with the Big Name providing story outlines (Patterson is straightforward that this is the case). Another example is the Stratemeyer Syndicate who published The Hardy Boys and Nancy Drew series which I recall from my childhood in the seventies. These books purport to have a named author but actually the author is a fiction and the books are published to a formula by a variety of writers (spread over more years than a living author might achieve). Finally, there is the phenomenon of the gigantic author credit on the front cover – Stephen King suffered from this, his name covered only 3% of the front cover of his first book, towards the end of the eighties it approached 40%!

The book finishes with an analysis of first and last lines.

The emphasis of the book is very much on the numbers with fairly cursory examination of the reasons for the numbers found, that said the book is an easy and thought provoking read.

Book Review: The Etymologicon by Mark Forsyth

The Etymologicon by Mark Forsyth

The Etymologicon by Mark Forsyth

The Etymologicon” by Mark Forsyth is a book of the origin of words, of their etymology. It’s based on the author’s blog, the Inky Fool, it reads very much as a sequence of blog posts strung together. This isn’t necessarily bad but does sometime make it feel like a a unrelenting, whirlwind tour of the origins of English words.

Although English was never my strongest subject at school this combination of history and language has always fascinated me. I thought I’d pluck out and summarise a few little gems that caught my eye:

Romany people have received a range of names, based on the mistaken assumptions of their origins. The term “gypsy” arises from those that thought they came from Egypt, most bizarrely the Spanish at one point seemed to believe they come from Flemish Belgium, hence the word “Flamenco”. The Roma ultimately come from India, their language having its roots in Hindi and Sanskrit.

Wamblecropt, meaning “afflicted by nausea” appeared in Cawdrey’s Table Alphabetical of 1604 which Forsyth cites as the first dictionary not directed to the aid of translators. He also highlights that the fame of Dr Johnson’s dictionary is not in its novelty as a type of book but in Johnson’s fame as a learned man. I feel there is a need to randomly reintroduce such words to the language, to see if their time has returned. “I am often wamblecropt on the train into work”.

I’d always assumed that Nazi was a a contraction of Nationalsozialistiche Deutsche Arbeitpartei, which it is but it was also a pre-existing term of abuse relating to Bavarian peasants who were the butt of jokes in Germany in the early 20th century. Nazi is a contraction of Ignatius, a common Bavarian name.

“Terrorism”, it seems was coined in English after the French Revolution to describe a system of government based on terror.

Rather romantically, the ring finger is so-named because early medicine held that the ring finger was directly connected to the heart and could be treated as a proxy for the treatment of heart problems, and so when we marry we put a ring on our ring finger. My wedding ring is the only jewellery I wear. Somewhat insensitively, when choosing a ring for Mrs SomeBeans at which point the issue of a ring for me was first raised, I exclaimed that I wasn’t particularly interested if it cost as much as hers did. I relented fairly soon afterwards, having saved on an engagement ring which Mrs SomeBeans wouldn’t have been able to wear as at the time she worked in the food industry.

You’ll be pleased to know that there was a “gorm” to go with “gormless”, gorm was a 12th century Scandinavian word meaning sense or understanding. Similarly, there were once also “feck” and “reck”, now only found in “feckless” and “reckless”. Happily there was also a “gruntle”, which is now only found in “disgruntled”. To gruntle is to grunt often, as pigs might do, in this instance dis- prefix is an intensifier.

Obviously I could go on, but it would be repetitive.

It’s difficult to know with a book like this the level of referencing which is desirable, it is light on references but the author acknowledges this at the end of the book, providing a brief bibliography and some more detailed references as an example.

Books similar to this include, Lynne Truss’s “Eats, shoots and leaves” and David Crystal’s “The Cambridge Encyclopedia of The English Language“. The most useful part of my library membership is online access to the Oxford English Dictionary, which is also a goldmine for etymologists.

All in all an entertaining read, and compatible with the stresses of new parenthood.