Nov 17 2023
Whilst economising during a period without work I thought I would turn to other books in the house to read and review. This is how I came to How the World Thinks: A Global History of Philosophy by Julian Baggini. This is not to say I am uninterested in philosophy but, as a scientist in the Western tradition, philosophy was a substrate on which I worked without thinking.
How the World Thinks aims to provide an outline of the major schools of philosophy around the world, Baggini alludes to the fact that in the Western world university philosophy departments are more accurately described as “Western philosophy” departments. Comparative philosophy, apparently, is not really a thing. Baggini also talks about how “academic” philosophy impacts the culture in which it sits – a process called sedimentation. Baggini cites the 5rd-3th centuries BCE as when the major philosophical traditions were born (know as the Axial Age), when understanding of the world started moving from myth to some sort of reason.
How the World Thinks is divided into four parts and an additional concluding part; these cover the nature of philosophy in different traditions, the nature of the world, who we are and how philosophy impacts the way we live. The text typically covers Far Eastern traditions (China and Japan), India, Islamic and Western traditions with some references to African philosophy. Rather strangely he mentions Russian philosophy in the final part, only to say really he hasn’t mentioned Russian philosophy!
Western philosophy is built around “reason” and nowadays is largely separate from theology, there are empiricist and rationalist schools within this. Empiricists believing on observing the world and building models based on observation, whilst rationalist believe the world can be understood with pure thought. East Asian philosophy is more concerned with a “way” of living in the world which is difficult if not impossible to explain in words. Indian philosophy lies between these two. Interestingly yoga is part of a philosophical tradition which sees it as a way of better seeing how the world really is.
The next part of the book concerns the processes that govern the world: time, karma, emptiness, naturalism, unity, and reductionism. Karma is a particularly Indian concept, and is linked by Baggini to the caste system which DNA evidence dates back to the 6th century AD. East Asian philosophy is more concerned with emptiness / nothingness then Western philosophy – it struck me reading The First Astronomers that Australian Aboriginal constellations include the absence of stars into their constellations. Naturalism, a regard for nature which links the natural world to the human, is stronger in East Asian philosophies – Chinese art incorporated natural scenes long before Western art. Islamic philosophy is strong on unity, whilst Western philosophy likes reductionism.
Part 3 concerns the self, contrasting the East Asian view of the self which is defined in relationship to others, similarly in Africa, with the indivisible, individualistic self of the West. There is even the idea that the self does not exist, as such. Baggini refers to the indivisible self as “atomistic” which harks back to the ancient Greek definition but for a modern scientist this is a bit confusing because an atom is a very different thing. Indian philosophy thinks in terms of a self that is reborn but need not hold any recollection of previous selves. Perhaps not made explicit in this part but one gets the feeling that other philosophies have a strong sense of being concerned with individual self-improvement, by acting in the right way, leading the right life one improves through each rebirth.
The final part of the book concerns how the world lives, how the philosophy discussed in earlier chapters is reflected in culture. This starts with a consideration of the idea of “harmony” in China, this can have elements of hierarchy and misogyny. Although Baggini highlights that it is understood that hierarchy is not bad in all cases, or even most. There is a chapter on “virtue” which as much as anything highlights that the meanings of words when translated can shift. We might think about the importance of “ritual” in Far Eastern cultures but equally we could call it “cultural grammar” which has different connotations in English .
I found How the World Thinks straightforward enough to read, the chapters are a convenient size and the style is readable. It also thought provoking, in that it challenges the deepest assumptions about the way I lead my intellectual life – in some ways it parallels The First Astronomers by Duane Hamacher in this respect.
Oct 24 2023
In an earlier blog post I explained the motivation for a series of “Rosetta Stone” posts which described the ecosystem for different programming languages. This post covers TypeScript, the associated GitHub repository is here. This blog post aims to provide some discussion around technology choices whilst the GitHub repository provides details of what commands to execute and what files to create.
I was curious to try this exercise on a language, TypeScript, which I had not previously used or even considered. It so happens that TypeScript arose in a recent discussion so I thought I would look at it.
So for this post in particular, factual errors are down to ignorance on my part! I welcome corrections.
How is the language defined?
The homepage for TypeScript is https://www.typescriptlang.org/. There appears to be no formal, up to date language specification. The roadmap for the language is published by Microsoft, and develops through a process of Design Proposals. TypeScript releases a new minor version every 3 months, and once 10 minor versions have been released the major version is incremented. There is some debate about this strategy since it does not follow either conventional semantic or date-based versioning.
TypeScript Compilers and runtimes
The install of node.js appears trivial but on Windows machines there is a lengthy install of Visual Studio build tools, the chocolatey package manager and Python after node.js has installed!
Once node.js is installed installing TypeScript is a simple package installation, it can be installed globally or at a project level. Typically getting started guides assume a global install which simplifies paths.
Tsc is configured with a file, tsconfig.json file – a template can be generated by running `tsc –init`
Details of installation can be found here in the accompanying GitHub repository.
Npm install @types/node –save-dev
Local packages can be installed for development, as described here.
Npm has neat functionality whereby scripts for executing the project tests, linting, formatting and whatever else you want, can be specified in the `package.json` file.
Python has long had the concept of a virtual environment, where the Python interpreter and installed packages can be specified at a project level to simplify dependency management. Npm essentially has the same functionality by the use of saved dependences which are installed into a `node_modules` folder. The node.js version can be specified in the `package.json` file, completing the isolation from global installation.
Project layout for package publication
There is no formally recommended project structure for TypeScript/npm packages. However, Microsoft has published a Node starter project which we must assume reflects best practice. An npm project will contain a `package.json` file at the root of the project and put locally, project-level packages into a node_modules directory.
Based on the node starter project, a reasonable project structure would contain the following folders, with configuration files in the project root directory:
- docs – contains documentation output from documentation generation packages ;
- node_modules – created by npm, contains copies of the packages used by a project;
- src – contains TypeScript source files ;
- tests – contains TypeScript test files;
How this works in practice is shown in the accompanying GitHub repository. TypeScript is often used for writing web applications in which case there would be separate directories for web assets like HTML, CSS and images.
Jest is installed with npm alongside ts-jest and the Jest TypeScript types, and configured using a jest.config.json file in the root of the file. In its simplest form this configuration file provides a selector for finding tests, and a transform rule to apply ts-jest to TypeScript files before execution. Details can be found in the accompanying GitHub repository.
Static analysis and formatting tools
There is an ESLint extension for Microsoft Visual Code.
There is a prettier extension for Visual Code
I was struck by the similarities between Python and TypeScript tooling, particularly around configuring a project. The npm package.json configuration file is very similar in scope to the Python pyproject.toml file. Npm has the neat additional features of adding packages to package.json when they are installed and generating the equivalent of the requirements.txt file automatically. It also allows the user to specify a set of “scripts” for running tests, linting and so forth – in Python I typically use a separate tool, `make`, to do this.
I welcome comments, probably best on Mastodon where you can find me here.
Oct 24 2023
In an earlier blog post I explained the motivation for a series of “Rosetta Stone” posts which described the ecosystem for different programming languages. This post covers Python, the associated GitHub repository is here. This blog post aims to provide some discussion around technology choices whilst the GitHub repository provides details of what commands to execute and what files to create.
For Python my knowledge of the ecosystem is based on considerable experience as a data scientist working predominantly in Python rather than a full-time software developer or a computer scientist. Although much of what I learned about the Python ecosystem was as a result of working on a data mesh project as, effectively, a developer.
Python is a dynamically typed language, invented by Guido van Rossum with the first version released in 1991. It was intended as a scripting language which fell between shell scripting and compiled languages like C. As of 2023 it is the most popular language in the TIOBE index, and also on GitHub.
How is Python defined?
The home for Python is https://www.python.org/ where it is managed by the Python Software Foundation. The language is defined in the Reference although this is not a formal definition. Python has a regular release schedule with a new version appearing every year or so and a well-defined life cycle process. As of writing (October 2023) Python 3.12 has just been released. In the past the great change was from Python 2 to Python 3 which was released in December 2008 – this introduced breaking changes. The evolution of the language is through the PEP (Python Enhancement Proposal) – PEP documents are an excellent resource for understanding new features.
The predominant Python interpreter is CPython which is what you get if you download Python from the official website. Personally, I have tended to use the Anaconda distribution of Python for local development. I started doing this 10 years or so ago when installing some libraries on Windows machines was a bit tricky and Anaconda made it possible/easy. It also has nice support for virtual environments – in particular it allows the Python version for the virtual environment to be defined. However, I keep thinking I should review this decision since Anaconda includes a lot of things I don’t use, they recently changed their licensing model which makes it more difficult to use in organisations and the issues with installing libraries are less these days.
CPython is not the only game in town though, there is Jython which compiles Python to Java-bytecode, IronPython which compiles it to the .NET intermediate language, and PyPy which is written in Python. These alternatives generally have the air of being interesting technical demonstrations rather than fully viable alternatives to CPython.
Typically I invoke Python scripts using a command line in Git Bash like:
This works because I start all of my Python scripts with:
More generally Python scripts are invoked like:
Python has always come with a pretty extensive built-in library – “batteries included” is how it is often described. I am a data scientist, and rather harshly I often judge programming languages as to whether they include a built-in library for reading and writing CSV files (Python does)!
The most common method for managing third party libraries is the `pip` package. By default this installs packages from the Python Package Index repository. The Anaconda distribution includes the `conda` package manager, which I have occasionally used to install tricky packages, and there are `pipenv` and `poetry` tools which also handle virtual environments as well as dependencies.
With pip installing a package is done using a command like:
pip install scikit-learn
If required a specific version can be specified or a version newer than a specific version. A list of dependencies can be installed from a plain text file:
pip install -r requirements.txt
The dependencies of a project are defined in the `pyproject.toml` file which configures the project. These are often described as being abstract – i.e. they indicate which packages are required, and perhaps version limits, if the host project requires functionality only available after a certain limit. The `requirements.txt` file is often found in projects, this should be a concrete specification of package versions on the developer machine. It is the “Works for me(TM)” file. I must admit I only understood this distinction after looking at the node.js package manager, npm, where the `pyproject.toml` equivalent is updated when a new package is installed. The `requirements.txt` file, equivalent – `package-lock.json` – is updated with the exact version of a package actually installed.
In Python local code can be installed as a package like:
pip install -e .
This so called “editable” installation means that a package can be used elsewhere on the same machine whilst keeping up to date with the latest changes to the code.
Python has long supported the idea of a “virtual environment” – a project level installation of Python which largely isolates it from other projects on the same machine by installing packages locally.
This very nearly became mandatory, see PEP-0704 – however, virtual environments don’t work very well for certain use cases (for example continuous development pipelines) and it turns out that `pip` sits outside the PEP process so the PEP had no authority to mandate a change in `pip`!
The recommended approach to creating virtual environments is the built-in `venv` library. I use the Anaconda package manager since it allows the base version of Python to be varied on a project by project basis (or even allowing multiple versions for the same project). virtualenv, pipenv and poetry are alternatives.
IDEs like Visual Code allow the developer to select which virtual environment a project runs in.
Project layout for package publication
Tied in with the installation of packages is the creation and publication of packages. This is quite a complex topic, and I wrote a whole blog post on it. Essentially Python is moving to a package publication strategy which stores all configuration in a `pyproject.toml` file (toml is a simple configuration file format) rather than an executable Python file (typically called setup.py). This position has evolved over a number of years, and the current state is still propagating through the ecosystem. An example layout is shown below,
setup.py is a legacy from former package structuring standards. The
__init__.py files are an indication to Python that a directory contains package code.
Python has long included the `unittest` package as a built-in package – it is inspired the venerable JUnit test library for Java. `Pytest` is an alternative I have started using recently which has better support for reusable fixtures and a simpler, implicit syntax (which personally I don’t like). Readers will note that I have a tendency to use built-in packages where at all possible, this is largely to limit the process of picking the best of a range of options, and hedging against a package falling into disrepair. Typically I use Visual Code to run tests which has satisfying green tick marks for passing tests and uncomfortable red crosses for failing tests.
Integrated Development Environments
The choice of Integrated Development Environment is a personal one, Python is sufficiently straightforward that it is easy to use a text editor and commandline to complete development related tasks. I use Microsoft Visual Code, having moved from the simpler Sublime Text. Popular alternatives are the PyCharm IDE from JetBrains and the Spyder editor. There is even a built-in IDE called IDLE. The Jupyter Notebook is used quite widely particularly amongst data scientists (personally I hate the notebook paradigm, having worked with it extensively in Matlab) but this is more suited to exploratory data analysis and visualisation than code development. I use IPython, a simple REPL, a little to confirm syntax.
Static Analysis and Formatting Tools
I group static analysis and formatting tools together because for Python static analysers tend to creep into formatting. I have started using static analysis tools and a formatter since using Visual Code whose Python support builds it in, and using development pipelines when working with others. For static analysis I use a combination of pyflakes and pylint which are pretty standard choices, and for formatting I use black.
For Python a common standard for formatting is PEP-8 which describes the style used in the Python built-in library and C codebase.
I use sphinx for generating documentation, the process is described in detail this blog post. There is a built-in library, pydoc, which I didn’t realise existed! Doxygen, the de facto standard for C++ documentation generation will also work with Python.
In writing this blog post I discovered a couple of built-in libraries that was not currently using (pydoc and venv). In searching for alternatives I also saw that over a period of a few years packages go in and out of favour, or at least support.
I welcome comments, probably best on Mastodon where you can find me here.
Oct 24 2023
The Rosetta Stone is a stone slab dating to 196BC on which is written the same decree in three different ancient Egyptian languages, it was key to deciphering these languages in the modern era.
It strikes me that learning a new programming language is not really an exercise in learning the syntax of a new language, for vast swathes of languages those things are very similar. For an experienced programmer the learning is in the ecosystem. What we need is a Rosetta Stone for software development in different languages that tells us which tools to use for different languages, or at least gives us a respectable starting point.
To my mind the ecosystem specific to a programming language includes the language specification and evolution process, compiler/interpreter options, package/dependency management, virtual environments, project layout, testing, static analysis and formatting tools, and documentation generation. Package management, virtual environments and project layout are inter-related, certainly in Python (my primary programming language).
In researching these tools I was curious about their history. Compilers have been around since nearly the beginning of electronic computing in the late forties and early fifties. Modern testing frameworks generally derive from SmallTalk’s sUnit – published in 1989. Testing clearly went on prior to this – I am sure it is referenced in The Mythical Man Month and Grace Hopper is cited for her work in testing components and computers.
I tend to see package/dependency management as being the system by which I install packages from an internet repository such as Python’s PyPI repository – in which case the first of these was CPAN, for the Perl language first online in 1995, not long after the birth of the World Wide Web.
Separate linters date back, to 1978. Indent was the first formatter, written in 1976. The first documentation generation tools arose towards the end of the eighties (link) with JavaDoc which I suspect inspired many subsequent implementations appearing in the mid-nineties.
Tool choices are not as straightforward as they seem, in nearly all cases there are multiple options as a result of an evolution in the way programming is done more generally, or developers seeking to improve what they see as pain points in current implementations. Some elements are down to personal choice.
For my first two Rosetta Stone blog posts I look at Python and TypeScript. My aim is that the blog post will discuss the options and a GitHub repository will demonstrate one set of options in action. I am guided by my experience of working in a team on a Python project where we needed to agree a tool set and best practices. The use of development pipelines which run linters, formatters and tests automatically before code changes are merged, drove a lot of this work. The aim of these blog posts is, therefore, not to simply get an example of a programming language running but to create a project that software developers would be content to work with. The code itself is minimal, although I may add some more involved code in future.
I wrote the TypeScript in my “initial release” to see how the process would work for a language with which I was not familiar – it helped me understand the Python ecosystem better and gave me “feature envy”!
I found myself referencing numerous separate blog posts in writing these first two blog posts which suggests this Rosetta Stone is a worthwhile exercise. I also found my search results were not great, contaminated by a great deal of poorly written perhaps automatically generated material.
There are other, generic, parts of the ecosystem such as the operating system on which the code will run, the source control system and the Integrated Development Environment the developer uses which I will not generally discuss. I work almost exclusively on Windows but I prefer Git Bash as my shell. I use git with GitHub for source control and Visual Code as my editor/IDE.
When I started this exercise I thought that that there may be specific Integrated Development Environments used for specific languages. In the eighties and nineties when you bought a programming language the Integrated Development Environment was often part of the deal. This seems not to be the case anymore, most IDEs these days can be extended with plugins specific to a language so which IDE you start with is immaterial. In any case any language can be used as a combination of a text editor and command line tools.
I have been programming since I was a child in the early eighties. First in BASIC, then at university in FORTRAN, in industry in MATLAB before moving to Python. During that time I have also dabbled in C++ and Java but largely theoretical point of view. Although I have been programming for a long time it has generally been in the role of scientist / data scientist producing code for my own use, only in the last few years have I written code intended to be consumed by others.
These are my first two “Rosetta Stone” blog posts:
Oct 13 2023
This review is of Broad Band by Claire L. Evans, subtitled The Untold Story of the Women Who Made the Internet. It is arranged thematically with each chapter focusing on a couple of women moving in time from the first chapter, about Ada Lovelace in the 19th century, through to the early years of the 21st century. The first part of the book covers the early period of computing up to the mid-sixties, the second part the growth of networked computing through the seventies and eighties with the final part covering the rise of the World Wide Web and services devoted to women.
The first chapter introduces us to Ada Lovelace, sometimes heralded as the first programmer which is a somewhat disputable claim. More importantly she was clearly a competent mathematician and excelled in democratising and explaining the potential of the mechanical computing engines that Charles Babbage was trying, and largely failing, to build. More broadly this chapter covers the work of the early human “computers”, who were often women, employed to carry out calculations for astronomical or military applications. Following on from this role, by 1946 250,000 women were working in telephone exchanges (presumably in the US).
Women gained this role as “computers” for a range of reasons. In the 19th century it was seen as acceptable work for educated women whose options were severely limited – as they would be for many years to come, excepting war time. The lack of alternatives meant they were very cheap to employ. Under the cover of this apparently administrative role of “computer” women made useful, original contributions to science albeit they were not recognised as such. Women were seen as good at this type of meticulous, routine work.
When the first electronic computers were developed in the later years of the Second World War it was unsurprising that women were heavily involved in their operation partly because of their previous roles, and partly because men had been sent to fight. There appears to have been an attitude that the design and construction of such machines was men’s work and their actual use, the physical act of programming was women’s work – often neglected by those men that built the machines.
It was in this environment that the now renowned Grace Hopper worked. She started writing what we would now describe as compilers to make the task of programming computers easier. She was also instrumental in creating the COBOL programming language, reviled by computer scientist in subsequent years but comprising 80% of the world’s code by the end of the 20th century. The process that Hopper used to create the language, a committee involving multiple companies working towards a common useful goal, looks surprisingly modern.
In the sixties there was a sea-change for women in computing, it was perceived that there was a shortage of programmers and the solution was to change programming into an engineering science which had the effect of gradually pushing women out of computing through the seventies. It was at this time that the power of computer networks started to be realised.
The next part of the book covers networking via a brief diversion into mapping the Mammoth Cave system in Kentucky which became the basis of the first network computer game: Colossal Cave Adventure. I was particularly impressed by Project One, a San Francisco commune which housed a mainframe computer (a Scientific Data Systems 940) which had been blagged from a company by Pam Hardt-English. In the early seventies it became the first bulletin board system (BBS) – a type of system which was to persist all the way through to the creation of the World Wide Web (and beyond). Broad Band also covers some of the later bulletin board systems founded by women which evolved into women’s places on the Web, BBS were majority male spaces for a long time. In the meantime Resource One also became the core of the San Francisco Social Services Referral Directory which persisted through until 2009, this was a radical innovation at the time – computers used for a social purpose outside of scientific or military applications.
The internet as we know it started with ARPANET in 1969. Broad Band covers two women involved in the early internet – Elizabeth (Jake) Feinler who was responsible for the Resource Handbook – a manually compiled directory of computers, and their handlers, on ARPANET. This evolved, under her guidance, to become the WHOIS service and host.domain naming convention for internet addresses. The second woman was Radia Perlman, who invented the Spanning Tree Protocol for ethernet whilst at DEC in 1984.
This brings us, in time, to the beginning of the World Wide Web. The World Wide Web grew out of the internet. Hypertext systems had been mooted since the end of the Second World War but it wasn’t until the eighties that they became technically feasible on widely available hardware. Broad Band cites British Wendy Hall and Cathy Marshall at Rank Xerox as contributors to the development of hypertext systems. These were to be largely swept away by Tim Berners-Lee’s HTML format which had the key feature of hyperlinking across different computers even if this made the handling of those links prone to decay – something handled better by other non-networked hypertext systems. The World Wide Web grew ridiculously quickly in the early nineties. Berners-Lee demonstrated a rather uninspiring version at HyperText ’91 and by HyperText ’94 he was keynote speaker.
There is a a brief chapter devoted to women in gaming. Apparently Barbie Fashion Designer sold 600,000 units in 1996 more than Doom and Quake! There was a brief period when games were made very explicitly for girls – led to a degree by Brenda Laurel who had done extensive research showing boys strive for mastery in games, whilst girls were looking for a collaborator to complete a task. These ideas held sway for a while before a more diverse gaming market took hold which didn’t divide games so much by gender.
It is tempting for me to say that where women have made their mark in computing and the internet is in forming communities, communicating the benefits of technology and making them easier to use – in a reprise of the early pioneering women in science – because that is what women are good at. However, this is the space in which women have been allowed by men – it is not a question of innate ability alone.
I found this book really interesting, it is more an entry point into the topic of women in computing than a comprehensive history. It has made me nostalgic for my computing experiences of the eighties and nineties, and I have added a biography of Grace Hopper to my reading list.