Friday, July 4, 2014

Programmers also need Moral Philosophy

I stand corrected; programmers need knowledge of moral philosophy too.  I realized this after hearing this BBC story about developers of self-driving cars explicitly asking philosophers for help in formulating which person the car should hit.

When first starting my project to teach other programmers all the practical concepts I was learning from Philosophy, I focused on ontology, the study of describing the world. Philosophy has 2500 years of study of this topic that computer science naively leaves up to intuition. I thought that only the IS side of Hume's IS/OUGHT divide would be relevant to actual programming.  It turns out that real programmers doing real software development need the OUGHT side too.

Hume's IS/OUGHT Divide

The philosopher David Hume wrote that all statements fall into one of two categories: descriptive statements about "what IS", versus, prescriptive statements about "what OUGHT to be", and one can't judge what ought to be without a clear accurate understanding of what is.

One of the basic tasks of Philosophy is to try to explain and justify one's intuition and gut reactions via a set of explicit logical rules.  The IS side of things worries about the best way to describe and categorize things, and how we can justify that we know what we think we know.  The OUGHT side of things worries about rules guiding "moral" decisions, and which rules apply in which situations, and what are the overriding goals of each rule system. In other words, what is the "right" thing to do.  In both of these categories, it turns out that our intuitions often result in conflicting answers, thus the need to analyze and sort them out (ahem, easier said than done).

The IS statements are the ones that first come to mind when developing a self-driving car. What IS the terrain, the car speed, the distance to the curb, the position the car in the next lane will be in two seconds.  These questions are the kind covered in Artificial Intelligence classes, the ones first needed to be able to drive at all.  The ones that let a system detect potential collisions, and formulate the set of options available to avoid them.

But, IS statements don't describe which of those options is the "right one", the choice it OUGHT to make. It is only after you realize that sometimes there is no purely "good" option, no option that leaves everyone unscathed, that you realize you will have to program the car to decide who to hit! How does the poor programmer decide that?! Luckily, some programmers had the wisdom to call a philosopher for help encoding moral rules rather than blindly using their programmer's intuition.

So, if a self-driving car hits someone, who OUGHT to have responsibility?  The auto maker? The car owner? The car's software developer who programmed its rules?   If a bicycle darts in front of the car, but the action of swerving to avoid an inevitable collision will itself cause a collision with someone else, who OUGHT to be hit?  The "at fault" bike? The more-likely-to-survive but "innocent" car in the next lane? Override the "never cross the double yellow line" rule and swerve into oncoming traffic (potentially resulting in a chain reaction)?

"Moral" Philosophy

When looking at the language describing the scenarios above, we see words like "action", "choice", "responsibility", "cause", "result", "fault", "innocent", "never", and "more likely to survive". These lead to classic concepts in moral philosophy like Action and Agency, Causation, Free Will vs Determinism, Moral Responsibility vs Moral Luck, Desert (i.e. who deserves what) and Legal Punishment, which are intertwined in the following way; we expect those making decisions to be morally/legally responsible for the consequences of their actions, assuming that they were able to make a free choice.

But there are debates about whether the ends justifies the means (Consequentialism) versus a bad deed is a bad deed (Deontological Ethics). There are also debates about what the overarching goals should be; the most good for the most people (Utilitarianism) versus the most deserving (Prioritarianism), or the most freedom (Libertarianism), or the most equality (Egalitarianism), etc, etc.

These are just the tip of the iceberg, but it is worth the study since they provide a language for documenting and explaining your ultimate set of rules as well as making you aware of the many non-trivial scenarios. Lest you programmers think that Philosophy is overkill, take a look at books like "The Pig that Wants to be Eaten" cataloging the many well-known moral paradoxes that result from relying on intuition and gut reactions.

Saturday, February 15, 2014

It's about Time, It's about Space

A 1960s TV series theme song began, "It's about time, it's about space...". Some, from Physics to Philosophy, say it's about both, claiming they are each aspects of a single space-time. Computer systems developers need to consider this as they build GIS applications.

Ontology, being the branch of philosophy concerned with describing "what exists", tackles the topics of Space and Time since they are often used to describe things. An Introduction to Ontology[1] devotes a chapter to each. As usual, things are more complicated than our initial intuition expects, and debate continues about different viewpoints. In a nutshell, the following are discussed:
  • Space is usually defined in terms of "regions"
  • Space is either absolute or relative.
  • Space is either something things "are in", or it is synonymous with the thing itself
    i.e. regions only have properties like size and location, versus, a region itself having the property blue if the stuff in it is blue.
  • Space is either Euclidean or not (i.e. flat or curved)
  • Space is either separate from Time, or parts of the same thing: space-time.
Most programmers today, in the age of Map apps, Geographic Information Systems, and geocoding, take the view that an entity such as a business or address is located at some location. The location ideally could be defined as a collection of regions defined by GPS coordinates. Often, the location is (over)simplified to a single point on a map.

While it is recognized that many problems exist with actual databases of geocoded entities, it is usually assumed that they are in the realm of epistemology rather than ontology. In other words, it is assumed the problem is with "our knowledge" due to inaccuracies in the set of GPS coordinates; not that locations don't actually have a definite set of coordinates.

However, not every entity that takes up space has a well-defined and unchanging mapping to a set of GPS coordinates.  ZIP codes, for example, are not defined in terms of geography but rather as collections of delivery routes. Another example, as shown in the title insurance case study below, is in real estate legal descriptions. In addition to a knowledge problem caused by ambiguous language used in these descriptions, they can also refer to ephemeral landmarks.

While a naive assumption that space is different than time is often made in data model design, entities like ZIP codes and Legal Descriptions require a time dimension to be completely accurate. It turns out that the mapping of zip codes to postal routes changes several times a year.  And landmarks, referred to in property descriptions, can change location and shape over time.

Case Study: TICOR Title Insurance System
OMEX was a startup that was an early pioneer in creating optical disk technology for data storage. It took on a contract to produce a computer system to support TICOR, the largest title insurance company in the U.S.  TICOR itself had the contract to keep backup copies of all the real estate transactions filed with Los Angeles county.  As a part of archiving copies of the documents, it was free to use the information in them, and hence support its business of providing title insurance.

The computer system was to replace using microfilm photos of the documents with optical disk storage of the images.  It would link these images with a structured database of information related to each property. One of the goals of the database was to enable answering basic questions about property locations.

The programmers, having a naive notion of how property boundaries were defined, were surprised to see that a common method is “metes and bounds” which uses plain english descriptions using landmarks. E.G. "beginning with a corner at the intersection of two stone walls near an apple tree on the north side of Muddy Creek road one mile above the junction of Muddy and Indian Creeks, north for 150 rods to the end of the stone wall bordering the road, then northwest along a line to a large standing rock on the corner of the property now or formerly belonging to John Smith, thence west 150 rods to the corner of a barn near a large oak tree, thence south to Muddy Creek road, thence down the side of the creek road to the starting point."

As can be seen, it would be difficult to translate this into a collection of GPS coordinates. But even if you did, you would not be done with the problem.  Like ZIP codes that change over time, the location and shape of creeks, rivers, etc change over time. Lest you think this is a merely theoretical problem, for centuries, States have sued each other over land ownership due to border rivers migrating over time.

Ultimately, the computer system wound up just using unstructured text fields to contain the legal description rather than the more ambitious GIS database they had originally promised.

[1] An Introduction to Ontology, Nikk Effingham, Polity Press, 2013
[2] A River Runs Thru It, How the States Got Their Shapes, History Channel, 2011