Tuesday, December 25, 2007

Shakespeare is (not) Shakespeare

In the early part of the book The Stuff of Thought by Steven Pinker, the problem of what-a-name-names, is explored with the example of Shakespeare. Pinker distinguishes between Shakespeare: the historical figure, and Shakespeare: the author of numerous plays like Hamlet attributed to Shakespeare.

In my earlier post, it was somewhat easy to see that there were multiple aspects to Superman because each aspect already had its own name; Superman vs Clark Kent. With Shakespeare however, it is much more subtle because the different aspects have the same name: Shakespeare. Additionally, we are not used to thinking that they are different aspects that can be independent of each other, any more than we think of Cher-the-person and Cher-the-singer as being independent things. But, as discussed in the book, many people over the centuries have debated whether the author of Hamlet, et al was really Francis Bacon, Christopher Marlowe, Queen Elizabeth, etc.

The interesting thing is that because Shakespeare is SO ingrained as the name of the playwright that even if Sir Francis were to be proven the author, the headline will be "Bacon is the REAL Shakespeare!" which is absurd because clearly, Shakespeare-the-historical-figure is the "real" Shakespeare. Changing the human associated with the author-of-Hamlet concept will not change the concept's name; it will remain "Shakespeare's Hamlet (written by Bacon)" and not "Bacon's Hamlet".

So, when assigning ID#(s) to putatively single entities, flexibility should be built in to allow ad-hoc collections of attributes of any entity to be grouped and named and referenced separately. Otherwise, the system would not be able to represent the statement: Shakespeare is not Shakespeare, Bacon is.

Wednesday, December 19, 2007

Clark Kent is (not) Superman

It delights me to find out that what I thought had been a particular nugget of wisdom, specific to building Identity matching computer systems, actually has a deep principle at work. While working on one of these systems, I learned the strategy of NOT merging all variations of an individual identity's name/address/phone/etc into a single canonical version. It turns out that the need to keep, and assign a unique key to, every variation of identity data (as opposed to only the "canonical" one) has deep roots in language itself...

While reading the Intellectual Devotional (which I highly recommend), I came across its page about "Philosophy of Language" and it had an immediate resonance with a project at my current client. The page describes the "problem of reference" where ideas about what a name "means" have been debated and changed over time.

One theory says that "names" don't have any meaning, in and of themselves, they merely refer to some
thing that has meaning. Hence, Shakespeare's quote "A rose by any other name would smell as sweet" summarizes the position that the word "rose" is not meaningful, and could be exchanged with any other word that refers to the thing "rose". That is why "gulaab" (the Urdu word for rose) can work just as well for speakers of Urdu.

Another more modern theory though, says that names not only refer to some thing, they also carry the connotation of "in what sense" is the thing being referenced. The book illustrates the example of Superman and Clark Kent both being names for the same thing (the being Superman), but they are not interchangeable. Clark Kent (mild mannered reporter) has a work address of the Daily Planet whereas Superman (superhero able to leap tall buildings) does not. It matters which name is used when talking about Superman.

So, in the same way that Clark Kent and Superman both refer to different aspects of the same entity, and are thus not interchangeable, a computer system managing legal entity identity data can not translate name/address variations into a single entity ID# when those variations actually refer to different aspects of the entity. For example, if there is data that is specific to a particular store branch, that branch needs its own well-known ID# even though it is only a portion of a single legal entity. Further, since legal entity names are not unique (for example, I personally own two different corporations with the identical legal name), the entire name/address/phone/etc combination needs managing rather than separate "alternate name" lists. It is also not sufficient to support alternate name/address records merely as search aids that still ultimately result in the ID# of the entity-as-a-whole. Otherwise one would loose track of the fact that we were talking about Clark Kent, not Superman. 

Sunday, December 2, 2007

Cubist Programming?

In The Stuff of Thought by Steven Pinker, the book talks about human minds being able to hold multiple viewpoints simultaneously about the same event, and that resonates with my ideas for Existential Programming. It also reminded me of my thoughts that Cubism tried to illustrate the same idea. Cubist paintings simultaneously attach different viewpoints of something to the same single "object" (as Existential Programming would have different viewpoints/models of the same "data entity" kept together in the same object).

So, Existential Programming might well have been named Cubist Programming!