Wednesday, December 19, 2007

Clark Kent is (not) Superman

It delights me to find out that what I thought had been a particular nugget of wisdom, specific to building Identity matching computer systems, actually has a deep principle at work. While working on one of these systems, I learned the strategy of NOT merging all variations of an individual identity's name/address/phone/etc into a single canonical version. It turns out that the need to keep, and assign a unique key to, every variation of identity data (as opposed to only the "canonical" one) has deep roots in language itself...

While reading the Intellectual Devotional (which I highly recommend), I came across its page about "Philosophy of Language" and it had an immediate resonance with a project at my current client. The page describes the "problem of reference" where ideas about what a name "means" have been debated and changed over time.

One theory says that "names" don't have any meaning, in and of themselves, they merely refer to some
thing that has meaning. Hence, Shakespeare's quote "A rose by any other name would smell as sweet" summarizes the position that the word "rose" is not meaningful, and could be exchanged with any other word that refers to the thing "rose". That is why "gulaab" (the Urdu word for rose) can work just as well for speakers of Urdu.

Another more modern theory though, says that names not only refer to some thing, they also carry the connotation of "in what sense" is the thing being referenced. The book illustrates the example of Superman and Clark Kent both being names for the same thing (the being Superman), but they are not interchangeable. Clark Kent (mild mannered reporter) has a work address of the Daily Planet whereas Superman (superhero able to leap tall buildings) does not. It matters which name is used when talking about Superman.

So, in the same way that Clark Kent and Superman both refer to different aspects of the same entity, and are thus not interchangeable, a computer system managing legal entity identity data can not translate name/address variations into a single entity ID# when those variations actually refer to different aspects of the entity. For example, if there is data that is specific to a particular store branch, that branch needs its own well-known ID# even though it is only a portion of a single legal entity. Further, since legal entity names are not unique (for example, I personally own two different corporations with the identical legal name), the entire name/address/phone/etc combination needs managing rather than separate "alternate name" lists. It is also not sufficient to support alternate name/address records merely as search aids that still ultimately result in the ID# of the entity-as-a-whole. Otherwise one would loose track of the fact that we were talking about Clark Kent, not Superman. 


No comments:

Post a Comment