Tuesday, December 25, 2007

Shakespeare is (not) Shakespeare

In the early part of the book The Stuff of Thought by Steven Pinker, the problem of what-a-name-names, is explored with the example of Shakespeare. Pinker distinguishes between Shakespeare: the historical figure, and Shakespeare: the author of numerous plays like Hamlet attributed to Shakespeare.

In my earlier post, it was somewhat easy to see that there were multiple aspects to Superman because each aspect already had its own name; Superman vs Clark Kent. With Shakespeare however, it is much more subtle because the different aspects have the same name: Shakespeare. Additionally, we are not used to thinking that they are different aspects that can be independent of each other, any more than we think of Cher-the-person and Cher-the-singer as being independent things. But, as discussed in the book, many people over the centuries have debated whether the author of Hamlet, et al was really Francis Bacon, Christopher Marlowe, Queen Elizabeth, etc.

The interesting thing is that because Shakespeare is SO ingrained as the name of the playwright that even if Sir Francis were to be proven the author, the headline will be "Bacon is the REAL Shakespeare!" which is absurd because clearly, Shakespeare-the-historical-figure is the "real" Shakespeare. Changing the human associated with the author-of-Hamlet concept will not change the concept's name; it will remain "Shakespeare's Hamlet (written by Bacon)" and not "Bacon's Hamlet".

So, when assigning ID#(s) to putatively single entities, flexibility should be built in to allow ad-hoc collections of attributes of any entity to be grouped and named and referenced separately. Otherwise, the system would not be able to represent the statement: Shakespeare is not Shakespeare, Bacon is.






Wednesday, December 19, 2007

Clark Kent is (not) Superman

It delights me to find out that what I thought had been a particular nugget of wisdom, specific to building Identity matching computer systems, actually has a deep principle at work. While working on one of these systems, I learned the strategy of NOT merging all variations of an individual identity's name/address/phone/etc into a single canonical version. It turns out that the need to keep, and assign a unique key to, every variation of identity data (as opposed to only the "canonical" one) has deep roots in language itself...

While reading the Intellectual Devotional (which I highly recommend), I came across its page about "Philosophy of Language" and it had an immediate resonance with a project at my current client. The page describes the "problem of reference" where ideas about what a name "means" have been debated and changed over time.

One theory says that "names" don't have any meaning, in and of themselves, they merely refer to some
thing that has meaning. Hence, Shakespeare's quote "A rose by any other name would smell as sweet" summarizes the position that the word "rose" is not meaningful, and could be exchanged with any other word that refers to the thing "rose". That is why "gulaab" (the Urdu word for rose) can work just as well for speakers of Urdu.

Another more modern theory though, says that names not only refer to some thing, they also carry the connotation of "in what sense" is the thing being referenced. The book illustrates the example of Superman and Clark Kent both being names for the same thing (the being Superman), but they are not interchangeable. Clark Kent (mild mannered reporter) has a work address of the Daily Planet whereas Superman (superhero able to leap tall buildings) does not. It matters which name is used when talking about Superman.

So, in the same way that Clark Kent and Superman both refer to different aspects of the same entity, and are thus not interchangeable, a computer system managing legal entity identity data can not translate name/address variations into a single entity ID# when those variations actually refer to different aspects of the entity. For example, if there is data that is specific to a particular store branch, that branch needs its own well-known ID# even though it is only a portion of a single legal entity. Further, since legal entity names are not unique (for example, I personally own two different corporations with the identical legal name), the entire name/address/phone/etc combination needs managing rather than separate "alternate name" lists. It is also not sufficient to support alternate name/address records merely as search aids that still ultimately result in the ID# of the entity-as-a-whole. Otherwise one would loose track of the fact that we were talking about Clark Kent, not Superman. 


Sunday, December 2, 2007

Cubist Programming?

In The Stuff of Thought by Steven Pinker, the book talks about human minds being able to hold multiple viewpoints simultaneously about the same event, and that resonates with my ideas for Existential Programming. It also reminded me of my thoughts that Cubism tried to illustrate the same idea. Cubist paintings simultaneously attach different viewpoints of something to the same single "object" (as Existential Programming would have different viewpoints/models of the same "data entity" kept together in the same object).



So, Existential Programming might well have been named Cubist Programming!

Saturday, November 10, 2007

Subjective, Objective, Relative, Existential

In the imposing, but handy, Oxford Companion to Philosophy[1], there are entries about "objectivism and subjectivism"[2], and "relativism, epistemological"[3] that lead to the following observations:

  • Objectivism says that some statements are objective, in that they are true independent of anyone's opinion. e.g. This ball is red. Alternatively, values assigned to properties can be dependent on other factors and be described by a function rather than a simple value. e.g. The color of Ayer's Rock is F(time-of-day, weather).
  • Subjectivism says that (potentially all) statements are subjective, in that they are dependent on the opinion of the person making the statement. e.g. This cake is delicious.
  • Relativism says that statements are always subjective even when the decider thinks he's made an objective evaluation. I.E. no evaluations are objective because man is always biased by his particular cultural, historical, religious, etc viewpoint, and no particular viewpoint is "the right one". e.g. This society is primitive.
  • Existential Programming philosophy says that even if something is supposedly scalar & objective, and even if one does not subscribe to relativism (thus implying that there is no need to ascribe to values a particular data source), the reliability of any particular data source is never perfect, and thus one needs to model data as if relativism were true. I.E. keep track of "says who" for each "fact" and therefore be prepared to simultaneously handle/store multiple values for everything, tagging them with "says who", "said when", etc. So, in effect, there are no scalar values, only functions with at least a data source as a parameter.
[1] Oxford Companion to Philosophy, 2nd Ed, 2005
[2] ibid, pg 667
[3] ibid, pg 800




Sunday, November 4, 2007

"Strong" typing not so strong?

I've been in debates about the wisdom of defining web services with "strongly" typed messages (with specific fields of "specific" types) versus more general messages with, say, fields containing XML content. I have argued that so-called strong types like integer and string are no more "strong" than a segment of XML. This is because the string (and to a lesser extent integer, float, etc) data types are so broad that they can't really be called "strong". Strings are almost always further interpreted by the software and people on both ends of a message.

This notion is embodied by the testing technique known as Equivalence Partitioning. [A good summary is in the article "Make Child's Play of Equivalence Class Partitioning".] It says that there is more detailed structure to data than string and integer and to only test at that level is to miss many crucial test cases. It defines subranges and such to guide what to test, in effect, defining more granular data types. Languages like Pascal and Ada had more refined data types like subranges but those ideas were lost when C-like languages (including C++ & Java) came into vogue.

Existential Programming would change the way data types are defined to get away from simply defining a structure that is an aggregation of "primitive data types". It would let data types define a function that computes whether (or how closely) a putative value meets the requirements for that data type. While one could put that logic in a "setter" method of a class, that doesn't work with direct assignments (via assignment operators) of values (at least in Java-like languages).

To see where even subrange definitions like Pascal/Ada allowed (e.g. 3..27) are not sufficient, look at the logic required to validate the month/day/year values of a date as detailed in the ECP article. Only an algorithm can encode what is required for legal values, not simple declarations of string and integer properties.

So, "strong" typing is a relative term.

Sunday, October 21, 2007

atonal versus pantonal

While reading the Intellectual Devotional[1], I came across its page about the composer Arnold Schoenberg that described how his new musical composition technique (12-tone serialism) was described as "atonal" but he preferred the term "pantonal". This resonated with my desire that Existential Programming be recognized as multi-type programming rather than "type-less".

The analogy here is between the key signature of a piece of music, and an ontology of classes used in a program.

A key signature (e.g. C-major, d-minor) defines which notes are to be used, and the special relationships between different notes. To be "tonal" is to follow the rules of a particular key signature. To be "atonal" is to NOT follow the rules of any key signature. To be "pantonal" is to simultaneously follow the rules of multiple key signatures.

An ontology defines which types/classes are to be used, and the special relationships between different classes. To be "strongly-typed" is to follow the rules of a particular ontology. To be "type-less" is to NOT follow the rules of any ontology. To be "multi-typed" (i.e. existential programming) is to simultaneously follow the rules of multiple ontologies.

[1] "The Intellectual Devotional",2006, Kidder, Oppenheim
http://www.theintellectualdevotional.com/index.shtml


Wednesday, October 17, 2007

What is redundant?

In the handy Philosopher's Toolkit book[1], there is a section[2] explaining the difference between "analytic" statements and "synthetic" statements, and in another section[3], "a priori" versus "a posteriori". They are both similar in that they are trying to distinguish between what is known about X just from its definition versus things that would have to be thought through or experiments done to discover them. E.G. "all triangles have 3 sides" and "Joe, the bachelor, is unmarried" are obvious just from the definition of triangle and bachelor. Other statements like "water boils at 212 degrees Fahrenheit" and "Joe, the bachelor, has red hair" are not "entailed by" or "contained within" the definitions of water and bachelor.

There are arguments by philosophers, however, about what IS entailed by a concept or contained within a definition. Some would argue that the boiling point of water *is* contained within the concept of water whether a particular person happens to know that or not. Similarly the statement "all triangles have 180 degrees as the sum of their internal angles" is true independently of whether we know about it, (but a lot of people *wouldn't* know that).

So, in general, there will not be agreement from one ontology/model to another about which statements (i.e. data/attributes/relationships) are redundant. Should the "age" of an applicant for a job be stored because it was a field in the application form? Or, should it always be calculated from the birth date and the date of the application? Should a "short cut" relationship between 2 distantly related entities be explicitly supported by a model, or should it always be "calculated" by traversing all the intermediate links through the "six degrees of separation"?

Even though E/R models will normalize data to an objective standard, different modelers will break normalization for "performance reasons" on a subjective (and thus inconsistent) basis. Existential Programming counts this lack of agreement on "what is redundant" as yet another reason to embrace/support multiple ontologies simultaneously.

[1] The Philosopher's Toolkit, Julian Baggini and Peter S. Fosl, Blackwell Publishers, 2003, ISBN: 0631228748
[2] ibid, section 4.3
[3] ibid, section 4.1

Tuesday, October 16, 2007

Relativism, Absolutism, and Existential Programming

In the handy Philosopher's Toolkit book[1], there is a section[2] explaining the difference between relative statements and absolute statements (and similarly relativism and absolutism). As a prototypical example, it explains how before Einstein, the "time" an event occurred was considered an absolute statement. In other words, the whole universe would know what it meant because time was the same everywhere (just different time zones). However, Einstein revealed that time is relative to the location and speed of the observer and can't be the same everywhere. Plus, since there is no place and speed that could/should be considered the "official" one, all "times" are equally valid.

Because there are many aspects of reality and opinion that are considered relative by some number of people, Existential Programming counts this as yet another reason to embrace/support multiple ontologies simultaneously. Absolute vs Relative points of view are yet another aspect of modeling the world that traditional object-oriented and relational database modeling make assumptions about.

[1] The Philosopher's Toolkit, Julian Baggini and Peter S. Fosl, Blackwell Publishers, 2003, ISBN: 0631228748
[2] ibid, section 4.2

Friday, October 12, 2007

A Scanner Darkly: Identification via Accidental Properties

It is the case that while it is "essential" properties that (by definition) define something, people very often identify types and individuals via a collection of "accidental" properties. For example, in a computer system, accidental (i.e. changeable) properties like name, address, phone number, etc are used to find/recognize/identify people even though they all can change thus producing the age-old problem of "identity over time".

Visually, people normally identify/recognize other people from their accidental attributes like face/hair/sex/etc. These can all be altered and not change their DNA or soul but that's what people normally use. If these change over time people use other attributes, or in the cliche science fiction scenarios where someone has been completely changed / possessed / transferred-essences-with-an-alien / etc, a shared set of secret shared knowledge is used. "Jim, it's really me, otherwise how would I know the name of your pet hamster on Rigel Seven?!" [Banks, Credit Cards, etc have recognized this and have started asking questions like "what's your favorite movie?" before letting you change your password.]

But for visual only recognition, say, for tracking someone through a crowd via a security camera, you are dependent on those accidental attributes not changing in real time. That is the premise for the "camouflage" in the movie "A Scanner Darkly". There is a cloak that constantly changes the appearance of every piece of the face/clothes, and this is illustrated via the rotoscoping and animation used in the movie. This would confuse automated tracking software and you would get lost in a crowd.

In thinking about this, though, I realized that people could still actually recognize the people in these cloaks BECAUSE WE DO IT WHILE WE ARE WATCHING THE MOVIE! A *real* cloak would be much more effective if it didn't change *pieces* of faces, but rather change whole faces at a time. It is easy to pick out the person in the funny changing cloak unless *everyone* is wearing one. Even then, humans can still keep track of who is who (short of a big crowd) because we track the various individuals at a "thing" level. I.E. I can see that there are some "things" in the room and when they move I can still tell who's who because despite their constantly changing "skin", they are still contiguous space-time blobs of matter (that walk a lot like people!).

Wednesday, September 26, 2007

Introduction to Existential Programming Blog

This first entry for "Existential Programming" is meant to act as an introduction to the topic and an explanation for the backdated entries to be added over time. Around May, 2006 I had a series of brain farts (ahem, epiphanies) about computer software engineering that led to a theory I decided should be called "Existential Programming". Because I have now collected about a year and half of unedited notebook entries, and I now have enough of an idea of what I mean by Existential Programming to write it up, but because I don't yet have the time to polish a Manifesto into a magazine article, much less an academic paper, I have decided to start this blog and back fill it with my notebook contents, as well as, putting future entries here. The goal is to plant my flag on the topic and its ideas now, even if I can't write "Existential Programming, the Book" yet. [BTW, see my std blog disclaimers.]
A meta-idea above Existential Programming itself, is the more general notion that there should be much more explicit cross fertilization between ideas from Philosophy (with a capital "P") and ideas from software engineering (as practiced in industry). I call this "Philosophical Programming". Early on during my epiphanies, I had the intuition that Philosophy probably had something to say about my topic (even though I'd never taken a philosophy class). So, at age 50, I started reading Philosophy 101 books. It quickly became obvious that Philosophy has SO MUCH to say about data/class modeling topics that it is criminal how little explicit reference to it there is in the software practice literature. I distinguish between industry practice (and their books, blogs, magazines oriented towards tools and "best practices") versus academia. After I learned enough terminology to search for papers covering ideas similar to mine, I found that there is a whole subculture writing academic conference papers that don't really bleed over into industry conferences for things like Java or AJAX or SOA. So, my general "project" these days is to try to come up with practical application techniques based on otherwise esoteric topics.
What is "Existential Programming" and what are the central ideas associated with it? In a nutshell, it means to embrace the notion that "existence precedes essence". I.E. develop data models, object models, programming frameworks, etc. without imposing a single E/R model, OO class hierarchy, ontology, etc. By using techniques where "objects" and "entities" exist independently of a single "strong type", they can integrate multiple "strongly typed" data models by letting objects simultaneously reflect all those models. This differs from weakly typed programming, or completely type-less programming, as can be found in scripting languages like JavaScript. Ironically, it takes a "type-less" foundation to really seriously do strong types in a "multi-cultural" world. [And actually, JavaScript is not a bad platform to implement these ideas precisely because of its class-less orientation.]

The ZEN thought here is that until one can create a type-less object, one can not create an object which can be all types simultaneously.

The ideas that flow from this general notion (or more correctly, caused me to refactor the general notion out of these specific ideas), fall into general categories like:

  • mixins
  • data integration using semantic mapping
  • code integration of uncooperative APIs and Frameworks
  • promoting "roles" over "is-a subclassing"
  • open classes and external methods
  • ontology mediation
  • object evolution and class/API versioning

My Blog Disclaimers

The following are my standard disclaimers that apply to this entire blog...

Disclaimer 1: my usual mode of operation is to "discover" something for myself and then later search the literature to discover that someone else, (say Plato) has already had that brain storm. Thus my intuition that "this means something, this is important" only provokes a "DUH!" if you already know that it was first written by a philosopher twenty-five hundred years ago, or in a Norwegian PhD thesis 10 years ago. Have patience with the wide eyed innocence which will eventually be made wise. These entries are a journey.


Disclaimer 2: in transcribing my notes a year and a half after the fact, I will sometimes use a philosophical term (e.g. essential/accidental) that I did not know at the time, in order to make it more clear what I was trying to say at that time.


Disclaimer 3: having just started teaching myself Philosophy 101, I know just enough to be dangerous. However, I have 30+ years of hands-on software engineering knowledge and experience, so I feel qualified to recognize gaps in today's state of the art (as practiced in the real world, i.e. corporations, not academia) which can be filled with concepts developed in Philosophy.

POSTSCRIPT - March 1st, 2010
I found a quote that echoes my disclaimer number one. From "Corporate Entity" by Jan Dejnožka: "It is usual to say that in studying philosophy of X, that we do best to acquire a pre-philosophical understanding of X before philosophizing about it. The more we know about X pre-philosophically, the better our philosophy of X will be."

Friday, September 21, 2007

The Savant Semantic Network Database

Back in 1983, when "expert systems" where just starting to be all the rage, I had a contract (and was later hired as Chief Scientist) with a startup company that wanted to develop several microcomputer based "business" applications. I realized that they were all essentially database applications with a forms-centric user interface. [Mind you, this was back before there were real databases on PCs, and 80x24 color text was the state of the art for user interfaces.] I decided to build a general database and forms UI engine from scratch, and then quickly create the business apps (e.g. contacts, H/R data, calendar, memos, etc) using the engine. The engine eventually became a product in its own right and was dubbed Savant. Mark Wozinac (brother of the famous Woz) ran an early computer store in Silicon Valley and was an early enthusiast for the technology.

Having always been interested in AI and general knowledge representation, and with expert systems looking hot, I took a semantic networking approach. Savant had an interactive development mode where the user could move the cursor around the screen and dynamically define form fields, associating each field with an EAV (entity/attribute/value) data triple (way before the term EAV was widespread). The name of each form field was the "attribute", the value entered into the field was the "value", and the "entity" was the value of some other form field that the user specified. Savant let users create their app, complete with database, on the fly, with the form definitions stored in the same EAV database. I had some novel techniques to optimize the data retrieval (using my written from scratch B+/B* tree DB kernal). All this was written in UCSD Pascal (grandpa to Java) such that it ran on IBM PC, Radio Shack TRS80, Apple ][, and exotic 32-bit computers, and ran on FLOPPIES! Only shortly later, when Corvus hard drives for Apple ][ and the IBM XT came out, did it benefit from their size and speed. To UCSD's credit, I didn't really have to change any code.

One of my inspirations was to make a more intuitive approach for non-technical users than was required by a popular database of the day, pfs:File. It also let users create form screens and save the data from each screen into its own "table". I thought that it was too complicated to know which screen each bit of data you wanted was on, plus, you couldn't share data between screens! My goal was to have all data available from all screens/forms without having to know about tables and schemas. One would just design forms and the data would magically go to the right place.

T
he other day, I stumbled across, MindModel, a product that looks very similar to Savant (except its written for the new-fangled graphic user interface :-) It also looks like it may have the same fatal flaw that my system had. Namely, Savant didn't (because I didn't) understand that the value of a "name" attribute of an entity is not the same thing as the entity itself (nor is it a key). E.G. (JoeSmith,phone#,123-456-7890) does not represent the same information as (key123,name,JoeSmith) plus (key123,phone#,123-456-7890). So, if you change JoeSmith to JoeBlow, you either lose the association with other attributes tied to JoeSmith, or you must remap them all. So, Savant changed all the JoeSmith references to JoeBlow in all triples, but then what to do about the phone number changing? If you change all 123-456-7890 references to 321-654-9999, you have not only changed JoeSmith's phone number but anyone else who shared that number. In other words, it didn't really understand the concepts of entity and attribute.

I see the same fuzzy understanding of entities and attributes in many articles about the Semantic Web and how easy it is to encode data such that any lay person can do it intuitively. They show the same
(JoeSmith,phone#,123-456-7890) flavor examples. We all need more basic metaphysics and ontology training in grade school!

Sunday, September 9, 2007

Quantum Math for Fuzzy Ontologies

In my earlier post "Existential Programming as Quantum States", I mused that objects that were simultaneously carrying properties from multiple ontologies (i.e. multiple class hierarchies or data models), were like Quantum States in Quantum Physics.  This led me later to wonder what math had been developed to work with quantum states...i.e. is there some sort of quantum algebra that might be applicable to Existential Programming? It is needed because, in Existential Programming, a property of an object might carry multiple conflicting values simultaneously, each with varying degrees of certainty or confidence or error margins.

I found Wikipedia page on Quantum-indeterminacy which looks applicable.
Quantum indeterminacy can be quantitatively characterized by a probability distribution on the set of outcomes of measurements of an observable. The distribution is uniquely determined by the system state, and moreover quantum mechanics provides a recipe for calculating this probability distribution.
Indeterminacy in measurement was not an innovation of quantum mechanics, since it had been established early on by experimentalists that errors in measurement may lead to indeterminate outcomes. However, by the later half of the eighteenth century, measurement errors were well understood and it was known that they could either be reduced by better equipment or accounted for by statistical error models. In quantum mechanics, however, indeterminacy is of a much more fundamental nature, having nothing to do with errors or disturbance.
AHA! It dawns on me that going beyond the mere fuzzy logic idea of values having a probability or certainty factor, Existential Programming could have a fuzziness value for the property as a whole...as in "it is not certain that this property even applies to this object"...and even further it could mean "it is not certain that this property even applies to the entire Class".  A FUZZY ONTOLOGY: method of associating attributes/relationships with entities where each entity is not conclusively known. The value of a property may be certain (i.e. not vague or probabilistic), but whether that property belongs to this object is fuzzy.

Why would you want that ability?  How about data mining web pages where several people's names and a single birth-date (or phone number, address, etc) are found.  Even though it isn't known which person's name is associated with the birthday, one could associate the birth-date with each person with some fractional probability.  With enough out of focus wisps of data like this, from many web pages, the confidence factor of the right birthdate with the right person would rise to the top of the list of all possible dates (analogous to the way that very long range telescopes must accumulate lots of individual, seemingly random, photons to build up a picture of the stars/galaxies being imaged).  The fractional probability assigned could be calculated with heuristics like "lexical-distance-between-age-and-name is proportional to the probability assigned". This could make the "value" of a scalar property (like birth-date), in reality, the summarization of a complete histogram of values-by-source-web-pages.



Friday, August 31, 2007

Fuzzy Unit Testing, Performance Unit Testing

In reading Philosophy 101, about Truth with a capital "T", and the non-traditional logics that use new notions of truth, we of course arrive at Fuzzy Logic with its departure from simple binary true/false values, and embrace of an arbitrarily wide range of values in between.

Contemplating this gave me a small AHA moment: Unit Testing is an area where there is an implicit assumption that "Test Passes" has either a true or false value.  How about Fuzzy Unit Testing where there is some numeric value in the 0...1 range which reports a degree of pass/fail-ness? i.e. a percentage pass/fail for each test.  For example, testing algorithms that predict something could be given a percentage pass/fail based on how well the prediction matched the actual value.  Stock market predictions, bank customer credit default prediction, etc come to mind.  This sort of testing of predictions about future defaults (i.e. credit grades) is just the sort of thing that the BASEL II accords are forcing banks to start doing.

Another great idea (if I do say so myself) that I had a few years ago was the notion that there is extra meta-data that could/should be gathered as a part of running unit test suites; specifically, the performance characteristics of each test run.  The fact that a test still passes, but is 10 times slower than the previous test run, is a very important piece of information that we don't usually get.  Archiving and reporting on this meta-data about each test run can give very interesting metrics on how the code changes are improving/degrading performance on various application features/behavior over time.  I can now see that this comparative performance data would be a form of fuzzy testing.

Sunday, August 26, 2007

Birth of a Blog

On this date I had the blinding realization that after a year and a half of thinking and note scribbling about this thing I was calling Existential Programming, I should grab the domain name and start a blog to plant my flag on the name and topic.  I had the thought while recording notes on the original ideas that led to Existential Programming...

Recalling the inspiration for Existential Programming, the concept

While working on a Javascript project in early 2006, I became aware of the differences between Java class-based objects and Javascript's prototype-based objects. I wanted to work with Java-like classes and researched existing attempts to implement them and what it would take to "do it right".  In the course of comparing different attempts, I came across different definitions of "class".

Once I realized the ability to dynamically add and delete attributes of an object in Javascript, I realized that one could nicely map them onto semantic-network relations/tuples. I already knew about EAV database schemas.  I had the epiphany that O/O and E/R and S/N modeling could all be made isomorphic once one had class-less objects as Javascript had.

Once I realized that classes were isomorphic to semantic networks, and having already known that computer ontologies were anything but universally agreed upon, I had the epiphany that OO class structures were too restrictive because they assume a single ontology.  On the other hand, I did not want to give up the benefits of strong typing.  So I had the idea that objects should be able to simultaneously house the attributes of multiple ontologies, and with class-less objects I could see how to implement the whole system.

Together, these trains of thought somehow gave rise to the intuition "I'll bet people have already thought about this...like maybe in Philosophy?". I quickly discovered it had everything to do with this topic. And once I found out what Existentialism was, it became clear that class-less objects have the same deep idea.

So, I coined the term "Existential Programming" to refer to the whole "project" of exploiting type-less objects to implement multiple strong types simultaneously.

Which came first - the Chip or the Socket?

Q: Which came first, the integrated circuit chip, or the socket into which it fits?

A: The socket came first (at least conceptually). Before a chip is designed, there is a framework defined into which the chip will integrate: Digital vs analog signals, voltage levels for digital one and digital zero, power requirements, clock rates, transistor type, etc. This usually results in the design of whole families of chips that are intended to work together.

SO, no chip is created in isolation, just as no software component should be. Otherwise, the result would be that there is no common framework with which other components can be attached (Super-glue, play-doh, and duct tape not withstanding).

Of course, as I wrote back in 2000, there is no such thing as a component (in the same way that there is no such thing as a donut hole).

Sunday, August 12, 2007

Killer App of Existential Programming?

Ok, with all this theorizing about "existential programming", what the heck is it good for?
I.E. What is the killer app of existential programming?

The answer could be "integration of uncooperating systems"...
  • Data Integration
  • Data Model (i.e. schema) Integration
  • Application Integration
  • Frameworks API Integration
By developing an information (code/data) environment where different paradigms can coexist without even being aware of each other, uncooperating systems can be mediated between that otherwise could not.

Saturday, July 28, 2007

Things vs Stuff, Individuals vs Commodities

In thinking about "what is a thing?", I had this brain fart about the dividing line between things that have an individual identity and "commodities" or "substances". That dividing line is between those things that are chosen or manipulated by "amounts" rather than "names". I.E. When you ask for X number of things (a dozen donuts), or Y ounces of stuff (an ounce of prevention ;-) (i.e. mass nouns), instead of asking for "that one" (i.e. using an indexical), you are referring to things that may actually exist as separate things (e.g. pork bellies, batteries, molecules) but they are so similar that we quit referring to individual ones as individuals. EVEN WHEN picking out a single screw from the basket of screws at the hardware store, it doesn't really have an individual identity! We picked "one", not "that one", much less "Joe".
In fact, to give "names" back to things that have become commodities, we resort to assigning serial numbers to make them each unique.

[Ed. Note - 11/21/07: as per my disclaimers, once I start looking for my epiphanies on the net, I find them. E.G. in this case, see "Conceiving of entities as objects and as stuff". Congrats Bruce, you've just discovered that there are mass nouns vs count nouns.]


Monday, June 11, 2007

Different orders of Polymorphism

When most computer science texts refer to polymorphism (in the context of object oriented programming) they are referring to either refining the implementation of a method via subclasses overriding that method, OR, by defining more than one behavior (method) with the same name. The former case (hopefully) preserves the semantics of the method name, the latter overloads multiple semantics on the same name. In either case, it is only the method that is multifaceted, not the entire object much less the class hierarchy.

With Existential Programming, a different order of polymorphism (i.e. "a single entity can take on multiple forms") is at work. Here a single entity in its entirety can take on multiple forms. Heck, it can take on multiple ontologies (aka class hierarchies). This means either supporting multiple (albeit, fundamentally similar) views/conceptions of an entity, OR, trying to knit together very different models where the object instance is in some sort of oscillating
Schrodinger's Cat-like state (except that we can look at it and not affect its state :-).

Monday, May 14, 2007

Rationalism of the Semantic Web a Flaw?

In thinking about how knowledge is represented in the real world, it occurs that, in addition to the traditional rational representation via human and formal languages, there are at least two other biggies: DNA and neural nets.

Many, if not all, of the things we learn are represented in our brains as complex networks of (mathematical) functions that are simulated in computers via the technique of neural networks. And, if you have ever looked at the data associated with a neural net that has been trained to, say, recognize a handwritten letter "A", you've seen how there isn't any way to put that knowledge into words. In fact, it is not clear by simply looking at a net's matrix of fuzzy weights that it knows anything at all, much less what it knows!

In a similar vein, DNA obviously encodes tremendous amounts of knowledge, like how to make a beating heart, that are also very hard to translate into explicit "facts". In fact, DNA, like self-modifying source code in Lisp, is hard to verbalize because what it "does" can be very far removed from what is "says".

Now because the whole theory and strategy of representing knowledge via "facts", whether in a traditional database, or a semantic network, assumes that everything can be put into a fact format, there are huge amounts of knowledge in the world that can't be used. Yes, one can simplistically use a BLOB to stuff anything reducible to a number string into a database tuple, but that doesn't really put the knowledge into a form that can be "reasoned" about via inferencing rules. And it is that sort of inferencing that is the rationale behind the Semantic Web and why we should bother to create it.

Is this a (fatal) flaw in the foundation of the Semantic Web? Many philosophers have claimed it is a fatal flaw in Rationalism which is the philosophical equivalent to semantic networks. Phenomenologists insisted that many other flavors of knowledge had to be handled beyond the ones covered in rationalism and logic. How will the semantic web handle these? Is it even possible philosophically?

== considered harmful

Once one realizes that objects aren't entities, and that they very often are representations of real world things, one realizes that the "==" (equals) operator is dangerous!




As shown in the diagram above, it is quite possible that two different objects can represent the same real world thing, and the same object can, at different times, represent different things.  Since "==" simply says that two different references to objects point to the same object, it has little to do with the references pointing to the same "thing".


Monday, April 16, 2007

What is the difference between an Ontology and a Taxonomy

The following has been posted on a few sites and I don't know which was the original. In any event, it is very germane to my purpose, so I republish it here...

A controlled vocabulary is a list of terms that have been enumerated explicitly. This list is controlled by and is available from a controlled vocabulary registration authority. All terms in a controlled vocabulary should have an unambiguous, non-redundant definition. This is a design goal that may not be true in practice. It depends on how strict the controlled vocabulary registration authority is regarding registration of terms into a controlled vocabulary. At a minimum, the following two rules should be enforced:

  1. If the same term is commonly used to mean different concepts in different contexts, then its name is explicitly qualified to resolve this ambiguity.
  2. If multiple terms are used to mean the same thing, one of the terms is identified as the preferred term in the controlled vocabulary and the other terms are listed as synonyms or aliases.
A taxonomy is a collection of controlled vocabulary terms organized into a hierarchical structure. Each term in a taxonomy is in one or more parent-child relationships to other terms in the taxonomy. There may be different types of parent-child relationships in a taxonomy (e.g., whole-part, genus-species, type-instance), but good practice limits all parent-child relationships to a single parent to be of the same type. Some taxonomies allow poly-hierarchy, which means that a term can have multiple parents. This means that if a term appears in multiple places in a taxonomy, then it is the same term. Specifically, if a term has children in one place in a taxonomy, then it has the same children in every other place where it appears.

A thesaurus is a networked collection of controlled vocabulary terms. This means that a thesaurus uses associative relationships in addition to parent-child relationships. The expressiveness of the associative relationships in a thesaurus vary and can be as simple as "related to term" as in term A is related to term B.

People use the word ontology to mean different things, e.g. glossaries & data dictionaries, thesauri & taxonomies, schemas & data models, and formal ontologies & inference. A formal ontology is a controlled vocabulary expressed in an ontology representation language. This language has a grammar for using vocabulary terms to express something meaningful within a specified domain of interest. The grammar contains formal constraints (e.g., specifies what it means to be a well-formed statement, assertion, query, etc.) on how terms in the ontology's controlled vocabulary can be used together.

People make commitments to use a specific controlled vocabulary or ontology for a domain of interest. Enforcement of an ontology's grammar may be rigorous or lax. Frequently, the grammar for a "light-weight" ontology is not completely specified, i.e., it has implicit rules that are not explicitly documented.

A meta-model is an explicit model of the constructs and rules needed to build specific models within a domain of interest. A valid meta-model is an ontology, but not all ontologies are modeled explicitly as meta-models. A meta-model can be viewed from three different perspectives:

  1. as a set of building blocks and rules used to build models
  2. as a model of a domain of interest, and
  3. as an instance of another model.
When comparing meta-models to ontologies, we are talking about meta-models as models (perspective 2).

Note: Meta-modeling as a domain of interest can have its own ontology. For example, the CDIF Family of Standards, which contains the CDIF Meta-meta-model along with rules for modeling and extensibility and transfer format, is such an ontology. When modelers use a modeling tool to construct models, they are making a commitment to use the ontology implemented in the modeling tool. This model making ontology is usually called a meta-model, with "model making" as its domain of interest.

Bottom line: Taxonomies and Thesauri may relate terms in a controlled vocabulary via parent-child and associative relationships, but do not contain explicit grammar rules to constrain how to use controlled vocabulary terms to express (model) something meaningful within a domain of interest. A meta-model is an ontology used by modelers. People make commitments to use a specific controlled vocabulary or ontology for a domain of interest.

Friday, April 13, 2007

Don't confuse (OOP) Objects with Entities

AHA! (OOP language) Classes are too general to think of as synonymous with (Philosophical) Entities! OOP classes/objects can represent roles, or values of properties (e.g. Red), or PropertyTypes (e.g. Color), NONE OF WHICH are Entities. SO, the "isA" relationship between a superClass and its subClasses is not always analogous with the "isA" relationship between an Entity and the collection of Entities constituting an entityType.

"Red isA Color" is different than "Joe isA Human" or "Human isA Mammal".
 Red is not an entity, Joe is and a Human is. Why? Red is not countable and not distinguishable from another "instance" of red; Joe, humans, and mammals are. [Ed. note: this intuition will later be recognized as the problem of universals as the author reads more books. ;-) ]

With "domain objects" (aka "business objects") being roughly equivalent to philosophical entities, it is dawning on the IT industry (e.g. POJOs, "domain driven development") that there are at least 2 kinds of classes/objects: "domain" and "non-domain".  There are wildly different ideas of what "non-domain" classes are, but they generally are meant to include all that programming implementation logic kind of stuff rather than the direct modeling of the world.

The main point here (and the AHA moment), is recognizing that lots of confusion about how to divide a problem into programming language classes (along with what logic goes into their "equals" methods) boils down to a lack of understanding about first philosophy concepts of entities, universals, etc.  We programmers confuse the ability of programming languages to implement everything as a class/object with the idea that down deep everything is the same kind of thing.
Philosophy would say that red is fundamentally different than Fred, and both are fundamentally different than a "process" (i.e. an OO method) or an "event" (i.e. an OO "message").

Thursday, April 12, 2007

"isA" and "asA" Relationships

In modeling the world, (object-oriented, entity-relationship, etc), the emphasis has been on distinguishing between "is-a" relationships and "has-a" relationships. (human isA mammal, human hasA head) There is another fundamental relationship that is under-emphasized in modeling; namely, the "as-a" relationship. This is the relationship between an entity and a "role" that that entity can take on. Many putative "entities" (e.g. customer, employee, etc) are not really entities at all, but are "roles" that the actual entity (e.g. a person) can take on. [see a case study here]

Roles are often implemented as classes, and multiple-inheritance is used [or worse, lots of glue code is written] to gain the lexical convenience of referencing joe.employeeID and joe.resign() versus joe.employeeRole.getID() and joe.employeeRole.resign(). Of course, "static" classes can be used to encapsulate the role details resulting in AsEmployee(joe).resign() references.

Monday, March 26, 2007

THE PHILOSOPHERS' PROJECT

Here is a bit of Philosophy jargon that will be new to programmers: "project".  It may best be explained via this excerpt from the world-wide best selling book,  Sophie's World...
THE PHILOSOPHER'S PROJECT

Here we are again! We'll go directly to today's lesson without detours around white rabbits and the like.

I'll outline very broadly the way people have thought about philosophy, from the ancient Greeks right up to our own day. But we'll take things in their correct order.

Since some philosophers lived in a different age--and perhaps in a completely different culture from ours--it is a good idea to try and see what each philosopher's project is. By this I mean that we must try to grasp precisely what it is that each particular philosopher is especially concerned with finding out. One philosopher might want to know how plants and animals came into being. Another might want to know whether there is a God or whether man has an immortal soul.

Once we have determined what a particular philosopher's project is, it is easier to follow his line of thought, since no one philosopher concerns himself with the whole of philosophy.
 

Monday, February 12, 2007

Captain's Log: My (new) niche in the World

Captain's Log, Lincoln's Birthday: After being absorbed in my new pastime these past several months trying to learn enough Philosophy to know whether it is worth learning at all to be of any practical use in my Software Engineering career, I have decided a decidedly *yes*. Not only is it worth continuing to pursue, it could well be my next career: Exploring and writing about the intersection between Philosophy and Computer Science. i.e. "Experimental Philosophy" or "Empirical Philosophy" or some such.
I can already see that Philosophy is a deep ocean to get drowned in; better to take on the much smaller pond that is cross-disciplinary studies. Nobody in mainstream computer programming is aware of this stuff! I know because I have a computer science degree and have been doing hands on development for 30+ years from the days of spaghetti code thru structured programming thru modular programing thru embedded programming thru object oriented programming thru distributed programming thru AJAX. I think I saw one article about Plato in CACM once. There is a book in this; dare I say a series? At the very least I ought to start a blog to put these notebook entries someplace in the meantime.

Thursday, February 1, 2007

Identifying Ontologies with URIs

As noted before in my original ephiphanies #10 & #15 and Level III Existential Programming, there should be a whole collection of meta-data around each "fact" documenting the source of that data, the time that data was acquired, etc, etc. This meta-data in effect constitutes an ontology definition; one per each different source/timestamp/etc. To be sure, these various ontologies are largely similar to each other and could benefit from an inheritance tree (ala OO class hierarchies).

In Java, such collections of definitions can be grouped into Packages. In XML, they can be grouped into Namespaces. So, a pragmatic way to factor out this metadata (such that it is not copied into every "fact"), is to give the ontology a name and tag each fact with that name (in the same way as each Java value is "tagged" with a data type that includes the Package name/path). And since, XML has already defined URIs as the format for namespace identifiers, and it is easy to map Package names with URIs, Existential Programming systems/languages could use URIs to identify the ontology associated with some set of facts. Since Existential Programming systems seek to seamlessly convert between an O/O and an E/R and an S/N representation, having OO packages easily mappable to data exchange mechanisms like XML is a good thing.

POSTSCRIPT - Nov 23rd, 2007
It is hard to tell which of my epiphanies are new and which are things I read long ago but later remembered as if they were a new idea...I'm sure I read this URI stuff in the Semantic Web articles (see Principle #1), but only "remembered" it when stumbling back across it today. At any rate, it was only recently that Tim Berners-Lee wrote this post about Linked Data using URIs.

Sunday, January 21, 2007

Imaginary Numbers paradigm for Existential Programming

It occurs to me that one of the things that Existential Programming hopes to enable is the ability to continue working with data that is vague, fuzzy, semi-inconsistent instead of screeching to a halt as would be the case with a strongly-typed implementation of a single ontology.

An analog to this is the invention (discovery?) of Imaginary Numbers in mathematics. The imaginary number "i" is defined to be the square root of -1. Now the mildly mathematical reader will note that you can't have a square root of a negative number because any time you square a number it is always positive. So, when early mathematicians came to a point in their formulas where a square root of a negative number was required, they were stuck. By creating a way to talk about and manipulate numbers that "can't exist" (i.e. imaginary numbers), formulas could be worked through such that "real" answers could eventually emerge.

By developing techniques to work with data that is not consistent with a single ontology (i.e. existential programming), programs can get past the "thats not legal data" stage and work its way to answers that ultimately do result in "legal data".