Wednesday, September 26, 2007

Introduction to Existential Programming Blog

This first entry for "Existential Programming" is meant to act as an introduction to the topic and an explanation for the backdated entries to be added over time. Around May, 2006 I had a series of brain farts (ahem, epiphanies) about computer software engineering that led to a theory I decided should be called "Existential Programming". Because I have now collected about a year and half of unedited notebook entries, and I now have enough of an idea of what I mean by Existential Programming to write it up, but because I don't yet have the time to polish a Manifesto into a magazine article, much less an academic paper, I have decided to start this blog and back fill it with my notebook contents, as well as, putting future entries here. The goal is to plant my flag on the topic and its ideas now, even if I can't write "Existential Programming, the Book" yet. [BTW, see my std blog disclaimers.]
A meta-idea above Existential Programming itself, is the more general notion that there should be much more explicit cross fertilization between ideas from Philosophy (with a capital "P") and ideas from software engineering (as practiced in industry). I call this "Philosophical Programming". Early on during my epiphanies, I had the intuition that Philosophy probably had something to say about my topic (even though I'd never taken a philosophy class). So, at age 50, I started reading Philosophy 101 books. It quickly became obvious that Philosophy has SO MUCH to say about data/class modeling topics that it is criminal how little explicit reference to it there is in the software practice literature. I distinguish between industry practice (and their books, blogs, magazines oriented towards tools and "best practices") versus academia. After I learned enough terminology to search for papers covering ideas similar to mine, I found that there is a whole subculture writing academic conference papers that don't really bleed over into industry conferences for things like Java or AJAX or SOA. So, my general "project" these days is to try to come up with practical application techniques based on otherwise esoteric topics.
What is "Existential Programming" and what are the central ideas associated with it? In a nutshell, it means to embrace the notion that "existence precedes essence". I.E. develop data models, object models, programming frameworks, etc. without imposing a single E/R model, OO class hierarchy, ontology, etc. By using techniques where "objects" and "entities" exist independently of a single "strong type", they can integrate multiple "strongly typed" data models by letting objects simultaneously reflect all those models. This differs from weakly typed programming, or completely type-less programming, as can be found in scripting languages like JavaScript. Ironically, it takes a "type-less" foundation to really seriously do strong types in a "multi-cultural" world. [And actually, JavaScript is not a bad platform to implement these ideas precisely because of its class-less orientation.]

The ZEN thought here is that until one can create a type-less object, one can not create an object which can be all types simultaneously.

The ideas that flow from this general notion (or more correctly, caused me to refactor the general notion out of these specific ideas), fall into general categories like:

  • mixins
  • data integration using semantic mapping
  • code integration of uncooperative APIs and Frameworks
  • promoting "roles" over "is-a subclassing"
  • open classes and external methods
  • ontology mediation
  • object evolution and class/API versioning

My Blog Disclaimers

The following are my standard disclaimers that apply to this entire blog...

Disclaimer 1: my usual mode of operation is to "discover" something for myself and then later search the literature to discover that someone else, (say Plato) has already had that brain storm. Thus my intuition that "this means something, this is important" only provokes a "DUH!" if you already know that it was first written by a philosopher twenty-five hundred years ago, or in a Norwegian PhD thesis 10 years ago. Have patience with the wide eyed innocence which will eventually be made wise. These entries are a journey.


Disclaimer 2: in transcribing my notes a year and a half after the fact, I will sometimes use a philosophical term (e.g. essential/accidental) that I did not know at the time, in order to make it more clear what I was trying to say at that time.


Disclaimer 3: having just started teaching myself Philosophy 101, I know just enough to be dangerous. However, I have 30+ years of hands-on software engineering knowledge and experience, so I feel qualified to recognize gaps in today's state of the art (as practiced in the real world, i.e. corporations, not academia) which can be filled with concepts developed in Philosophy.

POSTSCRIPT - March 1st, 2010
I found a quote that echoes my disclaimer number one. From "Corporate Entity" by Jan Dejnožka: "It is usual to say that in studying philosophy of X, that we do best to acquire a pre-philosophical understanding of X before philosophizing about it. The more we know about X pre-philosophically, the better our philosophy of X will be."

Friday, September 21, 2007

The Savant Semantic Network Database

Back in 1983, when "expert systems" where just starting to be all the rage, I had a contract (and was later hired as Chief Scientist) with a startup company that wanted to develop several microcomputer based "business" applications. I realized that they were all essentially database applications with a forms-centric user interface. [Mind you, this was back before there were real databases on PCs, and 80x24 color text was the state of the art for user interfaces.] I decided to build a general database and forms UI engine from scratch, and then quickly create the business apps (e.g. contacts, H/R data, calendar, memos, etc) using the engine. The engine eventually became a product in its own right and was dubbed Savant. Mark Wozinac (brother of the famous Woz) ran an early computer store in Silicon Valley and was an early enthusiast for the technology.

Having always been interested in AI and general knowledge representation, and with expert systems looking hot, I took a semantic networking approach. Savant had an interactive development mode where the user could move the cursor around the screen and dynamically define form fields, associating each field with an EAV (entity/attribute/value) data triple (way before the term EAV was widespread). The name of each form field was the "attribute", the value entered into the field was the "value", and the "entity" was the value of some other form field that the user specified. Savant let users create their app, complete with database, on the fly, with the form definitions stored in the same EAV database. I had some novel techniques to optimize the data retrieval (using my written from scratch B+/B* tree DB kernal). All this was written in UCSD Pascal (grandpa to Java) such that it ran on IBM PC, Radio Shack TRS80, Apple ][, and exotic 32-bit computers, and ran on FLOPPIES! Only shortly later, when Corvus hard drives for Apple ][ and the IBM XT came out, did it benefit from their size and speed. To UCSD's credit, I didn't really have to change any code.

One of my inspirations was to make a more intuitive approach for non-technical users than was required by a popular database of the day, pfs:File. It also let users create form screens and save the data from each screen into its own "table". I thought that it was too complicated to know which screen each bit of data you wanted was on, plus, you couldn't share data between screens! My goal was to have all data available from all screens/forms without having to know about tables and schemas. One would just design forms and the data would magically go to the right place.

T
he other day, I stumbled across, MindModel, a product that looks very similar to Savant (except its written for the new-fangled graphic user interface :-) It also looks like it may have the same fatal flaw that my system had. Namely, Savant didn't (because I didn't) understand that the value of a "name" attribute of an entity is not the same thing as the entity itself (nor is it a key). E.G. (JoeSmith,phone#,123-456-7890) does not represent the same information as (key123,name,JoeSmith) plus (key123,phone#,123-456-7890). So, if you change JoeSmith to JoeBlow, you either lose the association with other attributes tied to JoeSmith, or you must remap them all. So, Savant changed all the JoeSmith references to JoeBlow in all triples, but then what to do about the phone number changing? If you change all 123-456-7890 references to 321-654-9999, you have not only changed JoeSmith's phone number but anyone else who shared that number. In other words, it didn't really understand the concepts of entity and attribute.

I see the same fuzzy understanding of entities and attributes in many articles about the Semantic Web and how easy it is to encode data such that any lay person can do it intuitively. They show the same
(JoeSmith,phone#,123-456-7890) flavor examples. We all need more basic metaphysics and ontology training in grade school!

Sunday, September 9, 2007

Quantum Math for Fuzzy Ontologies

In my earlier post "Existential Programming as Quantum States", I mused that objects that were simultaneously carrying properties from multiple ontologies (i.e. multiple class hierarchies or data models), were like Quantum States in Quantum Physics.  This led me later to wonder what math had been developed to work with quantum states...i.e. is there some sort of quantum algebra that might be applicable to Existential Programming? It is needed because, in Existential Programming, a property of an object might carry multiple conflicting values simultaneously, each with varying degrees of certainty or confidence or error margins.

I found Wikipedia page on Quantum-indeterminacy which looks applicable.
Quantum indeterminacy can be quantitatively characterized by a probability distribution on the set of outcomes of measurements of an observable. The distribution is uniquely determined by the system state, and moreover quantum mechanics provides a recipe for calculating this probability distribution.
Indeterminacy in measurement was not an innovation of quantum mechanics, since it had been established early on by experimentalists that errors in measurement may lead to indeterminate outcomes. However, by the later half of the eighteenth century, measurement errors were well understood and it was known that they could either be reduced by better equipment or accounted for by statistical error models. In quantum mechanics, however, indeterminacy is of a much more fundamental nature, having nothing to do with errors or disturbance.
AHA! It dawns on me that going beyond the mere fuzzy logic idea of values having a probability or certainty factor, Existential Programming could have a fuzziness value for the property as a whole...as in "it is not certain that this property even applies to this object"...and even further it could mean "it is not certain that this property even applies to the entire Class".  A FUZZY ONTOLOGY: method of associating attributes/relationships with entities where each entity is not conclusively known. The value of a property may be certain (i.e. not vague or probabilistic), but whether that property belongs to this object is fuzzy.

Why would you want that ability?  How about data mining web pages where several people's names and a single birth-date (or phone number, address, etc) are found.  Even though it isn't known which person's name is associated with the birthday, one could associate the birth-date with each person with some fractional probability.  With enough out of focus wisps of data like this, from many web pages, the confidence factor of the right birthdate with the right person would rise to the top of the list of all possible dates (analogous to the way that very long range telescopes must accumulate lots of individual, seemingly random, photons to build up a picture of the stars/galaxies being imaged).  The fractional probability assigned could be calculated with heuristics like "lexical-distance-between-age-and-name is proportional to the probability assigned". This could make the "value" of a scalar property (like birth-date), in reality, the summarization of a complete histogram of values-by-source-web-pages.