Tuesday, December 25, 2007
In my earlier post, it was somewhat easy to see that there were multiple aspects to Superman because each aspect already had its own name; Superman vs Clark Kent. With Shakespeare however, it is much more subtle because the different aspects have the same name: Shakespeare. Additionally, we are not used to thinking that they are different aspects that can be independent of each other, any more than we think of Cher-the-person and Cher-the-singer as being independent things. But, as discussed in the book, many people over the centuries have debated whether the author of Hamlet, et al was really Francis Bacon, Christopher Marlowe, Queen Elizabeth, etc.
The interesting thing is that because Shakespeare is SO ingrained as the name of the playwright that even if Sir Francis were to be proven the author, the headline will be "Bacon is the REAL Shakespeare!" which is absurd because clearly, Shakespeare-the-historical-figure is the "real" Shakespeare. Changing the human associated with the author-of-Hamlet concept will not change the concept's name; it will remain "Shakespeare's Hamlet (written by Bacon)" and not "Bacon's Hamlet".
So, when assigning ID#(s) to putatively single entities, flexibility should be built in to allow ad-hoc collections of attributes of any entity to be grouped and named and referenced separately. Otherwise, the system would not be able to represent the statement: Shakespeare is not Shakespeare, Bacon is.
Wednesday, December 19, 2007
While reading the Intellectual Devotional (which I highly recommend), I came across its page about "Philosophy of Language" and it had an immediate resonance with a project at my current client. The page describes the "problem of reference" where ideas about what a name "means" have been debated and changed over time.
One theory says that "names" don't have any meaning, in and of themselves, they merely refer to some thing that has meaning. Hence, Shakespeare's quote "A rose by any other name would smell as sweet" summarizes the position that the word "rose" is not meaningful, and could be exchanged with any other word that refers to the thing "rose". That is why "gulaab" (the Urdu word for rose) can work just as well for speakers of Urdu.
Another more modern theory though, says that names not only refer to some thing, they also carry the connotation of "in what sense" is the thing being referenced. The book illustrates the example of Superman and Clark Kent both being names for the same thing (the being Superman), but they are not interchangeable. Clark Kent (mild mannered reporter) has a work address of the Daily Planet whereas Superman (superhero able to leap tall buildings) does not. It matters which name is used when talking about Superman.
So, in the same way that Clark Kent and Superman both refer to different aspects of the same entity, and are thus not interchangeable, a computer system managing legal entity identity data can not translate name/address variations into a single entity ID# when those variations actually refer to different aspects of the entity. For example, if there is data that is specific to a particular store branch, that branch needs its own well-known ID# even though it is only a portion of a single legal entity. Further, since legal entity names are not unique (for example, I personally own two different corporations with the identical legal name), the entire name/address/phone/etc combination needs managing rather than separate "alternate name" lists. It is also not sufficient to support alternate name/address records merely as search aids that still ultimately result in the ID# of the entity-as-a-whole. Otherwise one would loose track of the fact that we were talking about Clark Kent, not Superman.
Sunday, December 2, 2007
So, Existential Programming might well have been named Cubist Programming!
Saturday, November 10, 2007
- Objectivism says that some statements are objective, in that they are true independent of anyone's opinion. e.g. This ball is red. Alternatively, values assigned to properties can be dependent on other factors and be described by a function rather than a simple value. e.g. The color of Ayer's Rock is F(time-of-day, weather).
- Subjectivism says that (potentially all) statements are subjective, in that they are dependent on the opinion of the person making the statement. e.g. This cake is delicious.
- Relativism says that statements are always subjective even when the decider thinks he's made an objective evaluation. I.E. no evaluations are objective because man is always biased by his particular cultural, historical, religious, etc viewpoint, and no particular viewpoint is "the right one". e.g. This society is primitive.
- Existential Programming philosophy says that even if something is supposedly scalar & objective, and even if one does not subscribe to relativism (thus implying that there is no need to ascribe to values a particular data source), the reliability of any particular data source is never perfect, and thus one needs to model data as if relativism were true. I.E. keep track of "says who" for each "fact" and therefore be prepared to simultaneously handle/store multiple values for everything, tagging them with "says who", "said when", etc. So, in effect, there are no scalar values, only functions with at least a data source as a parameter.
 ibid, pg 667
 ibid, pg 800
Sunday, November 4, 2007
This notion is embodied by the testing technique known as Equivalence Partitioning. [A good summary is in the article "Make Child's Play of Equivalence Class Partitioning".] It says that there is more detailed structure to data than string and integer and to only test at that level is to miss many crucial test cases. It defines subranges and such to guide what to test, in effect, defining more granular data types. Languages like Pascal and Ada had more refined data types like subranges but those ideas were lost when C-like languages (including C++ & Java) came into vogue.
Existential Programming would change the way data types are defined to get away from simply defining a structure that is an aggregation of "primitive data types". It would let data types define a function that computes whether (or how closely) a putative value meets the requirements for that data type. While one could put that logic in a "setter" method of a class, that doesn't work with direct assignments (via assignment operators) of values (at least in Java-like languages).
To see where even subrange definitions like Pascal/Ada allowed (e.g. 3..27) are not sufficient, look at the logic required to validate the month/day/year values of a date as detailed in the ECP article. Only an algorithm can encode what is required for legal values, not simple declarations of string and integer properties.
So, "strong" typing is a relative term.
Sunday, October 21, 2007
The analogy here is between the key signature of a piece of music, and an ontology of classes used in a program.
A key signature (e.g. C-major, d-minor) defines which notes are to be used, and the special relationships between different notes. To be "tonal" is to follow the rules of a particular key signature. To be "atonal" is to NOT follow the rules of any key signature. To be "pantonal" is to simultaneously follow the rules of multiple key signatures.
An ontology defines which types/classes are to be used, and the special relationships between different classes. To be "strongly-typed" is to follow the rules of a particular ontology. To be "type-less" is to NOT follow the rules of any ontology. To be "multi-typed" (i.e. existential programming) is to simultaneously follow the rules of multiple ontologies.
 "The Intellectual Devotional",2006, Kidder, Oppenheim
Wednesday, October 17, 2007
There are arguments by philosophers, however, about what IS entailed by a concept or contained within a definition. Some would argue that the boiling point of water *is* contained within the concept of water whether a particular person happens to know that or not. Similarly the statement "all triangles have 180 degrees as the sum of their internal angles" is true independently of whether we know about it, (but a lot of people *wouldn't* know that).
So, in general, there will not be agreement from one ontology/model to another about which statements (i.e. data/attributes/relationships) are redundant. Should the "age" of an applicant for a job be stored because it was a field in the application form? Or, should it always be calculated from the birth date and the date of the application? Should a "short cut" relationship between 2 distantly related entities be explicitly supported by a model, or should it always be "calculated" by traversing all the intermediate links through the "six degrees of separation"?
Even though E/R models will normalize data to an objective standard, different modelers will break normalization for "performance reasons" on a subjective (and thus inconsistent) basis. Existential Programming counts this lack of agreement on "what is redundant" as yet another reason to embrace/support multiple ontologies simultaneously.
 The Philosopher's Toolkit, Julian Baggini and Peter S. Fosl, Blackwell Publishers, 2003, ISBN: 0631228748
 ibid, section 4.3
 ibid, section 4.1
Tuesday, October 16, 2007
Because there are many aspects of reality and opinion that are considered relative by some number of people, Existential Programming counts this as yet another reason to embrace/support multiple ontologies simultaneously. Absolute vs Relative points of view are yet another aspect of modeling the world that traditional object-oriented and relational database modeling make assumptions about.
 The Philosopher's Toolkit, Julian Baggini and Peter S. Fosl, Blackwell Publishers, 2003, ISBN: 0631228748
 ibid, section 4.2
Friday, October 12, 2007
Visually, people normally identify/recognize other people from their accidental attributes like face/hair/sex/etc. These can all be altered and not change their DNA or soul but that's what people normally use. If these change over time people use other attributes, or in the cliche science fiction scenarios where someone has been completely changed / possessed / transferred-essences-with-an-alien / etc, a shared set of secret shared knowledge is used. "Jim, it's really me, otherwise how would I know the name of your pet hamster on Rigel Seven?!" [Banks, Credit Cards, etc have recognized this and have started asking questions like "what's your favorite movie?" before letting you change your password.]
But for visual only recognition, say, for tracking someone through a crowd via a security camera, you are dependent on those accidental attributes not changing in real time. That is the premise for the "camouflage" in the movie "A Scanner Darkly". There is a cloak that constantly changes the appearance of every piece of the face/clothes, and this is illustrated via the rotoscoping and animation used in the movie. This would confuse automated tracking software and you would get lost in a crowd.
In thinking about this, though, I realized that people could still actually recognize the people in these cloaks BECAUSE WE DO IT WHILE WE ARE WATCHING THE MOVIE! A *real* cloak would be much more effective if it didn't change *pieces* of faces, but rather change whole faces at a time. It is easy to pick out the person in the funny changing cloak unless *everyone* is wearing one. Even then, humans can still keep track of who is who (short of a big crowd) because we track the various individuals at a "thing" level. I.E. I can see that there are some "things" in the room and when they move I can still tell who's who because despite their constantly changing "skin", they are still contiguous space-time blobs of matter (that walk a lot like people!).
Wednesday, September 26, 2007
A meta-idea above Existential Programming itself, is the more general notion that there should be much more explicit cross fertilization between ideas from Philosophy (with a capital "P") and ideas from software engineering (as practiced in industry). I call this "Philosophical Programming". Early on during my epiphanies, I had the intuition that Philosophy probably had something to say about my topic (even though I'd never taken a philosophy class). So, at age 50, I started reading Philosophy 101 books. It quickly became obvious that Philosophy has SO MUCH to say about data/class modeling topics that it is criminal how little explicit reference to it there is in the software practice literature. I distinguish between industry practice (and their books, blogs, magazines oriented towards tools and "best practices") versus academia. After I learned enough terminology to search for papers covering ideas similar to mine, I found that there is a whole subculture writing academic conference papers that don't really bleed over into industry conferences for things like Java or AJAX or SOA. So, my general "project" these days is to try to come up with practical application techniques based on otherwise esoteric topics.
The ideas that flow from this general notion (or more correctly, caused me to refactor the general notion out of these specific ideas), fall into general categories like:
- data integration using semantic mapping
- code integration of uncooperative APIs and Frameworks
- promoting "roles" over "is-a subclassing"
- open classes and external methods
- ontology mediation
- object evolution and class/API versioning
Disclaimer 1: my usual mode of operation is to "discover" something for myself and then later search the literature to discover that someone else, (say Plato) has already had that brain storm. Thus my intuition that "this means something, this is important" only provokes a "DUH!" if you already know that it was first written by a philosopher twenty-five hundred years ago, or in a Norwegian PhD thesis 10 years ago. Have patience with the wide eyed innocence which will eventually be made wise. These entries are a journey.
Disclaimer 2: in transcribing my notes a year and a half after the fact, I will sometimes use a philosophical term (e.g. essential/accidental) that I did not know at the time, in order to make it more clear what I was trying to say at that time.
Disclaimer 3: having just started teaching myself Philosophy 101, I know just enough to be dangerous. However, I have 30+ years of hands-on software engineering knowledge and experience, so I feel qualified to recognize gaps in today's state of the art (as practiced in the real world, i.e. corporations, not academia) which can be filled with concepts developed in Philosophy.
|POSTSCRIPT - March 1st, 2010|
|I found a quote that echoes my disclaimer number one. From "Corporate Entity" by Jan Dejnožka: "It is usual to say that in studying philosophy of X, that we do best to acquire a pre-philosophical understanding of X before philosophizing about it. The more we know about X pre-philosophically, the better our philosophy of X will be."|
Friday, September 21, 2007
Having always been interested in AI and general knowledge representation, and with expert systems looking hot, I took a semantic networking approach. Savant had an interactive development mode where the user could move the cursor around the screen and dynamically define form fields, associating each field with an EAV (entity/attribute/value) data triple (way before the term EAV was widespread). The name of each form field was the "attribute", the value entered into the field was the "value", and the "entity" was the value of some other form field that the user specified. Savant let users create their app, complete with database, on the fly, with the form definitions stored in the same EAV database. I had some novel techniques to optimize the data retrieval (using my written from scratch B+/B* tree DB kernal). All this was written in UCSD Pascal (grandpa to Java) such that it ran on IBM PC, Radio Shack TRS80, Apple ][, and exotic 32-bit computers, and ran on FLOPPIES! Only shortly later, when Corvus hard drives for Apple ][ and the IBM XT came out, did it benefit from their size and speed. To UCSD's credit, I didn't really have to change any code.
One of my inspirations was to make a more intuitive approach for non-technical users than was required by a popular database of the day, pfs:File. It also let users create form screens and save the data from each screen into its own "table". I thought that it was too complicated to know which screen each bit of data you wanted was on, plus, you couldn't share data between screens! My goal was to have all data available from all screens/forms without having to know about tables and schemas. One would just design forms and the data would magically go to the right place.
The other day, I stumbled across, MindModel, a product that looks very similar to Savant (except its written for the new-fangled graphic user interface :-) It also looks like it may have the same fatal flaw that my system had. Namely, Savant didn't (because I didn't) understand that the value of a "name" attribute of an entity is not the same thing as the entity itself (nor is it a key). E.G. (JoeSmith,phone#,123-456-7890) does not represent the same information as (key123,name,JoeSmith) plus (key123,phone#,123-456-7890). So, if you change JoeSmith to JoeBlow, you either lose the association with other attributes tied to JoeSmith, or you must remap them all. So, Savant changed all the JoeSmith references to JoeBlow in all triples, but then what to do about the phone number changing? If you change all 123-456-7890 references to 321-654-9999, you have not only changed JoeSmith's phone number but anyone else who shared that number. In other words, it didn't really understand the concepts of entity and attribute.
I see the same fuzzy understanding of entities and attributes in many articles about the Semantic Web and how easy it is to encode data such that any lay person can do it intuitively. They show the same (JoeSmith,phone#,123-456-7890) flavor examples. We all need more basic metaphysics and ontology training in grade school!
Sunday, September 9, 2007
I found Wikipedia page on Quantum-indeterminacy which looks applicable.
Quantum indeterminacy can be quantitatively characterized by a probability distribution on the set of outcomes of measurements of an observable. The distribution is uniquely determined by the system state, and moreover quantum mechanics provides a recipe for calculating this probability distribution.
Indeterminacy in measurement was not an innovation of quantum mechanics, since it had been established early on by experimentalists that errors in measurement may lead to indeterminate outcomes. However, by the later half of the eighteenth century, measurement errors were well understood and it was known that they could either be reduced by better equipment or accounted for by statistical error models. In quantum mechanics, however, indeterminacy is of a much more fundamental nature, having nothing to do with errors or disturbance.AHA! It dawns on me that going beyond the mere fuzzy logic idea of values having a probability or certainty factor, Existential Programming could have a fuzziness value for the property as a whole...as in "it is not certain that this property even applies to this object"...and even further it could mean "it is not certain that this property even applies to the entire Class". A FUZZY ONTOLOGY: method of associating attributes/relationships with entities where each entity is not conclusively known. The value of a property may be certain (i.e. not vague or probabilistic), but whether that property belongs to this object is fuzzy.
Why would you want that ability? How about data mining web pages where several people's names and a single birth-date (or phone number, address, etc) are found. Even though it isn't known which person's name is associated with the birthday, one could associate the birth-date with each person with some fractional probability. With enough out of focus wisps of data like this, from many web pages, the confidence factor of the right birthdate with the right person would rise to the top of the list of all possible dates (analogous to the way that very long range telescopes must accumulate lots of individual, seemingly random, photons to build up a picture of the stars/galaxies being imaged). The fractional probability assigned could be calculated with heuristics like "lexical-distance-between-age-and-name is proportional to the probability assigned". This could make the "value" of a scalar property (like birth-date), in reality, the summarization of a complete histogram of values-by-source-web-pages.
Friday, August 31, 2007
Contemplating this gave me a small AHA moment: Unit Testing is an area where there is an implicit assumption that "Test Passes" has either a true or false value. How about Fuzzy Unit Testing where there is some numeric value in the 0...1 range which reports a degree of pass/fail-ness? i.e. a percentage pass/fail for each test. For example, testing algorithms that predict something could be given a percentage pass/fail based on how well the prediction matched the actual value. Stock market predictions, bank customer credit default prediction, etc come to mind. This sort of testing of predictions about future defaults (i.e. credit grades) is just the sort of thing that the BASEL II accords are forcing banks to start doing.
Another great idea (if I do say so myself) that I had a few years ago was the notion that there is extra meta-data that could/should be gathered as a part of running unit test suites; specifically, the performance characteristics of each test run. The fact that a test still passes, but is 10 times slower than the previous test run, is a very important piece of information that we don't usually get. Archiving and reporting on this meta-data about each test run can give very interesting metrics on how the code changes are improving/degrading performance on various application features/behavior over time. I can now see that this comparative performance data would be a form of fuzzy testing.
Sunday, August 26, 2007
A: The socket came first (at least conceptually). Before a chip is designed, there is a framework defined into which the chip will integrate: Digital vs analog signals, voltage levels for digital one and digital zero, power requirements, clock rates, transistor type, etc. This usually results in the design of whole families of chips that are intended to work together.
SO, no chip is created in isolation, just as no software component should be. Otherwise, the result would be that there is no common framework with which other components can be attached (Super-glue, play-doh, and duct tape not withstanding).
Sunday, August 12, 2007
I.E. What is the killer app of existential programming?
The answer could be "integration of uncooperating systems"...
- Data Integration
- Data Model (i.e. schema) Integration
- Application Integration
- Frameworks API Integration
Saturday, July 28, 2007
In fact, to give "names" back to things that have become commodities, we resort to assigning serial numbers to make them each unique.
[Ed. Note - 11/21/07: as per my disclaimers, once I start looking for my epiphanies on the net, I find them. E.G. in this case, see "Conceiving of entities as objects and as stuff". Congrats Bruce, you've just discovered that there are mass nouns vs count nouns.]
Monday, June 11, 2007
With Existential Programming, a different order of polymorphism (i.e. "a single entity can take on multiple forms") is at work. Here a single entity in its entirety can take on multiple forms. Heck, it can take on multiple ontologies (aka class hierarchies). This means either supporting multiple (albeit, fundamentally similar) views/conceptions of an entity, OR, trying to knit together very different models where the object instance is in some sort of oscillating Schrodinger's Cat-like state (except that we can look at it and not affect its state :-).
Monday, May 14, 2007
Monday, April 16, 2007
The following has been posted on a few sites and I don't know which was the original. In any event, it is very germane to my purpose, so I republish it here...
A controlled vocabulary is a list of terms that have been enumerated explicitly. This list is controlled by and is available from a controlled vocabulary registration authority. All terms in a controlled vocabulary should have an unambiguous, non-redundant definition. This is a design goal that may not be true in practice. It depends on how strict the controlled vocabulary registration authority is regarding registration of terms into a controlled vocabulary. At a minimum, the following two rules should be enforced:
- If the same term is commonly used to mean different concepts in different contexts, then its name is explicitly qualified to resolve this ambiguity.
- If multiple terms are used to mean the same thing, one of the terms is identified as the preferred term in the controlled vocabulary and the other terms are listed as synonyms or aliases.
A thesaurus is a networked collection of controlled vocabulary terms. This means that a thesaurus uses associative relationships in addition to parent-child relationships. The expressiveness of the associative relationships in a thesaurus vary and can be as simple as "related to term" as in term A is related to term B.
People use the word ontology to mean different things, e.g. glossaries & data dictionaries, thesauri & taxonomies, schemas & data models, and formal ontologies & inference. A formal ontology is a controlled vocabulary expressed in an ontology representation language. This language has a grammar for using vocabulary terms to express something meaningful within a specified domain of interest. The grammar contains formal constraints (e.g., specifies what it means to be a well-formed statement, assertion, query, etc.) on how terms in the ontology's controlled vocabulary can be used together.
People make commitments to use a specific controlled vocabulary or ontology for a domain of interest. Enforcement of an ontology's grammar may be rigorous or lax. Frequently, the grammar for a "light-weight" ontology is not completely specified, i.e., it has implicit rules that are not explicitly documented.
A meta-model is an explicit model of the constructs and rules needed to build specific models within a domain of interest. A valid meta-model is an ontology, but not all ontologies are modeled explicitly as meta-models. A meta-model can be viewed from three different perspectives:
- as a set of building blocks and rules used to build models
- as a model of a domain of interest, and
- as an instance of another model.
Note: Meta-modeling as a domain of interest can have its own ontology. For example, the CDIF Family of Standards, which contains the CDIF Meta-meta-model along with rules for modeling and extensibility and transfer format, is such an ontology. When modelers use a modeling tool to construct models, they are making a commitment to use the ontology implemented in the modeling tool. This model making ontology is usually called a meta-model, with "model making" as its domain of interest.
Bottom line: Taxonomies and Thesauri may relate terms in a controlled vocabulary via parent-child and associative relationships, but do not contain explicit grammar rules to constrain how to use controlled vocabulary terms to express (model) something meaningful within a domain of interest. A meta-model is an ontology used by modelers. People make commitments to use a specific controlled vocabulary or ontology for a domain of interest.
Friday, April 13, 2007
"Red isA Color" is different than "Joe isA Human" or "Human isA Mammal". Red is not an entity, Joe is and a Human is. Why? Red is not countable and not distinguishable from another "instance" of red; Joe, humans, and mammals are. [Ed. note: this intuition will later be recognized as the problem of universals as the author reads more books. ;-) ]
Thursday, April 12, 2007
Monday, March 26, 2007
THE PHILOSOPHER'S PROJECT
Here we are again! We'll go directly to today's lesson without detours around white rabbits and the like.
I'll outline very broadly the way people have thought about philosophy, from the ancient Greeks right up to our own day. But we'll take things in their correct order.
Since some philosophers lived in a different age--and perhaps in a completely different culture from ours--it is a good idea to try and see what each philosopher's project is. By this I mean that we must try to grasp precisely what it is that each particular philosopher is especially concerned with finding out. One philosopher might want to know how plants and animals came into being. Another might want to know whether there is a God or whether man has an immortal soul.
Once we have determined what a particular philosopher's project is, it is easier to follow his line of thought, since no one philosopher concerns himself with the whole of philosophy.
Monday, February 12, 2007
I can already see that Philosophy is a deep ocean to get drowned in; better to take on the much smaller pond that is cross-disciplinary studies. Nobody in mainstream computer programming is aware of this stuff! I know because I have a computer science degree and have been doing hands on development for 30+ years from the days of spaghetti code thru structured programming thru modular programing thru embedded programming thru object oriented programming thru distributed programming thru AJAX. I think I saw one article about Plato in CACM once. There is a book in this; dare I say a series? At the very least I ought to start a blog to put these notebook entries someplace in the meantime.
Thursday, February 1, 2007
In Java, such collections of definitions can be grouped into Packages. In XML, they can be grouped into Namespaces. So, a pragmatic way to factor out this metadata (such that it is not copied into every "fact"), is to give the ontology a name and tag each fact with that name (in the same way as each Java value is "tagged" with a data type that includes the Package name/path). And since, XML has already defined URIs as the format for namespace identifiers, and it is easy to map Package names with URIs, Existential Programming systems/languages could use URIs to identify the ontology associated with some set of facts. Since Existential Programming systems seek to seamlessly convert between an O/O and an E/R and an S/N representation, having OO packages easily mappable to data exchange mechanisms like XML is a good thing.
|POSTSCRIPT - Nov 23rd, 2007|
|It is hard to tell which of my epiphanies are new and which are things I read long ago but later remembered as if they were a new idea...I'm sure I read this URI stuff in the Semantic Web articles (see Principle #1), but only "remembered" it when stumbling back across it today. At any rate, it was only recently that Tim Berners-Lee wrote this post about Linked Data using URIs.|
Sunday, January 21, 2007
An analog to this is the invention (discovery?) of Imaginary Numbers in mathematics. The imaginary number "i" is defined to be the square root of -1. Now the mildly mathematical reader will note that you can't have a square root of a negative number because any time you square a number it is always positive. So, when early mathematicians came to a point in their formulas where a square root of a negative number was required, they were stuck. By creating a way to talk about and manipulate numbers that "can't exist" (i.e. imaginary numbers), formulas could be worked through such that "real" answers could eventually emerge.
By developing techniques to work with data that is not consistent with a single ontology (i.e. existential programming), programs can get past the "thats not legal data" stage and work its way to answers that ultimately do result in "legal data".