Pro Forma
This article is a follow up on yesterday's article on metadata. It's not metadata per se that is the fundamental concern, it is data in general. Or, more to the point, it is the Pro Forma sampling of reality that databases and computing technology in general forces us to do.
Anyone who uses data or computers on a regular basis is probably so used to the idea of data that questioning it seems easy to dismiss. The difference is vast, though, to how we relate with reality outside of computing, and I'll try to step through the major differences here.
With software systems, there are two major logical pieces: algorithms and data. Some folks (function point counters , especially) call this "Data in motion" and "Data at rest". Let's look at how both of these pieces get created in software, and compare this process to how human beings interact with information outside of computing systems.
Algorithms are defined as follows (with variations):
ALGORITHM - (I) A finite set of step-by-step instructions for a problem-solving or computation procedure, especially one that can be implemented by a computer. [RFC2828] A mathematical procedure that can usually be explicitly encoded in a set of computer language instructions that manipulate data. (from this site)
Programs are defined this way:
PROGRAM - A complete sequence of computer software instructions necessary to provide an application, solve a specific problem, perform an action, or respond to external stimuli in a prescribed manner. As a verb, it means to develop a program. (from this site)
Software defined this way (with variations):
SOFTWARE - Computer programs; instructions that make hardware work. Two main types of software are system software (operating systems), which control the workings of the computer, and applications, such as word processing programs, spreadsheets, and databases. (from this site)
All point to concepts of "instructions" or "well-defined rules". So, algorithms get created and put into place through this process:
- Define the problem
- Design a finite set of instructions to solve the problem in a finite number of steps
- Codify the solution by writing software to execute these instructions and respond to external input, as appropriate (mouse, keyboard, other input devices)
- Test the code, eliminate errors
- Deploy the code to other computers (servers, desktops, mobile phones, etc)
- Execute the code (algorithm) in real time.
Data is defined this way:
DATA -"A formalized representation of facts or concepts suitable for communication, interpretation, or processing by people or automated means." The term "data" is often used to refer to the information stored in the computer. Webster's Dictionary of Computer Terms (3d ed. 1988). (From this site)
DATA - A representation of facts, concepts, or instructions in a formal manner suitable for communication, interpretation, or processing by human beings or by computers. (From this site)
These definitions of data point to concepts of "representation of concepts". Data as I'm discussing it here could include files, and it's not important where the data is stored (disk, memory, etc) for this discussion. Data ("data at rest") is created and put into place through this general process:
- Define the scope of the data
- Define a conceptual container for the data (file, database, etc)
- Prescribe a meaning to instances of the data ("Name" in a database, "Cell" in a spreadsheet, etc.). This meaning may be generic or specific ("Content" or "CustomerID").
- Create computer-based handing of the data (program, file store, database, etc.)
- Test the data handler
- Deploy the data handler
- Create instances of the data handler, adding data
In the case of both algorithms and data, there is a generic pattern of development that applies to each:
- Define, up front, the boundary of the computing solution
- Prescribe a solution
- Codify this prescribed solution
- Release the solution into the computing environment
And in all cases, computing solutions follow this pro-forma pattern of up-front, predeterminedness.
Not all computing programs express their predeterminedness in the same way; many programs are generic enough that users are able to use them in a very flexible manner. This is accomplished by defining the original problem in a very abstract manner, and then configuring processing flow and instructions during runtime based on external inputs. This is NOT accomplished by getting away from the predetermined-instruction, predefined-data paradigm. Even when a program follows a heuristic algorithm, the heuristic itself is a predetermined set of instructions.
Contrast this to the way human beings interact with reality. We anticipate the future, but not in a set of predetermined instructions. We gather information, but generally it's stored in context of other information that we have already learned. We can even pre-determine what it is we seek to see in the real world: how often have you purchased something, only to then notice that thing everywhere? For example, when I last bought a bicycle, I started seeing people riding bicycles everywhere. I recently started driving an old British sportscar, and now I'm seeing little British sportscars everywhere.
But, this is different in kind than the way that computers gather data. How I discover and solve problems (including my occasional approach of ignoring them or putting them off until they go away) is also completely different than the predetermined, algorithm-based computing approach.
The pro-forma approach that is inherent in computing demands of programmers and people that we come up with better and better models of reality (or problem space, or business, etc). These models can be specific or abstract, but they remain predetermined models.
|
|
© Copyright
2005
Steve Land.
Last update:
4/21/2005; 8:22:06 AM. |
|
|