The truth will set you free. But not until it is finished with you.
[continued from blog entry]
Here's how I want to work: I want to be able to just note stuff down (in my ideal world, wherever I am at that moment) and have it get stored and automatically categorized, organized -- by timestamp, at least, but ideally also in some kind of navigable taxonomy
. In my ideal world, this is all implementation-neutral: I could do this by voice, by scribbling on a pad, by wiggling my fingers, whatever, wherever. But for now, I'll settle for having this work as data input to a computer, networked at least part of the time.
Then I'd want to be able to take the elements from this central datastore and shoot them in whatever direction I wanted -- shoot them to my blog, to my address book, to my weblinks, to some centralized, network-accessible repository, or even -- and this is really the critical part -- over to another datastore. Say, from the one on my handheld to the one on my desktop, or from desktop to a backup on the web.
If I write something I want to publish, I want to be able to take some simple action that publishes it -- sends it to any place that I normally publish things. That may be my private journal; it may be something like what a wiki is supposed to be, a datastore on some or many subjects. It shouldn't matter.
When it's published, it shouldn't matter where I was when I wrote it, or what technology I used to write it; and I should be able to use the same piece of my own data to publish to many different places -- my own places, or someone else's.
This is something I've fantasized about for a long time. Within the last two to three years, it's sometimes felt almost palpably close. Yet it's really as far away as ever. And there's no really good reason for that. At least, no technical reason. Everything I describe above has been technically feasible for ten years.
Let's break down the requirements, here, so we can see a few of the barriers.
The first part, I'll call the "journaling editor": Its purpose is to record and store data in a publishable
format. It would need to write data to a standard format that can be interchanged losslessly with any application. If I unpack the requirements, I can see that this implies that it should write some kind of XML format, and that it must have a cleanly-defined and well-defined API. Less obvious: It must be client-executing, with access to the client filesystem (so conventional web-served Java is out), must have a rich UI (which is to say, it can't be web-based in the traditional sense), and must be platform agnostic at least with regard to Windows/*nix. (Note that *nix implies Macintosh, as well.)
It would not matter if the journaling editor it were an actual, particular piece of software, or simply described any software that was capable of meeting the interface requirements. It's clear that it's really not that difficult a piece of software to write; a skilled XUL or GTK hacker could probably pull something ugly together in a few days. A EMACS maven would already be working out the rough details of how to code it in macros in his/her head as they read this.
The second part, I'll call the transmission layer: Its purpose is to publish/syndicate (for present purposes, they're synonymous). It must have clean, standard interfaces with the journaling editor, and exist on all the same platforms. It must be capable of abstracting to whatever submission protocols are required for its source platform (the journaling editor) and target platforms.
Again, there are existing toolsets to do a lot of this. The syndication layers for major weblogging and aggregation software could, in effect, be coaxed into serving as this layer.
The third part, I'll call the presentation layer: Its purpose is to take what I want to publish and present it to its audience. It must be capable of losslessly reading the item I'm publishing, and handling it in a way that does not cause its own presentation to break. So if I embed an ActiveX object, and my target platform doesn't like those, it should be able to not show them. If my content includes a large image, the target platform ought to be provided enough information to resize or omit it.
Right now, all of these layers tend to be wrapped up in a single application, with some qualified exceptions, and excepting that the applications tend to talk to like-peers (e.g., Drupal to MT) via their syndication interfaces. Parts of the journaling editor and the transmission layer can be wrapped up in widgets that facilitate blogging from your browser or your desktop. They're clients of the presentation layer, but they don't separate the data from the weblog system. Similarly, all three layers can be collapsed, though badly, into a News Aggregator, which then interacts with the blogging software via some standard posting API.
That all these functions are wrapped together into a single piece of software that works "well enough" is clearly one of the biggest barriers. There's no good incentive to, say, keep an XML datastore of your Drupal (or MT or Wordpress) weblog. (Other than "backup", of course. [Ahem.])
This isn't really a visionary manifesto. People cook up the functional equivalent of this for their own use, all the time
. But nobody ever pulls it together into pieces that could be used by ordinary folks.
I can only recall encountering a small number of systems that were designed to create client-side datastores that then updated a server-side repository. The principle, relevant one for this discussion: Radio. (Notes is another good example, but it leads the diatribe in an entirely different direction, so I'll ignore it for now.) And Radio was only ever intended to do that in a fairly primitive, unsophisticated way: By uploading already-formatted HTML pages -- and often, duplicates of those pages -- to a webserver. Later revisions let it integrate with Manila servers; but since Manila was never very widely used, that limited its interoperability significantly.
Quite aside from its many, many other practical flaws, Radio fails to meet my requirements on two very important criteria: It doesn't have a rich UI (programmable API's don't count -- anyone who says they do is not user-focused), and it's tightly-coupled. It's still a single application; if you use Radio on the client, you use Radio or Manila on the server. End of discussion, unless you want to roll your own exports, of course, but again, APIs are not user interfaces.
The key point to all of this is that it's achievable now -- for one or two or a small group of people, who can agree to share the same software. It's been achievable on that level for a long time.
I wrote today in another context that problem with how we find information (call it the "information location problem") isn't really a problem with how we store data, but the metaphor we use when creating it. Since personal computing began, the dominant paradigm has been what could be called either the "file-centric" or the "application-centric" paradigm: You start an application and use it to create a file that contains your data. You always need to know if it's an Excel file or a Word file or a Trapeze file.
As a remedy to that, and starting back in the 1970s, some researchers began to suggest what would come to be known as a "document-centric" paradigm. The idea was that you would create or open documents, without needing to know what the application was that created them. Apple's Lisa OS, and then the Macintosh OS, were strongly influenced by this idea. The idea had a huge impact on the design of applications, too: AppleWorks/ClarisWorks, Microsoft Works, GeoWorks, and many other application mini-suites were inspired to a greater or lesser degree by the same ideas. But those were still app-centric solutions, as was the Mac OS implementation: You still needed that specific application at that place and time to open that document.
At the system level, Microsoft's OLE [Object Linking and Embedding, later to diverge in part into COM (Component Object Model) and ActiveX] and the ambitious and promising Apple-led OpenDoc effort hoped to solve the problem in a more fundamental way. Microsoft never really wanted to make the data objects independent of the application -- they were an app-software vendor, after all -- but OpenDoc, at least as I recall it, aimed to remove the boundaries between different types of data.
It was ambitious, promising, and doomed to failure. The systems in place were "good enough"; there wasn't sufficient incentive to replace them. Businesses (who drove most development) were quite comfortable, conceptually, with separating document types by application. It was just another form of regimentation, and post-Fordist business is all about regimentation, no matter what anybody wants to tell you about "creative destruction" and cheese-moving.
What I'm looking for is something both less and more ambitious. More ambitious, because if things happen the way I hope they will, there won't be any reason not to replace all note-taking and most publication with instances of journal editor objects. Certainly you'll still need to pack up spreadsheets or formal, rote-printable documents in some specific software format (Microsoft Project, for example); but in the world I envision, embedding those once into a journal editor object could make them available for distribution as email, posting to a portal, or tracking in a CVS repository, all conditioned only by where you wanted to send them.
Less ambitious, because I don't imagine people getting rid of their old applications, and I don't envision people switching to this mode of information creation in large numbers over a short period of time. The beauty part, and the most important take-away from this is that they would not need to. If the systems are standards-compliant -- if they use free and open and well-defined interfaces -- then there should be no problems creating gateways or abstraction layers into Word or Excel or Outlook or Sharepoint or Domino or Notes or Groove or MoveableType.
And that's enough for now.
Meta-note: I had to post the story this way -- a blog entry linking to a longer piece -- because of a core weakness in Drupal's Node model. It seams that "teaser" sections can only be set at one of several fixed lengths, none greater than 2000 characters. So I could only post this on the front page on pain of driving all other front-page content well down the page. So one obvious workaround: Post a blog entry that points to a real story. So here it is.