Taking back our information — even as we leave our information where it is

This is the second in a series of posts on the theme of “Taking back our information”. The first post in this series had the subtitle: “Let’s get out of the export/import business & build places of our own for digital information”

There are good reasons to get out of the export/import business when it comes to the management of our personal information. Even as initiatives like the Data Liberation Front work towards making things easier, information export from an application (and then import into other applications) will remain a hassle for basic reasons. The export/import process takes time, is fundamentally “lossy” and often leaves us in wishing we didn’t have to choose between application A and B. Why not both? Instead of moving our information between applications, why can’t applications come to our information?

We get a small glimpse for how this might work by considering a lengthening list of Web-based applets for editing the content of photographs: Fauxto, Flauntr, Fotoflexer, Lunapic, Phixr, Phoenix, Photoshop.com, , Picture2Life, Pixenate, Pixer.us, Pixlr, Preloadr, PXN8, Snipshot, Snipshot Pro, Splashup. Applets work with pictures saved in standard formats. Applets let us browse to these pictures as files in our local file system.

More interesting, many of these applets let us work with our pictures as stored in a service like Facebook or Flickr. If we have a URL for the photo, we can open the photo directly into a photo-editing applet. Otherwise, we can often browse to a desired photo by first selecting a containing folder or “album”. Behind the scenes, applets make use (directly and indirectly via dialogs such as “File Open”) of relevant APIs. For photos stored in files locally in the user’s file system, applets might use one of the Windows APIs (e.g. Win32) or one of the Mac OS X APIs. For photos stored in a Web service such as Facebook or Flickr, applets might use the Graph API or the Flickr API.

Applets come close to supporting a model we are familiar with from our work with files stored locally on our personal computer: Browse, select, open, edit…

And…? We might like to complete the sequence with “save & close”. Or, perhaps even better, we might like support of a model in which changes are immediate (but with an option to undo) as, for example, they are when we edit notes in OneNote.

Alas, things aren’t so simple. A privileged applet may have a special relationship with hosting service so that the modified photo is seamlessly saved back in place of the original. This seems to be currently the case, for example, for Aviary as the privileged photo-editing applet for Flickr. In other cases, we must save the modified version back as a new photo instead. Or, if we are able to save back over the photo (e.g. as kept in a file locally) we may find that tags and other metadata have not been preserved.

Even so, these photo applets — with all their current limitations — help us to imagine a better world for the management of our digital information. The approach is potentially a win both for the makers of these applets and for us as users of these applets:

1.      Applet makers can focus on surface features for viewing and working with pictures. Makers leverage existing support for storage and related backend features for 24x7 operation, security, sharing, synchronization, etc. Applets contrast and compete with each other with respect to features such as special effects, maximum resolution, support for multiple layers and overall ease of use.

2.      As users of these applets, we have more freedom to explore and to mix & match. Our decision concerning which front-end applications to use can be separated from decisions concerning the back-end services to use for storage, sharing, security, etc.

In a similar “in situ” spirit we might expect growing lists of applets that let us work in-place with other chunks of content – other information items -- such as audio recordings, video recordings and text (plain or in HTML format).  

As we use these applets, a healthy change can begin to occur in the relationship between us, “our” information and the “owning” application (whether this is an application with origins on the desktop, such as MS OneNote, or a Web-based service such as Evernote or Facebook).  Have we thought about switching between applications – moving, for example, from OneNote to Evernote? It’s easier to contemplate doing this if we know that some of the applets we’ve come to depend upon can be used in conjunction with either application. Conversely, we may more happily stay with the current owning application knowing that we are no longer completely hostage to its own “built-in” or “privileged partner” support for viewing and editing our information.

We begin to take back our information even as we leave it where it is. We might do this, anyway, for selected forms of information item such as pictures and for selected activities such as viewing and editing. Applets that work in-place with our information actually invoke a more original meaning of the word “application” as in “the act of applying or laying on”.  Let the applications come to our information, not vice versa.  Let the application be a thing we apply to our information rather than a thing we to which we “submit”.

This is a beginning — but only a beginning. Taking back our information, i.e. getting more control over our information, is a key part of a larger effort to improve our practices of personal information management (PIM). And PIM is about more than viewing and editing individual items of information content. PIM is also about meta-level activities involving larger groupings of information. We need to insure that our information is properly maintained (backed-up, archived, updated) and organized.  We need to manage for privacy, security and an overall flow of information (i.e., who sees what when? Who/what gets our attention and when?). We need to measure and evaluate (e.g. “what did I get done today?” “Am I spending too much time in email and in overall communication?”). Above all, we need to make sense of our information (“What do I have here? What is this telling me?”).

These activities each depend, in one way or another,  upon an ability to structure and, more specifically, to group our information. It is our information structure, not our items of information content, which is most likely to be mangled or left behind altogether in an exercise of export/import. It is this information structure that poses the biggest challenges to our efforts to take back our information.

Why does taking back our information structure (as well as content) matter? And why is it so hard to do?

The first question is readily answered with reference to the listing above of PIM meta-level activities. When the structures we build and use are balkanized into the “silos” of different applications, it is nearly impossible for us to develop an overall organization for our information. We work instead with fragments of the same or _nearly_ the same structure in several different applications.

The structure of notebooks and tags we create in Evernote or the structure of section tabs and pages we create in MS OneNote may resemble the folder structure we have in our local file system or the outline we’ve developed in MS Word. But these correspondences are only partial and easily undermined. Update one structure… and now what about the others? Name a project , task or topic one way in one application; name it another way in another and then wonder later “which name did I use and where?”. (There are basic reasons for this “verbal disagreement” in our use of names.)

With our information structures fragmented by the applications we use, each of the meta-level activities is harder to do.  How can we establish a coherent, consistent, comprehensive policy for the update, backup and archival of our information? Or, a comparable policy for sharing and the “flow” of information from and to us.(Who sees what when? Who/what gets our attention and when?) How can we measure and evaluate our practice of PIM. (Where does our time go? On which projects? With which people? Regardless of specific application?) Above all, how can we make sense of our information when our structures for doing so are fragmented across so many different applications?

We might like to extract our structures from various applications in a manner similar to the extraction of a photograph or a song or a video clip or a chunk of text – even if only for a little while. We might like to do this, for example, in order to compare and contrast our organization of information across different applications. Which ones are old, abandoned and possibly way out of date with respect to our current ways of thinking about and using our information? Conversely, which organizations work best? We might like to extract their structures in order to copy and use elsewhere. A good organization deserves to be used and re-used.  

And then, thinking farther out, the ability to extract structure from our applications is a big step towards making structure “first class” as a thing that exists independently of our applications and that serves to integrate _all_ of our digital information. Structure can then evolve into a personal unifying taxonomy (a “PUT”) – all of our digital information in its place, and a place for all of our digital information.

Ok, so taking back our information structures is important. Why is doing this so hard? The reasons are summarized under two words, “disparate” and “diffuse”:

Disparate.  Our structures come in many different forms according to the applications we use and the nature of the information we wish to structure. In Facebook, for example, we can structure our pictures into an “Album”. We can structure information about our friends into a “FriendList”. We structure information about a group of people (e.g., our bridge or poker group or a project team at work) into a “Group”. Evernote, provides for “Notebooks”, “Notebook stacks” and “Tags”. MS OneNote, as an alternate note-taking application, provides for “Notebooks”, “Section groups”, “Sections”, “Pages” and “Sub-pages”.  And of course we still have the folders and subfolders of our  local file systems.

Different applications may use the same names for distinctly different ways of structuring. In Evernote, for example, the “tags” that are applied to structure notes can themselves be structured into a hierarchy. “Tags” in OneNote, on the other hand, can each be given a font, a color and an icon but cannot be structured into a hierarchy.

Diffuse. Leave aside the scattering of our information structures into different applications and the disparity in the ways to structure. Even within an application, structures often seem to be spread out. How to work with a “chunk “ of structure in the way that we might work with a picture or a song or a paragraph of text? We expect to work on pictures independently of one another. Can we do something similar with a unit of structure? Can we do so quickly and efficiently? And what about the structures we want to share for group work as, for example, when we work through a wiki or the Dropbox or Google Drive or SkyDrive? Can we work on these structures in ways that minimize the risk of collision and conflict?

We need a chunk -- a unit of structure – that, by analogy to a picture (or song, clip, paragraph or some other chunk of content) might be viewed and edited in place, quickly, easily and with minimal chances for conflict. We expect several features in such a unit of structure.  It should be:

Ø  Modular with respect to the operations we’d like to perform.  We might like, for example, to change the pictures that go into an “Album” or the notes that display on a “Page”. More generally, for any structure, through any application or service, we need to be able to change the groupings that are established through the structure.

Ø  Small – as small as possible but no smaller. Smaller size supports handling speed and also modularity. But, in many examples to be considered in the next post, the unit can’t be broken down further into smaller units of structure (such as individual links) without serious loss of information.

Ø  Well-defined. It should be clear how any given structure can be decomposed into these structural units.

Ø  Fully expressive, in aggregate, of the overall structure. It should be possible to express a targeted information structure (e.g., as housed in an application like MS OneNote or a service like Facebook) through a simple collection of these structural units.

We need a name for our unit of structure. Our structures are information in their own right. By extension, our unit of structure is a special kind of information item. Call it a “grouping item”.  A grouping item is an information item whose primary purpose is to form a grouping of other information items (some of which themselves might be grouping items).  As we shall see in subsequent posts, our structures in the applications we use can be reduced to collections of grouping items. “Albums”, “Notebooks”, “Tags”, “Folders”, etc. are all variations of the grouping item.

Recall that the mix & match use of applets as applied in-place to a picture depends upon two things:  1. The use of an API as supported by the application through which the picture is stored. We have called this the “owning application” but from now on we’ll call this the storing application.  2. The retrieval of the picture in a standard format such as JPEG. This is a format that applets know how to parse and manipulate.

As we’ll explore in later posts (and can see now through inspection of API documentation) popular applications/services such as Evernote and Facebook also support the retrieval of structure which is often packaged into grouping items (under various names). By extension of the analogy to pictures, we might then look for a standard format for grouping items that is comparable, for example, to the JPEG format for pictures.

Alas we’ll find nothing directly suited to the diversity of structures we are able to build through our applications. Moreover, for reasons, we’ll explore in subsequent posts, there are reasons why such a “format” for the grouping item needs to be both much more basic and much more flexible (extensible, customizable) than formats used for chunks of content.

Finally – and this is most important – the representation of a grouping item for purposes of a mix and match of applications (as “applied” in place to the structure) is only a shadow of a grouping item’s actual representation within the storing application. We’ll call it metadata – a fragment of metadata to represent (“shadow”) a grouping item.

Fortunately, in XML, and especially in its conventions for the use of namespaces, we have a very powerful language for creating such a format or, more accurately, an XML schema.  In subsequent posts, we will explore features that this schema for the grouping item needs to have. We will see how we can begin to use this schema without extra, explicit support from our applications or the blessings of a standards-making body.  And we will hear a tale for how we might take back our information even as we leave our information in the applications we currently use. A tale that might actually come true.

But first, we need to take a more thorough look at the grouping item in some of its many manifestations across the landscape of our digital information. The grouping item comes in many different forms and yet , across these forms, there is a core commonality that provides a basis for the “taking back” of and integration of our information. How can we represent both the diversity between and core commonality among the many forms of the grouping item? And what, in turn, can grouping items – in all their different expressions – say about us and the ways we think and live? These are topics for the next post in this series.



Post new comment

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

This question is for testing whether you are a human visitor and to prevent automated spam submissions.