As World War II mercifully drew to a close, Vannevar Bush, President Truman's Director of Scientific Research, surveyed the post-war landscape and laid out what he viewed as the most important forthcoming challenges to humankind.9 In his oft-cited article, he also described a hypothetical information storage device called the "memex,"a intended to tackle the information overload problem that was already formidable in 1945. In Bush's own words:
"Consider a future device for individual use, which is a sort of mechanized private file and library. It needs a name, and to coin one at random, 'memex' will do. A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory."
He went on to specify that the user should be able to "add marginal notes and comments," and "build a trail of his interest" through the larger information space. And Bush's emphasis throughout the article was on expanding our own powers of recollection: "Man needs to mechanize his record more fully," he says, if he is not to "become bogged down...by overtaxing his limited memory."
According to Bush, this kind of ubiquitously available digital assistant, capturing and faithfully reproducing a person's thoughts, sources, and organization of information, would be more than anything else the key to maximizing mankind's potential in the coming information age. Granted, the vision he described has inspired a variety of other research endeavors, from information retrieval to distributed hypertext systems. But a great deal of his design can be seen as an example of what this article will characterize as a personal knowledge base.
Personal knowledge bases have existed in some form since humankind felt compelled to manage information: card files, personal libraries, Da Vinci's notebooks. Today there are literally dozens of software products attempting to satisfy these needs. Designers approach the problem in different ways, but have the same aim. And no wonder. If a "memex" was needed in Bush's day, then today's information explosion20 makes it an order of magnitude more important. If a human's only tool for retaining what they learn is their biological memory, their base of knowledge will be porous indeed.
Yet the problem is deceptively difficult to solve. How to design a system that attempts to capture human memories? To interrelate heterogeneous information, assimilated from numerous diverse sources and filtered through an individual's subjective understanding? To give users a natural way to search and correlate and extend that information? These are mysteries that many have attempted to solve but which remain tantalizingly incomplete.
In this article, we look at a number of these systems and provide a taxonomy for classifying their approaches.
We define a Personal Knowledge Baseor PKBas an electronic tool through which an individual can express, capture, and later retrieve the personal knowledge he or she has acquired. Our definition has three components:.
Personal: Like Bush's memex, a PKB is intended for private use, and its contents are custom tailored to the individual. It contains trends, relationships, categories, and personal observations that its owner sees but which no one else may agree with. Many of the issues involved in PKB design are also relevant in collaborative settings, as when a homogeneous group of people is jointly building a shared knowledge base. In this case, the knowledge base could simply reflect the consensus view of all contributors; or, perhaps better, it could simultaneously store and present alternate views of its contents, so as to honor several participants who may organize it or view it differently. This can introduce another level of complexity.
Knowledge: A PKB primarily contains knowledge, not information. That is, its purpose is not simply to aggregate all the information sources one has seen, but to preserve the select knowledge that one has learned from those sources. Psychologically, knowledge is the formation of a mental model in one's own mind that corresponds to the information encountered.21
Base: A PKB preserves knowledge for the long haul. It is ideally future-proof: immune to shifts in technology and to disaster. It can be fluidly searched and browsed. It also forms a consolidated, integrated whole, without partitions that might isolate some areas of knowledge from others. This is because it is a reflection of one's own memory, which, as Bush and many others have observed, can freely associate any two thoughts together, without restriction.3,9
What can be gained from using a PKB? An idea of the presumed advantages can be gleaned from the way in which today's numerous solutions are "pitched:".
Knowledge generation and formulation. Here the emphasis is on procedure, not persistence; it is the act of simply using the tool to express one's knowledge that helps, rather than the ability to retrieve it later. Systems boast that they can "convert random thoughts generated while you are the most creative into the linear thoughts needed most when communicating;" "help you relate and arrange random ideas;" and "stimulate your brain" (http://mindmappersusa.com).
Knowledge capture. PKBs do not merely allow one to express knowledge, but also to capture it before it elusively disappears. The point is to lower the burden of jotting down one's thoughts so that neither task nor thought process is interrupted. Stay At-Play's Idea Knot, for example, asserts that "it is very quick to open a...document and within seconds record the essence of that new idea without distractions, while your mind is focused on it and without disturbing the flow of your current work." (http://www.stayatplay.com).
Knowledge organization. A short study on note-taking habits found that "better organization" was the improvement people most desired in their own information recording practices.24 PKB systems like Aquaminds Notetaker (http://aquaminds.com) profess to answer this need, allowing one to "organize personal information," and claiming to be "a more productive way to stay organized."
Knowledge management and retrieval. Perhaps the most critical aspect of a PKB is that the knowledge it stores is permanent and accessible, ready to be retrieved at any later time. PersonalKnowbase (http://bitsmithsoft.com) claims it will "give you a place to stash all those stray snips of knowledge where they can be quickly recalled when you need them," and MicroLogic's InfoSelect lets you "find any data in an instant, no matter where or how you entered it" (http://miclog.com).
A plethora of candidate PKB systems have emerged over the past decades. Here, we give an overview of some of the more notable efforts from three distinct research communities.
Graphical knowledge capture tools. Much fanfare has been generated in the last 30 years around pictorial knowledge representations. Some claim that drawing informal diagrams to represent abstract knowledge is an excellent way to communicate complex ideas, enhance learning, and even to "unlock the potential of the brain."
"Mind mapping" and "concept mapping" are the two most popular paradigms in graphical knowledge capture. A mind map is essentially nothing more than a visual outline, in which a main idea or topic is written in the center of the diagram, and subtopics radiate outward in increasing levels of specificity. The primary value is in the freeform, spatial layout, and the ability for a software application to hide or reveal select levels of detail. The open source Freemind project (see Figure 1) is just one of literally dozens of such tools.
Concept maps34 are based on the premise that newly encountered knowledge must be related to one's prior knowledge in order to be properly understood. Concept maps help depict such connections graphically (see Figure 2). Like mind maps, they feature evocative words or phrases in boxes connected by lines. However, there are important differences in the underlying data modeltree vs. graphthat will discuss shortly.
Hypertext systems. The hypertext community proudly points to Bush's article as the cornerstone of their heritage. Hence the development of hypertext techniques, while seldom applied specifically toward PKB solutions, is historically important. Doug Engelbart, who began developing the first viable hypertext system in 1959, stated his purpose as "the augmentation of man's intellect."15 In other words, Engelbart's goal was to use the hypertext model specifically to model abstract knowledge.
The TextNet37 and NoteCards23 (see Figure 3) systems further explored this idea. TextNet revolved around "primitive pieces of text connected with typed links to form a network similar in many ways to a semantic network."16
The subsequent NoteCards effort, one of the most influential hypertext efforts in history, was similarly designed to "formulate, structure, compare, and manage ideas." The popular Hypercard program for the Apple MacIntosh22 offered similar functionality through its notion of a stack of digital cards that could be flexibly interconnected.
The sweeping vision of the Xanadu project33 was for a web of documents composed of individually addressable chunks of any granularity. Any part of one document could freely refer to any part of another, and this reference would persist even when the contents of one or both documents were updated. An interface based on the parallel visualization of texts and "transpointing windows" made navigating this intricate structure possible. Though not yet fully implemented, Xanadu's design also makes heavy use of transclusion, as described later. Other examples of knowledge-based hypertext tools include Compendium,12 PersonalBrain (http://thebrain.com), and Tinderbox.4
Note-taking applications. The most explicit attempt to create a PKB as we have defined it comes from the area of note-taking applications. These software tools allow a user to create bits of text and then organize or categorize them in some way. They draw heavily on the "note-taking" metaphor since it is a familiar operation for users to carry over from their experiences with pen and paper (see Figure 4).
Supporting bodies of research. In addition to the kind of user interface exhibited by the systems mentioned here, several underlying technologies are vital for a PKB to function properly. Here we give a brief overview of some of the research efforts that are applicable to PKB implementation.
Graph-based data storage and retrieval. Since nearly all PKB systems employ some form of graph or tree as their underlying data model, research from the database community on semi-structured data storage and retrieval is relevant here. Stanford University's Lore project30 was an early implementation of a DBMS solution specifically designed for graph-based data. The architecture provided an efficient storage mechanism, and the Lorel language permitted precise queries to be posed in the absence of a fixed schema. Lore's notion of a "DataGuide"a summary of the graph structure of a knowledge base, encoding all possible path traversalscould even provide a basis for assisting users in query formulation, a particular challenge when a knowledge base grows in haphazard fashion.
UnQL7 was an even more expressive query language than Lorel, based on structural recursion that could be applied to tree-structured data as well as arbitrary graphs. Buneman et al. demonstrated that the language can be efficiently implemented in a way that queries are guaranteed to terminate, and can be optimized just as in the relational algebra. StruQL,16 though originally designed for the specific task of Web site development, was in fact a general-purpose graph query language with similar features.
More recently, the World Wide Web Consortium's Semantic Web initiative has prompted numerous implementations of RDF triple stores, or databases that house graph-structured data. Notable examples include Jenab and Sesame.6 SPARQL,36 an RDF query language for selective retrieval of graph data, was recently adopted as a W3C Recommendation. For tree (as opposed to graph) data, a multitude of XML query languages exist, including XQueryc and the recently proposed extension XQFT,2 which enhances explicit tree navigation constructs with information retrieval on the nodes' free text.
All of these efforts lend considerable optimism to PKB implementers since they provide techniques for efficiently storing and retrieving the kind of data most PKBs are likely to store.
Graph data integration. A large portion of a user's knowledge consists of bits of information from the external sources they have assimilated. In some cases, these external sources may contain structured or semi-structured data. This would be the case if one PKB wanted to subsume part of another, say, or if the information source itself was a relational database or came from a graph-structured knowledge store, expressed in RDF or another graph-based representation. In these cases, the research findings of the data integration community can be brought to bear. Stanford University's pioneering work on the OEM model and query language35 illustrated a standard mechanism for the exchange of data between diverse and dynamic sources. McBrien and Poulovassilis29 showed how XML data sources can be semantically integrated by reducing disparate schemas into a common, lower-level graph (actually, hypergraph) language. Vdovjak and Houben38 provided a framework for a unified interface to query heterogeneous RDF data sources. These successes place the PKB-related task of subsuming external information on a firm theoretical footing.
Data provenance. When assimilating external knowledge, a PKB should also track and retain source information. Research in managing data provenance (sometimes known as data lineage) has produced numerous relevant results here. Buneman et al.8 devised a system for tracking user's browsing and collection activities so they can be queryable later on. They accomplished this by supplementing the user's personal database with a separate provenance database that links items to their original sources, and to previous versions within the local database. The Trio system39 also automatically tracks when and how data items came to exist, whether imported from outside sources, or computed from other known facts. This allows a history of each item to be reconstructed, and the database to be selectively filtered based on source or time information. Bhagwat et al.5 specifically studied the propagation of annotations, so that as data evolves over time source data can be recovered. These techniques are applicable to PKB implementation as well, to enable users to browse and collect information and know that source information will automatically be tracked.
Kaplan et al. stated it well when they observed in 1990 that "dominant database management paradigms are not well suited for managing personal data," since "personal information is too ad hoc and poorly structured to warrant putting it into a record-oriented online database."25 Clearly this is the case; when we want to jot down and preserve a book recommendation, directions to a restaurant, or scattered lecture notes, a rigidly structured relational database table is exactly the wrong prescription. The random information we collect defies categorization and quantization, and yet it demands some sort of structure, both to match the organized fashion in which we naturally think and to facilitate later retrieval. The question is, what sort of data model should a PKB provide?
A few definitions are in order. First, we will use the term "knowledge element" to refer to the basic building blocks of information that a user creates and works with. Most systems restrict knowledge elements to be simple words, phrases, or concepts, although some (especially note-taking systems) permit larger blocks of free text, which may even include hyperlinks to external documents. Second, the term "structural framework" will cover the rules about how these knowledge elements can be structured and interrelated.
This section presents and critiques the five principal PKB structural frameworks (tree, graph, tree plus graph, spatial, and category.) The vast majority of PKB tools are based on one of these five principal frameworks, although a handful of alternates have been proposed (for example, Lifestreams' chronological approach17 and Aquanet's n-ary relations.27) I will then give particular attention to Ted Nelson's ZigZag paradigm,32 a more flexible model than any of these five whose expressive power can subsume all of them. Later, key characteristic of transclusion and its influence on the various frameworks will be addressed.
Tree. Systems that support a tree model allow knowledge elements to be organized into a containment hierarchy, in which each element has one and only one "parent." This takes advantage of the mind's natural tendency to classify objects into groups, and to further break up each classification into subclassifications.
All of the applications for creating mind maps are based on a tree model, because a mind map is a tree. And most of the "notebook-based" note-taking systems use a tree model by allowing users to partition their notes into sections and subsections. Some tools extend this paradigm by permitting "crosslinks" between items (or Mind Manager's "floating topics," which are not anchored to the hierarchy.) The fact that such features are included betrays the inherent limitations of the strict tree as a modeling technique: it is simply inadequate for representing much complex information.
Graph. Graph-based systemsincluding hypertext systems and concept-mapping toolsallow users to create knowledge elements and then to interconnect them in arbitrary ways. In many systems, links between items can optionally be labeled with a word or phrase indicating the nature of the relationship, and adorned with arrowheads on one or both ends to indicate navigability.
An alluring feature of the graph data model is that it is essentially equivalent to a "semantic network,"3,40 believed by many psychologists to be an excellent model for human memory. Just as humans perceive the world in terms of concepts and the relationships between them, so a graph depicts a web of interconnected entities. The ideal PKB would supplement this model with alternative retrieval mechanisms (such as full-text indexing, or suggesting "similar items" the user has not explicitly linked) so as to compensate for the human mind's shortcomings. But if a primary goal of a PKB is to capture the essence of a human's thoughts, then using a graph data model as the foundation is powerfully attractive.
Tree plus graph. Although graphs are a strict superset of trees, trees offer some important advantages in their own right: simplicity, familiarity, ease of navigation, and the ability to conceal details at any level of abstraction. Indeed, the problem of "disorientation" in hypertext navigation11,26 largely disappears with the tree model; one is never confused about "where one is" in the larger structure, because traversing the parent hierarchy gives the context of the larger surroundings. For this reason, several graph-based systems have incorporated special support for trees as well, to combine the advantages of both approaches.
One of the earliest systems to combine tree and graph primitives was TEXTNET, which featured two types of nodes: "chunks" (that contained content to be browsed and organized) and "table of contents" nodes (or "tocs.") Any node could freely link to any other, permitting an unrestricted graph. But a group of tocs could be combined to form a tree-like hierarchy that bottomed out in various chunk nodes. In this way, any number of trees could be superimposed upon an arbitrary graph, allowing it to be viewed and browsed as a tree, with all the requisite advantages.
Spatial. In the opposite direction, some designers have shunned links between elements altogether, favoring instead spatial positioning as the sole organizational paradigm. Capitalizing on the human's tendency to implicitly organize through clustering, making piles, and spatially arranging, some tools offer a 2D workspace for placing and grouping items. This provides a less formal (and perhaps less intimidating) way for a user to gradually introduce structure into a set of items as it is discovered.
This approach originated from the spatial hypertext community, demonstrated in projects like VIKI/VKB28 (see Figure 5). With these programs, users place information items on a canvas and can manipulate them to convey organization imprecisely. VIKI and VKB are especially notable for their ability to automatically infer the structure from a user's freeform layout: a spatial parser examines which items have been clustered together, colored or otherwise adorned similarly, and so on, and makes judgments about how to turn these observations into machine-processible assertions.
Certain note-taking tools (for example, Microsoft OneNote) also combine an overarching tree structure with spatial freedom on each "page." Users can access a particular page of the notebook with basic search or tree navigation facilities, and then lay out notes and images on the page as desired. Tinderbox,4 in addition to supporting the graph model, also makes heavy use of the spatial paradigm, which lets users express less formal affinities between items.
Category. The fifth fundamental structural framework that PKB systems use is that of categories. Users may think of categories as collections, in which the category somehow encloses or "owns" the items within it. Alternatively, they may think of labeling items with custom-defined keywords, thereby implicitly creating a category. The important point is that a given item can be simultaneously present in multiple categories, relieving the tree model's most restrictive constraint.
The first popular application to embrace the category approach was the original Agenda which later became a commercial product and spawned many imitations. Personal Knowbase (http://bitsmithsoft.com; see Figure 6), Haystack, and Chandler (http://osafoundation.org) are more modern examples.
We treat ZigZag32 separately from the five common models since it is so unique, and represents a paradigm shift in knowledge modeling. Its core idea is very simple: knowledge elements can be related to one another sequentially along any number of dimensions.
In some ways, ZigZag is an extension of the structure of an ordinary spreadsheet: just as in a spreadsheet, each cell is positioned in sequence relative to other cells both horizontally and vertically, ZigZag allows a cell to participate in many such sequences. This seemingly simple extension actually has broad impact: it turns out to be more general than any of the previous five models, and each of them can be expressed by it. Its principal liability has been difficulty of understanding: users (and even researchers) steeped in the traditional paradigms sometimes struggle to break free of old assumptions. Yet if adopted on a wide scale, ZigZag's so-called "hyper-thogonal" structure offers the possibility of an ultra-flexible PKB, capable of adapting to all of a user's needs.
The term "transclusion," first coined by Ted Nelson,31 has been used in several senses. In general it means including an excerpt from one document into another, such that the including document maintains some kind of reference to the included document. The simplest form of transclusion would be a simply copy-and-paste operation wherein a link to the original source was maintained. A stronger form is when the transcluded content is not copied, but referenced. This can allow any updates to the referred-to document to be instantly seen by the referring one, or, in an even more sophisticated scheme, it allows the referring document to maintain access to the transcluded content as it originally appeared, and also any more recent versions of it. (The Xanadu project design was based on this latter formulation.)
In the context of PKBs, transclusion means the ability to view the same knowledge element (not a copy) in multiple contexts. It is so pivotal because it is central to how the human mind processes information. We think associatively, and with high fan-out. I may consider John, for instance, as a neighbor, a fellow sports enthusiast, a father of small children, a Democrat, an invitee to a party, and a homeowner who owns certain power tools, all at once. Each of these different contexts places him in relationship to a different set of elements in my mind. Without delving into psychological research to examine exactly how the mind encodes such associations, it seems clear that if we are to build a comprehensive personal knowledge base containing potentially everything a person knows, it must have the ability to transclude knowledge elements.
Bush's original design of the me-mex explicitly prescribed the transclusion concept, for instance in his notion of a "skip trail." "The historian," he writes, "with a vast chronological account of a people, parallels it with a skip trail that stops only at the salient items." In this way, the full account of the subject can be summarized in a sort of digest that refers to select items from the original, larger trail. A modern example of transclusion is Mediawiki,d the software used to host, among other sites, Wikipedia. Its use of template tags permits a source page's current contents to be dynamically included and embedded within another page.
Adding transclusion to the tree model effectively turns it into a directed acyclic graph (DAG), in which an item can have multiple parents. This is what Trigg and Halasz achieved with their extensions to the tree model.23,37
A similar mechanism can be applied to graph models, as with Tinder-box's "alias" feature. In Tinderbox, information is broken up into "notes," which can appear on the screen as spatially laid out rectangles with links between them. By creating an "alias" for a note, one can summon its appearance on a different graph layout than the note originally appeared. Compendium also allows its nodes to be present on multiple views, and the Popcorn data model13 was based entirely on transclusion. This seems closer to how the mind operates: we associate ideas with contexts, but we do not embed ideas irreversibly into the first context we happened to place them in. Tightly binding an element to its original context, therefore, seems like the wrong approach.
The idea of a PKB gives rise to some important architectural considerations. While not constraining the nature of what knowledge can be expressed, the architecture nevertheless affects matters such as availability and workflow.
File based. The vast majority of solutions mentioned in this article use a simple storage mechanism based on flat files in a file system. This is true of virtually all the mind-mapping tools, concept-mapping tools, and notetaking tools, and even a number of hypertext tools (for example, NoteCards, Tinderbox). Typically, the main "unit" of a user's knowledge designwhether that be a mind map, a concept map, an outline, or a "notebook"is stored in its own file somewhere in the file system. The application can find and load such files via the familiar "File | Open..." paradigm, at which point it typically maintains the entire knowledge structure in memory.
This approach takes advantage of the average user's familiarity with file "open" and "save" operations, but does have ramifications on its utility as a PKB. Users must choose one of two basic strategies: either store all of their knowledge in a single file; or else break up their knowledge and store it across a number of different files, presumably according to subject matter and/or time period. The first choice can result in insurmountable scalability problems if the system is heavily used, while the second may force an unnatural partitioning of topics and an inability to link disparate items together.
Database based. Architectures involving a database to store user knowledge address these concerns. Knowledge elements reside in a global space, which allows any idea to relate to any other: now a user can relate a book he read on productivity not only to other books on productivity, but also to "that hotel in Orlando that our family stayed in last spring," because that is where he remembers having read the book. Though such a relationship may seem "out of bounds" in traditional knowledge organization, it is exactly the kind of retrieval path that humans often employ in retrieving memories.3
Agenda25 and gIBIS12 were two early tools that incorporated a relational database backend in their architecture. More generally, the issues surrounding storage of graph-based data in relational databases6,18 or in special-purpose databases30 have received much attention, giving PKB designers ample guidance for how to architect their persistence mechanism.
Client server. Decoupling the actual knowledge store from the PKB user interface can achieve architectural flexibility. As with all client-server architectures, the benefits include load distribution, platform interoperability, data sharing, and ubiquitous availability. Increased complexity and latency are among the liabilities, which can indeed be considerable factors in PKB design. Examples of client-server PKBs include MyBase Networking Edition (http://wjjsoft.com), and Haystack's three-tiered architecture.1
A variation of the client-server approach is of course Web-based systems, in which the client system consists of nothing but a (possibly enhanced) browser. This gives the same ubiquitous availability that client-server approaches do, while minimizing (or eliminating) the setup and installation required on each client machine.
Handheld devices. Lastly, we mention mobile devices as a possible PKB architecture. Storing all of one's personal knowledge on a handheld computer would solve the availability problem, of course, and even more completely than would a client-server or Web-based architecture. The safety of the information is an issue, since if the handheld device were to be lost or destroyed, the user could face irrevocable data loss; this is easily remedied, however, by periodically synchronizing the handheld device's contents with a host computer. More problematic is simply the limitations of the hardware. Screen real estate, processing power, and storage capacity are of course much more limited, and this hampers their overall effectiveness.
Personal knowledge management is a real and pressing problem, as the sheer number of products included in this article attests. Yet it does not appear that Vannevar Bush's dream has yet been fully realized on a wide scale. Nearly every system mentioned here has its circle of loyal adherents ("I find Tinderbox indispensable for my work and every update makes it that much more mind-blowing."e "The Greatest Invention in Human History? I vote for Microsoft OneNote."f) But certainly when compared with word processors, spreadsheets, or Web browsers, PKB usage lags far behind.
The idea of a PKB gives rise to some important architectural considerations. While not constraining the nature of what knowledge can be expressed, the architecture nevertheless affects matters such as availability and workflow.
What would it take for a true PKB solution to appeal to a wide audience and generate the kinds of benefits Bush envisioned? Synthesizing lessons from the analysis here, the following recommendations for future research seem apparent:
To pull all these ideas together, imagine a distributed system that securely stores your personal knowledge and is available to you anywhere, anytime: from any computer, or from a handheld device that you always carry with you. Furthermore, the knowledge it contains is in a flexible form that can readily accommodate your very thoughts. It contains all the concepts you have perceived in the past and want to recallhistorical events, business plans, phone numbers, scientific formulasand does not encourage you to isolate them from one another, or to prematurely commit to a structure that you might find restrictive later. The concepts can be linked together as in a graph, clustered visually on canvases, classified in multiple categories, and/or arranged hierarchically, all for different purposes. As further information is encounteredfrom reading documents, brainstorming project plans, or just experiencing lifeit is easy to assimilate into the tool, either by capturing snippets of text and relating it to what is already known, or by creating new concepts and combining them with the easily retrievable old. External documents can be linked into the knowledge structure in key places, so that they can be classified and easily retrieved. It is effortless to augment the content with annotations, and to rearrange it to reflect new understandings. And this gold mine of knowledge is always exportable in a form that is compatible with other, similar systems that have different features and price points.
Such a tool would surely be a boon to anyone who finds their own mind to be insufficient for retaining and leveraging the knowledge they acquire. As Vannevar Bush stirringly wrote: "Presumably man's spirit should be elevated if he can better review his shady past and analyze more completely and objectively his present problems."
1. Adar, E., Karger, D. and Stein, L.A. Haystack: Per-user information environments. In Proceedings of the 8th International Conference on Information Knowledge Management, (Kansas City, MO, 1999), 413422.
10. Canas, A.J., Hill, G., Carff, R., Suri, N., Lott, J., Gomez, G., Eskridge, T.C., Arroyo, M. and Carvajal R. CmapTools: A knowledge modeling and sharing environment. In Proceedings of the 1st International Conference on Concept Mapping, (Pamplona, Spain, 2005), 125133.
13. Davies, S., Allen, S., Raphaelson, J., Meng, E. Engleman, J., King, R., and Lewis, C. Popcorn: The Personal Knowledge Base. In Proceedings of the 6th ACM Conference on Designing Interactive Systems (2006), 150159.
14. Dittrich, J.P. and Salles, M.A.V. iDM: A unified and versatile data model for personal dataspace management. In Proceedings of the 32nd International Conference on Very Large Data Bases (2006), 367378.
17. Fertig, S., Freeman, E. and Gelernter, D. Lifestreams: An alternative to the desktop metaphor. In Proceedings of the Conference on Human Factors in Computing Systems (Vancouver, B.C., 1996), 410411.
20. Gantz, J. The Diverse and Exploding Digital Universe. White paper. International Data Corporation, Framingham, MA, (Mar. 2008); http://www.emc.com/collateral/analyst-reports/diverse-exploding-digital-universe.pdf.
27. Marshall, C., Halasz, F.G., Rogers, R.A. and Janssen, W.C. Aquanet: A hypertext took to hold your knowledge in place. In Proceedings of the 3rd Annual ACM Conference on Hypertext, (San Antonio, TX, 1991), 261275.
29. McBrien, P. and Poulovassilis, A. A semantic approach to integrating XML and structured data sources. Proceedings of the 13th International Conference on Advanced Information Systems Engineering. Springer-Verlag, London, U.K. 2001, 330345.
35. Papakonstantinou, Y., Garcia-Molina, H., and Widom, J. Object exchange across heterogeneous information sources. In Proceedings of the 11th International Conference on Data Engineering (1995), 251260.
38. Vdovjak, R. and Houben, G.J. RDF based architecture for semantic integration of heterogeneous information sources. In Proceedings of the Workshop on Information Integration on the Web (2001), 5157.
Figure 6. Personal Knowbase, a note-taking system based on the category structural framework. Multiple customized user keywords (listed in far-left pane) can be assigned to each note (comprised of a small text document and a title). These keywords can be combined with Boolean operations in order to retrieve notes.
©2011 ACM 0001-0782/11/0200 $10.00
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from email@example.com or fax (212) 869-0481.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2011 ACM, Inc.
The following letter was published in the Letters to the Editor in the April 2011 CACM (http://cacm.acm.org/magazines/2011/4/106576).
I found much to agree with in Stephen Davies's article "Still Building the Memex" (Feb. 2011) but also feel one of his suggestions went off in the wrong direction. Davies imagined that a system for managing what he termed a "personal knowledge base" would be a "distributed system that securely stores your personal knowledge and is available to you anywhere..."and toward this end dismissed handheld devices due to their hardware limitations while including handheld devices as a way to access the networked distributed system.
Just as Davies might have imagined future built-in network capabilities able to guarantee access anywhere anytime at desirable speeds to the desired information (presumably at reaonable cost), the rest of us, too, can imagine personal devices with all the necessary capabilities and interface control. Such devices would be much closer to Vannevar Bush's Memex vision.
Bush was clearly writing about a personal machine to store one's collection of personal information, and a personal device functioning as one's extended memory would be far preferable to a networked distributed system. But why would any of us trust our personal extended memory to some networked distributed resource, given how often we are unable to find something on the Web we might have seen before?
In my own exploration of Bush's Memex vision ("Memex at 60: Internet or iPod?" Journal of the American Society for Information Science and Technology (July 2006), 12331242), I took a stab at how such a personal information device might be assembled and function, comparing it to a combination iPod and tablet PC, resulting in a personal information pod.
Ultimately though, I do fully agree with Davies as to the desirability of a tool that benefits any of us whose "own mind" is simply "insufficient for retaining and leveraging the knowledge [we] acquire."
Richard H. Veith
Port Murray, NJ
Veith makes a fair point. For users, what ultimately matters is whether their knowledge base is ubiquitously available and immune to data loss as provided by the distributed solution I described but that would be handled just as well by a handheld device with synchronized backups.
Displaying 1 comment