Information seekers tend to employ two major strategies for location: an analytic or query formation strategy and a browsing strategy, depending on their information needs, personal characteristics, and the system [2]. Thus the first component of our work is in the provision of multiple location techniques: a natural language processing (NLP) environment and a browsing/exploration environment (the Exploratory Overview technique), to support statistical information seeking. It is also possible a user might retrieve both pre-made summary tables and the very raw datasets from which such tables would be built (by choosing specific variables, specific values of variables, and so on). The Exploratory Overview technique enables browsing these datasets. Once a user identifies the table or tables of interest, the user is likely to want to display a table and begin to work with it. The Table Browser supports a variety of manipulations and explanation functions to support this challenge. Underlying these three technical components is a rich knowledge of statistical information seeking behavior and the barriers users experience as they work with statistical tables.
Getting people to the right table or set of tables is a complex query interpretation task. The team built a statistical-query sublanguage grammar using NLP techniques [1]. This grammar will enable the system to automatically recognize predictable aspects of statistical queries and map them into the pre-established statistical query frames that, in turn, will be matched against the metadata describing the content of each table. The NLP capabilities enable accurate, efficient mapping from the elements and relationships expressed in a user’s statement of their information need to a table or tables that have the potential for addressing that information need.
As data volumes grow, the potential increases for user frustration, wasted network capacity, and increased server loads. We believe effective overviews and previews of databases and specific data sets can simultaneously improve the user experience and lighten system loads. The Exploratory Overview Panel allows users to see the distribution of data visually with histograms, maps, or textual lists prior to making choices and retrieving data. For example, in Figure 1, the panel enables users to make queries incrementally and visually by selecting items from a set of bar charts. Users get continuous feedback on the data distribution and result set size as they continue their selections, thereby avoiding wasted time on zero-hit or mega-hit queries.
Our third technology concerns the representations of data that a user can access and/or create on the fly. The Table Browser tool ([3], Figure 2) provides basic tabular functionality (for example, “sticky” headings that do not scroll with data, data reorganization capabilities), the ability to retrieve explanatory metadata (via pop-ups, mouseovers, among others), all while minimizing the perceptual and cognitive effort of users. The tool underwent a series of usability studies and two eye-tracking studies [2]. The Java applet prototype reads XML files (of tabular data and associated metadata). It is available as an open source package from www.ils. unc.edu/idl/projects.html#stats.
This work has also provided insights into how electronic tables (e-tables) can be different from the traditionally static tabular representation in responding to a given user’s particular knowledge and needs. Users will be able to generate the tables that match their needs exactly rather than retrieving a pre-made table and deriving appropriate information from it. Within a given table, they might also be able to perform certain arithmetic, sorting, and comparison operations easily. The notion of a set of tables that a user might need to juxtapose in his or her own physical or virtual world is likely to be replaced by the completely customized table built from rawer components. In the electronic environment, an e-table and its contents (cells, rows, columns, and so on) will be linked to the metadata that provides context and explanation thus enabling appropriate and informed usage and enabling users to learn as they go.
Figures
Figure 1. The data distribution information is attached to the buckets of three attributes expanded in the user view and multiple selections are made on these buckets.
Figure 2. (a) depicts the basic TB interface with a hierarchical list of tables in the upper-left pane, an extended metadata pane in the lower left, and the main table with a tool-tip explanation in the main pane. (b) shows four tables juxtaposed for comparison.
Join the Discussion (0)
Become a Member or Sign In to Post a Comment