Sign In

Communications of the ACM

121 - 130 of 2,536 for bentley

Personal knowledge questions for fallback authentication: security questions in the era of Facebook

Security questions (or challenge questions) are commonly used to authenticate users who have lost their passwords. We examined the password retrieval mechanisms for a number of personal banking websites, and found that many of them rely in part on security questions with serious usability and security weaknesses. We discuss patterns in the security questions we observed. We argue that today's personal security questions owe their strength to the hardness of an information-retrieval problem. However, as personal information becomes ubiquitously available online, the hardness of this problem, and security provided by such questions, will likely diminish over time. We supplement our survey of bank security questions with a small user study that supplies some context for how such questions are used in practice.

BSkyTree: scalable skyline computation using a balanced pivot selection

Skyline queries have gained a lot of attention for multi-criteria analysis in large-scale datasets. While existing skyline algorithms have focused mostly on exploiting data dominance to achieve efficiency, we propose that data incomparability should be treated as another key factor in optimizing skyline computation. Specifically, to optimize both factors, we first identify common modules shared by existing non-index skyline algorithms, and then analyze them to develop a cost model to guide a balanced pivot point selection. Based on the cost model, we lastly implement our balanced pivot selection in two algorithms, BSkyTree-S and BSkyTree-P, treating both dominance and incomparability as key factors. Our experimental results demonstrate that proposed algorithms outperform state-of-the-art skyline algorithms up to two orders of magnitude.

An experimental investigation of set intersection algorithms for text searching

The intersection of large ordered sets is a common problem in the context of the evaluation of boolean queries to a search engine. In this article, we propose several improved algorithms for computing the intersection of sorted arrays, and in particular for searching sorted arrays in the intersection context. We perform an experimental comparison with the algorithms from the previous studies from Demaine, López-Ortiz, and Munro [ALENEX 2001] and from Baeza-Yates and Salinger [SPIRE 2005]; in addition, we implement and test the intersection algorithm from Barbay and Kenyon [SODA 2002] and its randomized variant [SAGA 2003]. We consider both the random data set from Baeza-Yates and Salinger, the Google queries used by Demaine et al., a corpus provided by Google, and a larger corpus from the TREC Terabyte 2006 efficiency query stream, along with its own query log. We measure the performance both in terms of the number of comparisons and searches performed, and in terms of the CPU time on two different architectures. Our results confirm or improve the results from both previous studies in their respective context (comparison model on real data, and CPU measures on random data) and extend them to new contexts. In particular, we show that value-based search algorithms perform well in posting lists in terms of the number of comparisons performed.

Adding an interactive display to a public basketball hoop can motivate players and foster community

Interactive displays that aim to engage people through play have been successfully deployed in urban environments. However, there has been little work bringing interactive displays into existing public game spaces like outdoor basketball courts. To explore this, we designed an interactive display for a public half-court basketball hoop. We studied the impact of 3 different display modes over a 10-week period through interviews with players, spectators, and passers-by. Our findings suggest 3 dimensions for the design space of such interactive displays: balancing noticeability across different user groups, support for different play action, and support for connecting user groups. We also present 6 design tactics along these dimensions to help designers create engaging interactive displays for public game spaces. using it to facilitate engaging playful experiences.

Social audio features for advanced music retrieval interfaces

The size of personal music collections has constantly increased over the past years. As a result, the traditional metadata based lists to browse these collections have reached their limits. Interfaces that are based on music similarity offer an alternative and thus are increasingly gaining attention. Music similarity is typically either derived from audio-features (objective approach) or from user driven information sources, such as collaborative filtering or social tags (subjective approach). Studies show that the latter techniques outperform audio-based approaches when it comes to describe the perceived music similarity. However, subjective approaches typically only define pairwise relations as opposed to the global notion of similarity given by audio-feature spaces. Many of the proposed interfaces for similarity based music access inherently depend on this global notion and are thus not applicable to user driven music similarity measures. The first contribution of this paper is a high dimensional music space that is based on user driven similarity measures. It combines the advantages of audio-feature spaces (global view) with the advantages of subjective sources that better reflect the users' perception. The proposed space compactly represents similarity and therefore is well suited for offline use, such as in mobile applications. To demonstrate the practical applicability, the second contribution is a comprehensive mobile music player that incorporates several smart interfaces to access the user's music collection. Based on this application, we finally present a large-scale user study that underlines the benefits of the introduced interfaces and shows their great user acceptance.

Finding Probabilistic k-Skyline Sets on Uncertain Data

Skyline is a set of points that are not dominated by any other point. Given uncertain objects, probabilistic skyline has been studied which computes objects with high probability of being skyline. While useful for selecting individual objects, it is not sufficient for scenarios where we wish to compute a subset of skyline objects, i.e., a skyline set. In this paper, we generalize the notion of probabilistic skyline to probabilistic k-skyline sets (Pk-SkylineSets) which computes k-object sets with high probability of being skyline set. We present an efficient algorithm for computing probabilistic k-skyline sets. It uses two heuristic pruning strategies and a novel data structure based on the classic layered range tree to compute the skyline set probability for each instance set with a worst-case time bound. The experimental results on the real NBA dataset and the synthetic datasets show that Pk-SkylineSets is interesting and useful, and our algorithms are efficient and scalable.

Sampling Big Trajectory Data

The increasing prevalence of sensors and mobile devices has led to an explosive increase of the scale of spatio-temporal data in the form of trajectories. A trajectory aggregate query, as a fundamental functionality for measuring trajectory data, aims to retrieve the statistics of trajectories passing a user-specified spatio-temporal region. A large-scale spatio-temporal database with big disk-resident data takes very long time to produce exact answers to such queries. Hence, approximate query processing with a guaranteed error bound is a promising solution in many scenarios with stringent response-time requirements. In this paper, we study the problem of approximate query processing for trajectory aggregate queries. We show that it boils down to the distinct value estimation problem, which has been proven to be very hard with powerful negative results given that no index is built. By utilizing the well-established spatio-temporal index and introducing an inverted index to trajectory data, we are able to design random index sampling (RIS) algorithm to estimate the answers with a guaranteed error bound. To further improve system scalability, we extend RIS algorithm to concurrent random index sampling (CRIS) algorithm to process a number of trajectory aggregate queries arriving concurrently with overlapping spatio-temporal query regions. To demonstrate the efficacy and efficiency of our sampling and estimation methods, we applied them in a real large-scale user trajectory database collected from a cellular service provider in China. Our extensive evaluation results indicate that both RIS and CRIS outperform exhaustive search for single and concurrent trajectory aggregate queries by two orders of magnitude in terms of the query processing time, while preserving a relative error ratio lower than 10\%, with only 1% search cost of the exhaustive search method.

CHI '07: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

Welcome to the CHI 2007 proceedings. We believe the technical papers and notes herein present some of the best current work in the diverse and dynamic field of human-computer interaction (HCI).

CHI is the leading HCI conference. Creating the technical program requires a huge investment of time and effort from members of the research community. 840 submissions were processed (571 papers, 269 notes), requiring over 3000 reviews. We thank all the reviewers for the dedication with which they undertook this task. We are particularly indebted to the papers and notes program committee members, also known as the Associate Chairs (ACs). Balancing areas of expertise, ACs were selected from the field's leading researchers. The AC role included recruiting all reviewers, moderating and supervising the review process to ensure a high-quality set of reviews was obtained, initiating and organizing author rebuttal and reviewer discussions and approving final submissions. The estimated time expenditure to serve as an AC was 11 days of full-time work; many committee members spent more time than that. Papers ACs came to San Jose in December 2006 from around the world for two intense days of review, debates, and deliberation; Notes ACs who could not attend the parallel notes meeting in San Jose engaged in a virtual conference. The committee was extremely serious and careful in making CHI paper and note decisions, with many submissions receiving multiple discussions, before and during the program committee meetings. No review process can guarantee perfect decisions, but we are confident that every possible effort was made to ensure fair process and high quality decision-making. This year's program committee certainly has our respect and gratitude, and deserves the sincere appreciation of the entire HCI community. We would also like to thank the ACs and their organizations for underwriting the travel expenses for meeting.

CHI is both a journal-quality archival forum and a community-building conference. To encourage quality in the written presentation of accepted work, all of the 142 full paper and 40 note acceptances were provisional. As a result, authors actively responded and incorporated feedback from the reviews into the final versions of the papers that appear here.

Twenty-eight accepted papers and four accepted notes (5% of submissions) deemed to make an especially noteworthy contribution to human-computer interaction research were nominated by the program committee for Best Paper and Best Note Awards; these nominated papers and notes are identified in the Final Program. At the conference, up to six of these will be announced as winners of a CHI Best Paper Award (1% of submissions), and one note will be selected as an exemplary note. While all papers accepted into the CHI technical papers program have passed a rigorous examination of their quality, the Best Paper and Best Notes Awards signal and reward particularly outstanding contributions in each year.

Exact L nearest neighbor search in high dimensions

We present an algorithm for solving the nearest neighbor problem with respect to $L_{\infty}$-distance. It requires no preprocessing and storage only for the point set $P$. Its average runtime assuming that the set $P$ of $n$ points is drawn randomly from the unit cube $[0,1]^{d}$ under uniform distribution is essentially $\Theta (nd/ln\; n)$ thereby improving the brute-force method by a factor of $\Theta (1/ln\; n)$. Several generalizations of the method are also presented, in particular to other “well-behaved” probability distributions and to the important problem of finding the $k$ nearest neighbors to a query point.