Sign In

Communications of the ACM

1 - 10 of 2,912 for bentley

Aegis+: A Context-aware Platform-independent Security Framework for Smart Home Systems

The introduction of modern Smart Home Systems (SHSs) is redefining the way we perform everyday activities. Today, myriad SHS applications and the devices they control are widely available to users. Specifically, users can easily download and install the apps from vendor-specific app markets, or develop their own, to effectively implement their SHS solutions. However, despite their benefits, app-based SHSs unfold diverse security risks. Several attacks have already been reported to SHSs and current security solutions only consider smart home devices and apps individually to detect malicious actions, rather than the context of the SHS as a whole. Thus, the current security solutions applied to SHSs cannot capture user activities and sensor-device-user interactions in a holistic fashion. To address these limitations, in this article, we introduce Aegis+, a novel context-aware platform-independent security framework to detect malicious behavior in an SHS. Specifically, Aegis+ observes the states of the connected smart home entities (sensors and devices) for different user activities and usage patterns in an SHS and builds a contextual model to differentiate between malicious and benign behavior. We evaluated the efficacy and performance of Aegis+ in multiple smart home settings (i.e., single bedroom, double bedroom, duplex) and platforms (i.e., Samsung SmartThings, Amazon Alexa) where real users perform day-to-day activities using real SHS devices. We also measured the performance of Aegis+ against five different malicious behaviors. Our detailed evaluation shows that Aegis+ can detect malicious behavior in SHS with high accuracy (over 95%) and secure the SHS regardless of the smart home layout and platforms, device configurations, installed apps, controller devices, and enforced user policies. Finally, Aegis+ yields minimum overhead to the SHS, ensuring effective deployability in real-life smart environments.


Keeping science on keel when software moves

An approach to reproducibility problems related to porting software across machines and compilers.


Learning from Family Mysteries: Accounting for Untold Stories in Family Memory Practices

Given the importance and social significance of passing down family stories to each generation, why do important family stories not get told? How should designers of digital family storytelling platforms address missing or incomplete parts of narratives? Drawing from the results of an interview-based, practice-oriented inquiry, we argue that non-telling should be considered an important and integral part of family storytelling. Our findings show that non-telling is not simply silence. Non-telling allows family members to observe protective and discretionary values essential to the identity-making and relational goals of family storytelling. We also show ways that a person's reticence is situated and may change over time. In our discussion, we provide design strategies for family storytelling technologies to make room for silence and incorporate the values, purposes, and practices of non-telling.


Characterizing Stage-aware Writing Assistance for Collaborative Document Authoring

Writing is a complex non-linear process that begins with a mental model of intent, and progresses through an outline of ideas, to words on paper (and their subsequent refinement). Despite past research in understanding writing, Web-scale consumer and enterprise collaborative digital writing environments are yet to greatly benefit from intelligent systems that understand the stages of document evolution, providing opportune assistance based on authors' situated actions and context. In this paper, we present three studies that explore temporal stages of document authoring. We first survey information workers at a large technology company about their writing habits and preferences, concluding that writers do in fact conceptually progress through several distinct phases while authoring documents. We also explore, qualitatively, how writing stages are linked to document lifespan. We supplement these qualitative findings with an analysis of the longitudinal user interaction logs of a popular digital writing platform over several million documents. Finally, as a first step towards facilitating an intelligent digital writing assistant, we conduct a preliminary investigation into the utility of user interaction log data for predicting the temporal stage of a document. Our results support the benefit of tools tailored to writing stages, identify primary tasks associated with these stages, and show that it is possible to predict stages from anonymous interaction logs. Together, these results argue for the benefit and feasibility of more tailored digital writing assistance.


Data-driven Distributionally Robust Optimization For Vehicle Balancing of Mobility-on-Demand Systems

With the transformation to smarter cities and the development of technologies, a large amount of data is collected from sensors in real time. Services provided by ride-sharing systems such as taxis, mobility-on-demand autonomous vehicles, and bike sharing systems are popular. This paradigm provides opportunities for improving transportation systems’ performance by allocating ride-sharing vehicles toward predicted demand proactively. However, how to deal with uncertainties in the predicted demand probability distribution for improving the average system performance is still a challenging and unsolved task. Considering this problem, in this work, we develop a data-driven distributionally robust vehicle balancing method to minimize the worst-case expected cost. We design efficient algorithms for constructing uncertainty sets of demand probability distributions for different prediction methods and leverage a quad-tree dynamic region partition method for better capturing the dynamic spatial-temporal properties of the uncertain demand. We then derive an equivalent computationally tractable form for numerically solving the distributionally robust problem. We evaluate the performance of the data-driven vehicle balancing algorithm under different demand prediction and region partition methods based on four years of taxi trip data for New York City (NYC). We show that the average total idle driving distance is reduced by 30% with the distributionally robust vehicle balancing method using quad-tree dynamic region partitions, compared with vehicle balancing methods based on static region partitions without considering demand uncertainties. This is about a 60-million-mile or a 8-million-dollar cost reduction annually in NYC.


Exploring State-of-the-Art Nearest Neighbor (NN) Search Techniques

Finding nearest neighbors (NN) is a fundamental operation in many diverse domains such as databases, machine learning, data mining, information retrieval, multimedia retrieval, etc. Due to the data deluge and the application of nearest neighbor queries in many applications where fast performance is necessary, efficient index structures are required to speed up finding nearest neighbors. Different application domains have different data characteristics and, therefore, require different types of indexing techniques. While the internal indexing and searching mechanism is generally hidden from the top-level application, it is beneficial for a data scientist to understand these fundamental operations and choose a correct indexing technique to improve the performance of the overall end-to-end workflow. Choosing the correct searching mechanism to solve a nearest neighbor query can be a daunting task, however. A wrong choice can potentially lead to low accuracy, slower execution time, or in the worst case, both. The objective of this tutorial is to present the audience with the knowledge to choose the correct index structure for specific applications. We present the state-of-the-art Nearest Neighbor (NN) indexing techniques for different data characteristics. We also present the effect, in terms of time and accuracy, of choosing the wrong index structure for different application needs. We conclude the tutorial with a discussion on the future challenges in the Nearest Neighbor search domain.


Relative Worst-order Analysis: A Survey

The standard measure for the quality of online algorithms is the competitive ratio. This measure is generally applicable, and for some problems it works well, but for others it fails to distinguish between algorithms that have very different performance. Thus, ever since its introduction, researchers have worked on improving the measure, defining variants, or defining measures based on other concepts to improve on the situation. Relative worst-order analysis (RWOA) is one of the most thoroughly tested such proposals. With RWOA, many separations of algorithms not obtainable with competitive analysis have been found.

In RWOA, two algorithms are compared directly, rather than indirectly as is done in competitive analysis, where both algorithms are compared separately to an optimal offline algorithm. If, up to permutations of the request sequences, one algorithm is always at least as good and sometimes better than another, then the first algorithm is deemed the better algorithm by RWOA.

We survey the most important results obtained with this technique and compare it with other quality measures. The survey includes a quite complete set of references.


Optimal Substring Equality Queries with Applications to Sparse Text Indexing

We consider the problem of encoding a string of length n from an integer alphabet of size σ so access, substring equality, and Longest Common Extension (LCE) queries can be answered efficiently. We describe a new space-optimal data structure supporting logarithmic-time queries. Access and substring equality query times can furthermore be improved to the optimal O(1) if O(log n) additional precomputed words are allowed in the total space. Additionally, we provide in-place algorithms for converting between the string and our data structure.

Using this new string representation, we obtain the first in-place subquadratic algorithms for several string-processing problems in the restore model: The input string is rewritable and must be restored before the computation terminates. In particular, we describe the first in-place subquadratic Monte Carlo solutions to the sparse suffix sorting, sparse LCP array construction, and suffix selection problems. With the sole exception of suffix selection, our algorithms are also the first running in sublinear time for small enough sets of input suffixes. Combining these solutions, we obtain the first sublinear-time Monte Carlo algorithm for building the sparse suffix tree in compact space. We also show how to build a correct version of our data structure using small working space. This leads to the first Las Vegas in-place algorithm computing the full LCP array in O(nlog n) time w.h.p. and to the first Las Vegas in-place algorithms solving the sparse suffix sorting and sparse LCP array construction problems in O(n1.5 √ log σ) time w.h.p.


Fine-grained Complexity Analysis of Two Classic TSP Variants

We analyze two classic variants of the TRAVELING SALESMAN PROBLEM (TSP) using the toolkit of fine-grained complexity.

Our first set of results is motivated by the BITONIC TSP problem: given a set of n points in the plane, compute a shortest tour consisting of two monotone chains. It is a classic dynamic-programming exercise to solve this problem in O(n2) time. While the near-quadratic dependency of similar dynamic programs for LONGEST COMMON SUBSEQUENCE and DISCRETE Fréchet Distance has recently been proven to be essentially optimal under the Strong Exponential Time Hypothesis, we show that bitonic tours can be found in subquadratic time. More precisely, we present an algorithm that solves bitonic TSP in O(nlog 2 n) time and its bottleneck version in O(nlog 3 n) time. In the more general pyramidal TSP problem, the points to be visited are labeled 1,… ,n and the sequence of labels in the solution is required to have at most one local maximum. Our algorithms for the bitonic (bottleneck) TSP problem also work for the pyramidal TSP problem in the plane.

Our second set of results concerns the popular k-OPT heuristic for TSP in the graph setting. More precisely, we study the k-OPT decision problem, which asks whether a given tour can be improved by a k-OPT move that replaces k edges in the tour by k new edges. A simple algorithm solves k-OPT in O(nk) time for fixed k. For 2-OPT, this is easily seen to be optimal. For k=3, we prove that an algorithm with a runtime of the form Õ(n3−ɛ) exists if and only if ALL-PAIRS SHORTEST PATHS in weighted digraphs has such an algorithm. For general k-OPT, it is known that a runtime of f(k) · no(k/ log k) would contradict the Exponential Time Hypothesis. The results for k=2,3 may suggest that the actual time complexity of k-OPT is Θ (nk). We show that this is not the case, by presenting an algorithm that finds the best k-move in O(n ⌊ 2k/3 ⌋+1) time for fixed k ≥ 3. This implies that 4-OPT can be solved in O(n3) time, matching the best-known algorithm for 3-OPT. Finally, we show how to beat the quadratic barrier for k=2 in two important settings, namely, for points in the plane and when we want to solve 2-OPT repeatedly.


How Do You Feel Online: Exploiting Smartphone Sensors to Detect Transitory Emotions during Social Media Use

Emotions are an intrinsic part of the social media user experience that can evoke negative behaviors such as cyberbullying and trolling. Detecting the emotions of social media users may enable responding to and mitigating these problems. Prior work suggests this may be achievable on smartphones: emotions can be detected via built-in sensors during prolonged input tasks. We extend these ideas to a social media context featuring sparse input interleaved with more passive browsing and media consumption activities. To achieve this, we present two studies. In the first, we elicit participant's emotions using images and videos and capture sensor data from a mobile device, including data from a novel passive sensor: its built-in eye-tracker. Using this data, we construct machine learning models that predict self-reported binary affect, achieving 93.20% peak accuracy. A follow-up study extends these results to a more ecologically valid scenario in which participants browse their social media feeds. The study yields high accuracies for both self-reported binary valence (94.16%) and arousal (92.28%). We present a discussion of the sensors, features and study design choices that contribute to this high performance and that future designers and researchers can use to create effective and accurate smartphone-based affect detection systems.