The introduction of modern Smart Home Systems (SHSs) is redefining the way we perform everyday activities. Today, myriad SHS applications and the devices they control are widely available to users. Specifically, users can easily download and install the apps from vendor-specific app markets, or develop their own, to effectively implement their SHS solutions. However, despite their benefits, app-based SHSs unfold diverse security risks. Several attacks have already been reported to SHSs and current security solutions only consider smart home devices and apps individually to detect malicious actions, rather than the context of the SHS as a whole. Thus, the current security solutions applied to SHSs cannot capture user activities and sensor-device-user interactions in a holistic fashion. To address these limitations, in this article, we introduce A
An approach to reproducibility problems related to porting software across machines and compilers.2021-01-25
Given the importance and social significance of passing down family stories to each generation, why do important family stories not get told? How should designers of digital family storytelling platforms address missing or incomplete parts of narratives? Drawing from the results of an interview-based, practice-oriented inquiry, we argue that non-telling should be considered an important and integral part of family storytelling. Our findings show that non-telling is not simply silence. Non-telling allows family members to observe protective and discretionary values essential to the identity-making and relational goals of family storytelling. We also show ways that a person's reticence is situated and may change over time. In our discussion, we provide design strategies for family storytelling technologies to make room for silence and incorporate the values, purposes, and practices of non-telling.2021-01-05
Writing is a complex non-linear process that begins with a mental model of intent, and progresses through an outline of ideas, to words on paper (and their subsequent refinement). Despite past research in understanding writing, Web-scale consumer and enterprise collaborative digital writing environments are yet to greatly benefit from intelligent systems that understand the stages of document evolution, providing opportune assistance based on authors' situated actions and context. In this paper, we present three studies that explore temporal stages of document authoring. We first survey information workers at a large technology company about their writing habits and preferences, concluding that writers do in fact conceptually progress through several distinct phases while authoring documents. We also explore, qualitatively, how writing stages are linked to document lifespan. We supplement these qualitative findings with an analysis of the longitudinal user interaction logs of a popular digital writing platform over several million documents. Finally, as a first step towards facilitating an intelligent digital writing assistant, we conduct a preliminary investigation into the utility of user interaction log data for predicting the temporal stage of a document. Our results support the benefit of tools tailored to writing stages, identify primary tasks associated with these stages, and show that it is possible to predict stages from anonymous interaction logs. Together, these results argue for the benefit and feasibility of more tailored digital writing assistance.2021-01-05
With the transformation to smarter cities and the development of technologies, a large amount of data is collected from sensors in real time. Services provided by ride-sharing systems such as taxis, mobility-on-demand autonomous vehicles, and bike sharing systems are popular. This paradigm provides opportunities for improving transportation systems’ performance by allocating ride-sharing vehicles toward predicted demand proactively. However, how to deal with uncertainties in the predicted demand probability distribution for improving the average system performance is still a challenging and unsolved task. Considering this problem, in this work, we develop a data-driven distributionally robust vehicle balancing method to minimize the worst-case expected cost. We design efficient algorithms for constructing uncertainty sets of demand probability distributions for different prediction methods and leverage a quad-tree dynamic region partition method for better capturing the dynamic spatial-temporal properties of the uncertain demand. We then derive an equivalent computationally tractable form for numerically solving the distributionally robust problem. We evaluate the performance of the data-driven vehicle balancing algorithm under different demand prediction and region partition methods based on four years of taxi trip data for New York City (NYC). We show that the average total idle driving distance is reduced by 30% with the distributionally robust vehicle balancing method using quad-tree dynamic region partitions, compared with vehicle balancing methods based on static region partitions without considering demand uncertainties. This is about a 60-million-mile or a 8-million-dollar cost reduction annually in NYC.2021-01-04
Finding nearest neighbors (NN) is a fundamental operation in many diverse domains such as databases, machine learning, data mining, information retrieval, multimedia retrieval, etc. Due to the data deluge and the application of nearest neighbor queries in many applications where fast performance is necessary, efficient index structures are required to speed up finding nearest neighbors. Different application domains have different data characteristics and, therefore, require different types of indexing techniques. While the internal indexing and searching mechanism is generally hidden from the top-level application, it is beneficial for a data scientist to understand these fundamental operations and choose a correct indexing technique to improve the performance of the overall end-to-end workflow. Choosing the correct searching mechanism to solve a nearest neighbor query can be a daunting task, however. A wrong choice can potentially lead to low accuracy, slower execution time, or in the worst case, both. The objective of this tutorial is to present the audience with the knowledge to choose the correct index structure for specific applications. We present the state-of-the-art Nearest Neighbor (NN) indexing techniques for different data characteristics. We also present the effect, in terms of time and accuracy, of choosing the wrong index structure for different application needs. We conclude the tutorial with a discussion on the future challenges in the Nearest Neighbor search domain.2021-01-02
The standard measure for the quality of online algorithms is the competitive ratio. This measure is generally applicable, and for some problems it works well, but for others it fails to distinguish between algorithms that have very different performance. Thus, ever since its introduction, researchers have worked on improving the measure, defining variants, or defining measures based on other concepts to improve on the situation. Relative worst-order analysis (RWOA) is one of the most thoroughly tested such proposals. With RWOA, many separations of algorithms not obtainable with competitive analysis have been found.
In RWOA, two algorithms are compared directly, rather than indirectly as is done in competitive analysis, where both algorithms are compared separately to an optimal offline algorithm. If, up to permutations of the request sequences, one algorithm is always at least as good and sometimes better than another, then the first algorithm is deemed the better algorithm by RWOA.
We survey the most important results obtained with this technique and compare it with other quality measures. The survey includes a quite complete set of references.2021-01-02
We consider the problem of encoding a string of length n from an integer alphabet of size σ so access, substring equality, and Longest Common Extension (LCE) queries can be answered efficiently. We describe a new space-optimal data structure supporting logarithmic-time queries. Access and substring equality query times can furthermore be improved to the optimal O(1) if O(log n) additional precomputed words are allowed in the total space. Additionally, we provide in-place algorithms for converting between the string and our data structure.
Using this new string representation, we obtain the first in-place subquadratic algorithms for several string-processing problems in the restore model: The input string is rewritable and must be restored before the computation terminates. In particular, we describe the first in-place subquadratic Monte Carlo solutions to the sparse suffix sorting, sparse LCP array construction, and suffix selection problems. With the sole exception of suffix selection, our algorithms are also the first running in sublinear time for small enough sets of input suffixes. Combining these solutions, we obtain the first sublinear-time Monte Carlo algorithm for building the sparse suffix tree in compact space. We also show how to build a correct version of our data structure using small working space. This leads to the first Las Vegas in-place algorithm computing the full LCP array in O(nlog n) time w.h.p. and to the first Las Vegas in-place algorithms solving the sparse suffix sorting and sparse LCP array construction problems in O(n1.5 √ log σ) time w.h.p.2020-12-31
We analyze two classic variants of the T
Our first set of results is motivated by the B
Our second set of results concerns the popular k-
Emotions are an intrinsic part of the social media user experience that can evoke negative behaviors such as cyberbullying and trolling. Detecting the emotions of social media users may enable responding to and mitigating these problems. Prior work suggests this may be achievable on smartphones: emotions can be detected via built-in sensors during prolonged input tasks. We extend these ideas to a social media context featuring sparse input interleaved with more passive browsing and media consumption activities. To achieve this, we present two studies. In the first, we elicit participant's emotions using images and videos and capture sensor data from a mobile device, including data from a novel passive sensor: its built-in eye-tracker. Using this data, we construct machine learning models that predict self-reported binary affect, achieving 93.20% peak accuracy. A follow-up study extends these results to a more ecologically valid scenario in which participants browse their social media feeds. The study yields high accuracies for both self-reported binary valence (94.16%) and arousal (92.28%). We present a discussion of the sensors, features and study design choices that contribute to this high performance and that future designers and researchers can use to create effective and accurate smartphone-based affect detection systems.2020-12-17