Sign In

Communications of the ACM

121 - 130 of 3,299 for bentley

Effective reinforcement learning through evolutionary surrogate-assisted prescription

There is now significant historical data available on decision making in organizations, consisting of the decision problem, what decisions were made, and how desirable the outcomes were. Using this data, it is possible to learn a surrogate model, and with that model, evolve a decision strategy that optimizes the outcomes. This paper introduces a general such approach, called Evolutionary Surrogate-Assisted Prescription, or ESP. The surrogate is, for example, a random forest or a neural network trained with gradient descent, and the strategy is a neural network that is evolved to maximize the predictions of the surrogate model. ESP is further extended in this paper to sequential decision-making tasks, which makes it possible to evaluate the framework in reinforcement learning (RL) benchmarks. Because the majority of evaluations are done on the surrogate, ESP is more sample efficient, has lower variance, and lower regret than standard RL approaches. Surprisingly, its solutions are also better because both the surrogate and the strategy network regularize the decision making behavior. ESP thus forms a promising foundation to decision optimization in real-world problems.


Advanced statistical analysis of empirical performance scaling

Theoretical running time complexity analysis is a widely adopted method for studying the scaling behaviour of algorithms. However, theoretical analysis remains intractable for many high-performance, heuristic algorithms. Recent advances in statistical methods for empirical running time scaling analysis have shown that many state-of-the-art algorithms can achieve significantly better scaling in practice than expected. However, current techniques have only been successfully applied to study algorithms on randomly generated instance sets, since they require instances that can be grouped into "bins", where each instance in a bin has the same size. In practice, real-world instance sets with this property are rarely available. We introduce a novel method that overcomes this limitation. We apply our method to a broad range of scenarios and demonstrate its effectiveness by revealing new insights into the scaling of several prominent algorithms; e.g., the SAT solver lingeling often appears to achieve sub-polynomial scaling on prominent bounded model checking instances, and the training times of scikit-learn's implementation of SVMs scale as a lower-degree polynomial than expected (≈ 1.51 instead of 2).


Space-Efficient k-d Tree-Based Storage Format for Sparse Tensors

Computations with tensors are widespread in many scientific areas. Usually, the used tensors are very large but sparse, i.e., the vast majority of their elements are zero. The space complexity of sparse tensor storage formats varies significantly. For overall efficiency, it is important to reduce the execution time and additional space requirements of the initial preprocessing (i.e., converting tensors from common storage formats to the given internal format).

The main contributions of this paper are threefold. Firstly, we present a new storage format for sparse tensors, which we call the succinct k-d tree-based tensor (SKTB) format. We compare the space complexity of common tensor storage formats and of the SKTB format and demonstrate the viability of using a tree as a data structurefor sparse tensors. Secondly, we present a parallel space-efficient algorithm for converting tensors to the SKTB format. Thirdly, we demonstrate the computational efficiency of the proposed format in sparse tensor-vector multiplication.


Spying on the Floating Point Behavior of Existing, Unmodified Scientific Applications

Scientific (and other) applications are critically dependent on calculations done using IEEE floating point arithmetic. A number of concerns have been raised about correctness in such applications given the numerous gotchas the IEEE standard presents for developers, as well as the complexity of its implementation at the hardware and compiler levels. The standard and its implementations do provide mechanisms for analyzing floating point arithmetic as it executes, making it possible to find and track problematic operations. However, this capability is seldom used in practice. In response, we have developed FPSpy, a tool that provides this capability when operating underneath existing, unmodified x64 application binaries on Linux, including those using thread- and process-level parallelism. FPSpy can observe application behavior without any cooperation from the application or developer, and can potentially be deployed as part of a job launch process. We present the design, implementation, and performance evaluation of FPSpy. FPSpy operates conservatively, getting out of the way if the application itself begins to use any of the OS or hardware features that FPSpy depends on. Its overhead can be throttled, allowing a tradeoff between which and how many unusual events are to be captured, and the slowdown incurred by the application, with the low point providing virtually zero slowdown. We evaluated FPSpy by using it to methodically study seven widely-used applications/frameworks from a range of domains (five of which are in the NSF XSEDE top-20), as well as the NAS and PARSEC benchmark suites. All told, these comprise about 7.5 million lines of source code in a wide range of languages, and parallelism models (including OpenMP and MPI). FPSpy was able to produce trace information for all of them. The traces show that problematic floating point events occur in both the applications and the benchmarks. Analysis of the rounding behavior captured in our traces also suggests the feasibility of an approach to adding adaptive precision underneath existing, unmodified binaries.


Advancing Computational Reproducibility in the Dataverse Data Repository Platform

Recent reproducibility case studies have raised concerns showing that much of the deposited research has not been reproducible. One of their conclusions was that the way data repositories store research data and code cannot fully facilitate reproducibility due to the absence of a runtime environment needed for the code execution. New specialized reproducibility tools provide cloud-based computational environments for code encapsulation, thus enabling research portability and reproducibility. However, they do not often enable research discoverability, standardized data citation, or long-term archival like data repositories do. This paper addresses the shortcomings of data repositories and reproducibility tools and how they could be overcome to improve the current lack of computational reproducibility in published and archived research outputs.


VizSciFlow: A Visually Guided Scripting Framework for Supporting Complex Scientific Data Analysis

Scientific workflow management systems such as Galaxy, Taverna and Workspace, have been developed to automate scientific workflow management and are increasingly being used to accelerate the specification, execution, visualization, and monitoring of data-intensive tasks. For example, the popular bioinformatics platform Galaxy is installed on over 168 servers around the world and the social networking space myExperiment shares almost 4,000 Galaxy scientific workflows among its 10,665 members. Most of these systems offer graphical interfaces for composing workflows. However, while graphical languages are considered easier to use, graphical workflow models are more difficult to comprehend and maintain as they become larger and more complex. Text-based languages are considered harder to use but have the potential to provide a clean and concise expression of workflow even for large and complex workflows. A recent study showed that some scientists prefer script/text-based environments to perform complex scientific analysis with workflows. Unfortunately, such environments are unable to meet the needs of scientists who prefer graphical workflows. In order to address the needs of both types of scientists and at the same time to have script-based workflow models because of their underlying benefits, we propose a visually guided workflow modeling framework that combines interactive graphical user interface elements in an integrated development environment with the power of a domain-specific language to compose independently developed and loosely coupled services into workflows. Our domain-specific language provides scientists with a clean, concise, and abstract view of workflow to better support workflow modeling. As a proof of concept, we developed VizSciFlow, a generalized scientific workflow management system that can be customized for use in a variety of scientific domains. As a first use case, we configured and customized VizSciFlow for the bioinformatics domain. We conducted three user studies to assess its usability, expressiveness, efficiency, and flexibility. Results are promising, and in particular, our user studies show that VizSciFlow is more desirable for users to use than either Python or Galaxy for solving complex scientific problems.


Audience Management Practices of Live Streamers on Twitch

Live streaming is a unique medium that merges different layers of communication by facilitating individual, group, and mass communication simultaneously. Streamers who broadcast themselves on live streaming platforms such as Twitch are their own media entity and have the challenge of having to manage interactions with many different types of online audiences beyond the translucent platform interfaces. Through qualitative interviews with 25 Twitch streamers, in this paper we share streamers’ practices of discovering audience composition, categorizing audience groups, and developing appropriate mechanisms to interact with them despite geographical, technological, and temporal limitations. We discuss streamers’ appropriation of real-time signals provided by these platforms as sources of information, and their dependence on both technology and voluntary human labor to scale their media entity. We conclude with design recommendations for streaming platforms to provide streamer-centric tools for audience management, especially for knowledge discovery and growth management. .


Augmenting TV Viewing using Acoustically Transparent Auditory Headsets

This paper explores how acoustically transparent auditory headsets can improve TV viewing by intermixing headset and TV audio, facilitating personal, private auditory enhancements and augmentations of TV content whilst minimizing occlusion of the sounds of reality. We evaluate the impact of synchronously mirroring select audio channels from the 5.1 mix (dialogue, environmental sounds, and the full mix), and selectively augmenting TV viewing with additional speech (e.g. Audio Description, Directors Commentary, and Alternate Language). For TV content, auditory headsets enable better spatialization and more immersive, enjoyable viewing; the intermixing of TV and headset audio creates unique listening experiences; and private augmentations offer new ways to (re)watch content with others. Finally, we reflect on how these headsets might facilitate more immersive augmented TV viewing experiences within reach of consumers.


Assessing Social Text Placement in Mixed Reality TV

TV experiences are often social, be it at-a-distance (through text) or in-person (through speech). Mixed Reality (MR) headsets offer new opportunities to enhance social communication during TV viewing by placing social artifacts (e.g. text) anywhere the viewer wishes, rather than being constrained to a smartphone or TV display. In this paper, we use VR as a test-bed to evaluate different text locations for MR TV specifically. We introduce the concepts of wall messages, below-screen messages, and egocentric messages in addition to state-of-the-art on-screen messages (i.e., subtitles) and controller messages (i.e., reading text messages on the mobile device) to convey messages to users during TV viewing experiences. Our results suggest that a) future MR systems that aim to improve viewers’ experience need to consider the integration of a communication channel that does not interfere with viewers’ primary task, that is watching TV, and b) independent of the location of text messages, users prefer to be in full control of them, especially when reading and responding to them. Our findings pave the way for further investigations towards social at-a-distance communication in Mixed Reality.