NDM 2014‎ > ‎Workshop Program‎ > ‎

Accepted Papers

proceedings ISBN 978-1-4799-7019-3 

5 papers accepted out of 11 submissions

paper 4
           Adaptation and Policy-Based Resource Allocation for Efficient Bulk Data Transfers in High  Performance Computing Environments [slides]  [paper]
            Ann L. Chervenak (USC/ISI), Alex Sim (LBNL), Junmin Gu (LBNL), Robert Schuler (USC/ISI)  and Nandan Hirpathak (USC)   

Many science applications increasingly make use of data-intensive methods that require bulk data movement such as staging of large datasets in preparation for analysis on shared computational resources, remote access to large data sets, and data dissemination. Over the next 5 to 10 years, these datasets are projected to grow to exabytes of data, and continued scientific progress will depend on efficient methods for data movement between high performance computing centers. We study two techniques that improve the use of available resources for large, long-running, multi-file transfers. First, we show the effect of adaptation of transfer parameters for multi-file transfers, where the adaptation is based on recent performance. Second, we use Virtual Organization and site policies to influence the allocation of resources such as available transfer streams to clients. We show that these techniques improve completion times for large multi-file data transfers by approximately 20% over resource constrained infrastructure.

paper 8:
       Analysis of the Effect of Core Affinity on High-Throughput Flows  [slides] [paper]
Nathan Hanford (UC Davis), Vishal Ahuja (UC Davis),  Matthew Farrens (UC Davis), Dipak Ghosal (UC Davis)Mehmet Balman (LBNL), Eric Pouyoul (ESnet,LBNL) and Brian Tierney (ESnet, LBNL)

Abstract: Network throughput is scaling-up to higher data rates while end-system processors are scaling-out to multiple cores. In order to optimize high speed data transfer into multicore end-systems, techniques such as network adapter offloads and performance tuning have received a great deal of attention. Furthermore, several methods of multithreading the network receive process have been proposed. However, thus far attention has been focused on how to set the tuning parameters and which offloads to select for higher performance, and little has been done to understand why the settings do (or do not) work. In this paper we build on previous research to track down the source(s) of the end-system bottleneck for high-speed TCP flows. For the purposes of this paper, we consider protocol processing efficiency to be the amount of system resources used (such as CPU and cache) per unit of achieved throughout (in Gbps). The amount of various system resources consumed are measured using low-level system event counters. Affinitization, or core binding, is the decision about which processor cores on an end system are responsible for interrupt, network, and application processing. We conclude that affinitization has a significant impact on protocol processing efficiency, and that the performance bottleneck of the network receive process changes drastically with three distinct affinitization scenarios.

paper 10: 
            Flexible Scheduling and Control of Bandwidth and In-transit Services for End-to-End Application Workflows [slides] [paper]
Mehmet Fatih Aktas (Rutgers University), Georgiana Haldeman (Rutgers University) and Manish Parashar (Rutgers University)

Abstract: Emerging end-to-end scientific applications that integrate high-end experiments and instruments with large scale simulations and end-user displays, require complex couplings and data sharing between distributed components involving large data volumes and varying hard (in-time data delivery) and soft (in-transit processing) quality of service (QoS) requirements. As a result, efficient data transport is a key requirement of such workflows. In this paper, we leverage software-defined networking (SDN) to address issues of data transport service control and resource provisioning to meet varying QoS requirements from multiple coupled workflows sharing the same service medium. Specifically, we present a flexible control and a disciplined resource scheduling approach for data transport services of science networks. Furthermore, we emulate an SDN testbed on top of the FutureGrid virtualized testbed and use it to evaluate our approach for a realistic scientific workflow. Our results show that SDN-based control and resource scheduling based on simple intuitive models can meet the coupling requirement with high resource utilization.

paper 11:
Towards Energy Awareness in Hadoop [slides] [paper]
Krish K.R. (Virginia Tech), M. Safdar Iqbal (Virginia Tech), M. Mustafa Rafique (IBM Research Ireland) and Ali R. Butt (Virginia Tech)

Abstract: With the rise in the use of data centers comprised of commodity clusters for data-intensive applications, the energy efficiency of these setups is becoming a paramount concern for data center operators. Moreover, applications developed for Hadoop framework, which has now become a de-facto imple- mentation of the MapReduce framework, now comprise complex workflows that are managed by specialized workflow schedulers, such as Oozie. These schedulers assume cluster resources to be homogeneous and often consider data locality to be the only scheduling constraint. However, this is increasingly not the case in modern data centers. The addition of low-power computing devices and regular hardware upgrades have made heterogeneity the norm, in that clusters are now comprised of several logical sub-clusters each with its own performance and energy profile. In this paper we present ǫSched, a work- flow scheduler that profiles the performance and the energy characteristics of applications on each hardware sub-cluster in a heterogeneous cluster in order to improve the application- resource match while ensuring energy efficiency and performance related Service Level Agreement (SLA) goals. ǫSched borrows from our earlier work, φSched, a hardware-aware scheduler, that improves the resource–application match to improve application performance. We evaluate ǫSched on three clusters with different hardware configurations and energy profiles, where each sub- cluster comprises of five homogeneous nodes. Our evaluation of ǫSched shows that application performance and power character- istics vary significantly across different hardware configurations. We show that the hardware-aware scheduling can perform 12.8% faster, while saving 21% more power than hardware oblivious scheduling for the studied applications.

paper 12:
Towards Managed Terabit/s Scientific Data Flows  [slides] [paper]
         Artur Barczyk (Caltech),  Azher Mughal (Caltech), Harvey Newman (Caltech), Iosif Legrand (Caltech), Michael Bredel (Caltech), Ramiro Voicu (Caltech), Vlad Lapadatescu (Caltech) and Tony Wildish (Princeton University)

Abstract: Scientific collaborations on a global scale, such as the LHC experiments at CERN, rely today on the presence of high performance, high availability networks. In this paper we review the developments performed over the last several years on high throughput applications, multilayer software-defined network path provisioning, path selection and load balancing methods, and the integration of these methods with the mainstream data transfer and management applications of CMS, one of the major LHC experiments. These developments are folded into a compact system capable of moving data among research sites at the 1 Terabit per second scale. Several aspects that went into the design and target different components of the system are presented, including: evaluation of the 40 and 100Gbps capable hardware on both network and server side, data movement applications, flow management and the network-application interface leveraging advanced network services. We report on comparative results between several multi-path algorithms, the performance increase obtained using this approach, and present results from the related SC’13 demonstration.