Data−Intensive Scalable Computing Laboratory (DISCL)
The Data−Intensive Scalable Computing Laboratory (DISCL) at Texas Tech University has broad research interests in parallel and distributed computing, high−performance
computing, Cloud computing, computer architectures and systems software with a focus on building scalable computing systems for data−intensive applications in
high−performance scientific computing/high−end enterprise computing.
If you would like to view all of our publications and other information about our research group, please feel free and visit our website here.
Scalable I/O Architectures for Data−Intensive Computing
High−performance computing (HPC) has entered the post−petascale era and is reaching the exaflop range quickly. Many scientific computing applications and engineering simulations in critical areas of research, such as nanotechnology, astrophysics, climate modeling and weather forecasting, medicine discovery, petroleum engineering, and high−energy physics, are becoming more and more data intensive. I/O has become a crucial performance bottleneck of high−performance computing, especially for data intensive applications. There is a critical and widening gap between the applications’ I/O demand and the HPC I/O system capability, which can lead to severe overall performance degradation. New mechanisms and new I/O architectures need to be developed to solve the ‘I/O-wall’ problem. In this research, we investigate new solutions to build a next−generation I/O architecture that scales well and meets the applications’ growing I/O demand. We explore new storage devices including flash−based solid state drives (SSD), general storage−class memory (SCM), and hybrid storage systems to build the hardware component of the new I/O architecture. We explore new designs in parallel file systems, parallel I/O middleware, and parallel programming models to build the software component of the new I/O architecture. The objective of this research is to provide a new I/O architecture that can address the I/O bottleneck issue fundamentally for data−intensive high−performance computing.
Intelligent I/O Optimizations for High−End Computing
The widely−adopted multicore/manycore architectures have significantly increased the computational performance of high−end computers. However, compared to the processor technology advance, data−access performance improvement has been magnitudes’ slower, which significantly limits the sustained deliverable performance of high−end computing systems. Data access delay, rather than computational speed, has become the major concern of high−end computing. In this research, we investigate intelligent techniques to improve I/O access efficiency for high−end computing. We explore novel data prefetching, data caching, access scheduling, and data layout optimization techniques that considerably reduce the I/O access latency and improve the I/O bandwidth. The objective of this research is three fold: 1) increasing the fundamental understanding of I/O behavior for high-end computing applications; 2) providing novel data−access optimization techniques using scheduling, caching, prefetching, and layout strategies based on the understanding of I/O behavior; 3) improving the productivity of high−end computing applications.
Multicore Architectures and Data−Access Optimizations
The advent of multicore processors has completely changed the landscape of computing. It brings parallel processing into a single processor at task level. On one hand, it significantly further enlarges the performance gap between data processing and data access. On the other hand, it calls for a rethinking of system design to utilize the potential of multicore architecture. We believe the key of utilizing multicore processors is to reduce the data-access delay. In this research, we focus on reducing data−access delay for multicore architectures. Our approach is three fold: special hardware for swift data access, core−aware and context−aware scheduling and prefetching, and integrated cache management. We have introduced the data access history cache architecture to support dynamic hardware prefetching and smart data management, developed the core−aware memory access scheduling, and integrated the cache management with prefetching and scheduling. Many issues remain open, however, and we continue exploring data−access optimization techniques for multicore architectures in this project.