Modernization Hub

Compression

Enhanced Definition

Compression, in the mainframe context, refers to the process of reducing the physical size of data to optimize storage space on direct access storage devices (DASD) or tape, and to decrease the volume of data transmitted over networks. Its primary purpose is to improve I/O performance, reduce storage costs, and enhance data transfer efficiency by minimizing the amount of data that needs to be read, written, or sent.

Key Characteristics

    • Lossless Nature: Mainframe data compression is almost exclusively lossless, meaning that the original data can be perfectly reconstructed from the compressed version without any loss of information, which is critical for data integrity.
    • Variety of Algorithms: z/OS supports various compression algorithms, including general-purpose ones (e.g., based on Lempel-Ziv variants) and specialized ones optimized for specific data types or hardware.
    • Hardware vs. Software: Compression can be implemented in software (e.g., by utilities like DFSMSdss, IEBCOPY, or database managers like DB2) or accelerated by dedicated hardware (e.g., zEDC Express adapter, FICON channel compression).
    • Dynamic vs. Static: Data can be compressed statically when written to storage (e.g., a compressed data set) or dynamically during I/O operations, often transparently to applications.
    • Resource Consumption: While compression saves space and I/O, the compression and decompression processes consume CPU cycles, which can be significant for large volumes of data if not offloaded to specialized hardware.
    • Applicability: Can be applied to various data types, including sequential data sets, VSAM data sets, PDS/PDSE members, DB2 tablespaces, IMS segments, and network data streams.

Use Cases

    • DASD Storage Optimization: Reducing the physical space occupied by large data sets, such as logs, archives, historical data, or infrequently accessed files, thereby extending the life of existing storage and deferring upgrades.
    • Tape Storage Efficiency: Minimizing the number of tape volumes required for backups, archives, and disaster recovery, leading to reduced media costs and faster backup/restore operations.
    • Database Performance: Compressing DB2 tablespaces or IMS segments to reduce the amount of data read from DASD, which can significantly improve query response times and transaction throughput by lowering I/O latency.
    • Network Data Transfer: Accelerating data transmission between LPARs, to remote systems, or across a WAN by reducing the volume of data sent, improving network bandwidth utilization and reducing transfer times.
    • Data Migration and Copy: Speeding up data migration processes (e.g., using DFSMSdss) or copying large data sets between volumes or systems by reducing the amount of data to be moved.

Related Concepts

Compression is deeply integrated into the z/OS ecosystem. DFSMS components like DFSMSdss and DFSMShsm leverage compression for efficient data movement, backup, and hierarchical storage management. Database systems such as DB2 and IMS provide native compression features for their data structures, often offloading the CPU overhead to zIIP processors. The zEDC (zEnterprise Data Compression) Express adapter is a key hardware component that provides high-performance, low-latency compression and decompression, significantly reducing the CPU impact on general-purpose processors. Furthermore, network protocols and hardware can apply compression to data streams, complementing storage-level compression.

Best Practices:
  • Evaluate CPU vs. I/O Trade-offs: Carefully assess the CPU overhead of compression against the benefits of reduced I/O and storage savings. For high-volume, performance-critical workloads, hardware compression (e.g., zEDC) is often the preferred solution.
  • Monitor Compression Ratios and Performance: Regularly monitor the effectiveness of compression (compression ratio) and its impact on system performance (CPU utilization, I/O rates) to ensure optimal configuration and identify areas for improvement.
  • Choose Appropriate Compression Method: Select the compression technique (e.g., DB2 native compression, DFSMSdss compression, zEDC) that best suits the data type, access patterns, and performance requirements of the application.
  • Test Thoroughly: Always test compression strategies in a non-production environment to understand their impact on application performance, batch run times, and resource consumption before deploying to production.
  • Consider Data Volatility: Data that is frequently updated or has a high rate of change may not be an ideal candidate for certain types of compression due to the overhead of re-compressing modified blocks or records.

Related Vendors

ASE

3 products

MacKinney Systems

54 products

IBM

646 products

Broadcom

235 products

Related Categories

Compression

49 products

Files and Datasets

168 products

Performance

171 products

Databases

211 products