Apache Commons logo Commons Compress

Apache Commons Compress™

The Apache Commons Compress library defines an API for working with ar, cpio, Unix dump, tar, zip, gzip, XZ, Pack200, bzip2, 7z, arj, lzma, snappy, DEFLATE, lz4, Brotli, Zstandard, DEFLATE64 and Z files.

The code in this component has many origins:

  • The bzip2, tar and zip support came from Avalon's Excalibur, but originally from Ant, as far as life in Apache goes. The tar package is originally Tim Endres' public domain package. The bzip2 package is based on the work done by Keiron Liddle as well as Julian Seward's libbzip2. It has migrated via:
    Ant -> Avalon-Excalibur -> Commons-IO -> Commons-Compress.
  • The cpio package has been contributed by Michael Kuss and the jRPM project.

Status

The current release is 1.20 and requires Java 7.

Below we highlight some new features, for a full list of changes see the Changes Report.

What's new in 1.20?

  • SevenZFile now supports random access.
  • The zip package now supports split archives.
  • The tar package now supports reading sparse entries.

What's new in 1.19?

  • ParallelScatterZipCreator now writes entries in the same order they have been added to the archive.
  • ZipArchiveInputStream and ZipFile are more forgiving when parsing extra fields by default now.
  • TarArchiveInputStream has a new lenient mode that may allow it to read certain broken archives.

What's new in 1.18?

  • The CPIO package now properly handles file names using a mult-byte encoding.
  • ZipArchiveInputStream can now deal with APK files containing an APK signing block.
  • It is no possible to specifiy various parameters for Zstd output.

What's new in 1.17?

  • A new InputStreamStatistics interface is implemented by many streams and may be used to provide feedback or detect abnormally high compression ratios that may indicate a ZIP bomb during decompression.

What's new in 1.16.1?

  • Fixed the OSGi manifest that was broken in Compress 1.16.

What's new in 1.16?

  • Support for Zstandard compression.
  • Read-only support for DEFLATE64 compression as stand-alone CompressorInputStream and as method used in ZIP and 7z archives.

What's new in 1.15?

  • Added Automatic-Module-Name so the module name will be org.apache.commons.compress when the jar is used as an automatic module in Java9+.

What's new in 1.14?

  • Added support for writing the Snappy format
  • Added support for the LZ4 compression format
  • Added read-only support for Brotli decompression by using the Google Brotli decoder.

What's new in 1.13?

  • The 7z package as well as ZipArchiveOutputStream and ZipFile can now use SeekableByteChannel when random acces is needed. This allows archives to be read from inputs and written to outputs that are seekable but are not represented by Files.
  • It is now possible to add Compressor- and ArchiverStream implementations using the JDK's ServiceLoader mechanism. Please see Extending Commons Compress.
  • Added support for writing the legacy LZMA format as compressor stream and inside 7z archives - this requires XZ for Java 1.6.

What's new in 1.12?

  • Added support for the Snappy dialect used in iWork archives.
  • SevenZFile throws an IllegalStateException for empty entries.
  • BZip2CompressorOutputStream no longer tries to finish the output stream in finalize. This is a breaking change for code that relied on the finalizer.
  • Various fixes and improvements for tar, cpio and zip.

What's new in 1.11?

  • Added read-only support for BZIP2 compression used inside of ZIP archives.
  • Speed improvements in SevenZFile
  • Various fixes and improvements for tar, ar, snappy and zip.

What's new in 1.10?

  • The old org.apache.commons.compress.compressors.z._internal_ now is org.apache.commons.compress.compressors.lzw and the code is now an official part of Commons Compress' API.
  • Added support for parallel ZIP compression.
  • Added support for raw transfer of entries from one ZIP file to another without uncompress/compress.
  • Performance improvements for creating ZIP files with lots of small entries.
  • Added auto-detection for LZMA.

What's new in 1.9?

  • support for raw DEFLATE streams

What's new in 1.8.1?

Compress 1.8.1 is a bug fix release with fixes for the tar, ar and snappy formats as well as the IOUtils class. In addition CompressorStreamFactory can now autodetect the .Z compress format.

What's new in 1.8?

  • Bug fixes to the tar, zip and 7z packages
  • Access to metadata when reading gzip streams
  • Finer grained control over the methods used when creating 7z archives and support for some the most important filter methods used in 7z archives.

What's new in 1.7?

  • Read-only support for the Snappy compression.
  • Read-only support for the traditional Unix compress format used for .Z files.

What's new in 1.6?

  • Support for the 7z format.
  • Read-only support for uncompressed ARJ archives.
  • Read-only support for the "stand-alone" LZMA format.

Documentation

The compress component is split into compressors and archivers. While compressors (un)compress streams that usually store a single entry, archivers deal with archives that contain structured content represented by ArchiveEntry instances which in turn usually correspond to single files or directories.

Currently the bzip2, Pack200, XZ, gzip, lzma, brotli, Zstandard and Z formats are supported as compressors where gzip support is mostly provided by the java.util.zip package and Pack200 support by the java.util.jar package of the Java class library. XZ and lzma support is provided by the public domain XZ for Java library. Brotli support is provided by the MIT licensed Google Brotli decoder. Zstandard support is provided by the BSD licensed Zstd-jni. As of Commons Compress 1.20 support for the DEFLATE64, Z and Brotli formats is read-only.

The ar, arj, cpio, dump, tar, 7z and zip formats are supported as archivers where the zip implementation provides capabilities that go beyond the features found in java.util.zip. As of Commons Compress 1.20 support for the dump and arj formats is read-only - 7z can read most compressed and encrypted archives but only write unencrypted ones. LZMA(2) support in 7z requires XZ for Java as well.

The compress component provides abstract base classes for compressors and archivers together with factories that can be used to choose implementations by algorithm name. In the case of input streams the factories can also be used to guess the format and provide the matching implementation.

Releases

Download now!