module filehelper.compression_helper

Short summary

module pyquickhelper.filehelper.compression_helper

Functions about compressing files.

source on GitHub

Functions

function

truncated documentation

gzip_files

Compresses all files from an iterator in a zip file and then in a gzip file.

un7zip_files

Unzips files from a zip archive compress with 7z.

ungzip_files

Uncompresses files from a gzip file.

unrar_files

Uncompresses files from a rar archive compress with 7z on Window or unrar on linux.

untar_files

Uncompresses files from a tar file.

unzip_files

Unzips files from a zip archive.

zip7_files

If 7z is installed, the function uses it to compress file into 7z format. The file filename_7z must not exist. …

zip_files

Zips all files from an iterator.

Documentation

Functions about compressing files.

source on GitHub

pyquickhelper.filehelper.compression_helper.gzip_files(filename, file_set, encoding=None, fLOG=<function noLOG>)[source]

Compresses all files from an iterator in a zip file and then in a gzip file.

Parameters:
  • filename – final gzip file (double compression, extension should something like .zip.gz)

  • file_set – iterator on file to add

  • encoding – encoding of input files (no double compression then)

  • fLOG – logging function

Returns:

bytes (if filename is None) or None

source on GitHub

pyquickhelper.filehelper.compression_helper.un7zip_files(zipf, where_to=None, fLOG=<function noLOG>, fvalid=None, remove_space=True, cmd_line=False)[source]

Unzips files from a zip archive compress with 7z.

Parameters:
  • zipf – archive (or bytes or BytesIO)

  • where_to – destination folder (can be None, the result is a list of tuple)

  • fLOG – logging function

  • fvalid – function which takes two paths (zip name, local name) and return True if the file must be unzipped, False otherwise, if None, the default answer is True

  • remove_space – remove spaces in created local path (+ ',())

  • cmd_line – use command line instead of module pylzma

Returns:

list of unzipped files

The function requires module pylzma. See Why module pylzma does not work?.

source on GitHub

pyquickhelper.filehelper.compression_helper.ungzip_files(filename, where_to=None, fLOG=<function noLOG>, fvalid=None, remove_space=True, unzip=True, encoding=None)[source]

Uncompresses files from a gzip file.

Parameters:
  • filename – final gzip file (double compression, extension should something like .zip.gz)

  • where_to – destination folder (can be None, the result is a list of tuple)

  • fLOG – logging function

  • fvalid – function which takes two paths (zip name, local name) and return True if the file must be unzipped, False otherwise, if None, the default answer is True

  • remove_space – remove spaces in created local path (+ ',())

  • unzip – unzip file after gzip

  • encoding – encoding

Returns:

list of unzipped files

source on GitHub

pyquickhelper.filehelper.compression_helper.unrar_files(zipf, where_to=None, fLOG=<function noLOG>, fvalid=None, remove_space=True)[source]

Uncompresses files from a rar archive compress with 7z on Window or unrar on linux.

Parameters:
  • zipf – archive (or bytes or BytesIO)

  • where_to – destination folder (can be None, the result is a list of tuple)

  • fLOG – logging function

  • fvalid – function which takes two paths (zip name, local name) and return True if the file must be unzipped, False otherwise, if None, the default answer is True

  • remove_space – remove spaces in created local path (+ ',())

Returns:

list of unzipped files

source on GitHub

pyquickhelper.filehelper.compression_helper.untar_files(filename, where_to=None, fLOG=<function noLOG>, encoding=None)[source]

Uncompresses files from a tar file.

Parameters:
  • filename – final tar file (double compression, extension should something like .zip.gz)

  • where_to – destination folder (can be None, the result is a list of tuple)

  • fLOG – logging function

  • encoding – encoding

Returns:

list of unzipped files

source on GitHub

pyquickhelper.filehelper.compression_helper.unzip_files(zipf, where_to=None, fLOG=<function noLOG>, fvalid=None, remove_space=True, fail_if_error=True)[source]

Unzips files from a zip archive.

Parameters:
  • zipf – archive (or bytes or BytesIO)

  • where_to – destination folder (can be None, the result is a list of tuple)

  • fLOG – logging function

  • fvalid – function which takes two paths (zip name, local name) and return True if the file must be unzipped, False otherwise, if None, the default answer is True

  • remove_space – remove spaces in created local path (+ ',())

  • fail_if_error – fails if an error is encountered (typically a weird character in a filename), otherwise a warning is thrown.

Returns:

list of unzipped files

source on GitHub

pyquickhelper.filehelper.compression_helper.zip7_files(filename_7z, file_set, fLOG=<function noLOG>, temp_folder='.')[source]

If 7z is installed, the function uses it to compress file into 7z format. The file filename_7z must not exist.

Parameters:
  • filename_7z – final destination

  • file_set – list of files to compress

  • fLOG – logging function

  • temp_folder – the function stores the list of files in a file in the folder temp_folder, it will be removed afterwords

Returns:

number of added files

Why module pylzma does not work?

The module pylzma failed to decompress the file produced by the latest version of 7z (2016-09-23). The compression was changed by tweaking the command line. LZMA is used instead LZMA2. The current version does not include this commit. Or you can clone the package sdpython.pylzma and build it yourself with python setup.py bdist_wheel.

source on GitHub

pyquickhelper.filehelper.compression_helper.zip_files(filename, file_set, root=None, fLOG=<function noLOG>)[source]

Zips all files from an iterator.

Parameters:
  • filename – final zip file (can be None)

  • file_set – iterator on file to add

  • root – if not None, all path are relative to this path

  • fLOG – logging function

Returns:

number of added files (or content if filename is None)

filename can be None, the function compresses into bytes without saving the results.

source on GitHub