GZIP which stands for GNU Zip is a file compression and decompression tool that you find in most of the Linux distributions. This tool will compress each file into a single file. After the compression the original files will be replaced by the compressed file which has a “.gz” extension as suffix. For more information read the man page of GZIP command by typing the following command in your Linux terminal.
Difference Between GZIP and ZIP in Linux
Similar as GZIP, ZIP is also a file compression and decompression tool. When we compare GZIP and ZIP, GZIP is better in terms of compression. During a GZIP compression it archive all the files into a single tarball before the compression. In the other hand during ZIP compression, the individual files are compressed and then added to the archive. Therefore, when you want to extract a single file from a GZIP compression you need to decompress the whole files before extracting the file you want from the archive.
How data compression works?
The main reason that you need to compress your files is because of a condition called Data Redundancy. In simple terms data redundancy is condition that present within databases in which same piece of data is stored in multiple places. Assume you stored the following in a text file.
Now that text file has data redundancy because it contains similar data pieces in multiple places. For example, there are five “A” characters, four “B” characters and many more. So during the compression what a compression program does is it keep track of how many places contains same piece of data and then it eliminate extra pieces of data by keeping only just one instance of each bit of data. For example, after the compression of above text file it would look like the following
It contains the same data as uncompressed file but it’s simplified and uses less space. (numbers in front of the letters correspond to the number of times that the letter in front of them repeats.)
Using GZIP Command in Linux
Syntax for the GZIP command is,
gzip [Options] [filenames]
touch test gzip test
Above command will creates a compresses file of test called test.gz and delete the original test.txt file.
Imagine you create another file with the same name “test” and try to GZIP it. Then GZIP will throw you an error saying that the file is exist. If you need to forcefully compress the file ignoring this error simply use “-f” as show in below.
gzip -f test
By default GZIP will delete the original file after the compression. But if you want to compress the file and keep the original file you can use “-k” as show below.
gzip -k test
mkdir samplefolder cd samplefolder touch test1 test2 test3 mkdir samplefolder2 cd samplefolder2 touch test1 test2 test3 gzip -r samplefolder
By using “-r” together with the gzip command it compress every file in a folder and its subfolders. doesn’t create one file called samplefolder.gz instead it will traverse the directory and compress each file.
Also Read: How to Use Wildcards in Linux Explained
Using the “-v” flag we set the verbose mode which displays the name and percentage reduction for each file compressed or decompressed.
TO decompress a compressed file using GZIP we can use “”-d” flag.
You can also set the compression speed and compression quality using “-#” where # equals 1-9. By default it will use “-6” . “-1” means faster compression (speed) but less compression (quality) while “-9” means slower compression (speed) but high compression (quality).