SQLUNINTERRUPTED

I am just a medium, SQL Server the Goal

Tag Archives: compression

Zipping and Unzipping Files – The Fast and Furious Way

Recently while working on a customer project, we were required to zip and unzip hundreds of files of varying size. The objective was to import IIS Logs from over 100+ servers to Microsoft Analytics platform system for further analytics. The SLA’s we were working on was 1 hour, i.e. APS should have data available with a maximum latency of 1 hour. The processing required us to copy Logs from these IIS servers to a central NAS and from there to the APS landing zone. The hop to the central NAS was required because of some regulatory requirements at the client.

The IIS logs which were getting created every 1 hour, were anywhere between 1 MB – 50 MB, depending on the activity on the servers. Coping the raw files was out of question for obvious reasons. So we had to zip the files (thankfully IIS logs give tremendous compression) and copy them to the central NAS. From there copy and unzip the files on the Landing Zone

We tired multiple options like, calling exe’s (rar, unrar, 7-zip) from PowerShell, but that did not scale well. Eventually we settled on the .NET Compression routine (.NET 4 and above) to this.  We saw overwhelming results when using .NET routines (~13-15 times better execution speed) during compression and decompression.

For the purpose of this blog, I created a test folder, and filled it with 204 different types of files (images from our last trip, some PowerPoint presentations etc.) of varying size (max size 7 MB, min size 5 KB), for a total size of 320 MB.

Compression Results using rar/7-zip

$files = Get-ChildItem -Recurse "C:\Intel\BlogTests\TestFolder"
$starttime = Get-Date
foreach ($file in $files)
{
    $ZipUtility = 'C:\Program Files\WinRAR\rar.exe'
    $fileToCompress = $file.FullName
    $outputFile =  "C:\Intel\BlogTests\ZipFiles\" +$file.Name.Replace(".","_") + ".rar" 
    $zipCommandparam = " a " + $outputFile + " " + $fileToCompress
    $ps = Start-Process -FilePath $ZipUtility -ArgumentList $zipCommandparam -NoNewWindow -PassThru -Wait
    $ps.WaitForExit() # block till exe finish
}
$endTime = Get-Date
Write-Host "Time to Complete Using RAR -: "$endTime.Subtract($starttime).ToString()

Results With Rar.exe  —> Total Duration – 00:03:28 (3 minutes, 28 Seconds)image
Results With 7-zip.exe —> Total Duration – 00:03:48 (3 minutes 48 Seconds)

Compression Results using .Net Compression Routine


[System.Reflection.Assembly]::LoadWithPartialName('System.IO.Compression.FileSystem')
$files = Get-ChildItem -Recurse "C:\Intel\BlogTests\TestFolder"
$starttime = Get-Date
foreach ($file in $files)
{
$fileToCompress = $file.FullName
$directorypath = "C:\Intel\BlogTests\ZipFiles\" + $file.Name.Substring(0,$file.Name.LastIndexOf("."))
$outputzipfile =  "C:\Intel\BlogTests\ZipFiles\" + $file.Name.Replace(".","_") + ".zip"

#Write-Host $directory
$directory = [IO.Directory]::CreateDirectory($directorypath)
$filetocopy = $directorypath + "\" + $file.Name
[IO.File]::Copy($file.FullName,$filetocopy)
[System.IO.Compression.ZipFile]::CreateFromDirectory($directorypath,$outputzipfile)
[IO.Directory]::Delete($directorypath,$true)
}
$endTime = Get-Date
Write-Host "Time to Complete Using .Net Routine -: "$endTime.Subtract($starttime).ToString()

Results With .Net Routine —> Total Duration – 00:00:17 (17 Seconds)
image_thumb.png

We achieved similar results while decompressing the files.

Advertisement