Batch download files in an efficient way

Introduction

Whenever I have a need to use PowerShell, I will gladly do so. A couple of weeks ago, part of an assignment was to create a script that can download a given number of files from the internet. The script should categorize the files within folders on the hard drive. As we are talking about a big number of files that are in fact quite large, whose update frequency is not known, it seems to be much more efficient to determine whether the version of the file on the internet has changed with respect to the version on the local hard drive. That's how in the end you can come to a useful script that you can schedule to run say once a day. It eliminates a number of manual tasks by the user. By reaching full efficiency, we no longer need to download files that haven't changed anyway.

My test to check whether a file has changed, is a comparison of the file size between the remote file and the file we might already have on the local hard drive. You could add more / more advanced checks if needed.

PowerShell code
$ErrorActionPreference = 'SilentlyContinue' $Base_URL = "https://www.wimgielis.com/xlwdfiles/" $Download_Folder = "D:\test\" $array_Files = @( "excel_kruiswoordraadsel_NL.xlsm", "excel_kruiswoordraadsel_NL.xlsm", "excel_kruiswoordraadsel_NL.xlsm", "excel_kruiswoordraadsel_NL.xlsm") $array_Folders = @( "Folder 1", "Folder 2", "Folder 3", "Folder 4") $MailTo = "YOUR_OFFICE_365_ACCOUNT" $MailFrom = "YOUR_OFFICE_365_ACCOUNT" $Subject = "Downloads: changed file" $Server = "smtp.office365.com" $Port = "587" $Pwd = "YOUR_OFFICE_365_PASSWORD" $SmtpClient = New-Object Net.Mail.SmtpClient($Server, $Port) $SmtpClient.EnableSSL = $true $SmtpClient.Credentials = New-Object System.Net.NetworkCredential($MailFrom, $Pwd); $wc = New-Object System.Net.WebClient $i = 0 foreach ($f in $array_Files) { $rf = $Base_URL+$f $hfld = $Download_Folder+$array_Folders[$i] # md -Force $hfld If( !(test-path $hfld)) { New-Item -ItemType Directory -Force -Path $hfld > $null } $hf = $hfld+"\"+$f $rse = 0; if (Test-Path $hf) { $rse = (Get-Item $hf).length } $rs = (Invoke-WebRequest $rf -Method Head).Headers.'Content-Length' if ($rs -ne $rse) { $wc.DownloadFile($rf, $hf) $SmtpClient.Send($MailFrom, $MailTo, $Subject, $hf) } $i++ }

The user will be notified when a new file was downloaded, by means of a customized email. No email is sent in case the file has not changed. It will be 1 email per file that changed, and emails are sent via the Office 365 account.

If you are eager to see the PowerShell magic live in action, try it with a website of your liking, your own files to be downloaded and folders on your hard drive. Enter the parameters and check it out ! For example, if the files you are downloading are in fact zip files, then you can augment the script with an automatic unzip functionality. To do that, use the Expand-Archive command. Zipping up is the inverse and is done with Compress-Archive.




Homepage

Section contents

About Wim

Wim Gielis is a Business Intelligence consultant and Excel expert

Other links