Downloading multiple files with IDLnetURL

You can download a file from a remote server by using the GET method of  the IDLnetURL  object.  I often need to download multiple files from a server and would like to skip files that have been downloaded previously. To this end, I developed the program webcopy which has the following arguments and keywords.

IDL> webcopy,url_files,outdir,/clobber,url_username=username,url_password=password

The first input argument is the array of remote URL file names that you wish to download. Single input  file names will work also.  The second (optional) argument is the output directory where you wish to download the files. The default is the current working directory. You will get an error if the output directory doesn’t exist or doesn’t have write access.  The program will loop through each URL file name and download a copy if a version doesn’t exist in the output directory.  This is the default behavior. If a downloaded version already exists and you wish to force a new download, then you can use the /clobber keyword.

The program has some additional nice features:

  • It checks the “Content-Disposition” keyword in the HTTP header which is often used to designate  the name to give to the file when it is downloaded. This feature is often used when the URL file name is a query string.
  • If the remote server requires login username and password credentials, they can be passed as keywords. This feature is useful for downloading from FTP sites, which webcopy also supports.

Here is an example of downloading multiple files for the HMI instrument on the Solar Dynamics Observatory (SDO), using /verbose to provide informative output:

IDL> print,files
http://sao.virtualsolar.org/cgi-bin/VSO/drms_export.cgi?series=hmi__Ic_45s;record=12852482-12852482
http://sao.virtualsolar.org/cgi-bin/VSO/drms_export.cgi?series=hmi__Ic_45s;record=12852483-12852483
http://sao.virtualsolar.org/cgi-bin/VSO/drms_export.cgi?series=hmi__Ic_45s;record=12852484-12852484
IDL> webcopy,files,/verbose
% WEBCOPY_MAIN: Successfully downloaded to /private/tmp/hmi.ic_45s.2011.05.01_00:01:30_TAI.continuum.fits
% WEBCOPY_MAIN: Successfully downloaded to /private/tmp/hmi.ic_45s.2011.05.01_00:02:15_TAI.continuum.fits
% WEBCOPY_MAIN: Successfully downloaded to /private/tmp/hmi.ic_45s.2011.05.01_00:03:00_TAI.continuum.fits

Note that the file names are query strings which are translated into local filenames using the Content-Disposition keyword.

Leave a Reply

Your email address will not be published. Required fields are marked *