Monthly Archives: August 2014

Sending a HEAD request with IDLnetURL

The IDLnetURL object provides a GET method for downloading files but not a HEAD method. I often find it convenient to check if a remote file is available or examine the properties of a remote file (such as the size in bytes) before downloading the file. After some experimenting, I came up with a solution that uses the IDLnetURL Callback function feature to read the HTTP header without downloading the file. The program is called webhead.pro

Here is an example of its use:

IDL> f="http://sohowww.nascom.nasa.gov/pickoftheweek/old/17oct2014/Oct_C3_halo.jpg"
IDL> h=webhead(f) 
IDL> print,h 

HTTP/1.1 200 OK 
Date: Thu, 30 Oct 2014 15:46:31 GMT 
Server: Apache/2.4.10 (Unix) 
Last-Modified: Fri, 17 Oct 2014 20:05:21 GMT 
ETag: "24b46-505a3e28764e6" 
Accept-Ranges: bytes 
Content-Length: 150342 
Content-Type: image/jpeg

Here is how it does it. First, I create a callback function that will be called by the GET method:

FUNCTION WEBHEAD_CALLBACK, status, progress  

;-- since we only need the response header, we just read until
;   the first set of non-zero bytes is encountered.

;-- return 0 to return to main caller

IF (progress[0] EQ 1) && (progress[2] GT 0) then RETURN,0

;-- otherwise return 1 to keep reading

RETURN,1

END

When called by the GET method, this function updates the 2-element vector PROGRESS as follows:

PROGRESS[0] – Contains 1 when the Progress array contains valid data, or 0 when the array does not contain valid data

PROGRESS[2] – Contains the number of bytes that have been downloaded during the current GET

When the first set of valid bytes are received, the values in PROGRESS become non-zero and we cancel the callback without reading additional bytes.

Next is the driver function that sets the callback function as a property to the IDLnetURL object:

FUNCTION WEBHEAD,url,STATUS_CODE=status_code

IF N_ELEMENTS(url) EQ 0 then RETURN,''

o=OBJ_NEW('IDLnetURL')

;-- establish callback function to interrupt GET.
;-- this function is invoked when the GET method is called

o->SETPROPERTY, CALLBACK_FUNCTION='webhead_callback'

;-- use a CATCH since canceling the callback triggers it

header=''
error=0
CATCH, error
IF (error EQ 0) THEN output=o->GET(/BUFFER,url=url) ELSE CATCH,/CANCEL

;-- we get here when CATCH is triggered 
 
o->GETPROPERTY,RESPONSE_HEADER=header

OBJ_DESTROY,o

;--  extract HTTP status code from the response header 
;    by using a regular expression to search for the pattern HTTP/1.1 xxx

u=STREGEX(header,'http(s)?/[0-9]{1}.[0-9]{1} +([0-9]+).*',/SUB,/EXTR,/FOLD)

chk=WHERE(u[1,*] NE '',COUNT)
IF count GT 0 THEN status_code=FIX(u[1,chk[0]]) ELSE status_code=404

RETURN,header 
END