Monthly Archives: November 2014

Downloading parts of a file with IDLnetURL

With HTTP byte-serving, a client can request specific parts of a remote file by using the Range-Request header. This technique  is useful if a file contains several images, but you are only interested in extracting a specific image or parts of an image without downloading the entire file.  The program webread.pro allows you to do this by including a RANGE request header in the IDLnetURL object when sending a GET request.  First, here is an example of reading bytes 100-200 of a remote JPEG image:

IDL> f='http://sohowww.nascom.nasa.gov/pickoftheweek/old/17oct2014/Oct_C3_halo.jpg'
IDL> o=webread(f,response=response,range=[100,200])
IDL> help,o
O               BYTE      = Array[101]

IDL> print,response

HTTP/1.1 206 Partial Content
Date: Thu, 23 Oct 2014 15:56:31 GMT
Server: Apache/2.4.10 (Unix)
Last-Modified: Fri, 17 Oct 2014 20:05:21 GMT
ETag: "24b46-505a3e28764e6"
Accept-Ranges: bytes
Content-Length: 101
Content-Range: bytes 100-200/150342
Content-Type: image/jpeg

The HTTP/1.1 206 code returned in the response header indicates that the Range-Request header was accepted and satisfied by the server.  Here is how the technique works:

FUNCTION webread,url,RANGE=range,RESPONSE=response

IF N_ELEMENTS(url) EQ 0 THEN RETURN,''
;-- create the IDLnetURL object

o=OBJ_NEW('IDLnetURL')

;-- verify that range values are entered and create a RANGE request header that
;   is passed to the object as a property.

nrange=N_ELEMENTS(range)
range_requested= (nrange EQ 1) || (nrange EQ 2)

IF range_requested THEN BEGIN
 header='Range: bytes='+STRTRIM(range[0],2)+'-'
 IF nrange EQ 2 THEN header=header+STRTRIM(range[1],2)
 o->SETPROPERTY,HEADER=header
ENDIF

;-- send the GET request using /BUFFER to return the requested data in a byte array

output=o->GET(/BUFFER,url=url)

;-- extract the RESPONSE_HEADER property to verify that the request succeeeded

o->GETPROPERTY,RESPONSE_HEADER=response
OBJ_DESTROY,o
RETURN,output
END

The RANGE keyword is a 2-element vector [b1,b2] which contains the start and end bytes of the request that is passed to the IDLnetURL object as a property in the string format:

Range: bytes = b1-b2

Entering a single element scalar will send a request read from the first byte to the end of the file. Not all web servers accept RANGE requests. You can determine in advance whether a server supports byte-serving by sending a HEAD request using the program webhead.pro:

IDL> header=webhead(f)
IDL> print,header

and examining the header for the response below that indicates that the server accepts RANGE requests:

Accept-Ranges: bytes

A more interesting and useful example is provided by reading the header of a FITS (Flexible Image Transport System) formatted file. In a standard FITS file, the ASCII text header is 2880 bytes long. Hence, if you are interested in examining the header before downloading the entire file, you could try this:

IDL> f='http://stereo-ssc.nascom.nasa.gov/data/ins_data/secchi/L0/b/img/euvi/20140901/20140901_111530_n4euB.fts
IDL> header=webread(f,range=[0,2879])
IDL> help,header
HEADER          BYTE      = Array[2880]

The above example reads the header of a STEREO spacecraft FITS image and returns it as a byte array. You can examine the header by converting it to a string and printing it as follows:

IDL> print,string(header)
SIMPLE  =                    T / Written by IDL:  Wed Sep  3 14:09:07 2014      
BITPIX  =                   16 /                                                
NAXIS   =                    2 /                                                
NAXIS1  =                 2048 /                                                
NAXIS2  =                 2048 /                                                
DATE-OBS= '2014-09-01T11:15:56.207' /                                           
FILEORIG= 'E90102FE.448'       /