Category Archives: Sockets

How to stream an image using IDLnetURL

The IDnetURL object provides an interesting capability to stream data from a server into a buffer that can be loaded into a client application. I’ll present a simple example of how to stream a JPEG image from a remote website and embed it another web page without actually downloading the image into local storage. The key steps are as follows:

o=obj_new('IDLnetURL')                               ;-- initialize IDLnetURL
buffer = o->get(/buffer,url=url)                     ;-- read URL data into buffer stream (byte) array 
imgdata=idl_base64(buffer)                           ;-- encode stream to BASE64 
src='<img src="data:image/jpg;base64,'+imgdata+'"\>' ;-- insert as a data URI into HTML source tag

After initializing the IDLnetURL object, I call its get method with the /buffer keyword which reads the url (passed as a keyword) of the image into local memory as a byte array. The trick to getting a client to recognize the byte array is to use the idl_base4 function to convert the array into a MIME Base64 encoded scalar string. Once converted, I insert the encoded string as a data URI in a HTML source tag. The data URI syntax informs the client that the encoded string has a MIME type of image/jpg so that the client (e.g. browser) will render it correctly. For example, I can embed the URI into a simple HTML web page as follows:

openw,lun,'test_image.html',/get_lun
printf,lun,'<html>'
printf,lun,'<body>'
printf,lun,src
printf,lun,'</body>'
printf,lun,'</html>'
close,lun

As an exercise, try testing the above commands using the image URL:

url='http://www.heliodocs.com/data/stereo.jpg'

The commands will create an HTML file called test_image.html. Open it in your preferred browser and you should see the image below:

 

I can get a performance boost in streaming speed if the remote image is compressed (e.g. gzip format). In this case, I use the zlib_uncompress function to uncompress the buffer before encoding as follows:

buffer = o->get(/buffer) 
buffer=zlib_uncompress(buffer,/gzip) 
imgdata=idl_base64(buffer) 
src='<img src="data:image/jpg;base64,'+imgdata+'"\>'

How to write a Client-Server application using Sockets

With the release of IDL 8.5, the socket procedure supports two new keywords: Listen and Accept. The Listen keyword instructs a socket to listen on a specified port, while the Accept keyword instructs the socket to accept communication over a specified logical unit number (LUN).  In this post, I will demonstrate how to use these keywords to write a simple and practical IDL client-server application that can send data from a client to a server and execute a remote data operation on the server. The application behaves much like a Remote Procedure Call (RPC), but written entirely in IDL.

Let’s start with the server application SockServer that will listen for client connections:

pro SockServer,port
if n_elements(port) eq 0 then port=21038
socket, ListenerLUN, port, /listen, /get_lun
ID = Timer.Set (.1, "ListenerCallback", ListenerLUN)  
return & end

The SockServer procedure opens a unit number ListenerLUN on a specified port. I use a default port of 21038, but be sure not to use a port that is already allocated for other applications such as HTTP (80) or FTP (21). The Listen keyword instructs the socket to listen for connections. These connections will be handled by a callback function ListenerCallback which is invoked by calling IDL’s timer object using the syntax:

ID = Timer.Set( Time, Callback , UserData)

The Timer object creates an asynchronous timer that calls the callback function after a specified time with an optional user input argument. In SockServer, the ListenerCallback function is called after 0.1 seconds with the socket’s ListenerLUN as input. Note that a requirement for a callback function is that its first argument be the ID of the Timer object set method.

Let’s look at ListenerCallback:

pro ListenerCallback,ID,ListenerLUN
status = File_Poll_Input(ListenerLUN, Timeout = .1d)
if status then begin
 socket, ClientLUN, accept = ListenerLUN, /get_lun
 message,'Client connection established on LUN '+strtrim(clientlun,2),/info
 ID = Timer.Set(.1, "ServerCallback", ClientLUN)
endif
ID = Timer.Set(.1, "ListenerCallback", ListenerLUN)
return & end

The Listener callback function literally listens on ListenerLUN for a client connection by using the file_poll_input function:

status = File_Poll_Input(ListenerLUN, Timeout=value) 

When a client attempts to connect to the server, file_poll_input returns a status of true which then triggers another call to socket. This socket call accepts the connection and assigns a ClientLUN unit number to the client. All subsequent data communication between the client and server occurs over this ClientLUN. Once this handshaking is complete, the Listener callback function calls another callback function ServerCallback (via the Timer object) which performs the actual data processing. If there is no connection during a specified Timeout period, file_poll_input returns a status of false and the Listener callback function is called again by the Timer object.

Let’s look at ServerCallback:

pro ServerCallback,ID,ClientLUN
status=File_Poll_Input(ClientLUN, Timeout = .01)
if status then begin
 command=""
 dsize=lonarr(6)
 readu,ClientLun,dsize                  ;-- read data size and type
 data=make_array(size=dsize)            ;-- reconstruct data type
 readu,ClientLun,data                   ;-- read the data
 readf,Clientlun,command                ;-- read the command to execute on the data
 s=execute(command)
endif
ID=Timer.Set(.1, "ServerCallback", ClientLUN)
return & end

The Server callback function behaves in a similar fashion to ListenerCallback. It uses file_poll_input to listen for data sent from the client on ClienLUN. When a true status is signaled, the callback first reads the size and type of data using readu, and prepares a data variable using make_array into which the data will be read. Lastly, the callback reads the string variable command using readf which is passed to the execute function to operate on the data.

Let’s finish with the client application SockClient that will establish the connection to the server:

pro SockClient, ID, port,server, lun=ServerLUN
if n_elements(port) eq 0 then port = 21038
if n_elements(server) eq 0  then server="localhost"
socket, ServerLUN, server, port, /get_lun, error=error
if error ne 0 then ID=timer.set(.1,"SockClient",port)
return & end

As with SockServer, I use a default port of 21038 for the connection. In principle, I can connect to a SockServer application running on a different IP address but security restrictions may prevent that. Hence, for this test example, I default to using localhost such that server and client are running on the same system. The physical connection is made by calling socket with the server and port as input arguments. If the connection is successful (i.e. the server accepted the connection), the socket returns an error value of 0 and the client is ready to send data over the assigned ServerLUN unit number. If not successful, I call SockClient again (via a Timer) and try reconnecting.

To demonstrate the socket client-server application, try the following sample program SockTest which will send a JPEG image from a client to a server where it will be displayed:

pro SockTest,serverLUN
read_jpeg,'stereo.jpg',data,/true          ;-- read JPEG image into data array
writeu, ServerLUN,size(data)               ;-- send the image size and type to the server
writeu, ServerLUN,data                     ;-- send the image data to the server
command="a=image(data,/no_tool)"           ;-- create the command to execute on the server
printf,serverLUN,command                   ;-- send the command to the server
return & end

You will need to download SockClient, SockServer, SockTest, and the JPEG image into your local directory. The programs and image are available at GitHub. Run the test as follows: start two separate IDL sessions; type SockServer in the first session; and type SockClient in the second followed by SockTest. If the connection is successful, you will see the following messages:

IDL> SockServer               ;-- start socket server
% LISTENERCALLBACK: Client connection established on LUN 101
IDL> SockClient               ;-- start socket client
% SOCKCLIENT: Server connection established on LUN 100

IDL> SockTest,100             ;-- send data and command to server via LUN 100

If all goes well, you should see this image in the server’s IDL session:

stereo

Using the Socket procedure to download a file

The IDL socket procedure is a powerful tool for connecting to and accessing content on remote servers. In this post, I will demonstrate how to use the socket procedure to download a file by sending an HTTP GET request.

The socket command syntax is:

 socket,unit,host,port,/get_lun

where:

  • Unit = logical unit number
  • Host = remote host name
  • Port = remote host port number [80 for HTTP]
  • Get_lun = keyword to request a logical unit number

In the following example, I open a socket to the mission website for NASA’s Solar Dynamics Observatory and send a GET request for a JPEG image of the Sun at http://sdo.gsfc.nasa.gov/assets/img/latest/f_211_193_171_512.jpg.

IDL> socket,lun,"sdo.gsfc.nasa.gov",80,/get_lun
IDL> printf,lun,"GET /assets/img/latest/f_211_193_171_512.jpg HTTP/1.1"
IDL> printf,lun,"Host: sdo.gsfc.nasa.gov:80"
IDL> printf,lun,string(10b)

Note the following:

  • the GET request is sent as text by using printf
  • the remote file must include a full path name and be followed by a protocol, which is HTTP 1.1 in this example.
  • the request must include a Host:Port keyword. Port 80 is assumed if not specified.
  • the last command is a blank string (actually a LF).

Once the server receives and interprets the GET request, it responds with header text which I read into a string array using readf as follows:

header=""                                ;-- initialize strings
text="xxx"              
 while text ne "" do begin               ;-- read each line of text
  readf,lun,text
  header=[header,text]                   ;-- append to header array
 endwhile

In this example, the HTTP header looks like this:

IDL> print, header

HTTP/1.1 200 OK
Date: Thu, 19 Mar 2015 18:37:02 GMT
Server: Apache/2.2.15 (CentOS)
Last-Modified: Thu, 19 Mar 2015 18:23:06 GMT
ETag: "35180008-f460-511a84a91b35f"
Accept-Ranges: bytes
Content-Length: 62458
Content-Type: image/jpeg

The header contains useful information about the remote file:

  • HTTP/1.1 200 OK                        => the request was successful
  • Content-Length: 62458               => the file size in bytes
  • Content-Type: image/jpeg          => the file type is a JPEG image

After reading the response header, I use readu to read (i.e. download) the actual byte data from the open socket. Since I know from the header that the JPEG file size is 62458 bytes, I initialize the byte array to this size before reading it:

IDL> data=bytarr(62458)                ;-- create byte array for data
IDL> readu,lun,data                    ;-- read data from socket
IDL> close,lun                         ;-- close socket

After reading the data, I write it to a local file (e.g. output.jpeg):

IDL> openw,lun,"output.jpeg",/get_lun
IDL> writeu,lun,data
IDL> close,lun

Finally, I read and display the downloaded JPEG file:

IDL> read_jpeg,"output.jpeg",image,/true
IDL> tv,image,/true

f_211_193_171_512

Downloading multiple files with IDLnetURL

You can download a file from a remote server by using the GET method of  the IDLnetURL  object.  I often need to download multiple files from a server and would like to skip files that have been downloaded previously. To this end, I developed the program webcopy which has the following arguments and keywords.

IDL> webcopy,url_files,outdir,/clobber,url_username=username,url_password=password

The first input argument is the array of remote URL file names that you wish to download. Single input  file names will work also.  The second (optional) argument is the output directory where you wish to download the files. The default is the current working directory. You will get an error if the output directory doesn’t exist or doesn’t have write access.  The program will loop through each URL file name and download a copy if a version doesn’t exist in the output directory.  This is the default behavior. If a downloaded version already exists and you wish to force a new download, then you can use the /clobber keyword.

The program has some additional nice features:

  • It checks the “Content-Disposition” keyword in the HTTP header which is often used to designate  the name to give to the file when it is downloaded. This feature is often used when the URL file name is a query string.
  • If the remote server requires login username and password credentials, they can be passed as keywords. This feature is useful for downloading from FTP sites, which webcopy also supports.

Here is an example of downloading multiple files for the HMI instrument on the Solar Dynamics Observatory (SDO), using /verbose to provide informative output:

IDL> print,files
http://sao.virtualsolar.org/cgi-bin/VSO/drms_export.cgi?series=hmi__Ic_45s;record=12852482-12852482
http://sao.virtualsolar.org/cgi-bin/VSO/drms_export.cgi?series=hmi__Ic_45s;record=12852483-12852483
http://sao.virtualsolar.org/cgi-bin/VSO/drms_export.cgi?series=hmi__Ic_45s;record=12852484-12852484
IDL> webcopy,files,/verbose
% WEBCOPY_MAIN: Successfully downloaded to /private/tmp/hmi.ic_45s.2011.05.01_00:01:30_TAI.continuum.fits
% WEBCOPY_MAIN: Successfully downloaded to /private/tmp/hmi.ic_45s.2011.05.01_00:02:15_TAI.continuum.fits
% WEBCOPY_MAIN: Successfully downloaded to /private/tmp/hmi.ic_45s.2011.05.01_00:03:00_TAI.continuum.fits

Note that the file names are query strings which are translated into local filenames using the Content-Disposition keyword.

Downloading parts of a file with IDLnetURL

With HTTP byte-serving, a client can request specific parts of a remote file by using the Range-Request header. This technique  is useful if a file contains several images, but you are only interested in extracting a specific image or parts of an image without downloading the entire file.  The program webread.pro allows you to do this by including a RANGE request header in the IDLnetURL object when sending a GET request.  First, here is an example of reading bytes 100-200 of a remote JPEG image:

IDL> f='http://sohowww.nascom.nasa.gov/pickoftheweek/old/17oct2014/Oct_C3_halo.jpg'
IDL> o=webread(f,response=response,range=[100,200])
IDL> help,o
O               BYTE      = Array[101]

IDL> print,response

HTTP/1.1 206 Partial Content
Date: Thu, 23 Oct 2014 15:56:31 GMT
Server: Apache/2.4.10 (Unix)
Last-Modified: Fri, 17 Oct 2014 20:05:21 GMT
ETag: "24b46-505a3e28764e6"
Accept-Ranges: bytes
Content-Length: 101
Content-Range: bytes 100-200/150342
Content-Type: image/jpeg

The HTTP/1.1 206 code returned in the response header indicates that the Range-Request header was accepted and satisfied by the server.  Here is how the technique works:

FUNCTION webread,url,RANGE=range,RESPONSE=response

IF N_ELEMENTS(url) EQ 0 THEN RETURN,''
;-- create the IDLnetURL object

o=OBJ_NEW('IDLnetURL')

;-- verify that range values are entered and create a RANGE request header that
;   is passed to the object as a property.

nrange=N_ELEMENTS(range)
range_requested= (nrange EQ 1) || (nrange EQ 2)

IF range_requested THEN BEGIN
 header='Range: bytes='+STRTRIM(range[0],2)+'-'
 IF nrange EQ 2 THEN header=header+STRTRIM(range[1],2)
 o->SETPROPERTY,HEADER=header
ENDIF

;-- send the GET request using /BUFFER to return the requested data in a byte array

output=o->GET(/BUFFER,url=url)

;-- extract the RESPONSE_HEADER property to verify that the request succeeeded

o->GETPROPERTY,RESPONSE_HEADER=response
OBJ_DESTROY,o
RETURN,output
END

The RANGE keyword is a 2-element vector [b1,b2] which contains the start and end bytes of the request that is passed to the IDLnetURL object as a property in the string format:

Range: bytes = b1-b2

Entering a single element scalar will send a request read from the first byte to the end of the file. Not all web servers accept RANGE requests. You can determine in advance whether a server supports byte-serving by sending a HEAD request using the program webhead.pro:

IDL> header=webhead(f)
IDL> print,header

and examining the header for the response below that indicates that the server accepts RANGE requests:

Accept-Ranges: bytes

A more interesting and useful example is provided by reading the header of a FITS (Flexible Image Transport System) formatted file. In a standard FITS file, the ASCII text header is 2880 bytes long. Hence, if you are interested in examining the header before downloading the entire file, you could try this:

IDL> f='http://stereo-ssc.nascom.nasa.gov/data/ins_data/secchi/L0/b/img/euvi/20140901/20140901_111530_n4euB.fts
IDL> header=webread(f,range=[0,2879])
IDL> help,header
HEADER          BYTE      = Array[2880]

The above example reads the header of a STEREO spacecraft FITS image and returns it as a byte array. You can examine the header by converting it to a string and printing it as follows:

IDL> print,string(header)
SIMPLE  =                    T / Written by IDL:  Wed Sep  3 14:09:07 2014      
BITPIX  =                   16 /                                                
NAXIS   =                    2 /                                                
NAXIS1  =                 2048 /                                                
NAXIS2  =                 2048 /                                                
DATE-OBS= '2014-09-01T11:15:56.207' /                                           
FILEORIG= 'E90102FE.448'       /   

Sending a HEAD request with IDLnetURL

The IDLnetURL object provides a GET method for downloading files but not a HEAD method. I often find it convenient to check if a remote file is available or examine the properties of a remote file (such as the size in bytes) before downloading the file. After some experimenting, I came up with a solution that uses the IDLnetURL Callback function feature to read the HTTP header without downloading the file. The program is called webhead.pro

Here is an example of its use:

IDL> f="http://sohowww.nascom.nasa.gov/pickoftheweek/old/17oct2014/Oct_C3_halo.jpg"
IDL> h=webhead(f) 
IDL> print,h 

HTTP/1.1 200 OK 
Date: Thu, 30 Oct 2014 15:46:31 GMT 
Server: Apache/2.4.10 (Unix) 
Last-Modified: Fri, 17 Oct 2014 20:05:21 GMT 
ETag: "24b46-505a3e28764e6" 
Accept-Ranges: bytes 
Content-Length: 150342 
Content-Type: image/jpeg

Here is how it does it. First, I create a callback function that will be called by the GET method:

FUNCTION WEBHEAD_CALLBACK, status, progress  

;-- since we only need the response header, we just read until
;   the first set of non-zero bytes is encountered.

;-- return 0 to return to main caller

IF (progress[0] EQ 1) && (progress[2] GT 0) then RETURN,0

;-- otherwise return 1 to keep reading

RETURN,1

END

When called by the GET method, this function updates the 2-element vector PROGRESS as follows:

PROGRESS[0] – Contains 1 when the Progress array contains valid data, or 0 when the array does not contain valid data

PROGRESS[2] – Contains the number of bytes that have been downloaded during the current GET

When the first set of valid bytes are received, the values in PROGRESS become non-zero and we cancel the callback without reading additional bytes.

Next is the driver function that sets the callback function as a property to the IDLnetURL object:

FUNCTION WEBHEAD,url,STATUS_CODE=status_code

IF N_ELEMENTS(url) EQ 0 then RETURN,''

o=OBJ_NEW('IDLnetURL')

;-- establish callback function to interrupt GET.
;-- this function is invoked when the GET method is called

o->SETPROPERTY, CALLBACK_FUNCTION='webhead_callback'

;-- use a CATCH since canceling the callback triggers it

header=''
error=0
CATCH, error
IF (error EQ 0) THEN output=o->GET(/BUFFER,url=url) ELSE CATCH,/CANCEL

;-- we get here when CATCH is triggered 
 
o->GETPROPERTY,RESPONSE_HEADER=header

OBJ_DESTROY,o

;--  extract HTTP status code from the response header 
;    by using a regular expression to search for the pattern HTTP/1.1 xxx

u=STREGEX(header,'http(s)?/[0-9]{1}.[0-9]{1} +([0-9]+).*',/SUB,/EXTR,/FOLD)

chk=WHERE(u[1,*] NE '',COUNT)
IF count GT 0 THEN status_code=FIX(u[1,chk[0]]) ELSE status_code=404

RETURN,header 
END