Topic: APLX Help : Help on APL language : System Functions & Variables : ⎕NA Define External Function
[ Previous | Next | Contents | Index | APL Home ]

www.microapl.co.uk

⎕NA Define External Function


The ⎕NA ("name association") system function allows you to associate a name in the APL workspace with an external (non-APL) function call. It thus defines an 'external function' which can be used like an ordinary APL function, but which calls out to the operating system or to a shared library (DLL).

Describing the external function

⎕NA takes as its right argument a function descriptor which specifies the library in which the external function resides, its name, and the calling conventions of the function. You can also supply an optional left argument which is the name to be used for the function in the APL workspace. If you omit the left argument, the APL name is the same as the external routine name.

A typical routine descriptor looks like this:

'I4 msupport.dll|AddNode I4 U I4 ={I[2] U I[4]}'

where the first field (I4) is the return type and the next (msupport.dll) is the library. The function name follows the vertical bar '|' character, and is AddNode. This is followed by the function parameters. Thus, the above descriptor contains the following information:

       I4             Return type = 4-byte integer
       msupport.dll   DLL name/path
       AddNode        Function name
       I4             1st parameter: Integer
       U              2nd parameter: Unsigned integer
       I4             3rd parameter Integer
       ={I[2] U I[4]} 4th parameter: Pointer to structure of 
                      2 Signed Ints, 1 Unsigned Int, 4 Signed Ints
 

The supported basic types are:

        I or I4        4-byte signed integer
        I8             8-byte signed integer (supported in APLX64 only)
        I2             Signed short (2-byte integer)
        I1             Signed byte (1-byte integer)
        U or U4        4-byte unsigned integer
        U8             8-byte unsigned integer (supported in APLX64 only)
        U2             Unsigned short (2-byte integer)
        U1             Unsigned byte (1-byte integer)
        D or F8 or D8  8-byte double-precision floating point
        F or F4 or D4  4-byte single-precision floating point
        C or CT        Character (with translation)
        CU             Character (without translation)
        W              Wide (Unicode) character, translated to APLX character	
        P or PT        Pascal string character (with translation) 
        PU             Pascal string character (without translation)		

The supported qualifiers are:

        <             Pointer, in only.  
                      No result comes out, but incoming data goes in.
        >             Pointer, out only.  
                      On return, the result is returned in the nested result
        =             Pointer, in and out
        {}            Encloses structure elements
        []            Encloses array elements

Arrays are typically fixed-size, and are represented using the [] notation. This corresponds to an APL vector of the same length. You can specify an array of indeterminate length using * as a wildcard character (eg C[*]), but only when passing a pointer to an array into the external function, not for returned values or inside structures.

Structures are represented using the {} notation, and the corresponding APL data is a nested array. Arrays and structures are typically passed by pointer, i.e. the external routine expects a pointer to the array or structure. This is indicated by one of the < = or > qualifiers, depending on whether the data is passed into the function (case '<'), returned by the function (case '>'), or both (case '='). If data is returned into a structure or array, it will be returned by APLX in a nested vector returned when the function is run.

The result is either a scalar (being the result in the first field), or a nested vector of the result followed by one element per > or = field in the descriptor. If there is no explicit result to the function (void in the C language), you can simply omit the result type.

Strings and Character Arrays

There are various ways in which you can specify strings and character arrays. Usually, you should use an array of C or CT type, to indicate a character string which will be translated between APL's internal representation (⎕AV position) and the normal non-APL equivalent (extended ASCII). It will also be null-terminated as is usual in the C language, by adding a null byte at the end of strings passed from APL to the external function, and setting the length of any string returned to APL to one byte less than the position of the first null character. In both cases, the string will be also be terminated if the maximum length you specify is reached. For example, the type C[64] indicates a C string of up to 64 bytes including any final null character. The type CU is similar, but no translation takes place.

Wide or Unicode characters (type W) are similar. These are 16-bit characters; if you use this type, text is automatically translated from APLX's internal representation to Unicode before making the call. Any returned Unicode text is translated back to APLX's internal representation after making the call. Any Unicode characters which are not in the APLX character set are converted to question marks. (If you want to retrieve the raw Unicode values, use type U2 for unsigned short, which will retrieve them as integers).

The Pascal string types are valid only in array definitions, not as simple data elements. They indicate a Pascal-type string where the first byte is the length of the string, followed by the string. APLX will automatically add the length byte before calling the external function, and will remove it before returning any result to APL. For example, P[63] indicates a Pascal string of up to 63 characters (i.e. 64 bytes including the length byte). PU[63] is the same, except that no character translation takes place

Structure alignment

When the function descriptor defines a structure using the {} notation, APLX will automatically try to ensure that the elements in the structure are aligned correctly relative to the start of the structure. This generally means that 2-byte elements (I2 and U2) are aligned on 2-byte boundaries, and 4-byte elements (I, I4, U, U4 and pointers) are aligned on 4-byte boundaries. APLX will automatically allow for any padding bytes required to achieve this.

Occasionally you may need to modify this default behavior. You can do this by adding an alignment modifier enclosed in curly brackets, just after the library name. This takes the form {a=N} where N is one of the values 1, 2, or 4. The effect of this is to specify the maximum alignment of any element; if the size of an element is less than this maximum, it will be aligned on its natural boundary. This gives the following results:

  • {a=1} No padding is inserted; elements will be packed together.
  • {a=2} Two-, four- and eight-byte elements (and pointers) will all be aligned on 2-byte boundaries only.
  • {a=4} Two-byte elements will be aligned on 2-byte boundaries. Four- and eight-byte elements (and pointers) will be aligned on 4-byte boundaries.

For example, under MacOS the Quicktime routines are in a library called 'QuickTimeLib'. This expects structures to be aligned according to Motorola 68000-style alignment rules, with four-byte elements aligned on 2-byte boundaries. The function OpenMovieFile takes as its first argument a pointer to an FSSpec structure comprising a 2-byte integer, a 4-byte integer and a 63-long Pascal string (see below for more details). The second argument is a pointer to a location where a 2-byte integer will be written, and the third is a 1-byte integer. It returns a two-byte integer error code. You can define this function and specify the correct structure alignment as follows:

      ⎕NA 'I2 QuickTimeLib{a=2}|OpenMovieFile <{I2 I4 P[63]} >I2 I1'

Under MacOS, a special case applies for calls to the operating system shared library 'CarbonLib', which also uses 68000-style alignment rules. This special case is detected automatically by APLX, so you do not need to specify the alignment explicitly.

Using external functions

Once an external function has been defined using ⎕NA, you can use it rather as you would an ordinary user-defined monadic (or possibly niladic) APL function. However, the right argument to the function must correspond to the data type which the external function expects. This means that where there is a parameter of type I or U, an APL integer value must be provided. (APLX will automatically convert your data to and from the 1-byte and 2-byte integer forms if appropriate). Where character data or a Pascal string is specified, you must provide an APL character scalar or vector. Arrays must be represented by APL vectors, and structures by APL nested arrays.

External functions are saved with the workspace, and they can be copied using )COPY or placed in overlays using ⎕OV. You do not need to run ⎕NA again to re-enable the name association, although there is no harm in doing so.

Shared libraries under Windows

Under Windows, shared libraries usually have the file extension .dll (Dynamic Link Library). Windows will search for the DLL in the following places:

If 'SafeDllSearchMode' is enabled (the default for Vista and XP SP2 or later):

  1. The directory from which APLX was loaded.
  2. The system directory.
  3. The 16-bit system directory.
  4. The Windows directory.
  5. The current directory.
  6. The directories that are listed in the PATH environment variable.

For older versions of Windows:

  1. The directory from which APLX was loaded.
  2. The current directory.
  3. The system directory.
  4. The 16-bit system directory.
  5. The Windows directory.
  6. The directories that are listed in the PATH environment variable.

32-bit calls can be made to functions in a DLL which follow either the 'cdecl' or 'stdcall' calling convention. APLX automatically fixes up the stack as necessary.

Shared libraries under MacOS

The original shared library format under MacOS versions prior to MacOS X was accessed through the 'Code Fragment Manager'. In this format, shared libraries were files of type 'shlb' with an appropriate 'cfrg' resource. An example is 'CarbonLib', the library which implements most of the MacOS user-interface routines. APLX can call routines inside shared libraries directly using the ⎕NA mechanism. For this case, you should omit the file path and just give the name of the shared library.

In MacOS X, Apple introduced an extended shared-library file type known as a 'bundle'. In addition, a special directory known as a 'framework' (containing one or more bundles) is also available. (The core operating system libraries are contained in frameworks.) You can call both of these using ⎕NA. To access a 'bundle', the library name should be the full path to the bundle, for example 'WorkDisk:RocketScience:newton.bundle'. To access a framework, you should omit the full path, and give just the framework name, for example 'System.framework' or 'Carbon.framework'.

Shared libraries under Linux

Under Linux, shared libraries usually have the file extension .so. Linux will search for the shared library in the following places:

  1. The directories specified in the LD_LIBRARY_PATH environment variable, if any.
  2. /lib
  3. /usr/lib

Note that it is very common in Linux to have file links to shared libraries, sometimes with multiple versions of a library. For example, the current version of the library might be might be /usr/lib/libisc.so.9.1.5, specifying the exact version and build number, with a link /usr/lib/libisc.so pointing at it.

Example 1 (Windows)

The Windows GetTickCount function retrieves the number of milliseconds that have elapsed since Windows was started. The function prototype given in the Windows documentation is:

DWORD GetTickCount(VOID);

This means that the function takes no argument, and returns a DWORD (4-byte unsigned integer) result. The Windows documentation also states that this function is implemented in the library 'kernel32'. To declare this function in APLX for Windows, you would therefore write:

            ⎕NA 'U kernel32|GetTickCount'

This will create a niladic function in the APL workspace called GetTickCount. Each time you run it, it will return an integer giving the number of milliseconds that have passed since Windows started:

            GetTickCount
      22340594
            GetTickCount
      22342907

If you wanted to give a different APL name to the same function, you could provide a left argument to ⎕NA:

            'GETMS' ⎕NA 'U kernel32|GetTickCount'
            GETMS
      22343405

Example 2 (Windows)

The Windows GetSystemDirectory function retrieves the path of the Windows system directory. It exists in two versions, one using single-byte characters (GetSystemDirectoryA) and one for Unicode characters (GetSystemDirectoryU). The function prototype given in the Windows documentation is:

UINT GetSystemDirectoryA(LPTSTR lpBuffer, UINT uSize);

This means that the function takes two arguments, and returns a 4-byte unsigned integer result. The arguments are a pointer to a buffer where the result will be returned, and an integer giving the length of the buffer. The Windows documentation also states that this function is implemented in the library 'kernel32'. To declare this function in APLX for Windows, you would therefore write something like:

            'GetSystemDirectory' ⎕NA 'U kernel32|GetSystemDirectoryA >C[256] U'

This will create a monadic function in the APL workspace called GetSystemDirectory. It takes two arguments, one for the buffer (which will not be used, but an element should be supplied), and an integer giving the length of the buffer (up to 256). It will return a nested vector, the first element of which is the result of the function itself. The Windows documentation states that this result is the length required for the returned string, or 0 if an error occurs. The second element in the nested array returned will be the string itself, corresponding to the > pointer argument. To use the function, you need to supply a dummy argument for the first parameter (pointer to buffer where the returned string will be placed), and the length of the buffer:

            GetSystemDirectory '' 255
       17 C:\WINNT\System32

Example 3 (Linux)

Under Linux, the getuid function returns the user ID. The function prototype given in the Linux documentation is:

uid_t getuid(void);

where the return type uid_t is a 4-byte integer result. The function exists in the 'libc' library, which on a typical system is held in '/lib/libc.so.6' To declare this function in APLX for Linux, you would therefore write something like:

            ⎕NA 'U /lib/libc.so.6|getuid'
            getuid
      203

Example 4 (AIX)

Under AIX, the getpid function returns the process ID. The function prototype given in the AIX documentation is:

pid_t getpid(void);

where the return type pid_t is a 4-byte integer result. The function exists in the AIX kernel, which in AIX is referenced using the library name '/unix'. To declare this function in APLX for AIX, you would therefore write something like this (here we name the function PROCID in the APL workspace):

            'PROCID' ⎕NA 'U /unix|getpid'
            PROCID
      15482

Example 5 (MacOS Shared Library)

Under MacOS, the SetWTitle function allows you to change a window's title. The function prototype given in the MacOS documentation is:

void SetWTitle (WindowRef window, ConstStr255Param title);

where the first parameter is a 4-byte window reference (the same as the Handle property of a window in ⎕WI), and the second is a pointer to a Pascal string of up to 255 characters. The function exists in the 'CarbonLib' library under MacOS 9 and MacOS X. To declare and use this function in APLX for MacOS, you could write:

            'SETTITLE' ⎕NA 'Carbon.framework|SetWTitle U <P[255]'
            'W' ⎕WI 'New' 'Window'
            H←'W' ⎕WI 'Handle'
            SETTITLE H 'My new window'

You can also use the GetWTitle call to return the window's title as a Pascal string. The function prototype is:

void GetWTitle (WindowRef window, Str255 title);

To use this from APL, you could type:

            'GETTITLE' ⎕NA 'Carbon.framework|GetWTitle U >P[255]'
            GETTITLE H ''
      My new window

Example 6 (MacOS Shared Library)

This example shows a more complex case involving a structure. Under MacOS, the FSMakeFSSpec function is used to take a filename, a volume reference, and a directory reference, and fill in a structure of type FSSpec which can be used to access the file. The function prototype given in the MacOS documentation is:

OSErr FSMakeFSSpec (short vRefNum, long dirID, ConstStr255Param fileName, FSSpec *spec);

where OSErr is a two-byte signed integer, and the FSSpec structure is:

struct FSSpec {
   short         vRefNum;
   long          parID;
   StrFileName   name; /* a Str63 on MacOS*/
};

(This structure is aligned according to 68000-style alignment rules, so there is no padding). The third item in the structure is a Pascal string of up to 63 characters.

You could define and use this in APLX as:

            ⎕NA 'I2 Carbon.framework|FSMakeFSSpec I2 I4 <P[255] >{I2 I4 P[63]}'
            FSMakeFSSpec 0 0 ':APLX' (0 0 '')
      0  ¯2  ¯2078211007  APLX

In this example, we have passed in 0 for both the vRefNum and dirID parameters, and ':APLX' for the Pascal string fileName. We have used (0 0 '') as the dummy argument for the spec parameter pointer, which receives the result. The result is a two element nested vector. The first element is the function return code, in this case 0. The second element is a nested array representing the structure. MacOS has filled in this structure with ¯2 for vRefNum, ¯2078211007 for parID, and 'APLX' for the Pascal string name.

Example 7 (MacOS X Framework)

Under MacOS X, a standard Unix-style programming environment exists in addition to the familiar Macintosh user-interface. This can be acccessed through the System.framework interface. An example of a simple routine in System.framework is the gethostname function, which returns the network name of the Macintosh system. The function prototype given in the MacOS (BSD) documentation is:

int gethostname(char *hostname, int namelen);

where hostname is the address of a buffer where the returned string should be placed, and namelen is the length of the buffer. The function returns 0 if it is successful, else an error code. To declare this function in APLX for MacOS, you could write:

            ⎕NA 'I System.framework|gethostname >C[256] I'
            gethostname '' 256
      0 PowerMacG4

APLX has returned a two-element vector, with the first element being the explicit result 0, and the second the string which has been placed in the 256-long buffer. This example will not work under MacOS 9, which does not implement frameworks or the BSD interface.

Errors

Note that, when you define the function using ⎕NA, only minimal error checking is done; if the routine descriptor is invalid you will get a DEFN ERROR when you try to use the function. If the library cannot be found, LOGICAL UNIT NOT FOUND will be generated. If the library is found, but the function name is not exported by the library, APLX will generate COMPONENT NOT FOUND. If the argument given to the function does not match the routine descriptor, APLX will report DOMAIN ERROR or possibly LENGTH ERROR.

Finally, you should be aware that a descriptor which does not correctly match the parameters expected by the external function may cause a fatal error and possibly crash APL.

Special considerations for Client-Server implementations of APLX

In Client-Server implementations of APLX, the APLX interpreter itself (the "Server") runs on one system, and the front-end which implements the user-interface (the "Client") runs either on a different machine, or under a different operating environment on the same machine. Typically, the Client might be a 32-bit Windows application running on a desktop PC, and the Server might be a 64-bit interpreter (APLX64) running either on the same machine, or on a different machine (such as a Linux server) over a network.

In such systems, ⎕NA allows you to specify whether the call should take place on the Client or Server side. You do this by preceding the ⎕NA function descriptor with either an Up Arrow to indicate that the call should take place on the Client, or a Down Arrow to indicate that the call should take place on the Server. If you do not specify, the default is that the call should take place on the Client.

In practice, there are two main cases to consider:

Client and Server run on the same physical machine

This case occurs if you are running APLX64 on a desktop 64-bit PC, because the front-end (Client) is still a 32-bit application, and the interpreter (Server) is a 64-bit application. Thus, you make 32-bit operating-system or library calls from the Client side, and 64-bit operating-system or library calls from the Server side.

For example, consider the Windows operating-system call GetSystemInfo. This takes as an argument a pointer to a SYSTEM_INFO structure, which it fills in with details of the current system:

typedef struct _SYSTEM_INFO {   
    DWORD dwOemId; 
    DWORD  dwPageSize; 
    LPVOID lpMinimumApplicationAddress; 
    LPVOID lpMaximumApplicationAddress; 
    DWORD  dwActiveProcessorMask; 
    DWORD  dwNumberOfProcessors; 
    DWORD  dwProcessorType; 
    DWORD  dwAllocationGranularity; 
    WORD  wProcessorLevel; 
    WORD  wProcessorRevision; 
} SYSTEM_INFO; 

The fields of type WORD and DWORD are 16-bit and 32-bit unsigned integers respectively. The fields of type LPVOID are pointers, i.e. 32-bit values in 32-bit versions of Windows, and 64-bit values in 64-bit versions of Windows. Under Windows XP64, 32-bit applications (such as the Client program) run within a virtual 32-bit environment, and access different versions of the operating-system libraries (confusingly, the 32-bit versions reside in C:\WINDOWS\SysWOW64, and the 64-bit versions in C:\WINDOWS\system32). We can therefore make either the 32-bit or 64-bit version of this system call from within APLX64 for Windows:

Making the 32-bit call:

      ⎕NA 'kernel32|GetSystemInfo >{U4 U4 U4 U4 U4 U4 U4 U4 U2 U2}'    
      GetSystemInfo ''
 0  4096  65536  2147418111  1  1  586  65536  15  9218

Making the 64-bit call:

      'GetSystemInfo64' ⎕NA '↓kernel32|GetSystemInfo >{U4 U4 U8 U8 U4 U4 U4 U4 U2 U2}'
      GetSystemInfo64 ''
 9  4096  65536  8796092956671  1  0  1  8664  0  1

In the second (64-bit) example, the down arrow has been used to force the call to take place on the Server side. Note that the third and fourth elements in the structure (the minimum and maximum application addresses) are 32-bit integers (U4) in the 32-bit version, and 64-bit integers (U8) in the 64-bit version. Note also that the fourth value returned, which is the maximum application address, is limited (by Windows) to 2GB in the 32-bit case.

Client and Server run on different physical machines

In this case, any ⎕NA calls made on the Client side will occur on the desktop machine, and any calls made on the Server side will occur on the remote server machine. The two systems may be running different operating systems and possibly even run under completely different processor architectures.

As an example, suppose you are running the APLX64 Server on a twin-processor Xeon Linux 64-bit system, and running the 32-bit APLX client on a 32-bit Windows PC. In these circumstances, you can define two external function, one which makes a call similar to Example 3 above (a 32-bit Windows call) on the Client machine, and another which makes a 64-bit Linux call on the Server:

      'GetSystemDirectory' ⎕NA '↑U kernel32|GetSystemDirectoryA >C[256] U'
      GetSystemDirectory '' 255
 17 C:\WINNT\system32
       ⎕NA '↓/lib64/libc.so.6|getcwd >C[512] U8'
      getcwd '' 512
/home/david/aplx64

Topic: APLX Help : Help on APL language : System Functions & Variables : ⎕NA Define External Function
[ Previous | Next | Contents | Index | APL Home ]