Topic: APLX Help : Help on APL language : System Functions & Variables : ⎕SQL Interface to External Database
[ Previous | Next | Contents | Index | APL Home ]

www.microapl.co.uk

⎕SQL Interface to External Database


The system function ⎕SQL provides an extensible interface to a standard relational database such as Oracle, SQL Server, IBM DB2, MySQL, or PostgreSQL, using Structured Query Language (SQL). It can also be used (via ODBC) to access data from spreadsheets, Microsoft Access, DB2 files, and many other data sources. You can use it to retrieve data from relational databases as nested arrays in the workspace, or to update a database directly from within APL.

To use ⎕SQL, you need to be familiar with database concepts and with the SQL language. Depending on the database you are using, you may also need to know something about setting up ODBC. There are many standard textbooks and websites where you can learn more about these topics. In addition, most database systems are supplied with a sample set of tables which you can use to work through examples of SQL syntax.

In most cases, ⎕SQL interfaces to the database by calling ODBC (Open DataBase Connectivity). Under Windows, this should normally be installed as part of the system software, but you will probably need to configure the data sources which it interfaces to (see the Windows Help system, or the Microsoft web site http://www.microsoft.com for further information). Under Linux and AIX, APLX uses unixODBC; this is included in many Linux distributions, or can be downloaded from http://www.unixodbc.org. Under MacOS X, APLX interfaces to the open-source iODBC software, which is distributed as part of the MacOS X system software (see the iODBC web site http://www.iodbc.org). ⎕SQL is not supported under Mac OS 9.

To make full use of ⎕SQL, you may also need a fully-featured relational database. Nearly all database systems can be accessed via ODBC, so ⎕SQL can be used to interface to all the major commercial database systems such as Oracle, Sybase and SQL Server. There are also several open-source databases available free of charge; these include MySQL and PostgreSQL (either or both of these are included in many Linux distributions). IBM's DB2 Personal Edition is also available free of charge (subject to IBM's license terms), and provides a very high-specification database system for single-machine use.

Note: Because APLX ships on multiple platforms, and ⎕SQL can be used to interface to hundreds of different databases and other data sources, the examples shown below will probably behave differently or need to be adapted for your system and database manager.

Syntax

In most cases, ⎕SQL will be used monadically. The right argument is usually a nested vector, the first element of which is a keyword (such as 'Connect' or 'Execute'), and the remaining elements of which are parameters specific to the operation being carried out. (For certain operations, which take no parameters, a simple character vector argument of just the keyword can be supplied.) Keywords are case-insensitive.

If you are connecting simultaneously to more than one database, then you also need to supply a left argument to ⎕SQL. This is a numeric integer which is used as a tie number to indicate which database you wish to access. If you omit the left argument, a default tie number of 0 is assumed.

Result

In general, the result of ⎕SQL is a length-three nested vector, comprising:

  • An integer vector giving information about whether the operation was successful. This consists of four integers.

    The first element is a code indicating the source of any error. A 0 indicates that no error occurred. A non-zero first element indicates that an error was reported, with 1 meaning that APLX itself reported an error, 2 indicating that the interface layer between APLX and the Database Driver reported an error, and 3 indicating that the driver or underlying database reported an error.

    The second element is the numeric error code, if any, which was returned by the software which reported the error. For example, for an ODBC interface it is the ODBC error number.

    The third element is a 'warning' code. 0 indicates no warning.

    The fourth element indicates whether more data might be available, for cases where you are retrieving data in chunks. 0 means that no more data is available. 1 means that more data might be available.

  • The second item of the nested vector result is a character string, containing any error message as text. It is an empty vector if there is no error message.

  • The third item of the nested vector result is the data (if any) returned by the SQL operation. If no data is returned, it is an empty vector.

It is recommended that you always use multiple assignment to split the result from ⎕SQL into these three constituent parts, as in this example:

      (RC ERRMSG DATA)←⎕SQL 'Connect' 'aplxdb2'
      RC
3 ¯1024 0 0  
      ERRMSG      
[IBM][CLI Driver] SQL1024N  A database connection does not exist.  SQLSTATE=08003

In this example, an error has been reported by IBM's DB2 driver software.

This method makes it very easy to check for errors by seeing if the first element of RC is non-zero.

Connecting to a database

Syntax:
[Tie] ⎕SQL 'Connect' Interface ConnectionString
[Tie] ⎕SQL 'Connect' Interface DataSourceName [User] [Password]

Before carrying out any database operation, you need to cause APLX to load the appropriate interface library, and connect to the underlying database. This is done using the ⎕SQL 'Connect' operation. As shown above, there are two variants of the syntax, depending on how you want to specify the database connection.

The first parameter is the name of the interface driver. This is a shared library which interfaces between APLX and the database manager. It can be either a simple shared-library name (such as 'aplxodbc'), or a full path name (such as 'c:\apl\drivers\aplxodbc.dll'). In the former case, which is normally preferred because it is operating-system independent, APLX will load the driver from the bin directory of the APLX installation; the shared-library extension (.dll for Windows, .so for Linux and AIX, or .bundle for Mac OS X) will be added automatically. In the latter case, the file name will be used as supplied, and you need to provide the file extension. If the library cannot be loaded, either because the APLX interface driver is not found, or because the underlying database manager library is not properly installed, APLX will report the error LOGICAL UNIT NOT FOUND.

Normally, you will use one of the drivers supplied as standard with APLX. These are currently:

For 32-bit versions of APLX:

  • aplxodbc - Interface to Microsoft ODBC (under Windows), or UnixODBC (under Linux x86 or AIX), or iODBC (under MacOS X).
  • aplxdb2 - Direct interface to IBM's DB2 database.

For APLX64:

  • aplx64odbc - Interface to Microsoft ODBC (under Windows XP 64), or UnixODBC (under Linux x86_64).

The second and subsequent parameters depend on the exact database interface you are using, and on how your system is set up. If you are using ODBC, you can specify either a 'Connection String' or a 'Data Source Name' (DSN). Other database interfaces use a Database Alias, which is similar to a DSN.

Specifying the database with a Connection String:

A Connection String is a text vector containing a series of semicolon-delimited phrases, each of which is of the form 'keyword=value;' . These phrases fully or partially define the connection to the database. The parameters typically include the driver name, user name, connection parameters and so on, plus any driver-specific information such as the name of an Excel spreadsheet, the name of the database alias, or the network address of the database server. (Technical note: This method of connection uses the underlying SQLDriverConnect ODBC call.)

Under Windows, if the Connection String is incomplete (i.e. if it does not fully define all the parameters needed to make the connection), the ODBC sub-system will display a dialog which allows you to set up the remaining parameters of the connection interactively. You can even specify an empty vector, in which case you can choose the ODBC driver, database, and all other parameters which are required. For MacOS, Linux, and AIX you need to specify a full connection string.

On successful connection, ⎕SQL returns as the third element of the result the complete connection string; you can store this, and use it for subsequent connections (without any dialog displaying).

For example, suppose you want to use ⎕SQL under Windows to access a table saved in an Excel spreadsheet. You can specify an empty connection string:

    (RC ERRMSG CNS) ← ⎕SQL 'Connect' 'aplxodbc' ''

A dialog should appear. Select the 'File DSN' tab, and press the New... button. Select the driver 'Microsoft Excel Driver', and save the DSN in some suitable scratch directory. When you click Next.. , the ODBC Excel driver should ask you for the name of the spreadsheet containing the data. (You can use the sample spreadsheet 'alcohol.xls' which is supplied with APLX in the 'ws' directory of the APLX installation, usually C:\Program Files\MicroAPL\APLX\ws). This should establish a connection to the Excel spreadsheet as though it were a database, and the CNS variable defined above should contain the full connection string which you can use to repeat the connection without the dialog:

      CNS
DBQ=C:\Program Files\MicroAPL\APLX\ws\alcohol.xls;DefaultDir=C:\Program Files\MicroAPL\
      APLX\;Driver={Microsoft Excel Driver (*.xls)};DriverId=790;FIL=excel 8.0;FI
      LEDSN=C:\Program Files\Common Files\ODBC\Data Sources\sample.dsn;MaxBufferSi
      ze=2048;MaxScanRows=8;PageTimeout=5;ReadOnly=1;SafeTransactions=0;Threads=
      3;UID=admin;UserCommitSync=Yes;

You should now be able to access the data using SQL commands, as described below:

      (RC ERRMSG DATA) ← ⎕SQL 'Do' 'select * from alcohol'
      RC
0 0 0 0      
      ⍴DATA
30 18
      3 8↑DATA
 Australia  9.4 11.6 12.9 11.7 10.5   10  9.7
 Austria   10.9 13.9 13.8 12.1 12.6 12.8 12.1
 Belgium    8.9 12.3   14 12.9 12.1 11.5 11.7

Specifying the database with a Data Source Name:

If the third parameter to 'Connect' is a text vector which does not end in a semi-colon and is not empty, APLX will assume that it represents a Data Source Name (DSN). A DSN is effectively a pre-defined Connection String, which encapsulates all the details needed to connect to the database. It can be stored in a text file as a "File DSN", or in the registry (or UnixODBC configuration file) as a "User DSN" or "System DSN". It is identified by a simple name. In some cases, the User Name and Password have to be entered as extra parameters. (Technical note: This method of connection uses the underlying SQLConnect ODBC call.)

Under Windows, you set up DSNs using Microsoft's 'ODBC Data Source Administrator' dialog. Under Linux or MacOS, you set them up using UnixODBC or iODBC configuration files. See the appropriate ODBC documentation for details, which will depend on the database you are using and where the data is held.

Example using ODBC to access data from an Oracle Database. The Data Source Name has been configured in advance to be 'Oracle Sample', but you need to specify the user name and password:

      ⎕SQL 'Connect' 'aplxodbc' 'Oracle Sample' 'scott' 'tiger'
 0 0 0 0

Specifying the database with a Database Alias:

For DB2 and other database-specific (non-ODBC) drivers, the second parameter to Connect is the database alias, which is similar to a DSN. The third and fourth parameters are the User ID and Password, if these are required by the database.

Example using IBM's DB2, assuming you have a database alias called 'DB2' defined:

      ⎕SQL 'Connect' 'aplxdb2' 'DB2'
 0 0 0 0

Connecting to multiple databases

If you want to connect to more than one database simultaneously, you will need to specify an integer left argument to ⎕SQL when you connect. You then use this tie number in all subsequent operations, to identify which database you want to access.

Checking which connections are open

Syntax: ⎕SQL 'Connections'

This returns an integer vector of the ⎕SQL tie numbers in use, or an empty vector if there are none. (Note that this is an exception to the general rule that ⎕SQL returns a nested vector of Return Codes, Error Message, Data).

Disconnecting from a database

Syntax: [Tie] ⎕SQL 'Disconnect'

You must always disconnect from the database when you have finished using it, otherwise you may continue to tie up resources and in some cases may prevent other users from accessing the database.

Disconnecting automatically closes any database statements you have open. Depending on the underlying driver, it will usually also roll back any uncommitted transactions.

You can disconnect from all active databases using the statement:

      0 0⍴(⎕SQL 'Connections') ⎕SQL ¨⊂'Disconnect'

Executing simple database operations

Syntax: [Tie] ⎕SQL 'Do' Statement

The 'Do' keyword allows you to execute almost any arbitrary SQL operation immediately. It takes a single parameter, which is a text vector containing the SQL statement you want to perform. For example, this can be a SELECT or UPDATE, or a more specialized operation such as GRANT. (However, it should not be a COMMIT or ROLLBACK, since these are done using the specific syntax described below). Any data returned by the database will be returned in the third element of the result from ⎕SQL. To avoid getting a WS FULL error, make sure you do not retrieve large data sets using this method.

For example:

      ⎕DISPLAY ⎕SQL 'DO' 'SELECT * FROM CL_SCHED'
┌→────────────────────────────────────────────────────┐
│ ┌→──────┐ ┌⊖┐ ┌→──────────────────────────────────┐ │
│ │0 0 0 0│ │ │ ↓ ┌→──────┐   ┌→───────┐ ┌→───────┐ │ │
│ └~──────┘ └─┘ │ │CL3RPN │ 3 │09:00:00│ │10:30:00│ │ │
│               │ └───────┘   └────────┘ └────────┘ │ │
│               │ ┌→──────┐   ┌→───────┐ ┌→───────┐ │ │
│               │ │CL4SDM │ 3 │11:00:00│ │12:30:00│ │ │
│               │ └───────┘   └────────┘ └────────┘ │ │
│               │ ┌→──────┐   ┌→───────┐ ┌→───────┐ │ │
│               │ │CL4DSE │ 2 │09:00:00│ │10:30:00│ │ │
│               │ └───────┘   └────────┘ └────────┘ │ │
│               └∊──────────────────────────────────┘ │
└∊────────────────────────────────────────────────────┘

In this example, ⎕SQL has returned a three-element nested vector as normal. The first element is the vector of error information, with all zeroes indicating that no error occurred and that no more data is available. The second element is the error message, which is empty because no error occurred.The third element is the result from the SQL query. In this case, a nested matrix of three rows of data has been returned, with one column per table column.

Using the 'Do' keyword has the advantage of being very simple. However, it has two disadvantages which mean that it will often be better to use the multi-stage 'Prepare' and 'Execute' operations described below. These are, firstly, that you have no control over the amount of data which is returned to you, and secondly that you cannot 'bind' APL data to your SQL operation at run-time - any parameters need to be hard-coded as text in the SQL statement which you execute. For these reasons, we recommend that you do not use the 'Do' keyword in applications for executing SQL 'Select' statements, unless you are certain that the data set will always be small and that there is therefore no danger of a WS FULL error being generated at some time in the future when the database may have grown.

Dynamic SQL statements

As an alternative to executing a statement directly using the 'Do' keyword, you can prepare a statement in advance, optionally bind parameters to it, and then execute it, retrieving any results in chunks rather than all in one go. To do this, you carry out the following steps:

  • Prepare a named SQL statement using the 'Prepare' keyword. This statement will usually be a SELECT, UPDATE or INSERT. You can have several such statements open simultaneously. The statement can optionally include place-holders for parameters which are supplied later from APL variables.
  • Execute the named SQL statement using the 'Execute' keyword. If the SQL statement which you have prepared requires parameters, these are supplied at this stage.
  • If the statement which you have executed returns a result set, you can now loop round using the 'Fetch' keyword, retrieving the results in stages (you specify the maximum number of rows to retrieve each time).
  • Finally, you use the 'Close' keyword to close the statement and release resources associated with it. Alternatively, you can re-execute it with a different set of parameters.

Preparing the dynamic SQL statement

Syntax: [Tie] ⎕SQL 'Prepare' Name Statement

This operation creates a new SQL statement. The parameter Name is a character vector which is the internal name by which you will refer to the statement subsequently (names are case-sensitive).

The parameter Statement is the SQL statement you want to execute. This can optionally include ? characters; these act as placeholders for APL data which will be substituted when the statement is executed. For example:

      ⎕SQL 'Prepare' 'cs1' "UPDATE CL_SCHED SET DAY=? WHERE CLASS_CODE=?"
 0 0 0 0

In this example, 'cs1' is the name of the statement, which will be used to identify it subsequently. The SQL statement is an UPDATE, with two parameters which will be filled in when it is executed.

Executing the prepared statement

Syntax: [Tie] ⎕SQL 'Execute' Name [Param1] [Param2]...

This causes the prepared statement to be submitted to the database for execution. Name is the name of the statement (i.e. the name you supplied when it was prepared). If there were any ? placeholders in the statement, you need to provide one element for each such placeholder; these will be substituted at this stage. For example:

      ⎕SQL 'Execute' 'cs1' 2 'CL4DSE'
 0 0 0 0   SQL_CURSH200C4

In this example, the numeric data 2 is substituted for the first ? in the SQL statement, the character vector 'CL4DSE' is substituted for the second, and the statement is executed.

The result of the operation follows the standard pattern. If a data set is available (i.e. the statement is something like a SELECT), a 1 will be placed in the fourth element of the return-code part of the result - you can retrieve the data using the 'Fetch' keyword.

When you use the 'Execute' keyword, the third element of the result of ⎕SQL is the name of the SQL cursor (if any) allocated by the underlying database. This can be used for advanced update operations using the SQL phrase 'WHERE CURRENT OF'.

Fetching the result set

Syntax: [Tie] ⎕SQL 'Fetch' Name [Count]

If the statement you have executed returns a result set, you can fetch the data in chunks, using the 'Fetch' keyword. Name is the name of the statement you have just executed. The optional Count parameter is the maximum number of rows to return (if you omit this, APLX will try to return all the rows). The result follows the standard pattern, with a 1 in the fourth element of the return code vector indicating that more data might be available, so you should call Fetch again. The data itself is returned as a matrix (usually nested), as the third element of the result.

For example, suppose we have a table with just three rows. We can retrieve these immediately using the 'Do' keyword:

      ⎕SQL 'DO' 'SELECT * FROM CL_SCHED'
 0 0 0 0    CL3RPN  3 09:00:00 10:30:00
            CL4SDM  3 11:00:00 12:30:00
            CL4DSE  2 09:00:00 10:30:00

Alternatively, we can use dynamic SQL with parameter substitution as follows, in this case retrieving one row at a time:

      ⎕SQL 'Prepare' 'cs2' "SELECT * FROM CL_SCHED where Starting=?"
 0 0 0 0
      ⎕SQL 'Execute' 'cs2' '9:00'
 0 0 0 1   SQL_CURSH200C4
      ⎕SQL 'Fetch' 'cs2' 1
 0 0 0 1    CL3RPN  3 09:00:00 10:30:00
      ⎕SQL 'Fetch' 'cs2' 1
 0 0 0 1    CL4DSE  2 09:00:00 10:30:00
      ⎕SQL 'Fetch' 'cs2' 1
 0 0 0 0    
      ⎕SQL 'Close' 'cs2'
 0 0 0 0

In this example, we have selected only those rows where the column 'Starting' (a time) is 9:00. The first time we call 'Fetch', we get the first row only because we have set the maximum number of rows to be returned to be 1. The return code indicates that more data might be available, so we call 'Fetch' again to retrieve the next row. Finally, the call to 'Fetch' returns an empty matrix as the third element (because we have now successfully read all the data), and the 'more' indicator is 0, so there is no more data to be read. We 'Close' the statement.

In a real example, using a large database, there might be many thousands of rows returned. For efficiency, you would probably retrieve these in blocks of several hundred rows (depending on the workspace available and your processing requirements).

Note that the 'more' indicator of 1 means only that more data might be available; when you come to fetch it, the database may report that you are in fact at the end of the data set, and return no more rows. Once this happens, the 'more' indicator will revert to 0.

Describing the result set

Syntax: [Tie] ⎕SQL 'Describe' Name

After you have prepared or executed a statement, you can use the 'Describe' keyword to determine the types of the columns (if any) which will be returned. (Doing this before the staement is executed is valid, but may be inefficient, depending on the underlying database). The data description is returned as a four-row matrix as the third element of the result of ⎕SQL, with one column per column of the result set. The four rows are Column Name, Column Label, SQL Data type string (eg 'DECIMAL 7.2'), and a binary flag indicating whether NULL is valid for the column. For example:

      ⎕SQL 'Prepare' 'cs1' 'SELECT * FROM CL_SCHED'
 0 0 0 0
      ⎕SQL 'Execute' 'cs1' 
 0 0 0 1   SQL_CURSH200C4
      ⎕SQL 'Describe' 'cs1' 
 0 0 0 0    CLASS_CODE      DAY STARTING ENDING
            CLASS_CODE      DAY STARTING ENDING
                CHAR 7 SMALLINT     TIME   TIME
                     0        1        1      1

In this example, the first column of the data set is called CLASS_CODE, and its label is also CLASS_CODE. It is of type CHAR, with a length of 7 characters. It cannot be NULL. The second column is called DAY, and is a small integer, which can be NULL. The third and fourth columns are called STARTING and ENDING, are of type TIME, and can be NULL.

Closing the prepared statement

Syntax: [Tie] ⎕SQL 'Close' Name

This closes the prepared statement and releases resources associated with it.

Transactions and Isolation Levels

One of the key features of most modern relational database systems is the ability to provide a consistent view of the data even when multiple users are changing it simultaneously, and to roll-back any changes if an error occurs within a sequence of updates. ⎕SQL provides access to these facilities (provide they are supported by the underlying database) as follows:

Syntax: [Tie] ⎕SQL 'Autocommit' [Value]

This feature allows you to specify whether each SQL operation should be regarded as a separate transaction. If you set Value to 1 (which is the default), a COMMIT will implicitly be executed after every SQL operation you perform. This means that you will not be able to roll-back changes if an error occurs. If Value is set to 0, you will need to use the 'Commit' or 'Rollback' keywords explicitly when you make any changes to the database. If you omit Value, the current state of the parameter is returned as the third element of the result of ⎕SQL. Note that only full database systems such as Oracle or DB2 support transactions; if you are using ODBC to access simple filing systems such as Excel tables or text files, Autocommit is always 1.

Caution: If you set 'Autocommit' to 0, you must remember to Commit any changes before ending the session, otherwise they will be rolled back when you disconnect from the database.

Syntax: [Tie] ⎕SQL 'Commit'

Causes any pending changes to the database to be permanently committed. This has no effect if 'Autocommit' is set to 1.

Syntax: [Tie] ⎕SQL 'Rollback'

Causes any pending changes to the database to be thrown away, and the database returned to the state it was in at the time of the most recent COMMIT. This has no effect if 'Autocommit' is set to 1, since in this case all changes will automatically have already been committed.

Syntax: [Tie] ⎕SQL 'Isolation' [Code]

Allows you to set or query the method used to handle simultaneous changes to the database by multiple users (if this is supported by the underlying database). The parameter Code is a two-character string, which sets the isolation level as follows:

Code ODBC Name DB2-style name
UR SQL_TXN_READ_UNCOMMITTED Uncommitted Read
CS SQL_TXN_READ_COMMITTED Cursor Stability
RS SQL_TXN_REPEATABLE_READ Read Stability
RR SQL_TXN_SERIALIZABLE Repeatable Read

If you omit the Code parameter, the current setting is returned as the third element of the result.

Note that not all database systems and drivers support all these isolation levels, or any at all; if you try to use an unsupported case, an SQL error will be returned. For example, the Excel driver under ODBC does not support isolation levels and rollbacks:

      ⎕SQL 'Isolation'
 3 106 0 0  [Microsoft][ODBC Excel Driver]Optional feature not implemented

Determining available Data Source Names

Syntax: [Tie] ⎕SQL 'Datasources' Interface

This call allows you to determine which Data Sources are available. It takes a single parameter, which is the name of the APLX database interface (usually 'aplxodbc' for 32-bit versions of APLX, or 'aplx64odbc' for APLX64). The data returned as the third element of the result is a nested matrix of two columns. The first column is the Data Source Name (i.e. the name you should supply to 'Connect'), the second the driver description. The results will depend on what User and System Data Sources you have set up on your system. In this example, we have five DSNs defined:

      ⎕DISPLAY ⎕SQL 'Datasources' 'aplxodbc'
┌→───────────────────────────────────────────────────────────────────────────┐
│ ┌→──────┐ ┌⊖┐ ┌→─────────────────────────────────────────────────────────┐ │
│ │0 0 0 0│ │ │ ↓ ┌→─────────────────┐  ┌→──────────────────────────────┐  │ │
│ └~──────┘ └─┘ │ │MS Access Database│  │Microsoft Access Driver (*.mdb)│  │ │
│               │ └──────────────────┘  └───────────────────────────────┘  │ │
│               │ ┌→──────────┐         ┌→─────────────────────────────┐   │ │
│               │ │dBASE Files│         │Microsoft dBase Driver (*.dbf)│   │ │
│               │ └───────────┘         └──────────────────────────────┘   │ │
│               │ ┌→───────────┐        ┌→─────────────────────────────┐   │ │
│               │ │Excel Sample│        │Microsoft Excel Driver (*.xls)│   │ │
│               │ └────────────┘        └──────────────────────────────┘   │ │
│               │ ┌→───┐                ┌→─────────┐                       │ │
│               │ │MQIS│                │SQL Server│                       │ │
│               │ └────┘                └──────────┘                       │ │
│               │ ┌→──┐                 ┌→──────────────────┐              │ │
│               │ │DB2│                 │IBM DB2 ODBC DRIVER│              │ │
│               │ └───┘                 └───────────────────┘              │ │
│               └∊─────────────────────────────────────────────────────────┘ │
└∊───────────────────────────────────────────────────────────────────────────┘

Note: Microsoft's ODBC driver will not allow you to use this call once you have connected to a data source.

Examining the catalog of tables

Syntax: [Tie] ⎕SQL 'Tables' [Catalog] [Schema] [Table] [Types]

Returns (as the third element of the result) a nested array giving details of the tables and pseudo-tables available in the database (including system tables and views). The details returned are contained in a five-column array, where each row represents a table in the database. The columns are:

  • The name of the catalog
  • The name of the schema
  • The name of the table or pseudo-table
  • The type of table or pseudo-table, one of: 'TABLE', 'VIEW', 'INOPERATIVE VIEW', 'SYSTEM TABLE', 'ALIAS', or 'SYNONYM'
  • Descriptive information about the table or pseudo-table

The result from this call can be very large, so in most cases you should provide parameters to limit the scope of the enquiry. This is done by providing up to four character vectors as parameters. The first three are pattern-matching strings for the Catalog, Schema and Table name respectively. An underscore (_) character stands for any single character, a percent (%) character stands for any sequence of zero or more characters, and other characters stand for themselves. The case of a letter is significant. An empty vector matches against any string. The fourth parameter contains a list of table types you are interested in, separated by commas if there are more than one. Valid table type identifiers are: TABLE, VIEW, SYSTEM TABLE, ALIAS, SYNONYM. If this parameter is an empty vector, this is equivalent to specifying all of the possibilities for the table type identifier.

For example, this statement returns all user tables, but not Views or System tables:

      ⎕DISPLAY ⎕SQL 'TABLES' '' '' '' 'TABLE'
┌→─────────────────────────────────────────────────────┐
│ ┌→──────┐ ┌⊖┐ ┌→───────────────────────────────────┐ │
│ │0 0 0 0│ │ │ ↓ ┌⊖┐ ┌→──┐ ┌→───────┐   ┌→────┐ ┌⊖┐ │ │
│ └~──────┘ └─┘ │ │0│ │RPN│ │CL_SCHED│   │TABLE│ │0│ │ │
│               │ └~┘ └───┘ └────────┘   └─────┘ └~┘ │ │
│               │ ┌⊖┐ ┌→──┐ ┌→─────────┐ ┌→────┐ ┌⊖┐ │ │
│               │ │0│ │RPN│ │DEPARTMENT│ │TABLE│ │0│ │ │
│               │ └~┘ └───┘ └──────────┘ └─────┘ └~┘ │ │
│               │ ┌⊖┐ ┌→──┐ ┌→──────┐    ┌→────┐ ┌⊖┐ │ │
│               │ │0│ │RPN│ │EMP_ACT│    │TABLE│ │0│ │ │
│               │ └~┘ └───┘ └───────┘    └─────┘ └~┘ │ │
│               │ ┌⊖┐ ┌→──┐ ┌→────────┐  ┌→────┐ ┌⊖┐ │ │
│               │ │0│ │RPN│ │EMP_PHOTO│  │TABLE│ │0│ │ │
│               │ └~┘ └───┘ └─────────┘  └─────┘ └~┘ │ │
│               │ ┌⊖┐ ┌→──┐ ┌→─────────┐ ┌→────┐ ┌⊖┐ │ │
│               │ │0│ │RPN│ │EMP_RESUME│ │TABLE│ │0│ │ │
│               │ └~┘ └───┘ └──────────┘ └─────┘ └~┘ │ │
│               │ ┌⊖┐ ┌→──┐ ┌→───────┐   ┌→────┐ ┌⊖┐ │ │
│               │ │0│ │RPN│ │EMPLOYEE│   │TABLE│ │0│ │ │
│               │ └~┘ └───┘ └────────┘   └─────┘ └~┘ │ │
│               │ ┌⊖┐ ┌→──┐ ┌→──────┐    ┌→────┐ ┌⊖┐ │ │
│               │ │0│ │RPN│ │IN_TRAY│    │TABLE│ │0│ │ │
│               │ └~┘ └───┘ └───────┘    └─────┘ └~┘ │ │
│               │ ┌⊖┐ ┌→──┐ ┌→──┐        ┌→────┐ ┌⊖┐ │ │
│               │ │0│ │RPN│ │ORG│        │TABLE│ │0│ │ │
│               │ └~┘ └───┘ └───┘        └─────┘ └~┘ │ │
│               │ ┌⊖┐ ┌→──┐ ┌→──────┐    ┌→────┐ ┌⊖┐ │ │
│               │ │0│ │RPN│ │PROJECT│    │TABLE│ │0│ │ │
│               │ └~┘ └───┘ └───────┘    └─────┘ └~┘ │ │
│               │ ┌⊖┐ ┌→──┐ ┌→────┐      ┌→────┐ ┌⊖┐ │ │
│               │ │0│ │RPN│ │SALES│      │TABLE│ │0│ │ │
│               │ └~┘ └───┘ └─────┘      └─────┘ └~┘ │ │
│               │ ┌⊖┐ ┌→──┐ ┌→────┐      ┌→────┐ ┌⊖┐ │ │
│               │ │0│ │RPN│ │STAFF│      │TABLE│ │0│ │ │
│               │ └~┘ └───┘ └─────┘      └─────┘ └~┘ │ │
│               └∊───────────────────────────────────┘ │
└∊─────────────────────────────────────────────────────┘

The following variant limits the selection further, to tables whose name begins with 'EMP':

      ⎕DISPLAY ⎕SQL 'TABLES' '' '' 'EMP%' 'TABLE'
┌→─────────────────────────────────────────────────────┐
│ ┌→──────┐ ┌⊖┐ ┌→───────────────────────────────────┐ │
│ │0 0 0 0│ │ │ ↓ ┌⊖┐ ┌→──┐ ┌→──────┐    ┌→────┐ ┌⊖┐ │ │
│ └~──────┘ └─┘ │ │0│ │RPN│ │EMP_ACT│    │TABLE│ │0│ │ │
│               │ └~┘ └───┘ └───────┘    └─────┘ └~┘ │ │
│               │ ┌⊖┐ ┌→──┐ ┌→────────┐  ┌→────┐ ┌⊖┐ │ │
│               │ │0│ │RPN│ │EMP_PHOTO│  │TABLE│ │0│ │ │
│               │ └~┘ └───┘ └─────────┘  └─────┘ └~┘ │ │
│               │ ┌⊖┐ ┌→──┐ ┌→─────────┐ ┌→────┐ ┌⊖┐ │ │
│               │ │0│ │RPN│ │EMP_RESUME│ │TABLE│ │0│ │ │
│               │ └~┘ └───┘ └──────────┘ └─────┘ └~┘ │ │
│               │ ┌⊖┐ ┌→──┐ ┌→───────┐   ┌→────┐ ┌⊖┐ │ │
│               │ │0│ │RPN│ │EMPLOYEE│   │TABLE│ │0│ │ │
│               │ └~┘ └───┘ └────────┘   └─────┘ └~┘ │ │
│               └∊───────────────────────────────────┘ │
└∊─────────────────────────────────────────────────────┘

Examining the catalog of columns

Syntax: [Tie] ⎕SQL 'Columns' [Catalog] [Schema] [Table] [Column]

This works in a similar way to the 'Tables' keyword. Again, you should normally use parameters to limit the size of the returned result. All the parameters are pattern-matching strings, following the same rules as for 'Tables'.

The result has 18 columns, as follows:

  1. The name of the Catalog
  2. The name of the Schema
  3. Name of the table or pseudo-table
  4. Name of the column
  5. SQL data type code of the column, as an integer
  6. Character string representing the name of the data type
  7. If the column type is a character or binary string, the maximum length in characters for the column. For date, time, timestamp data types, this is the total number of characters required to display the value when converted to character. For numeric data types, this is the total number of digits
  8. The maximum number of bytes needed to store data from this column as text.
  9. The scale of the column. An empty numeric vector is returned for data types where scale is not applicable.
  10. The radix expressing the precision (either 10, 2 or NULL). If the data type is an approximate numeric data type, this column contains the value 2, and the column size is expressed in bits. If the data type is an exact numeric data type, this column contains the value 10, and the column size is expressed in decimal digits. An empty vector is returned for data types where radix is not applicable.
  11. 0 if the column does not accept NULL values, 1 if the column accepts NULL values.
  12. Descriptive information about the column, if available.
  13. The column's default value. If the default value is a numeric literal, then this column contains the character representation of the numeric literal with no enclosing single quotes. If the default value is a character string, then this column is that string enclosed in single quotes. If the default value a pseudo-literal, such as for DATE, TIME, and TIMESTAMP columns, then this column contains the keyword of the pseudo-literal (e.g. CURRENT DATE) with no enclosing quotes. If NULL was specified as the default value, then this column returns the word NULL, not enclosed in quotes. If the default value cannot be represented without truncation, then this column contains TRUNCATED with no enclosing single quotes. If no default value was specified, then this field is an empty vector.
  14. SQL data type
  15. The subtype code for datetime data types
  16. The maximum length in bytes for a character data type column. For all other data types it is an empty vector.
  17. The ordinal position of the column in the table. The first column in the table is number 1.
  18. Contains the string 'NO' if the column is known to be not nullable; and 'YES' otherwise.

For example:

      ⎕DISPLAY 3⊃⎕SQL 'COLUMNS' '' '' 'CL%' 
┌→──────────────────────────────────────────────────────────────────────────────────┐
↓ ┌⊖┐ ┌→──┐ ┌→───────┐ ┌→───────┐    ┌→───┐     ┌⊖┐ ┌⊖┐   ┌⊖┐ ┌⊖┐   ┌⊖┐       ┌→──┐ │
│ │0│ │RPN│ │CL_SCHED│ │CLASS_CD│ 1  │CHAR│ 7 7 │0│ │0│ 1 │0│ │0│ 1 │0│ 7   1 │NO │ │
│ └~┘ └───┘ └────────┘ └────────┘    └────┘     └~┘ └~┘   └~┘ └~┘   └~┘       └───┘ │
│ ┌⊖┐ ┌→──┐ ┌→───────┐ ┌→──┐         ┌→──┐                ┌⊖┐ ┌⊖┐   ┌⊖┐ ┌⊖┐   ┌→──┐ │
│ │0│ │RPN│ │CL_SCHED│ │DAY│      5  │INT│  5 2 0   10  1 │0│ │0│ 5 │0│ │0│ 2 │YES│ │
│ └~┘ └───┘ └────────┘ └───┘         └───┘                └~┘ └~┘   └~┘ └~┘   └───┘ │
│ ┌⊖┐ ┌→──┐ ┌→───────┐ ┌→───────┐    ┌→───┐         ┌⊖┐   ┌⊖┐ ┌⊖┐       ┌⊖┐   ┌→──┐ │
│ │0│ │RPN│ │CL_SCHED│ │STARTING│ 92 │TIME│ 8 6 0   │0│ 1 │0│ │0│ 9 2   │0│ 3 │YES│ │
│ └~┘ └───┘ └────────┘ └────────┘    └────┘         └~┘   └~┘ └~┘       └~┘   └───┘ │
│ ┌⊖┐ ┌→──┐ ┌→───────┐ ┌→─────┐      ┌→───┐         ┌⊖┐   ┌⊖┐ ┌⊖┐       ┌⊖┐   ┌→──┐ │
│ │0│ │RPN│ │CL_SCHED│ │ENDING│   92 │TIME│ 8 6 0   │0│ 1 │0│ │0│ 9 2   │0│ 4 │YES│ │
│ └~┘ └───┘ └────────┘ └──────┘      └────┘         └~┘   └~┘ └~┘       └~┘   └───┘ │
└∊──────────────────────────────────────────────────────────────────────────────────┘

Examining details of the connection

Syntax: [Tie] ⎕SQL 'Describe'

This uses the same keyword 'Describe' which is used to provide details of a result set (see above). However, if no cursor name is supplied as a parameter, APLX returns details of the connection as a nested vector of 9 elements, as follows:

  1. Data source name
  2. Driver ODBC version
  3. DBMS name
  4. DBMS version
  5. Driver name
  6. Driver version
  7. Database name
  8. User name
  9. SQL conformance, expressed as a string "Minimum SQL Grammar", "Core SQL Grammar", or "Extended SQL Grammar".

The following are the minimum sets of features available at each conformance level (each level includes the features of the lower levels, and a specific driver may optionally implement further features):

Minimum SQL Grammar:

  • Data Definition Language (DDL): CREATE TABLE, DROP TABLE.
  • Data Manipulation Language (DML): simple SELECT, INSERT, UPDATE and DELETE
  • Simple Expressions: (such as A < B - C).
  • Simple character-only data types: CHAR, VARCHAR, LONG VARCHAR.

Core SQL Grammar:

  • DDL: ALTER TABLE, CREATE INDEX, DROP INDEX, CREATE VIEW, DROP VIEW, GRANT, and REVOKE.
  • DML: full SELECT.
  • Expressions: Subqueries, set functions such as SUM and MIN.
  • Numeric Data types: DECIMAL, NUMERIC, SMALLINT, INTEGER, REAL, FLOAT, DOUBLE

Extended SQL Grammar:

  • DML: Outer Joins, positioned UPDATE, positioned DELETE, SELECT FOR UPDATE, and Unions.
  • Expressions: Scalar functions such as SUBSTRING and ABS, date, time, and timestamp literals.
  • Data types: BIT, TINYINT, BIGINT, BINARY, VARBINARY, LONG VARBINARY, DATE, TIME, TIMESTAMP
  • Batch SQL statements.
  • Procedure calls.

For example, DB2 provides full SQL facilities:

      ⊃ 3⊃⎕SQL 'DESCRIBE' 
SAMPLE
03.00
DB2/NT
08.01.0003
DB2CLI.DLL
08.01.0003
SAMPLE
ALICE
Extended SQL Grammar

Retrieving the Version number of the APLX database driver

Syntax: [Tie] ⎕SQL 'Version' Interface

This call returns (as the third element of the explicit result) an integer scalar of the APLX database driver version; 100 corresponds to version 1.00. The Interface parameter is the same as that of the Connect keyword, for example 'aplxodbc'.

Retrieving the database connection and statement handles

Sometimes, you may need to access the underlying ODBC, CLI or other database driver directly, for example using ⎕NA to access a facility not supported by ⎕SQL. For this to work, you need the appropriate low-level handles, which you can retrieve using the 'Handle' keyword.

Syntax: [Tie] ⎕SQL 'Handle' [Name]

The information returned (as the third element of the explicit result) is system-specific; for ODBC, if you do not specify a statement name, you get a two-element result of the Environment and Database Connection handles. If you specify a statement, you get a scalar which is the statement handle:

      ⎕DISPLAY ⎕SQL 'Handle'
┌→──────────────────────────────────┐
│ ┌→──────┐ ┌⊖┐ ┌→────────────────┐ │
│ │0 0 0 0│ │ │ │52237280 52237480│ │
│ └~──────┘ └─┘ └~────────────────┘ │
└∊──────────────────────────────────┘
      ⎕DISPLAY ⎕SQL 'Handle' 'cs1'
┌→───────────────────────┐
│ ┌→──────┐ ┌⊖┐          │
│ │0 0 0 0│ │ │ 52238968 │
│ └~──────┘ └─┘          │
└∊───────────────────────┘

Data Conversions

Because the underlying database will usually support a set of data type for columns which are more extensive than those available in APL, APLX automatically converts data types as follows:

SQL Data Type APLX Data Type
Integer, Small Integer, Tiny Integer, Bit Integer
64-bit Integer Float (under 32-bit versions of APLX), or Integer (under APLX64)
Numeric, Decimal, Float, Real, Double Float
Date, Time, DateTime, Timestamp, Interval Character vector
Char, VarChar, LongVarChar, CLOB, GUID Character vector
GUID, DataLink Character vector
Graphic (i.e. non-ASCII character), VarGraphic, LongVarGraphic Character vector
UnicodeChar, UnicodeVarChar, UnicodeLongVarChar Character vector
Binary, VarBinary, LongVarBinary, Blob, OLE object, User-defined Type Raw (untranslated) Character vector

See also ⎕MC, which contains the character used to replace Unicode characters which cannot be represented in APLX.


NULL handling

An important concept in SQL is that of NULL data items, i.e. items which have no value. A NULL is handled in a very special (and often counter-intuitive) way in SQL statements; for example, in a SELECT statement, a NULL does not match anything, including another NULL. For a character field, a NULL is not the same as a zero-length string.

⎕SQL follows a very simple convention to represent NULL data items, both in parameter substitution (using 'Execute') and in returning result sets. A NULL is represented by an empty numeric vector. Note that this can easily be distinguished from a zero-length string, which will be represented by an empty character vector.


Summary of ⎕SQL Syntax

Connect to database:
 (RC ERRMSG CNS) ← [Tie] ⎕SQL 'Connect' Interface ConnectionString
 (RC ERRMSG EMPTY) ← [Tie] ⎕SQL 'Connect' Interface DSN [User] [Password]

Execute SQL statement immediately:
 (RC ERRMSG RESULT) ← [Tie] ⎕SQL 'Do' Statement

Prepare SQL statement for later execution:
 (RC ERRMSG EMPTY) ← [Tie] ⎕SQL 'Prepare' Name Statement

Execute prepared statement:
 (RC ERRMSG CURSORNAME) ← [Tie] ⎕SQL 'Execute' Name [Param1] [Param2]...

Fetch rows from statement result set:
 (RC ERRMSG ROWS) ← [Tie] ⎕SQL 'Fetch' Name [Count]

Describe result set:
 (RC ERRMSG DESCRIPTION) ← [Tie] ⎕SQL 'Describe' Name

Close statement
 (RC ERRMSG EMPTY) ← [Tie] ⎕SQL 'Close'

Set/Query AutoCommit:
 (RC ERRMSG EMPTY) ← [Tie] ⎕SQL 'Autocommit' Flag
 (RC ERRMSG FLAG) ← [Tie] ⎕SQL 'Autocommit'

Set/Query Isolation Level:
 (RC ERRMSG EMPTY) ← [Tie] ⎕SQL 'Isolation' Code
 (RC ERRMSG CODE) ← [Tie] ⎕SQL 'Isolation'

Commit/Rollback transaction:
 (RC ERRMSG EMPTY) ← [Tie] ⎕SQL 'Commit'
 (RC ERRMSG EMPTY) ← [Tie] ⎕SQL 'Rollback'

Retrieving the catalog of tables and columns:
 (RC ERRMSG TABLES) ← [Tie] ⎕SQL 'Tables' [Catalog] [Schema] [Table] [Types]
 (RC ERRMSG COLUMNS) ← [Tie] ⎕SQL 'Columns' [Catalog] [Schema] [Table] [Column]

Examining details of the connection:
 (RC ERRMSG DETAILS) ← [Tie] ⎕SQL 'Describe'

Retrieving the Database Connection and Statement handles
 (RC ERRMSG CONNHANDLE) ← [Tie] ⎕SQL 'Handle'
 (RC ERRMSG STMTHANDLE) ← [Tie] ⎕SQL 'Handle' Name

Return list of tie numbers in use:
 TIES ← ⎕SQL 'Connections'

Disconnect from database:
 (RC ERRMSG EMPTY) ← [Tie] ⎕SQL 'Disconnect'

Query available Data Source Names
 (RC ERRMSG DSNS) ← [Tie] ⎕SQL 'Datasources' Interface

Query Version Number of the APLX database interface:
 (RC ERRMSG VERSION) ← ⎕SQL 'Version' Interface


Topic: APLX Help : Help on APL language : System Functions & Variables : ⎕SQL Interface to External Database
[ Previous | Next | Contents | Index | APL Home ]