Chapter 18: Sets and Classes

In this chapter we look at more of the built-in functions of J. The connecting theme is, somewhat loosely, working with sets and classes.

Suppose that, for some list, for the purpose at hand, the order of the items is irrelevant and the presence of duplicate items is irrelevant. Then we can regard the list as (representing) a finite set. In the abstract, the set 3 1 2 1 is considered to be the same set as 1 2 3. The word "class" we will use in the sense in which, for example, each integer in a list belongs either to the odd class or to the even class.

18.1 Sets

18.1.1 Membership

There is a built-in verb e. (lowercase e dot, called "Member"). The expresssion x e. y tests whether x matches any item of y, that is, whether x is a member of the list y. For example:

y=: 'abcde' 'a' e. y 'w' e. y 'ef' e. y
abcde
1
0
1 0

Evidently the order of items in y is irrelevant and so is the presence of duplicates in y.

z=: 'edcbad' 'a' e. y 'w' e. y 'ef' e. y
edcbad
1
0
1 0

We can test whether a table contains a particular row:

t =: 4 2 $ 'abcdef' 'cd' e. t
ab 
cd 
ef 
ab
1

18.1.2 Less

There is a built-in verb -. (minus dot, called "Less"). The expression x -. y produces a list of the items of x except those which are members of y.

x =: 'consonant' y =: 'aeiou' x -. y
consonant
aeiou
cnsnnt

Evidently the order of items in y is irrelevant and so is the presence of duplicates in y.

18.1.3 Nub

There is a built-in verb ~. (tilde dot, called "Nub"). The expression ~. y produces a list of the items of y without duplicates.

nub =: ~. y =: 'hook' nub y
~.
hook
hok

We can apply nub to the rows of a table:

t nub t
ab 
cd 
ef 
ab
ab 
cd 
ef

18.1.4 Nub Sieve

The verb "nub sieve" (~:) gives a boolean vector which is true only at the nub.

y b =: ~: y b # y nub y
hook
1 1 0 1
hok
hok

18.1.5 Functions for Sets

The customary functions on sets, such as set-union, set-intersection or set-equality, are easily defined using the built-in functions available. For example two sets are equal if all members of one are members of the other, and vice versa.

   seteq =: *./ @: (e. , e.~)

1 2 3 seteq 3 1 2 1 1 2 3 seteq 1 2
1
0

18.2 The Table Adverb

Recall that the adverb / generates a verb; for example +/ is a verb which sums lists. More precisely, it is the monadic case of +/ which sums lists. The dyadic case of +/ generates a table:

x =: 0 1 2 y =: 3 4 5 6 z =: x +/ y
0 1 2
3 4 5 6
3 4 5 6 
4 5 6 7 
5 6 7 8

The general scheme is that if we have

         z =:  x f/ y

then z is a table such that the value at row i column j is given by applying f dyadically to the pair of arguments i{x and j{y. That is, z contains all possible pairings of an item of x with an item of y. Here is another example:

x =: 'abc' y =: 'face' x =/ y
abc
face
0 1 0 0 
0 0 0 0 
0 0 1 0

The result shows, in the first row, the value of 'a' = 'face', in the second row the value of 'b' ='face' and so on.

18.3 Classes

18.3.1 Self-Classify

Consider the problem of finding the counts of letters occurring in a string (the frequency-distribution of letters). Here is one approach.

We form a table testing each letter for equality with the nub.

y =: 'hook' nub y (nub y) =/ y
hook
hok
1 0 0 0 
0 1 1 0 
0 0 0 1

The expression ((nub y) = / y) can be abbreviated as (= y). The monadic case of the built-in verb = is called "Self-classify").

y nub y (nub y) =/ y = y
hook
hok
1 0 0 0 
0 1 1 0 
0 0 0 1
1 0 0 0 
0 1 1 0 
0 0 0 1

If we sum each row of = y we obtain the counts, in the order of the letters in the nub.

y = y +/ " 1 =y
hook
1 0 0 0 
0 1 1 0 
0 0 0 1
1 2 1

The counts can be paired with the letters of the nub:

y nub y (nub y) ;" 0 (+/ " 1 =y)
hook
hok
+-+-+ 
|h|1| 
+-+-+ 
|o|2| 
+-+-+ 
|k|1| 
+-+-+

18.3.2 Classification Schemes

Gardeners classify soil-types as acid, neutral or alkaline, depending on the pH value. Suppose that a pH less than 6 is classed as acid, 6 to 7 is neutral, and more than 7 as alkaline. Here now is a verb to classify a pH value, returning A for acid, N for neutral and L for alkaline (or limy).

   classify =: ({ & 'ANL')  @: ((>: & 6) + (> & 7))

classify 6 classify 4.8 5.1 6 7 7.1 8
N
AANNLL

The classify function we can regard as defining a classification scheme. The letters ANL, which are in effect names of classes, are called the keys of the scheme.

18.3.3 The Key Adverb

Given some data (a list, say), we can classify each item to produce a list of corresponding keys.

data =: 7 5 6 4 8 k =: classify data
7 5 6 4 8
NANAL

We can select and group together all the data in, say, class A (all the data with key A):

data k k = 'A' (k = 'A') # data
7 5 6 4 8
NANAL
0 1 0 1 0
5 4

Now suppose we wish to count the items in each class. That is, we aim to apply the monadic verb # separately to each group of items all of the same key. To do this we can use the built-in adverb /. (slash dot, called "Key").

data k =: classify data k # /. data
7 5 6 4 8
NANAL
2 2 1

For another example, instead of counting the members we could exhibit the members, by applying the box verb <.

data k =: classify data k < /. data
7 5 6 4 8
NANAL
+---+---+-+ 
|7 6|5 4|8| 
+---+---+-+

The verb we apply can discover for itself the class of each separate argument, by classifying the first member: Here the verb u produces a boxed list: the key and count:

   u =: (classify @: {.) ; #
   

data k =: classify data k u /. data
7 5 6 4 8
NANAL
+-+-+ 
|N|2| 
+-+-+ 
|A|2| 
+-+-+ 
|L|1| 
+-+-+

The general scheme for the "Key" adverb is as follows. In the expression x u /. y, we take y to be a list, and x is a list of keys of corresponding items of y according to some classification scheme, and u is the verb to be applied separately to each class. The scheme is:

          x u /. y    means   (= x) (u @ #) y

To illustrate:

   y =: 4 5 6 7 8 
   x =: classify y
   u =: <

y x = x (= x) (u @ #) y x u /. y
4 5 6 7 8
AANNL
1 1 0 0 0 
0 0 1 1 0 
0 0 0 0 1
+---+---+-+ 
|4 5|6 7|8| 
+---+---+-+
+---+---+-+ 
|4 5|6 7|8| 
+---+---+-+

We see that each row of =x selects items from y, and u is applied to this selection.

18.3.4 Letter-Counts Revisited

Recall the example of finding the counts of letters in a string.

y =: 'LETTUCE' = y (nub y) ; " 0 +/ "1 (= y)
LETTUCE
1 0 0 0 0 0 0 
0 1 0 0 0 0 1 
0 0 1 1 0 0 0 
0 0 0 0 1 0 0 
0 0 0 0 0 1 0
+-+-+ 
|L|1| 
+-+-+ 
|E|2| 
+-+-+ 
|T|2| 
+-+-+ 
|U|1| 
+-+-+ 
|C|1| 
+-+-+

Here is a variation. We note that we have in effect a classification scheme where we have as many different classes as different letters: each letter is (the key of) its own class. Thus we can write an expression of the form y u /. y.

The applied verb u will see, each time, a list of letters, all the same. It counts them, with #, and takes the first, with {., to be a label for the class.

   u =: {. ; #

y = y y u /. y
LETTUCE
1 0 0 0 0 0 0 
0 1 0 0 0 0 1 
0 0 1 1 0 0 0 
0 0 0 0 1 0 0 
0 0 0 0 0 1 0
+-+-+ 
|L|1| 
+-+-+ 
|E|2| 
+-+-+ 
|T|2| 
+-+-+ 
|U|1| 
+-+-+ 
|C|1| 
+-+-+

This is the end of Chapter 18


NEXT
Table of Contents


Copyright © Roger Stokes 2000. This material may be freely reproduced, provided that this copyright notice and provision is also reproduced.

last updated 17 Mar 00