Sequences: Comments

Sequences Commentary

General:Sequences in Glee manage hetrogeneous data (e.g. character strings in the same vector with numeric vectors). It does this by maintaining pointers to the actual Glee objects. So Glee sequences are vectors of pointers and are themselves homogeneous. This capability gives Glee much of its power and flexibility. Any Glee object may be a member (referent) of a Glee sequence. Glee uses an internal reference counting mechanism. This allows it to maintain references to any number of Glee objects with a storage requirement of only 4 bytes (the size of a 32 bit C++ pointer) per reference. No copy of the referent is made. If the referent changes, all references to it implicitly change with it. The referent will not be discarded as long as there are outstanding references to it. In Glee, Sequences are themselves Glee objects. Therefore, Glee Sequences may contain Sequences (which may contain Sequences, etc.). Glee has a current depth limit of 5. This is like a Russian egg limited to 5 embedding shells. This limit is totally arbitrary. It is deemed adequate for real work and is included as protection for sequences which directly or indirectly reference themselves.

Simple Output:This is an example of one way Glee can present results. Note: The parenthesis are not necessary. You will get the same result without them.

Verbose Output: What we really see in the previous example is a sequence of two elements. The first element is a character string containing 'answer is '. The second element is the numeric vector (with one element) containing (123.45). In quiet mode as in that previous example, Glee displays sequences by catenating the quiet displays of its referenced elements. This example is the same as the previous one except I display in verbose mode. When I turn on verbosity (the $V), I see what is going on in the previous example. Glee has created a sequence of two elements. The first is a character string; the second is a numeric vector. When displayed with verbosity you can see the individual elements.

Sequence Indexing:You can index Sequences just like you can other Glee vector objects..

Sequence Results: Many Glee operators, out of necessity, return sequence results. One such operator is the (``) Indices of operator. In this example, I generate a vector of ten 5's (5*->10). For each element of that vector, I create a random number between 1 and 5 using the (v ? 2) Random operator (the 2 right argument is the random seed assuring you see what I see when you try this). I assign this result to x and display it (=> x $;). x is now a vector of 10 random numbers between 1 and 5. I turn on verbosity ($V;). I perform the Indices of operation on the contents of x for values of 1 thru 5 (5`) with (x `` (5`)). Our result is a 5 element sequence vector (one element for each element in the right argument). The result being hetrogeneous (numeric vectors of different lengths), the sequence is the only way to capture the result. It reports[1]:1 2 (1 is at index 1 and 2); [2]:NULL (the number 2 doesn't occur);; [3]6 8 10 [4]:4 5 7 9; and [5]:3. A real life application of this facility might be in analyzing accounts in transaction data.

Count with (#ea '#'):This is another taste of the power of Glee. Taking the preceeding example and applying Glee's count operator (#) to each (#ea) element of the sequence, I get a sequence of counts: [1]:2 (2 elements); [2]:0 (no elements); [3]:3; [4]:4; and [5]:1. In any other language, this might be valuable information for setting up storage areas for rendering results. In Glee that usually isn't an issue because Glee is so dynamic.

Another way to do each: (@&): Since creating the example above, I have created more powerful ways of dealing with the elements of sequences. The "@&" operator (read "at each") takes the operation on the right and applys it to each element of the sequence on the left. For a simple operator as illustrated here with "#"count, just the operator may be given. Notice a sequence is returned. You have to "< " disclose it to obtain a simple numeric vector. For more complicated requirements, a whole set of operations can be given within parenthesis. This is shown in a later example. Finally, if there is a block within the parenthesis, anonymous blocks of Glee code and namespace protection can be applied to the elements of the sequence. A simple example of this is also given later below.

Scan counting: (/#): Counting the elements of sequence elements is such a common need that I created this count scan operator. As indicated, it is "count" applied "at-each element" of a sequence and the results returned element by element as with a normal scanning of elements. In the examples before, we obtained the same result but as a sequence of counts which we needed to disclose. Here, we obtain the desired numeric vector directly. There are many ways to do the same thing in Glee. This scan counting operator does the operation orders of magnitude faster than the other examples and is the recommended way of finding the count of elements in sequence elements. If the elements are simple objects like numeric vectors or strings, the count is the number of those elements. If the element is a sequence itself, the count is for the elements in that sequence.

Disclose ( < ) and depth ( ** ): As a rule, Glee doesn't scrutinize its results. But in this example we see how the programmer, knowing what to expect, can have Glee return results more useful to him. From the example before, we see that when we count the elements in each sequence element, our result is homogeneous. Each element is a numeric singleton. Glee has the disclose ( < ) operator. It is the Less Than symbol. Used monadically, to Glee it means disclose . How was the symbol chosen? Notice that it opens to the right (to Glee, it opens up or discloses its left argument). There is another symbol, as we'll see in the next example, that encloses. (guess what symbol I chose for that). If you ask Glee to disclose a sequence, it scans to see if it is homogenous (all the same kind of data). If it is homogeneous, Glee will create a single vector of that type. Otherwise it just returns the sequence left argument. The programmer can test for this by checking the depth of the object with the Glee's depth operator ( ** ). This operator returns 0 for simple objects (like numeric vectors) and 1 for sequence objects (or more than 1 if the sequence contains a sequence). In this example, every element in the sequence is numeric so Glee is able to create a numeric vector using the disclose operator. If it couldn't, it would just return the original sequence. In my example, I take these counts stored in z and add them up ( \+ ). Sure enough, I've accounted for all 10.

Enclose ( > ) Implicit and Explicit:This is another compound example. First I illustrate that for character strings, without explicit catenation, Glee does an implicit enclose of each string returning a sequence. Glee hates inconsistency, but as I illustrate it doesn't do this implicit enclose with numeric vectors. The reason is a pragmatic one. In Glee, when we're working with explicit numeric vectors, we're usually groping around in our data to see what we have. Implicit enclose would get in our way. On the other hand, when we're dealing with character strings (e.g. lines of computer program) we usually want to deal with them as individual entities so implicit enclosure is what we want. I retain the right to change my mind on this as I develop real live programs solving real live problems. You will soon see there are times we need to be able to form sequences explicitly. That is the purpose of the enclose operator ( > ). Notice how it visually closes down on, or encloses, its argument to the left.

Multiple Enclosure:Here I illustrate a reasonably practical example of multiple enclosure. Some APL dialects called this nested arrays. Notice how Glee pays no attention to line feeds in instructions. Instructions are delimited by semicolons. Otherwise, code entry is entirely free form. This example illustrates how to form multiple enclosures. First, I have two character strings forming an implicit sequence of 2 elements. I parenthesize to group these elements. Then I have a character string and numeric vector forming another implicit sequence of 2 elements. I parenthesize that into a group. Now I have two groups, each containing two elements. If the groups were standing alone, there would be no enclosure ... no added depth. However, when two groups are stranded together as we have here, the groups are implicitly enclosed and catenated. They become elements of a newly formed sequence. This example also shows how Glee displays multiple enclosures or nested sequences. The asterisks ( * ) in the display indicate the level of depth. I admit this display isn't as pretty as it could be. Cosmetics are not on my critical path.

Grade and Sort:. Sequences can be graded and sorted just like string and numeric objects. When comparisons are made between objects of different types, like-types are grouped together and then sorted. Strings are sorted ignoring case, white space, and punctuation.

Ravel ( , ) and Expose With Separator( ,, ):Sequences can be created by raveling any object. If the object is a sequence of depth other than 1, it is flattened to a sequence of depth one. Raveling a string, number, or other object turns it into a one element sequence. Separator elements can be inserted during the raveling. This is called "expose with separator". Any object can be used as a raveling separator. However, blank is so common, the monadic version of ( ,, ) ravels in blanks at the deepest point and newline characters at lesser depths. Dyadic use of this operator ravels in other objects given as the right argument. Exposing with separators may be useful for using sequence building techniques to quickly achieve readable displays of complicated data objects. Note: Whereas the ravel (,) operator yields a result of depth 1, the expose with separator (,,) does not change depth. If depth is zero, expose with separator has no effect.

Disclose and display: Using a combination of ravel, disclose, and the %* operator you can manipulate sequence data into displayable form.

Strands:. Strands are formed by delivering variable names to the interpreter with no intervening operator. The interpreter makes a sequence with each element being the contents of the variable. This is a common method of forming argument lists for blocks and named blocks (programs).

At Each ( @& ) application of monadic: The at (@) each (&) facilitates the processing of elements in a sequence. Used niladically, it marks an object for processing element by element. For simple objects, it converts them into sequences. For sequence objects, it merely marks them for subsequent "each" processing. This example illustrates at each followed by a monadic operator. operator serves two purposes. First, it takes simple objects like numeric or character vectors and returns a sequence. Each element of the resulting sequence is an element of the source object. This is illustrated in the first example pair. The most common use should be for preparing numeric data for entry into fields of record sets. Secondly, as shown in the second example pair, @& marks sequences such that the subsequent operator (in this case monadic "-") processes on the sequence elements individually rather than on the sequence as a whole. Note, I need to parenthesize the expression to force this contrived example.

At Each (@&) application of dyadic:. Illustrated here is the behavior of @& when used in a dyadic sense. Each element of the sequence is delivered as the left argument to the operator. The dyadic operation is then performed and the element result becomes an element in the statement result.

At Each (@& ) and anonymous blocks:. This example is complicated. First I create a reference to an anonymous numeric object. I call this reference "p". I then use "p" to accumulate the elements of a numeric vector in an implicit "at each" loop. With each iteration, "at each" delivers an element. In the first example, I capture the element explicitly in "i". I strand "i" with the reference "p" and create a namespace (without changing the names). This namespace becomes the argument to the anonymous block where I do the processing. The self explanatory processing just accumulates the "i"'s in "p". In the second example, I do the same thing but I don't explicity name the element delivered by "at each". It's just there, and it looks like I create the namespace of two things out of just one thing. Having seen the first example, you know such is not the case. You know about the phantom first element in the strand. Be aware that since we are dealing with a block here, any objects created in the block disappear after this operation. This is because they are in the block's namespace. The block itself is discarded after the operation and with it, its namespace. It's for this reason that I pass the accumulator "p" as a reference in a namespace.