New Release 77

General:. Work focused on bringing GREP (General Regular Expression Processor) to GLEE. This should please new GLEE users coming from PERL (Practical Extraction Report Language or Pathologically Eclectic Rubbish Lister?). The Borland C++ Builder has PCRE (PERL-Compatible Regular Expressions project) (use this link for documentation.) in the runtime. Thus, I have added GREP capability with virtually no size cost to GLEE. I suspect there will be considerable refactoring. However, the examples here at least illustrate how you could use GLEE to learn GREP. I had the option of deciding what combination of 4 mode switches would constitute GLEE's default. Having virtually no knowledge of GREP before this exercise, my choices may need adjustment later. The defaults I chose are (m):MULTILINE, (i):CASELESS, (s):DOTALL and (x):EXTENDED. The GREP expression can use (?-misx) to change all of this or something like (?mi-sx) to set Multiline and Caseless and not Dotall and not Extended.





The #grep type: The GREP facility is wrapped by GLEE as a new type, the GREP type. As a type, it has methods and properties which can be explored with the : query operator. Right now it only displays the Regular Expression (:RE). In this example I give the GREP Regular Expression pattern when I create the object "p". It's the left argument string. The next example shows changing the RE. Regardless, when the RE is set, GLEE compiles the pattern and saves the compiled result. The GREP object can then be used for matching against any string. In this case I'm using the `` (indices of) operator. With a GREP object right argument and string left argument, the pattern is run agains the text. Where the patter matches, a two element numeric is created. The first element is the index in the string of the beginning of the match. The second element is the end of the match plus 1. These can be used for displaying in the vacinity of the match as illustrated.




Comparing Glee with GREP:In this example I show finding the word "ram" in all of the Aesops fables. I first read the file into "t". I then use Glee's word search ``& to get the word indices and display the word in context. I do the same thing using a GREP pattern. For some reason, GREP does not find the last pattern.




Relative Timing:. This example runs each technique 1,000 times. It reveals that GREP is much faster than Glee in this example. However, finding all occurrences of a word in a 66K file in 7ms is acceptable for most applications. And where it is not, there is GREP for those who understand how to use it.




Why is GREP missing a match?:. I still don't know why GREP misses one occurrence when working in bulk. In this example, I run GREP line by line (which is typical of how we see it used) and it finds all the occurances.




Line by Line processing and timing. This illustrates that the Glee word search is nearly twice as fast as GREP on the line by line basis. Until I find out why GREP isn't finding all occurrences in bulk, I will have to recommend this approach. In the mean time I think I'll havef to favor Glee for doing bulk searches.




Patterns and subpatterns:. I'm going to have to get together with a GREP jock before settling on exactly how Glee will ultimately support GREP. In the meantime I've chosen to use nesting to collect the GREP results for patterns and sub patterns. There are three examples here. All three match on "cat" (1-3 ... remember the 2nd index is the beginning of the next character). The first example then matches "aract" (4-8). The second example matchs "erpillar" (4-11). The last one matches nothing more (4-4 is a 0 length field). In these cases I may want that to show 0 0 or _1 _1. I just don't know yet. If someone does, please message me at <feedback@WithGLEE.com>.




Parsing:. In those cases where you want to parse the text based on matches, Glee as usual gives you several choices. This example shows the indirect method of securing field start and end (+1) indices and using the field start for the cut point. I implemented a more direct method for GREP types that allows you to use the pattern to do the cutting directly. Again, please grant me, this is an experimental work in progress. It's going to take me a while to determine if I like GREP matching powers better than Glee's. That can't happen until I really master GREP and then know how to knit it into Glee philosophically. One thing that has come out of this little diversion. When someone quips that APL, J, K, and now Glee are write only languages, I can drag out some GREP patterns.




Indices Bracketing:. The ``& with a #GREP right argument naturally produces indices for beginning and ending+1 of words. However, for a simple string right argument ``& just produces the indices at the beginning of the word. A new operator ``&<> produces the #GREP style result for such an argument.




Marking Beginning and Ending of words. With Glee's flexible matching capabilities, the length of the match varies with each match. Thus you need to be able to mark the beginning as well as the ending of matches. With all the operators that do matches explicitly (*&) or implicitly (->>&) I needed away to signal the operator. I used the same technique I used for signaling exact (@==) and Glee (@=) sorting. For beginning I use (@<) before the operator. For ending I use (@>) to signal I'm interested in the end of the matched string.




Query. The choice of @> to signal ending caused me to find another glyph set for query End Of String. The ? is the natural query glyph, so I am now employing it where the operator is about making a query. So @> now becomes ?@> (are we at the end) and @< becomes ?@< (are we at the beginning). What is the index of the previous cursor position ?`@<<-; what is the current cursor position ?`@.




.