This note describes XyWrite III+ in some detail, with emphasis on (a) its overall architecture and on "internals," and (b) its "macro"/programming/automation facilities. The target audience is threefold: (1) XyWrite III+ users who want more understanding, mostly of XyWrite XPL programming, (2) others who have never used XyWrite, and "wonder what the fuss is/was all about," and (3) myself -- since nothing clarifies one's thoughts about a given topic as trying to explain the topic to someone else.
The description presented here will not appeal to everyone -- it is directed at people who tend to organize their understanding of things around the ideas of "how things work." The note will probably be much less useful for people who thrive on list of "how to do this" and "how to do that," but have little interest in why things are done the way that they are.
XyWrite III+ is a product that many users still feel is the best writing tool they have ever experienced. But, due to some misestimation by XyQuest (XyWrite's developer) as to how much MS Windows would damage the DOS applications market, plus an untimely, misguided, and costly partnership between XyQuest and IBM at about the time Windows was emerging, XyQuest failed at about the time MS Windows emerged. XyWrite development largely ceased soon thereafter.
In my view, many of the concepts that made XyWrite great have never been articulated, and many of them died when XyQuest died. This note attempts to explore and lay out some of those concepts, in a way that they might be appreciated even by someone who has never used the product, in the hope that some of these concepts might emerge in some measure in future "word processing" software. This hope, however, is perhaps a rather slim one -- nothing will make a person into a XyWrite fan as much as actually using the product will.
For all that it did "right," XyWrite did a fairly large number of things "wrong." In this note, I will attempt not to hide any of my opinions about the latter.
This note does not constitute complete documentation of XyWrite, by any means. For the most complete documentation available, I recommend Herbert Tyson's "XyWrite Revealed" book (Windcrest, 1990, ISBN 0-8306-8459-X), which I have found to be much more useful than the documentation included with XyWrite. Unfortunately, that book may be hard to come by.
XyWrite III+ version 3.58B is the final release by XyQuest of a wordprocessor called XyWrite, who's first incarnation occured in the mid 1980's, and which then evolved until about 1991. XyWrite was designed for MS/PC DOS. XyWrite had an avid following, and was the "official word processor" of the New York Times from 1989 to 1993.
Unlike other "word processors," which start by defining a proprietary format for their data files, XyWrite starts by largely adopting standard DOS ASCII files as its basic data format.
Word processors differ from simple text editors largely in the fact that their target is the printed page, which usually incorporates such niceties as "proportional fonts" and "microjustification" (e.g., inserting tiny amounts of extra space between the letters on each line of text so that the right margin of the lines in a paragraph aligns as perfectly as does the left margin). To accomplish such goals, word processors do things like "reflowing" the text of paragraphs to optimumly fit text within a defined set of margins. And, to do that, the word processor needs a certain amount of "control information" -- the desired margin settings on the page, for example -- over and above the actual text of the document.
But, for many files, particularly files destined for other than printing, little if any "word processor" control information need be added, and in such cases, XyWrite allows one to "end up with" a simple ASCII file, rather than the any of the myriad of proprietary format alternatives offered by other "word processors."
In XyWrite, any additional word processing control information added to an ASCII file is added by including "embedded commands" within the file. These embedded commands consist of short command sequences consisting of normal ASCII characters, but enclosed between a left and right guillemet. For example, «RM72» is the embedded command that would be used to set the Right Margin to 72. Because "word processing" control information is added by XyWrite in such a simple, consistent, and transparent way (relative to what is done by other word processors), "filters" which translate XyWrite files into practically any end use target (e.g., HTML, programming language source files, newspaper printing input, etc.) are relatively easy to write.
XyWrite III+ was, in some sense, superseded by a program called "Signature," which was developed by XyQuest in partnership with IBM, but which IBM abandoned at just about the time Signature was to be first released. Largely due to IBM's involvement, Signature was a somewhat bloated program, that sought to include "everything but the kitchen sink." It was released well before being debugged, and was a market disaster.
Signature then morphed into XyWrite IV, as XyQuest attempted to recoup from it's bloated development effort with IBM. But too much time and money had been expended, and too much unneeded complexity had been added to XyWrite IV for it be maintainable at reasonable cost. XyQuest as a viable corporation failed soon thereafter.
XyQuest was bought by The Technology Group, which carried XyWrite IV forward to version 4.018 (and XyWrite for Windows to version 4.012) before abandoning XyWrite as not commercially viable.
I made some effort to see if I could migrate my XyWrite actitities to XyWrite IV, version 4.018, and found the program very buggy, being unable, for example, to even reliably recognize (or correctly respond to being told) the height of the command window in which the program was being run. A large number of things that I tried also caused the program to freeze -- part of the learning curve seems to be learning what to avoid. I also found it to be about 2 orders of magnitude slower than XyWrite III+, in loading a sequence of files (one at a time) for searching purposes -- a function which is critical to me. This could in part be related to the environment in which XyWrite IV is run -- the XyWrite load module (at 681K, 3.7 times the size of XyWrite III+) is large enough that EMS or XMS memory is used for much file buffering, so the the XMS and/or EMS implementation and functionality in XyWrite IV's execution environment are very likely critical.
One of XyWrite's long term blemishes comes from XyQuest's tendency to be overly terse, using terse 2 character names for the vast majority of its commands, options, and many, many, many "switches." Along the same line is XyQuest's use, for example, of key numbers like "59" and "60" as the designators (in a XyWrite "keyboard file") for the F1 and F2 keys, in preference to somewhat more user friendly strings like "F1" and "F2". All of this terseness was probably reasonable -- even desireable perhaps -- for XyWrite III+, which ran on many, many machines with as little as 256KB of memory. But with XyWrite IV, the number of such terse names expanded greatly, making a serious problem very much worse -- and the existence of 256K machines was not a justifying factor for not improving this state of affairs in the XyWrite IV timeframe. In my opinion, XyQuest's willingness to allowing so much further deterioration on this "user-friendlyness" front -- without even so much as a hint that this issue had been rethought in the context of the more powerful machines then being made -- was a significant factor in contributing to XyWrite IV's rapid demise.
Nonetheless, XyWrite IV does continue to "live" within a group of users, and some of the original XyWrite "gurus" have migrated to XyWrite IV as the "XyWrite of choice." XyWrite's XPL programming language was enhanced in XyWrite IV, so XyWrite IV XPL programs are considerably easier (but still not easy) to write. A huge "library" of XyWrite IV add on XPL programs can be found at "www.serve.com/xywwweb/", but the body of work there is really in a form that isn't very useful for folks that don't have a working XyWrite IV installation, and who are still unsure of whether obtaining and installing a copy of XyWrite IV is what they want to do.
But, taking another crack at exploring XyWrite IV's virtues remains on my maybe to do list. Some folks highly recommend it.
There was/is also an incarnation of XyWrite known as NotaBene, originally marketed by a firm called Dragonfly Software, but now simply called NotaBene. Dragonfly apparently obtained some kind of rights from XyQuest to use the XyWrite code as a base. In the period when XyQuest was marketing XyWrite III+, the NotaBene executable was almost indistinguishable in functionality from the XyWrite III+ executable. But, Dragonfly released the executable under the NotaBene label, while "enhancing" the overall product offering by adding functionality in the area of "scholarly writing." The enhancements consisted of a good deal of tailoring of XyWrite via the use of gigantic keyboard files, help files, and via XyWrite's XPL programming language, plus some added "database" tools for maintaining and using bibliographic data. NotaBene has also had enhancements in support of languages such as Hebrew, Greek and Cyrillic.
For my uses at least, Dragonfly subtracted more from XyWrite III+ than it added. Its approach to product licensing was very draconian compared to XyQuest's. Its use of XyWrite's programming facilities was undocumented, making additional tailoring without conflicts nearly impossible. And the XPL that NotaBene added was largely encrypted, further making it difficult to add much of anything in the way of additional customization that had any synergy at all with Dragonfly's work. In my view, NotaBene was for all practical purposes not user customizable at all.
As best as I can determine, NotaBene obtained all remaining rights to the XyWrite code base as the Technology group folded, and now (2008) holds those rights. NotaBene (corporation) still has a NotaBene for Windows offering, based on some version of the XyWrite code base, but the emphasis is still almost exclusively on NotaBene's "scholarly writing" added functionality and tool set, with the underlying XyWrite functionality and versatility still remaining mostly hidden and undocumented (and therefore difficult, if not impossible, to use).
The first impression that one might have of XyWrite is that of a simple, "full screen" multi file, ASCII text editor. XyWrite will edit up to nine files concurrently.
In such an editor, we know what the "a" key will do -- it will typically place the letter "a" into the file being edited. We also know what the cursor keys will probably do. But how about other keys, such as a PC's so called "function keys"?
XyWrite answers that question by providing for a "keyboard file." One of the keyboard files shipped with XyWrite starts with a "picture," which, in part, looks like this:
; ╔════╦════╦════╦════╦════╦════╦════╦════╦════╦════ ; ║ Esc║ 1 ║ 2 ║ 3 ║ 4 ║ 5 ║ 6 ║ 7 ║ 8 ║ 9 ; ║ 1 ║ 2 ║ 3 ║ 4 ║ 5 ║ 6 ║ 7 ║ 8 ║ 9 ║ 10 ; ╠════╩╦═══╩╦═══╩╦═══╩╦═══╩╦═══╩╦═══╩╦═══╩╦═══╩╦═══ ; ║ Tab ║ Q ║ W ║ E ║ R ║ T ║ Y ║ U ║ I ║ O ; ║ 15 ║ 16 ║ 17 ║ 18 ║ 19 ║ 20 ║ 21 ║ 22 ║ 23 ║ 24 ; ╠═════╩╦═══╩╦═══╩╦═══╩╦═══╩╦═══╩╦═══╩╦═══╩╦═══╩╦══ ; ║ Ctrl ║ A ║ S ║ D ║ F ║ G ║ H ║ J ║ K ║ L ; ║ 29 ║ 30 ║ 31 ║ 32 ║ 33 ║ 34 ║ 35 ║ 36 ║ 37 ║ 3 ; ╠════╦═╩══╦═╩══╦═╩══╦═╩══╦═╩══╦═╩══╦═╩══╦═╩══╦═╩══ ; ║Shft║ \ ║ Z ║ X ║ C ║ V ║ B ║ N ║ M ║ , ; ║ 42 ║ 43 ║ 44 ║ 45 ║ 46 ║ 47 ║ 48 ║ 49 ║ 50 ║ 51 ; ╠════╩══╦═╩════╩════╩════╩════╩════╩════╩════╩════ ; ║ Alt ║ Space Bar ; ║ 56 ║ 57 ; ╚═══════╩═════════════════════════════════════════
where the ";" in column one indicates that these lines are only "comments" to XyWrite. The picture allows the user to associate a number with each key.
Later in the keyboard file, we would find, for example
SHIFT=42,54
telling XyWrite which keys were to be interpreted as "shift" keys. A user can get "creative" here, and try to change the norms as to which keys on the keyboard are the various Shift/Ctrl/Alt keys and such. But doing so works less well when you run XyWrite in the "DOS boxes" of Windows or Linux, because Windows and/or Linux is also interpreting keys like the shift keys, and the resulting inconsistencies usually leads to problems.
Later in the keyboard file, there are entries like
TABLE=
...
14=BD
, telling XyWrite that key 14 is the Backspace Delete key, or
65=CP
, telling XyWrite that key 65 is the key for CoPying marked text, or
30=a
, telling XyWrite that key 30 is the letter "a". (The "TABLE=" statement identifies the entries which follow as being associated with unshifted use of the indicated keys -- for each numbered key, there are similar entries later in the keyboard file which are preceded by entries like "TABLE=SHIFT" or "TABLE=CTRL".)
It's pretty easy to understand the general syntax of a XyWrite keyboard file, once you've examined one.
As an aside: the keyboard file format exemplifies one of XyQuest's major failings -- in the area of "personalizing" the program, XyQuest seemed to give near-zero priority to making the process easier for humans of the type that live on planet Earth. There is little reason why the "65=CP" entry could have just as easily read "F7=CP", thus saving the user the trouble of endless references to the keyboard numbering diagram when working with a XyWrite keyboard file. My guess is that, if you asked the folks at XyQuest "why '65=' rather than 'F7=' in the keyboard file?", they would tell you that numbers (e.g., '65') can be parsed faster and with fewer lookup tables than parsing 'F7' would require, during each startup of the XyWrite program -- and they would wonder why you asked.
Although -- after personalization has been done -- XyWrite may be the fastest and bestest tool on planet for folks who like to write, XyWrite remained even through it's Signature and XyWrite IV incarnations as one of the most time consuming tools on the planet to actually personalize. I suspect that many folks who made it through their personalization experiences with XyWrite III+ did so grudgingly, and only because the challenge XyWrite created for them made them refused to give up, once they had started. But asking folks to repeat the personalization nightmare -- only worse -- for Signature or XyWrite IV, once they had already been through it with XyWrite III, is hard to see as much of anything other than a (largely successful) death wish.
But -- to continue our examination: The letter "a" in the key 30 entry is pretty self explanatory. The "BD" in the key 14 entry or "CP" in the key 65 entry is somewhat less self explanatory. "BD" and "CP" are two of about 300 two-letter codes that represent "functions" or "primitives" that can be assigned to keys.
The number of available primitives is formidable, and well exceeds the number of (unshifted) keys on a keyboard, as one discovers when one goes about assigning these primitives to keys. But it's not as huge as it may at first seem - for example, there are 3 sequences of 36 code assignments each (e.g., @A-@Z, @0-@9) and another sequence of 9, where the "functionality" is essentially the same for all 36/9, with only an implicit embedded 1 letter/number "argument" being the difference. Also for some functions which many editors do only on a character basis, such as cursor-right-1-character or delete-character, XyWrite provides additional similar primitives based on words, sentences, lines, and/or paragraphs (such as "RP", which Rubs out (deletes) the Paragraph containing the cursor).
For folks (like me) who feel up-front that remembering key assignments for each of these would be a painful learning chore not always worth undertaking, there is always the option of not assigning such primitives to any key at all, since the functions involved can always be accomplished (albeit somewhat more slowly) using just the character based versions of these primitives. Or, one can take a wait-and-see approach -- assign essential primitives right away, and assign less essential primitives to keys and the need arises and it seems appropriate. Here again, though, is the kind of area where XyQuest missed the mark -- helpful items, such as a list of an essential subset of primitives needed to get up and running, were not provided. So the user is stuck wading through all 300 primitives, just to get started.
XyWrite provides a default keyboard file, in which the large number of available primitives is dealt with by allowing, for example, the A, Shift-A, Ctrl-A, Alt-A, Ctrl-Alt-A, Ctrl-Shift-A, Shift-Alt-A, and Ctrl-Shift-Alt-A "shifted keys" to each have different primitive (and/or character) assignments. But with a 100 key keyboard, that could be about 800 key combinations to remember, and it's not clear that this approach works all that well for many people.
In our examples above, each key was associated with a single primitive or letter. But entries in the keyboard file can also be associated with a sequence of letters and/or primitives. For example,
14=BD,BD,BD
would result in a backspace key that backspaces 3 characters, and
30=NO,J,o,e, ,S,m,i,t,h
would result in an "a" key that places the name "Joe Smith" in a file, rather than the letter "a".
Elements of a multi-element definition are separated by commas, as we can see in the examples. For definitions which consist of multiple elements, XyWrite requires that the first element entity be a primitive, so to get our "Joe Smith" key, we started with a "NO" (No Operation) primitive, which does nothing other than satisfy this XyWrite requirement.
After either learning a (possibly temporary) usable subset of the default key assignments, or creating your own (possible temporary) keyboard file with key assignments which you find to be a usable subset, the XyWrite program can be explored in more detail. On invocation, one finds that the program has one screen line reserved as a "command line," another couple lines reserved for status, error, and "ruler" indications of various types, and 22 lines (on a 25 line screen/window, or 47 on a 50 line screen/window) are reserved as the text editing area.
XyWrite can edit up to 9 files concurrently, in a set of buffers which are numbered from 1 to 9. When editing multiple files, you can simply display only one file at a time on screen, and then use some of the keyboard primitives to switch the display to a particular numbered buffer or to cycle though the buffers until you find the one you want. Or, there are also facilities for creating different "windows" for each of the buffers, within the XyWrite display area, thus allowing you to see portions of more than one file at the same time. When multiple windows are in use, only one buffer (and it's associated window) is selected or "active" at any given time.
Regardless of the number of "windows," there is only one command line, and commands entered on the command line apply to the currently selected buffer/window. The command line serves for entering both "immediate" and "embedded" commands.
An immediate command might be something like "edit a:\myfile", to initiate editing "myfile" on the A: disk in XyWrite. Commands of this sort are familiar to anyone who has used most any fullscreen editor under DOS.
An embedded command would be something like "RM 72", to set the Right Margin to 72. Commands of this sort simply "embed" the command, bracketed by guillemets, at the cursor location of the current file being edited -- the text embbedded for a "RM 72" command would be «RM72». An embedded command, such as «RM72», is subsequently acted upon whenever the setting that it conveys is relevant.
Once it has been entered into a file, an embedded command can also be edited or changed, simply by editing the corresponding text that was placed in the file.
Embedded command generally don't "print" as such. For the most part, they supply information as to exactly how printing is to be done.
The left («) and right (») guillemets, as used to enclose the RM72 as it appears within a data file, are used to enclose any and all of XyWrite's embedded commands. This strategy means that -- in implementing the notion of embedded "word processing codes," only two characters (the left and right guillemets) require "special interpretation," above and beyond their normal ASCII/code page 437 meanings. These characters correspond to decimal codes 174 (0AEh) and 175 (0AFh) respectively, and hence these codes don't interfere with files (such as computer programs in nearly any programming language) which use only the "usual" ASCII characters (those below 128/80h).
To facility the WYSIWYG (What You See Is What You Get) properties which differentiate word processors from editors, XyWrite supports both a normal and an "expanded" display mode for displaying file contents on screen. In normal mode, XyWrite formats the data as requested by the various embedded command which have been accumulated into the file, so that paragraphs look like paragraphs, indented areas are indented, margins are at the desired point, and so forth. A small triangle, occupying a single character position (and whose display can be suppressed, if desired), remains on screen as a reminder as to where embedded commands have been entered, but the embedded commands are not otherwise displayed (unless the cursor is placed at the associated triangle). This is XyWrite's "WYSIWYG" mode of display.
Many writers (myself included) find on-screen formatting of paragraphs, and the various indentation constructs such as numbered lists, to be essential to their thought processes when writing and revising documents. As one writes, XyWrite does the necessary on-screen formatting adjustments virtually instantly.
In contrast, the "expanded" display mode ignores the content of any embedded commands, instead simply displaying the codes as they appear throughout the file, each being bracketed by the guillemet characters that enclose embedded commands. This renders embedded commands as instantly editable. Switching between normal and expanded modes is done by primitives that can be assigned to any key or keys, via the keyboard file. The cursor location within a file is essentially unchanged when the display mode is changed between normal and "expanded," making it easy to retain focus on any item of interest, when changing between normal and expanded mode display.
In some ways, the XyWrite "expanded mode" display is like the "reveal codes" window in WordPerfect. But, in comparing the two, I think that the XyWrite expanded display is vastly superior. Except for the absense of WYSIWG formatting and the explicit inclusion of embedded commands, XyWrite's expanded mode display looks just like the normal mode display. When you switch modes, the cursor stays at the same point in the file, and you can continue editing in either mode without breaking your stride. No new skills or new habits are needed to edit in expanded mode. And there is no need to waste half of the screen, just to display what is really redundant data. And, unlike WordPerfect, you won't find a lot of extra embedded commands (such as "soft returns") ostensibly entered "on your behalf" by the wordprocessor. And the embedded command should be instantly recognizable, because they are what *you* entered.
The "embedded command" strategy is one place where XyWrite really starts to shine. Dealing with, say, the right margin settings, requires knowing the "RM nn" command and construct, but it requires knowing ONLY that construct (as pertains to right margin functionality), whether it be for purposes of entering the setting initially, examining a setting already entered, or editing/changing a setting already entered. The same kind of uniformity/simplicity applies to all other embedded commands as well. And it's all done without menus, dialogue boxes, or special "reveal codes" screens with their special abbreviations, that clutter up most of the other word processors.
XyWrite approaches the character set question by embracing "code page 437," which is the character set defined by IBM for the original PC, and which was built into the early display cards for the PC. Code page 437 matches the ASCII code for the displayable ASCII characters with values from 20h through 7Fh, which is the range where ASCII defines some kind of graphic for each ASCII code point.
If both your "DOS mode" on screen display and your printer support it, XyWrite is equally usable with code page 850.
In DOS files, the CR-LF (Carriage Return-Line Feed) sequence is normally thought of as denoting the end of a line. In XyWrite, CR-LF denotes the end of a unit of text that XyWrite will "reflow" or "word-wrap" as a unit, which in turn is usually thought of as corresponding to the notion of a paragraph. This may seem like a conflict, but in fact does not turn out to be so. If it isn't clear to you why not, then I would suggest that you spend a little time with a program like the "Notepad" program in Microsoft Windows, which works the same way as XyWrite does in this regard.
Relative to code page 437, we noted earlier that XyWrite usurped code points 0AEh and 0AFh (the guillemets) for it's own "word processor command" encoding purposes. So -- what happens if a user wants to use these characters as normal, printable characters?
Somewhere along the line, XyWrite designers observed that this and similar considerations for some other characters would mean that it needed more than just the 256 characters that can be encoded in a byte. So XyWrite devised a supplemental "3 byte encoding" scheme for what are "logically" single characters. By "logically single characters," I mean entities that are treated as a unit while editing -- they can only be created, deleted, moved, or copied as an encapsulated 3 byte unit, and are not generally accessible for editing as 3 individual bytes or characters. Each of these encodings appears on the display as only a single character. Given all of these aspects of XyWrite's handling of 3 byte encoded characters, a novice XyWrite user might remain unaware of their 3 byte nature for quite some time.
The encoding used for 3 byte encoded characters consists of defining an "escape" or "trigger" character (which happens to have the value 0FFh), which can always be used to identify the beginning of a 3 byte encoding, and then 2 more bytes which actually encode the "meaning" of the a given 3 byte encoding. Thus, for about 16 of the 256 characters in code page 437, there are two commonly used encodings -- the regular encoding in 1 byte, and an "alternate" encoding which is encoded in 3 bytes. For a few characters, ONLY the 3 byte encoding is ever used or usable in a file -- the 0FFh code point, 1Ah code point (the DOS End-Of-File marker), and the 00h code point being 3 examples.
"Alternate encodings" of simple data characters, as generated by XyWrite, always consists of the 0FFh "trigger", and two ASCII characters each in the range '0' to '9' or 'A' to 'F'. These two characters, interpreted as an expression in hex, give the value of the 1 byte encoding of the same character.
In decoding 3 byte encodings of simple data characters, XyWrite (III+) will recognize that such an encoding exists, for any 0FFh byte followed by a byte of less than 080h. This leaves 15 bits (in the last two bytes) to represent only 256 characters, which is a lot more bits than necessary. It appears that any values are accepted for these bytes, without complaint, and thet are converted to a single byte char with an algorithm equivalent to the following (assuming the first byte is in AL, and the second in AH):
call do_one xchg al,ah call doone shl al,4 shr ax,4 where the procedure "doone" looks like do_one proc near sub al,'0' cmp al,10 jb isdec sub al,7 ;above maps 'A'-'F' such that low nibble is 0Ah-0Fh ;note that *every* byte other than '0' to '9' is "bumped" by 7 and al,0Fh isdec: ret do_one endp
Although 3 byte encodings are usually treated as an encapsulated unit, there are some circumstances in XyWrite where the individual bytes of a 3 byte encoding can be separately manipulated. Many string operations in XPL -- XyWrite's built in "programming language" -- ignore the encapsulation, and allow manipulation of the individual bytes of such an encoding. Also, when these encapsulations appear on the command line, they are also displayed as and treated as 3 individual bytes, which can be manipulated separately.
The character 0FDh is also used as an escape or trigger character for some encapsulated sequences, which are from 5 to 7 bytes long. 0FDh triggered sequences, however, are only used internally by XyWrite, and are never recorded in the on-disk version of the associated file. They are used as part of a strategy by XyWrite to accelerate screen refresh. I know of no way in which these sequences reveal themselves to, or are in any way relevant to, anything that a user does or sees when using XyWrite normally. (If you know enough about the sequences, you can locate them via the search command, in normal display mode, however.) To my knowledge, these sequences have only been observed by users who have examined XyWrite's memory image with a debugger (or similar), while XyWrite was in operation. By far the main reason that a XyWrite user needs to know anything about 0FDh sequences at all (IMO) is because XyWrite's 0FDh encapsulations do result in need for using the 3 byte encoding for the 0FDh character, in all cases where this character is desired for use as user data. If you manage to get a 1 byte encoded 0FDh character into a file as user data (with some program other than XyWrite, for example), editing the file with XyWrite will likely cause some amount of data corruption in the vicinity of the 0FDh character, and it may even cause XyWrite to crash.
Like the 0FDh character, a 1 byte 01Ah (normally, End of File) should be studiously avoided within user file data (other than at end of file). As with the 0FFh and 0FD characters, XyWrite is checking for this character continuously, most of the timem when it is processing any kind of stream. If found, the 1Ah character generally terminates processing, with the character before the 1Ah, just as if the length count of the stream had terminated the stream at that point.
For the most part, 3 byte alternate encoded characters appear on screen exactly the same way as their 1 byte counterparts. But, for characters which have some special meaning and interpretation to XyWrite in their 1 byte form (such as the Guillemets), a 3 byte encoding negates or cancels XyWrite's special interpretation of that character. This, for example, makes the guillemets available for simple printing by a user, but does so without (erroneously) triggering XyWrite to look for an embedded command.
The characters which are frequently seen in their 3 byte forms in XyWrite are the null byte (00h), backspace (08h), tab (09h), LF (0Ah), CR (0Dh), dc1 (11h), EOF (1Ah), esc (1Bh), blank (20h), dash (2Dh), the decimal point character (normally=2Eh), del (7Fh), the left and right guillemets (0AEh and 0AFh), plus the 0FFh and 0FDh "trigger" characters. You can get the 3 byte version of these characters in either of two ways (a) by entering the character via the "ASCII decimal code" method, or (b) by entering the character via a "type 2 help frame." Neither of these methods, however, will yield a 3 byte encoding for other than just these 16 characters.
(Note that the decimal point character can be set to other then it's default value of "."(2Eh) -- a "default DP=," command will set it to a comma (",",2Ch), for example. If DP is set to ",", then a "," entered either via the "ASCII decimal code" method or a type 2 help frame will produce a 3 byte encoding, and the "." will no longer do so. In a pinch, I suppose, setting the DP to any character would apparently give a user a way to generate a 3 byte encoding of that character
The DP setting defines the character that is the basis for column alignment when "decimal tabs" are used. Having a 3 byte version of the DP character -- whatever the character may be -- provides a way to put the DP character on a line WITHOUT that character instance triggering the "decimal tab" alignment function. Note the the DP setting affects only the "decimal tabs" function -- it does not, for example, change the character used by XyWrite when it generates (within XPL) or displays a numerical result containing a XyWrite generated "decimal point.")
(The "ASCII decimal code" method is similar to the mode provided by a PC's BIOS under DOS, for entering any code page 437 character. For the BIOS/DOS case, this consists of holding down the Alt key while keying in the decimal value of the character on the keyboard's numeric keypad. In XyWrite's case, the idea is the same, but the actual keys used are defined via the keyboard file, and are usually different that those used by the BIOS for this function. The "type 2 help frame" is a mechanism provided within the XyWrite help system, wherein the graphics for all available characters can be displayed, at which point one "enters" a character by navigating to it with the cursor and pressing the Enter key.)
For the 16 characters indicated, either method (ASCII decimal code, type 2 help frame) generates a 3 byte encoded characters, when used to enter a character in a file editing window. In XyWrite 3.57 at least, the ASCII decimal code method has a bug, and it often also enters a null (00h) character following the 3 byte encoded character, for data entered into a file. Such nulls, however, will be purged from the file the next time the file is loaded from disk by XyWrite, so they usually don't cause a problem.
When used on the command line, the "ASCII decimal code" method also produces 3 byte encodings for these 16 characters and 1 byte codes for all others. But, on the command line, the bug which produces an extraneous null byte for each 3 byte encoding does not occur, and no such null is generated.
In contrast to the "ASCII decimal code" method, the "type 2 help frame" method for entering characters ALWAYS creates 1 byte encodings, when it is use to enter data on the command line -- even for the 16 bytes listed. This gives you the capability to create virtually any 3 byte encoding on the command line that you want to, including invalid encodings, by creating the three constituent bytes of that encoding, one byte at a time. Used unwisely, to create some types of invalid encodings, this capability will cause XyWrite to crash.
Although XyWrite will normally only generate 3 byte alternate encodings for the 16 (of 256 possible) characters indicated, it will recognize 3 byte encodings for the other 240 possible 3 byte encoded characters (utilizing the same encoding scheme), without any problem. XPL programs can easily be written to generate the other 240. They are not of great usefulness, but can be marginally useful, because they respond (or fail to respond to) different search arguments than their 1 byte counterparts.
As was indicated, 3 byte encoded characters appear the same on screen as their 1 byte encoded equivalents. So, a question arises: how does one tell whether a specific character is 1 byte vs. 3 byte encoded? XyWrite provides no direct mechanism for knowing the difference (for most characters), but a small XPL program can easily be created to do a test. The 3 character versions are detected by virtue of the fact that moving the cursor over a 3 byte encoding advances the cursor position (readable within XPL programs, via the «CP» command) within the file by 3, rather than by the usual 1.
Having defined 0FFh as a trigger character for 3 byte alternate encodings of the normal (code page 437) characters, XyWrite programmers apparently observed that they had burdened themselves with the rather tedious chore of *ALWAYS* having to look for and recognize 0FFh triggered 3 byte encodings, whenever they processed *any* file data for nearly any purpose. So having (unavoidably, I think, for any "word processor") introduced this kind of burden (or one similar to it), they decided to get the most out of it, and they added another group of 0FFh triggered, 3 byte encoded entities in addition to the 3 byte encodings of normal (code page 437) characters.
In discussing the keyboard file above (see "Overview - Keyboard"), we noted a set of about 300 two character primitive codes, like the "BD" code which represents the Backspace-Delete primitive. In the keyboard file, such codes can be distinguished from simple data codes (e.g., "a") because the primitive codes are a pair of characters, whereas simple data codes are a single character. But, in order to be able to distinguish between primitives and normal data in keyboard multi-element definitions, the keyboard file syntax requires a comma between each character or primitive in a definition. That's okay for a keyboard file, but is not an acceptable scheme in various other contexts.
So the second group of 0FFh triggered 3 byte encodings that XyWrite defined is a set of codes that correspond the 300 or so two character primitive codes that are available to assign to a key in a keyboard file. When one of these codes appears in any file, XyWrite displays it on screen as a two character code in bold (e.g., hilighted), followed by what appears to be a blank. The two hilighted characters are the same 2 characters that would be used to invoke the same primitive function within a keyboard file.
The 3 byte encoded primitives are encoded as a 0FFh byte, followed by a byte having the value 80h, 81h, or 82h, followed by a byte which is always odd (not divisible by 2). There is no algorithm by which one can infer the two character code which will correspond to a particular 3 byte primitive encoding -- the conversion (in either direction) requires lookup tables.
It appears that, in decoding functional primitives triples, XyWrite III+ takes bytes 2 and 3 as a value, subtracts 8000h, resets the low order bit of the result, and uses that as an offet into one table of character pairs to get the two letter code for the function (overrun of the table is not checked for), and into another table to get pointer to a procedure which implements the primitive's function. The use of only odd values in the third byte of the encoding seems to be designed to avoid problems with characters like nul (00h) and EOF (1Ah), but, barring problematic characters such as those, reducing the third byte by one seems to have no effect.
So. We have introduced 3 byte encoded primitives in our overview of XyWrite's "character set." Was that right -- should we think of them as "characters"? Certainly they are not, in the sense that they weren't "invented" for the purpose of printing and reading, as were the code page 437 characters. But, as a XyWrite user starts using the more advanced capabilities of XyWrite, these encodings will appear in files that the user is editing. The user will soon discover that 3 byte primitive encodings mostly behave in the same way as other "characters" -- in essense, the 3 byte encoded primitive is packaged as a unit and can only be manipulated as a unit.
But the importance of 3 byte encoded primitives as characters goes beyond the editing properties indicated above, in a way that may seem trivial, but which I think is actually quite profound. To illustrate that, we need to briefly look at XyWrite's programming language, called XPL. But, since we haven't yet had even done our overview of XPL, I will use pseudo 'C' syntax for our discussion. (I say "psuedo 'C'" because I am assuming a language with a variable length string type, and which can do string assignments.)
XPL has only one function -- "RC" (Read Character) -- by which an XPL can "read" input from the keyboard. It also has a command -- 'PV' which "puts" characters, into a file, onto the command line, or wherever.
Now, consider the the following program, which is simply a two statement infinite loop:
string s; while (true) {s = RC(); PV(s); }
This is a program that, for the most part, "works" in XyWrite. It completely transforms XyWrite, while at the same time seeming to change nothing at all.
When this program is running, keystrokes no longer *directly* do anything at all, as they would do when no XPL program is running. Instead, they are queued, until either there is no XPL program running (at which time XyWrite executes both queued and subsequent keystrokes), or until they are read by an RC() function call in an XPL program.
But what's significant here is that, having read the keystroke with RC(), all that an XPL program has to do to "pass on," or "execute," the keystoke, in the same way as would occur if the program wasn't there, is to 'put' the keystroke "character," using PV(). If the keystroke was an 'X', an 'X' character is inserted in the file (or onto the command line) just as it would have been if the above program were not there. If the keystroke was a CR (Cursor Right) primitive call, the cursor moves one to the right, again, just as if the program were not there. So, when the program above is running on XyWrite, the a person at the keyboard sees no change in behavior at all -- he can't even tell that the program is running. (This is actually a bit of an overstatement, because there are a handful of primitives that don't stand on their own, and only have meaning when augmented by an additional keystroke. So there are a few primitives that take special handling which is a little more involved than the loop above. But, absent use of those few primitives, the above loop really does work, and is not readily detectable.)
The loop above "works," in essense, for both "data" keystrokes and for "functional" keystrokes (like CR), BECAUSE the primitive functions are included in the class called "characters", which is what RC() (Read Character) function "reads."
Notice what we would need to do to change the above code into a "keystroke macro recorder." All that is required is to write:
string s, macro; while (true) {s = RC(); macro=concat(macro,s); PV(s); }
We are, of course, ignoring the code required to turn macro recording off and on. This is no easier or harder in XPL than with any other editor programming language.
To "play" our record keyboard macro, all we need do is write:
PV(macro);
In considering what we have just described, I should point out that the loop above that I described as a "keystroke macro recorder" really isn't exactly that. It's really a "keystroke function macro recorder." For example, I could, in the keyboard file, (stupidly) specify
59=DF
60=DF
which would result in both F1 (key 59) and F2 (key 60) being bound to the "DeFine text" (DF) primitive function. If I record a "keyboard macro" where either or both of these keys are pressed, all that I get in the keyboard macro are indications of the DF primitive function, but I have no way of determining from the macro whether it was F1 or F2 that was actually pressed when the macro was recorded. In my view, this is an advantage. Among other things, it generally means a XyWrite macro can be "played back" on a machine with different key bindings, but the macro will still do the intended function.
But, sometimes, identification of the actual key pressed is useful. As it turns out, not being able to identify the actual key apparently did turn out to be a problem for XyQuest, when it came to implementing the spell checker. When a misspelled word is found, the spell checker puts up a menu giving the user 8 options. For six of these options, the user selects the option via the F1 through F6 keys, respectively. Since the spell checker really doesn't know what primitives are assigned to these keys, it would have trouble determining when these keys have been pressed.
To solve this problem for the spell checker, XyWrite introduced 8 new primivites -- Q1 though Q8. In XyWrite's default keyboard file, primitives Q1 though Q6 are assigned to F1 through F6, in addition to whatever else is assigned to those keys. For F1, for example, the keyboard file entry is
59=Q1,DF
The idea here is that, in normal operation, the Q1-Q8 primitives are ignored, so the Q1 has no effect on F1's "normal" definition of DF (DeFine text). During spell check menu operation, all primitives EXCEPT Q1-Q8 are ignored, and F1 is detected via invocation of the Q1 primitive.
But the Q1-Q8 "hack" is not really a general solution to the problem, and there are other menus in XyWrite that require navigation via the keyboard. So there are still questions here as to how all of this should, and does, work, when the keyboard is used for purposes other than normal command line and file window editing.
So. Having briefly covered the XyWrite character set, with both 1 byte and 3 byte representations for data, and 3 byte representations for primitive functions, comes the bad news -- all of this is virtually undocumented in the XyWrite documentation. It's as if XyQuest was scared that any mention of 3 byte codes would terrify users and scare them away. But you don't have to use XyWrite all that long before you will get hopelessly confused by what is happening, if you don't have at least some idea of the underlying details of XyWrite's extended "character set." Fortunately, emergence of some amount third party documentation, in user groups and books, helped bridge the huge documentation gap.
Having discussed the subtleties of the XyWrite character set, it is worth noting that a great deal of what was said above does not apply to processing of characters as they appear on the command line. On the command line, generally speaking, all character are simply treated as 1 byte data characters. This includes the 0h, 0Ah, 0Dh, 1Ah, 0FDh, and the 0FFh characters.
When a string in a Save/Get (S/G) (discussed below) is transferred to the command line via a "«PV»" command (also discussed below), ALL 3 byte encoded data characters in that string are converted to 1 byte data characters before being transferred to the command line. This provides one of two mechanisms for introducing ANY 1 byte character onto the command line, including the 0h, 1Ah, 0FDh, and 0FFh characters. The other mechanism that can be used to introduce ANY 1 byte character onto the command line is the use of a type 2 help frame -- but this mechanism is of course available only if the loaded help file contains suitable type 2 help frames.
Although ANY 1 byte character can be introduced onto the command line by transfering a 3 byte encoded version to the command line from a S/G (a 1 byte version in the S/G will also do, for most characters), there are four characters that can only be introduced onto the command line from a S/G by using something that I call the "three byte ruse" (or similar techniques). These characters are the 0h (nul) character, 0Ah (LF) character, the 0Dh (CR) character, and the 1Bh (Esc) character. The "three byte ruse" consists of placing two extra characters in front of the desired character within the S/G, and also placing 4 primitives (CL BD BD CR ) in the S/G following the character, before doing the transfer. The first of the two added characters is a 3 byte encoded 0FFh character, and the second character can be nearly any character (I usually just use an 'X'). The transfer of the first 0FFh character to the command line effectively causes XyWrite to "believe" that a 3 byte encoded data character or a 3 byte encode primitive is being passed to the command line. This causes XyWrite to suspend any attempt to interpret or translate either of the next two characters. After all of the three characters have been passed, the four added primtives are executed, which do a cursor left, backspace delete, backspace delete, and cursor right sequence, to delete the (now 1 byte) 0FFh character and the 'X' from the command line, thus leaving only the desired character.
For 0h, 0Ah, 0Dh, and 1Bh, failure to use the "three byte ruse" results in various actions when one attempt to transfer the character to the command line. A 0h is simply discarded. A 1Bh (Esc) char is tranlated to 10h. A 0Ah (LF) char is translated to a 1Bh (Esc). And a 0Dh (CR) character causes the same action as the XC primtive -- nothing is transferred, but the command line is executed as it stood at the time the 0Dh byte transfer was attempted. Any of these actions can be avoided by using the "three byte ruse."
When the command line is subsequently actually executed, any 1Bh (Esc) character is translated to a 0Dh,0Ah (CR,LF) pair, and any 10h byte is translated to an 1Bh (Esc). A 0h byte followed by a A, L, N, S, X, or W, within a search argument, is interpreted as a "wildcard," for any Alphanumeric, Letter, Number, Separator char, any single char, or any string of length 80 or less, repectively. Note that this means that there is no way in XyWrite to specify a 10h character as part of either a search or change-to argument, since any 10h character is translated to a 1Bh (Esc) character before the command line is interpreted.
Similarly, when a command line is captured by an XPL program via S/G 00 («IS00») (discussed below), any 1Bh (Esc) character is translated to a 0Dh,0Ah (CR,LF) pair. But 10h characters are NOT translated (back) to 1Bh characters in the process of capturing a command line.
The WA /WL /WN /WS /WX /WW "wildcard" primitives place the same 0h,A/L/N/S/X/W (respectively) two byte sequences onto the command line that we noted above for wildcard activity, and, as search arguments, these behave the same as the 0h,A/L/N/S/X/W sequences we were able to introduced to the command line via the "three byte ruse." The WA /WL /WN /WS /WX /WW introduced sequences display differently, however, and are displayed as a single A/L/N/S/X/W character byte in reverse video, rather than being display as two "normal" characters (the first of these -- the 0h byte -- appearing as a blank). The WA /WL /WN /WS /WX /WW primitives are, of course, the "normal" way for introducing these wildcards onto the command line.
To actually search for a 3 byte encoded data character, or a 3 byte encoded primitive, within the text of a file, a user must actually place 3 separate characters -- characters which actually correspond to the actual encoding of the 3 byte character or primitive -- onto the command line. The same technique can also be used in the "change-to" argument string of a change command, to create 3 byte encodings in file text as a result of a change command.