Читать книгу The Jargon File, Version 2.9.10, 01 Jul 1992 - Various - Страница 8
:Hacker Writing Style: ======================
ОглавлениеWe've already seen that hackers often coin jargon by overgeneralizing grammatical rules. This is one aspect of a more general fondness for form-versus-content language jokes that shows up particularly in hackish writing. One correspondent reports that he consistently misspells `wrong' as `worng'. Others have been known to criticize glitches in Jargon File drafts by observing (in the mode of Douglas Hofstadter) "This sentence no verb", or "Bad speling", or "Incorrectspa cing." Similarly, intentional spoonerisms are often made of phrases relating to confusion or things that are confusing; `dain bramage' for `brain damage' is perhaps the most common (similarly, a hacker would be likely to write "Excuse me, I'm cixelsyd today", rather than "I'm dyslexic today"). This sort of thing is quite common and is enjoyed by all concerned.
Hackers tend to use quotes as balanced delimiters like parentheses, much to the dismay of American editors. Thus, if "Jim is going" is a phrase, and so are "Bill runs" and "Spock groks", then hackers generally prefer to write: "Jim is going", "Bill runs", and "Spock groks". This is incorrect according to standard American usage (which would put the continuation commas and the final period inside the string quotes); however, it is counter-intuitive to hackers to mutilate literal strings with characters that don't belong in them. Given the sorts of examples that can come up in discussions of programming, American-style quoting can even be grossly misleading. When communicating command lines or small pieces of code, extra characters can be a real pain in the neck.
Consider, for example, a sentence in a {vi} tutorial that looks like this:
Then delete a line from the file by typing "dd".
Standard usage would make this
Then delete a line from the file by typing "dd."
but that would be very bad — because the reader would be prone to type the string d-d-dot, and it happens that in `vi(1)' dot repeats the last command accepted. The net result would be to delete *two* lines!
The Jargon File follows hackish usage throughout.
Interestingly, a similar style is now preferred practice in Great Britain, though the older style (which became established for typographical reasons having to do with the aesthetics of comma and quotes in typeset text) is still accepted there. `Hart's Rules' and the `Oxford Dictionary for Writers and Editors' call the hacker-like style `new' or `logical' quoting.
Another hacker quirk is a tendency to distinguish between `scare' quotes and `speech' quotes; that is, to use British-style single quotes for marking and reserve American-style double quotes for actual reports of speech or text included from elsewhere. Interestingly, some authorities describe this as correct general usage, but mainstream American English has gone to using double-quotes indiscriminately enough that hacker usage appears marked [and, in fact, I thought this was a personal quirk of mine until I checked with USENET —- ESR]. One further permutation that is definitely *not* standard is a hackish tendency to do marking quotes by using apostrophes (single quotes) in pairs; that is, 'like this'. This is modelled on string and character literal syntax in some programming languages (reinforced by the fact that many character-only terminals display the apostrophe in typewriter style, as a vertical single quote).
One quirk that shows up frequently in the {email} style of UNIX hackers in particular is a tendency for some things that are normally all-lowercase (including usernames and the names of commands and C routines) to remain uncapitalized even when they occur at the beginning of sentences. It is clear that, for many hackers, the case of such identifiers becomes a part of their internal representation (the `spelling') and cannot be overridden without mental effort (an appropriate reflex because UNIX and C both distinguish cases and confusing them can lead to {lossage}). A way of escaping this dilemma is simply to avoid using these constructions at the beginning of sentences.
There seems to be a meta-rule behind these nonstandard hackerisms to the effect that precision of expression is more important than conformance to traditional rules; where the latter create ambiguity or lose information they can be discarded without a second thought. It is notable in this respect that other hackish inventions (for example, in vocabulary) also tend to carry very precise shades of meaning even when constructed to appear slangy and loose. In fact, to a hacker, the contrast between `loose' form and `tight' content in jargon is a substantial part of its humor!
Hackers have also developed a number of punctuation and emphasis conventions adapted to single-font all-ASCII communications links, and these are occasionally carried over into written documents even when normal means of font changes, underlining, and the like are available.
One of these is that TEXT IN ALL CAPS IS INTERPRETED AS `LOUD', and this becomes such an ingrained synesthetic reflex that a person who goes to caps-lock while in {talk mode} may be asked to "stop shouting, please, you're hurting my ears!".
Also, it is common to use bracketing with unusual characters to signify emphasis. The asterisk is most common, as in "What the *hell*?" even though this interferes with the common use of the asterisk suffix as a footnote mark. The underscore is also common, suggesting underlining (this is particularly common with book titles; for example, "It is often alleged that Joe Haldeman wrote _The_Forever_War_ as a rebuttal to Robert Heinlein's earlier novel of the future military, _Starship_Troopers_."). Other forms exemplified by "=hell=", "\hell/", or "/hell/" are occasionally seen (it's claimed that in the last example the first slash pushes the letters over to the right to make them italic, and the second keeps them from falling over). Finally, words may also be emphasized L I K E T H I S, or by a series of carets (^) under them on the next line of the text.
There is a semantic difference between *emphasis like this* (which emphasizes the phrase as a whole), and *emphasis* *like* *this* (which suggests the writer speaking very slowly and distinctly, as if to a very young child or a mentally impaired person). Bracketing a word with the `*' character may also indicate that the writer wishes readers to consider that an action is taking place or that a sound is being made. Examples: *bang*, *hic*, *ring*, *grin*, *kick*, *stomp*, *mumble*.
There is also an accepted convention for `writing under erasure'; the text
Be nice to this fool^H^H^H^Hgentleman, he's in from corporate HQ.
would be read as "Be nice to this fool, I mean this gentleman…". This comes from the fact that the digraph ^H is often used as a print representation for a backspace. It parallels (and may have been influenced by) the ironic use of `slashouts' in science-fiction fanzines.
In a formula, `*' signifies multiplication but two asterisks in a row are a shorthand for exponentiation (this derives from FORTRAN). Thus, one might write 2 ** 8 = 256.
Another notation for exponentiation one sees more frequently uses the caret (^, ASCII 1011110); one might write instead `2^8 = 256'. This goes all the way back to Algol-60, which used the archaic ASCII `up-arrow' that later became the caret; this was picked up by Kemeny and Kurtz's original BASIC, which in turn influenced the design of the `bc(1)' and `dc(1)' UNIX tools, which have probably done most to reinforce the convention on USENET. The notation is mildly confusing to C programmers, because `^' means bitwise {XOR} in C. Despite this, it was favored 3:1 over ** in a late-1990 snapshot of USENET. It is used consistently in this text.
In on-line exchanges, hackers tend to use decimal forms or improper fractions (`3.5' or `7/2') rather than `typewriter style' mixed fractions (`3-1/2'). The major motive here is probably that the former are more readable in a monospaced font, together with a desire to avoid the risk that the latter might be read as `three minus one-half'. The decimal form is definitely preferred for fractions with a terminating decimal representation; there may be some cultural influence here from the high status of scientific notation.
Another on-line convention, used especially for very large or very small numbers, is taken from C (which derived it from FORTRAN). This is a form of `scientific notation' using `e' to replace `*10^'; for example, one year is about 3e7 seconds long.
The tilde (~) is commonly used in a quantifying sense of `approximately'; that is, `~50' means `about fifty'.
On USENET and in the {MUD} world, common C boolean, logical, and relational operators such as `|', `&', `||', `&&', `!', `==', `!=', `>', and `<', `>=', and `=<' are often combined with English. The Pascal not-equals, `<>', is also recognized, and occasionally one sees `/=' for not-equals (from Ada, Common Lisp, and Fortran 90). The use of prefix `!' as a loose synonym for `not-' or `no-' is particularly common; thus, `!clue' is read `no-clue' or `clueless'.
A related practice borrows syntax from preferred programming languages to express ideas in a natural-language text. For example, one might see the following:
I resently had occasion to field-test the Snafu
Systems 2300E adaptive gonkulator. The price was
right, and the racing stripe on the case looked kind
of neat, but its performance left something to be
desired.
#ifdef FLAME
Hasn't anyone told those idiots that you can't get
decent bogon suppression with AFJ filters at today's
net speeds?
#endif /* FLAME */
I guess they figured the price premium for true
frame-based semantic analysis was too high.
Unfortunately, it's also the only workable approach.
I wouldn't recommend purchase of this product unless
you're on a *very* tight budget.
#include <disclaimer.h>
—
== Frank Foonly (Fubarco Systems)
In the above, the `#ifdef'/`#endif' pair is a conditional compilation syntax from C; here, it implies that the text between (which is a {flame}) should be evaluated only if you have turned on (or defined on) the switch FLAME. The `#include' at the end is C for "include standard disclaimer here"; the `standard disclaimer' is understood to read, roughly, "These are my personal opinions and not to be construed as the official position of my employer."
Another habit is that of using angle-bracket enclosure to genericize a term; this derives from conventions used in {BNF}. Uses like the following are common:
So this <ethnic> walks into a bar one day, and…
Hackers also mix letters and numbers more freely than in mainstream usage. In particular, it is good hackish style to write a digit sequence where you intend the reader to understand the text string that names that number in English. So, hackers prefer to write `1970s' rather than `nineteen-seventies' or `1970's' (the latter looks like a possessive).
It should also be noted that hackers exhibit much less reluctance to use multiply nested parentheses than is normal in English. Part of this is almost certainly due to influence from LISP (which uses deeply nested parentheses (like this (see?)) in its syntax a lot), but it has also been suggested that a more basic hacker trait of enjoying playing with complexity and pushing systems to their limits is in operation.
One area where hackish conventions for on-line writing are still in some flux is the marking of included material from earlier messages —- what would be called `block quotations' in ordinary English. From the usual typographic convention employed for these (smaller font at an extra indent), there derived the notation of included text being indented by one ASCII TAB (0001001) character, which under UNIX and many other environments gives the appearance of an 8-space indent.
Early mail and netnews readers had no facility for including messages this way, so people had to paste in copy manually. BSD `Mail(1)' was the first message agent to support inclusion, and early USENETters emulated its style. But the TAB character tended to push included text too far to the right (especially in multiply nested inclusions), leading to ugly wraparounds. After a brief period of confusion (during which an inclusion leader consisting of three or four spaces became established in EMACS and a few mailers), the use of leading `>' or `> ' became standard, perhaps owing to its use in `ed(1)' to display tabs (alternatively, it may derive from the `>' that some early UNIX mailers used to quote lines starting with "From" in text, so they wouldn't look like the beginnings of new message headers). Inclusions within inclusions keep their `>' leaders, so the `nesting level' of a quotation is visually apparent.
A few other idiosyncratic quoting styles survive because they are automatically generated. One particularly ugly one looks like this:
/* Written hh:mm pm Mmm dd, yyyy by user@site in <group> */
/* ————— "Article subject, chopped to 35 ch" ————— */
<quoted text>
/* End of text from local:group */
It is generated by an elderly, variant news-reading system called `notesfiles'. The overall trend, however, is definitely away from such verbosity.
The practice of including text from the parent article when posting a followup helped solve what had been a major nuisance on USENET: the fact that articles do not arrive at different sites in the same order. Careless posters used to post articles that would begin with, or even consist entirely of, "No, that's wrong" or "I agree" or the like. It was hard to see who was responding to what. Consequently, around 1984, new news-posting software evolved a facility to automatically include the text of a previous article, marked with "> " or whatever the poster chose. The poster was expected to delete all but the relevant lines. The result has been that, now, careless posters post articles containing the *entire* text of a preceding article, *followed* only by "No, that's wrong" or "I agree".
Many people feel that this cure is worse than the original disease, and there soon appeared newsreader software designed to let the reader skip over included text if desired. Today, some posting software rejects articles containing too high a proportion of lines beginning with `>' — but this too has led to undesirable workarounds, such as the deliberate inclusion of zero-content filler lines which aren't quoted and thus pull the message below the rejection threshold.
Because the default mailers supplied with UNIX and other operating systems haven't evolved as quickly as human usage, the older conventions using a leading TAB or three or four spaces are still alive; however, >-inclusion is now clearly the prevalent form in both netnews and mail.
In 1991 practice is still evolving, and disputes over the `correct' inclusion style occasionally lead to {holy wars}. One variant style reported uses the citation character `|' in place of `>' for extended quotations where original variations in indentation are being retained. One also sees different styles of quoting a number of authors in the same message: one (deprecated because it loses information) uses a leader of `> ' for everyone, another (the most common) is `> > > > ', `> > > ', etc. (or `>>>> ', `>>> ', etc., depending on line length and nesting depth) reflecting the original order of messages, and yet another is to use a different citation leader for each author, say `> ', `: ', `| ', `} ' (preserving nesting so that the inclusion order of messages is still apparent, or tagging the inclusions with authors' names). Yet *another* style is to use each poster's initials (or login name) as a citation leader for that poster. Occasionally one sees a `# ' leader used for quotations from authoritative sources such as standards documents; the intended allusion is to the root prompt (the special UNIX command prompt issued when one is running as the privileged super-user).
Finally, it is worth mentioning that many studies of on-line communication have shown that electronic links have a de-inhibiting effect on people. Deprived of the body-language cues through which emotional state is expressed, people tend to forget everything about other parties except what is presented over that ASCII link. This has both good and bad effects. The good one is that it encourages honesty and tends to break down hierarchical authority relationships; the bad is that it may encourage depersonalization and gratuitous rudeness. Perhaps in response to this, experienced netters often display a sort of conscious formal politesse in their writing that has passed out of fashion in other spoken and written media (for example, the phrase "Well said, sir!" is not uncommon).
Many introverted hackers who are next to inarticulate in person communicate with considerable fluency over the net, perhaps precisely because they can forget on an unconscious level that they are dealing with people and thus don't feel stressed and anxious as they would face to face.
Though it is considered gauche to publicly criticize posters for poor spelling or grammar, the network places a premium on literacy and clarity of expression. It may well be that future historians of literature will see in it a revival of the great tradition of personal letters as art.