git/Documentation/technical/protocol-common.txt
Jeff King 6cf378f0cb docs: stop using asciidoc no-inline-literal
In asciidoc 7, backticks like `foo` produced a typographic
effect, but did not otherwise affect the syntax. In asciidoc
8, backticks introduce an "inline literal" inside which markup
is not interpreted. To keep compatibility with existing
documents, asciidoc 8 has a "no-inline-literal" attribute to
keep the old behavior. We enabled this so that the
documentation could be built on either version.

It has been several years now, and asciidoc 7 is no longer
in wide use. We can now decide whether or not we want
inline literals on their own merits, which are:

  1. The source is much easier to read when the literal
     contains punctuation. You can use `master~1` instead
     of `master{tilde}1`.

  2. They are less error-prone. Because of point (1), we
     tend to make mistakes and forget the extra layer of
     quoting.

This patch removes the no-inline-literal attribute from the
Makefile and converts every use of backticks in the
documentation to an inline literal (they must be cleaned up,
or the example above would literally show "{tilde}" in the
output).

Problematic sites were found by grepping for '`.*[{\\]' and
examined and fixed manually. The results were then verified
by comparing the output of "html2text" on the set of
generated html pages. Doing so revealed that in addition to
making the source more readable, this patch fixes several
formatting bugs:

  - HTML rendering used the ellipsis character instead of
    literal "..." in code examples (like "git log A...B")

  - some code examples used the right-arrow character
    instead of '->' because they failed to quote

  - api-config.txt did not quote tilde, and the resulting
    HTML contained a bogus snippet like:

      <tt><sub></tt> foo <tt></sub>bar</tt>

    which caused some parsers to choke and omit whole
    sections of the page.

  - git-commit.txt confused ``foo`` (backticks inside a
    literal) with ``foo'' (matched double-quotes)

  - mentions of `A U Thor <author@example.com>` used to
    erroneously auto-generate a mailto footnote for
    author@example.com

  - the description of --word-diff=plain incorrectly showed
    the output as "[-removed-] and {added}", not "{+added+}".

  - using "prime" notation like:

      commit `C` and its replacement `C'`

    confused asciidoc into thinking that everything between
    the first backtick and the final apostrophe were meant
    to be inside matched quotes

  - asciidoc got confused by the escaping of some of our
    asterisks. In particular,

      `credential.\*` and `credential.<url>.\*`

    properly escaped the asterisk in the first case, but
    literally passed through the backslash in the second
    case.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-26 13:19:06 -07:00

96 lines
2.7 KiB
Text

Documentation Common to Pack and Http Protocols
===============================================
ABNF Notation
-------------
ABNF notation as described by RFC 5234 is used within the protocol documents,
except the following replacement core rules are used:
----
HEXDIG = DIGIT / "a" / "b" / "c" / "d" / "e" / "f"
----
We also define the following common rules:
----
NUL = %x00
zero-id = 40*"0"
obj-id = 40*(HEXDIGIT)
refname = "HEAD"
refname /= "refs/" <see discussion below>
----
A refname is a hierarchical octet string beginning with "refs/" and
not violating the 'git-check-ref-format' command's validation rules.
More specifically, they:
. They can include slash `/` for hierarchical (directory)
grouping, but no slash-separated component can begin with a
dot `.`.
. They must contain at least one `/`. This enforces the presence of a
category like `heads/`, `tags/` etc. but the actual names are not
restricted.
. They cannot have two consecutive dots `..` anywhere.
. They cannot have ASCII control characters (i.e. bytes whose
values are lower than \040, or \177 `DEL`), space, tilde `~`,
caret `^`, colon `:`, question-mark `?`, asterisk `*`,
or open bracket `[` anywhere.
. They cannot end with a slash `/` nor a dot `.`.
. They cannot end with the sequence `.lock`.
. They cannot contain a sequence `@{`.
. They cannot contain a `\\`.
pkt-line Format
---------------
Much (but not all) of the payload is described around pkt-lines.
A pkt-line is a variable length binary string. The first four bytes
of the line, the pkt-len, indicates the total length of the line,
in hexadecimal. The pkt-len includes the 4 bytes used to contain
the length's hexadecimal representation.
A pkt-line MAY contain binary data, so implementors MUST ensure
pkt-line parsing/formatting routines are 8-bit clean.
A non-binary line SHOULD BE terminated by an LF, which if present
MUST be included in the total length.
The maximum length of a pkt-line's data component is 65520 bytes.
Implementations MUST NOT send pkt-line whose length exceeds 65524
(65520 bytes of payload + 4 bytes of length data).
Implementations SHOULD NOT send an empty pkt-line ("0004").
A pkt-line with a length field of 0 ("0000"), called a flush-pkt,
is a special case and MUST be handled differently than an empty
pkt-line ("0004").
----
pkt-line = data-pkt / flush-pkt
data-pkt = pkt-len pkt-payload
pkt-len = 4*(HEXDIG)
pkt-payload = (pkt-len - 4)*(OCTET)
flush-pkt = "0000"
----
Examples (as C-style strings):
----
pkt-line actual value
---------------------------------
"0006a\n" "a\n"
"0005a" "a"
"000bfoobar\n" "foobar\n"
"0004" ""
----