From 209f129857bddd653e4ea2af2d310cbb8d8c8b5f Mon Sep 17 00:00:00 2001 From: "Shawn O. Pearce" Date: Thu, 8 Feb 2007 01:35:37 -0500 Subject: [PATCH 1/4] Correct ^0 asciidoc syntax in fast-import docs. I wrote this documentation with asciidoc 7.1.2, but apparently asciidoc 8 assumes ^ means superscript. The solution was already documented in rev-parse's manpage and is to use {caret} instead. Signed-off-by: Shawn O. Pearce --- Documentation/git-fast-import.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt index 0b64d3348b..0c4476109d 100644 --- a/Documentation/git-fast-import.txt +++ b/Documentation/git-fast-import.txt @@ -380,9 +380,9 @@ current branch value should be written as: ---- from refs/heads/branch^0 ---- -The `^0` suffix is necessary as gfi does not permit a branch to +The `{caret}0` suffix is necessary as gfi does not permit a branch to start from itself, and the branch is created in memory before the -`from` command is even read from the input. Adding `^0` will force +`from` command is even read from the input. Adding `{caret}0` will force gfi to resolve the commit through Git's revision parsing library, rather than its internal branch table, thereby loading in the existing value of the branch. From f842fdb01da6037a8be4cf7f084bc6030f1eea5f Mon Sep 17 00:00:00 2001 From: "Shawn O. Pearce" Date: Thu, 8 Feb 2007 01:53:48 -0500 Subject: [PATCH 2/4] Correct some language in fast-import documentation. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Minor documentation improvements, as suggested on the Git mailing list by Horst H. von Brand and Karl Hasselström. Signed-off-by: Shawn O. Pearce --- Documentation/git-fast-import.txt | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt index 0c4476109d..01f4c8aadc 100644 --- a/Documentation/git-fast-import.txt +++ b/Documentation/git-fast-import.txt @@ -181,7 +181,7 @@ If the local offset is not available in the source material, use ``+0000'', or the most common local offset. For example many organizations have a CVS repository which has only ever been accessed by users who are located in the same location and timezone. In this -case the offset from UTC can be easily assumed. +case a reasonable offset from UTC could be assumed. + Unlike the `rfc2822` format, this format is very strict. Any variation in formatting will cause gfi to reject the value. @@ -190,7 +190,7 @@ variation in formatting will cause gfi to reject the value. This is the standard email format as described by RFC 2822. + An example value is ``Tue Feb 6 11:22:18 2007 -0500''. The Git -parser is accurate, but a little on the lenient side. Its the +parser is accurate, but a little on the lenient side. It is the same parser used by gitlink:git-am[1] when applying patches received from email. + @@ -205,14 +205,15 @@ contained in an RFC 2822 date string is used to adjust the date value to UTC prior to storage. Therefore it is important that this information be as accurate as possible. + -If the source material is formatted in RFC 2822 style dates, +If the source material uses RFC 2822 style dates, the frontend should let gfi handle the parsing and conversion (rather than attempting to do it itself) as the Git parser has been well tested in the wild. + Frontends should prefer the `raw` format if the source material -is already in UNIX-epoch format, or is easily convertible to -that format, as there is no ambiguity in parsing. +already uses UNIX-epoch format, can be coaxed to give dates in that +format, or its format is easiliy convertible to it, as there is no +ambiguity in parsing. `now`:: Always use the current time and timezone. The literal From 882227f117a1356a7132cfdc24b1e2389b16133b Mon Sep 17 00:00:00 2001 From: "Shawn O. Pearce" Date: Thu, 8 Feb 2007 13:49:06 -0500 Subject: [PATCH 3/4] Correct spelling of fast-import in docs. Its spelled 'fast-import', not 'gfi'. Linus and Dscho have both recently pointed this out to me on the mailing list. Signed-off-by: Shawn O. Pearce --- Documentation/git-fast-import.txt | 156 +++++++++++++++--------------- 1 file changed, 78 insertions(+), 78 deletions(-) diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt index 01f4c8aadc..2a5052072a 100644 --- a/Documentation/git-fast-import.txt +++ b/Documentation/git-fast-import.txt @@ -15,15 +15,15 @@ DESCRIPTION This program is usually not what the end user wants to run directly. Most end users want to use one of the existing frontend programs, which parses a specific type of foreign source and feeds the contents -stored there to git-fast-import (gfi). +stored there to git-fast-import. -gfi reads a mixed command/data stream from standard input and +fast-import reads a mixed command/data stream from standard input and writes one or more packfiles directly into the current repository. When EOF is received on standard input, fast import writes out updated branch and tag refs, fully updating the current repository with the newly imported data. -The gfi backend itself can import into an empty repository (one that +The fast-import backend itself can import into an empty repository (one that has already been initialized by gitlink:git-init[1]) or incrementally update an existing populated repository. Whether or not incremental imports are supported from a particular foreign source depends on @@ -34,7 +34,7 @@ OPTIONS ------- --date-format=:: Specify the type of dates the frontend will supply to - gfi within `author`, `committer` and `tagger` commands. + fast-import within `author`, `committer` and `tagger` commands. See ``Date Formats'' below for details about which formats are supported, and their syntax. @@ -65,28 +65,28 @@ OPTIONS have been completed. --quiet:: - Disable all non-fatal output, making gfi silent when it + Disable all non-fatal output, making fast-import silent when it is successful. This option disables the output shown by \--stats. --stats:: - Display some basic statistics about the objects gfi has + Display some basic statistics about the objects fast-import has created, the packfiles they were stored into, and the - memory used by gfi during this run. Showing this output + memory used by fast-import during this run. Showing this output is currently the default, but can be disabled with \--quiet. Performance ----------- -The design of gfi allows it to import large projects in a minimum +The design of fast-import allows it to import large projects in a minimum amount of memory usage and processing time. Assuming the frontend -is able to keep up with gfi and feed it a constant stream of data, +is able to keep up with fast-import and feed it a constant stream of data, import times for projects holding 10+ years of history and containing 100,000+ individual commits are generally completed in just 1-2 hours on quite modest (~$2,000 USD) hardware. Most bottlenecks appear to be in foreign source data access (the -source just cannot extract revisions fast enough) or disk IO (gfi +source just cannot extract revisions fast enough) or disk IO (fast-import writes as fast as the disk will take the data). Imports will run faster if the source data is stored on a different drive than the destination Git repository (due to less IO contention). @@ -94,28 +94,28 @@ destination Git repository (due to less IO contention). Development Cost ---------------- -A typical frontend for gfi tends to weigh in at approximately 200 +A typical frontend for fast-import tends to weigh in at approximately 200 lines of Perl/Python/Ruby code. Most developers have been able to create working importers in just a couple of hours, even though it -is their first exposure to gfi, and sometimes even to Git. This is +is their first exposure to fast-import, and sometimes even to Git. This is an ideal situation, given that most conversion tools are throw-away (use once, and never look back). Parallel Operation ------------------ -Like `git-push` or `git-fetch`, imports handled by gfi are safe to +Like `git-push` or `git-fetch`, imports handled by fast-import are safe to run alongside parallel `git repack -a -d` or `git gc` invocations, or any other Git operation (including `git prune`, as loose objects -are never used by gfi). +are never used by fast-import). -gfi does not lock the branch or tag refs it is actively importing. -After the import, during its ref update phase, gfi tests each +fast-import does not lock the branch or tag refs it is actively importing. +After the import, during its ref update phase, fast-import tests each existing branch ref to verify the update will be a fast-forward update (the commit stored in the ref is contained in the new history of the commit to be written). If the update is not a -fast-forward update, gfi will skip updating that ref and instead -prints a warning message. gfi will always attempt to update all +fast-forward update, fast-import will skip updating that ref and instead +prints a warning message. fast-import will always attempt to update all branch refs, and does not stop on the first failure. Branch updates can be forced with \--force, but its recommended that @@ -125,35 +125,35 @@ is not necessary for an initial import into an empty repository. Technical Discussion -------------------- -gfi tracks a set of branches in memory. Any branch can be created +fast-import tracks a set of branches in memory. Any branch can be created or modified at any point during the import process by sending a `commit` command on the input stream. This design allows a frontend program to process an unlimited number of branches simultaneously, generating commits in the order they are available from the source data. It also simplifies the frontend programs considerably. -gfi does not use or alter the current working directory, or any +fast-import does not use or alter the current working directory, or any file within it. (It does however update the current Git repository, as referenced by `GIT_DIR`.) Therefore an import frontend may use the working directory for its own purposes, such as extracting file revisions from the foreign source. This ignorance of the working -directory also allows gfi to run very quickly, as it does not +directory also allows fast-import to run very quickly, as it does not need to perform any costly file update operations when switching between branches. Input Format ------------ With the exception of raw file data (which Git does not interpret) -the gfi input format is text (ASCII) based. This text based +the fast-import input format is text (ASCII) based. This text based format simplifies development and debugging of frontend programs, especially when a higher level language such as Perl, Python or Ruby is being used. -gfi is very strict about its input. Where we say SP below we mean +fast-import is very strict about its input. Where we say SP below we mean *exactly* one space. Likewise LF means one (and only one) linefeed. Supplying additional whitespace characters will cause unexpected results, such as branch names or file names with leading or trailing -spaces in their name, or early termination of gfi when it encounters +spaces in their name, or early termination of fast-import when it encounters unexpected input. Date Formats @@ -164,7 +164,7 @@ in the \--date-format= command line option. `raw`:: This is the Git native format and is `