Commit graph

106 commits

Author SHA1 Message Date
Shawn O. Pearce 7f09ac4714 Merge branch 'maint'
* maint:
  fast-import: grow tree storage more aggressively
2007-03-12 15:04:46 -04:00
Jeff King f022f85f6d fast-import: grow tree storage more aggressively
When building up a tree for a commit, fast-import
dynamically allocates memory for the tree entries. When more
space is needed, the allocated memory is increased by a
constant amount. For very large trees, this means
re-allocating and memcpy()ing the memory O(n) times.

To compound this problem, releasing the previous tree
resource does not free the memory; it is kept in a pool
for future trees. This means that each of the O(n)
allocations will consume increasing amounts of memory,
giving O(n^2) memory consumption.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-12 15:01:44 -04:00
Junio C Hamano f45fa2a073 Merge branch 'master' of git://repo.or.cz/git/fastimport
* 'master' of git://repo.or.cz/git/fastimport:
  Allow fast-import frontends to reload the marks table
  Use atomic updates to the fast-import mark file
  Preallocate memory earlier in fast-import
2007-03-07 23:10:05 -08:00
Shawn O. Pearce e8438420bb Allow fast-import frontends to reload the marks table
I'm giving fast-import a lesson on how to reload the marks table
using the same format it outputs with --export-marks.  This way
a frontend can reload the marks table from a prior import, making
incremental imports less painful.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-07 18:07:26 -05:00
Shawn O. Pearce 60b9004cdb Use atomic updates to the fast-import mark file
When we allow fast-import frontends to reload a mark file from a
prior session we want to let them use the same file as they exported
the marks to.  This makes it very simple for the frontend to save
state across incremental imports.

But we don't want to lose the old marks table if anything goes wrong
while writing our current marks table.  So instead of truncating and
overwriting the path specified to --export-marks we use the standard
lockfile code to write the current marks out to a temporary file,
then rename it over the old marks table.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-07 18:05:38 -05:00
Shawn O. Pearce 93e72d8d8f Preallocate memory earlier in fast-import
I'm about to teach fast-import how to reload the marks file created
by a prior session.  The general approach that I want to use is to
immediately parse the marks file when the specific argument is found
in argv, thereby allowing the caller to supply multiple marks files,
as the mark space can be sparsely populated.

To make that work out we need to allocate our object tables before
we parse the command line options.  Since none of these tables
depend on the command line options, we can easily relocate them.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-07 17:11:02 -05:00
Shawn O. Pearce 6777a59fcd Use off_t in pack-objects/fast-import when we mean an offset
Always use an off_t value in pack-objects anytime we are dealing
with an offset to some data within a packfile.

Also fixed a minor uintmax_t that was incorrectly defined before.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 11:06:33 -08:00
Shawn O. Pearce c4001d92be Use off_t when we really mean a file offset.
Not all platforms have declared 'unsigned long' to be a 64 bit value,
but we want to support a 64 bit packfile (or close enough anyway)
in the near future as some projects are getting large enough that
their packed size exceeds 4 GiB.

By using off_t, the POSIX type that is declared to mean an offset
within a file, we support whatever maximum file size the underlying
operating system will handle.  For most modern systems this is up
around 2^60 or higher.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 11:06:25 -08:00
Shawn O. Pearce 3a55602eec General const correctness fixes
We shouldn't attempt to assign constant strings into char*, as the
string is not writable at runtime.  Likewise we should always be
treating unsigned values as unsigned values, not as signed values.

Most of these are very straightforward.  The only exception is the
(unnecessary) xstrdup/free in builtin-branch.c for the detached
head case.  Since this is a user-level interactive type program
and that particular code path is executed no more than once, I feel
that the extra xstrdup call is well worth the easy elimination of
this warning.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 10:47:10 -08:00
Shawn O. Pearce 6b4318e604 Merge branch 'maint'
* maint:
  fast-import: Fail if a non-existant commit is used for merge
  fast-import: Avoid infinite loop after reset

[sp: Minor evil merge to deal with type_names array moving
 to be private in 'master'.]
2007-03-05 12:50:29 -05:00
Shawn O. Pearce 2f6dc35d2a fast-import: Fail if a non-existant commit is used for merge
Johannes Sixt noticed during one of his own imports that fast-import
did not fail if a non-existant commit is referenced by SHA-1 value
as an argument to the 'merge' command.  This allowed the user to
unknowingly create commits that would fail in fsck, as the commit
contents would not be completely reachable.

A side effect of this bug was that a frontend process could mark
any SHA-1 object (blob, tree, tag) as a parent of a merge commit.
This should also fail in fsck, as the commit is not a valid commit.

We now use the same rule as the 'from' command.  If a commit is
referenced in the 'merge' command by hex formatted SHA-1 then the
SHA-1 must be a commit or a tag that can be peeled back to a commit,
the commit must already exist, and must be readable by the core Git
infrastructure code.  This requirement means that the commit must
have existed prior to fast-import starting, or the commit must have
been flushed out by a prior 'checkpoint' command.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-05 12:43:14 -05:00
Shawn O. Pearce 734c91f9e2 fast-import: Avoid infinite loop after reset
Johannes Sixt noticed that a 'reset' command applied to a branch that
is already active in the branch LRU cache can cause fast-import to
relink the same branch into the LRU cache twice.  This will cause
the LRU cache to contain a cycle, making unload_one_branch run in an
infinite loop as it tries to select the oldest branch for eviction.

I have trivially fixed the problem by adding an active bit to
each branch object; this bit indicates if the branch is already
in the LRU and allows us to avoid trying to add it a second time.
Converting the pack_id field into a bitfield makes this change take
up no additional memory.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-05 12:31:09 -05:00
Nicolas Pitre 21666f1aae convert object type handling from a string to a number
We currently have two parallel notation for dealing with object types
in the code: a string and a numerical value.  One of them is obviously
redundent, and the most used one requires more stack space and a bunch
of strcmp() all over the place.

This is an initial step for the removal of the version using a char array
found in object reading code paths.  The patch is unfortunately large but
there is no sane way to split it in smaller parts without breaking the
system.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-27 01:34:21 -08:00
Nicolas Pitre df8436622f formalize typename(), and add its reverse type_from_string()
Sometime typename() is used, sometimes type_names[] is accessed directly.
Let's enforce typename() all the time which allows for validating the
type.

Also let's add a function to go from a name to a type and use it instead
of manual memcpy() when appropriate.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-27 01:34:21 -08:00
Junio C Hamano 599065a3bb prefixcmp(): fix-up mechanical conversion.
Previous step converted use of strncmp() with literal string
mechanically even when the result is only used as a boolean:

    if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo")))

This step manually cleans them up to read:

    if (!prefixcmp(arg, "foo"))

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-20 22:03:15 -08:00
Junio C Hamano cc44c7655f Mechanical conversion to use prefixcmp()
This mechanically converts strncmp() to use prefixcmp(), but only when
the parameters match specific patterns, so that they can be verified
easily.  Leftover from this will be fixed in a separate step, including
idiotic conversions like

    if (!strncmp("foo", arg, 3))

  =>

    if (!(-prefixcmp(arg, "foo")))

This was done by using this script in px.perl

   #!/usr/bin/perl -i.bak -p
   if (/strncmp\(([^,]+), "([^\\"]*)", (\d+)\)/ && (length($2) == $3)) {
           s|strncmp\(([^,]+), "([^\\"]*)", (\d+)\)|prefixcmp($1, "$2")|;
   }
   if (/strncmp\("([^\\"]*)", ([^,]+), (\d+)\)/ && (length($1) == $3)) {
           s|strncmp\("([^\\"]*)", ([^,]+), (\d+)\)|(-prefixcmp($2, "$1"))|;
   }

and running:

   $ git grep -l strncmp -- '*.c' | xargs perl px.perl

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-20 22:03:15 -08:00
Junio C Hamano 9360f27ff7 Merge branch 'maint'
* maint:
  Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c.
2007-02-20 22:02:15 -08:00
Jason Riedy 3efb1f343a Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c.
Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for
the clean-up.  Defining the C99 standard PRIuMAX when necessary
replaces UM_FMT and the awkward UM10_FMT.  There are no direct
C99 translations for other uses of NO_C99_FORMAT in git, alas.

Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-20 19:10:57 -08:00
Junio C Hamano 2b2b892e36 Merge branch 'maint'
* maint:
  Obey NO_C99_FORMAT in fast-import.c.
  Add a compat/strtoumax.c for Solaris 8.
  git-clone: Sync documentation to usage note.
2007-02-19 18:29:41 -08:00
Jason Riedy e326bce65c Obey NO_C99_FORMAT in fast-import.c.
Define UM_FMT and UM10_FMT and use in place of %ju and %10ju,
respectively.  Both format as unsigned long long, so this
assumes the compiler supports long long.

Signed-off-by: Jason Riedy <jason@acm.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-19 18:20:49 -08:00
Junio C Hamano 4a164d48df Merge branch 'jc/merge-base' (early part)
This contains an evil merge to fast-import, in order to
resolve in_merge_bases() update.
2007-02-13 16:54:35 -08:00
Shawn O. Pearce ea5e370aa9 fast-import: Support reusing 'from' and brown paper bag fix reset.
It was suggested on the mailing list that being able to use `from`
in any commit to reset the current branch is useful in some types of
importers, such as a darcs importer.

We originally did not permit resetting an existing branch with a
new `from` command during a `commit` command, but this restriction
was only to help debug the hacked up cvs2svn that Jon Smirl was
developing in parallel with git-fast-import.  It is probably more
of a problem to disallow it than to allow it.  So now we permit a
`from` during any `commit`.

While making the changes required to permit multiple `from`
commands on the same branch, I discovered we no longer needed the
last_commit field to be set to 0 during a reset, so that was removed.
(Reset was originally setting the field to 0 to signal cmd_from()
that it was OK to execute on the branch.)

While poking around in this section of fast-import I also realized
the `reset` command was not working as intended if the corresponding
`from` command was omitted (as allowed by the BNF grammar and the
code).  If `from` was omitted we cleared out the tree but we left
the tree SHA-1 and parent commit SHA-1 intact.  This is not what
the user intended in this case.  Instead they would be trying to
reset the branch to have no parent and to have no tree, making the
branch look new-born during the next commit.  We now clear these
SHA-1 values during `reset`, ensuring the branch looks new-born if
`from` does not get supplied.

New test cases for these were also added.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-12 12:17:31 -05:00
Shawn O. Pearce bdf1c06dc1 fast-import: Hide the pack boundary commits by default.
Most users don't need the pack boundary information that fast-import
was printing to standard output, especially if they were calling
it with --quiet.

Those users who do want this information probably want it captured
so they can go back and use it to repack the imported repository.
So dumping the boundary commits to a log file makes more sense then
printing them to standard output.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-11 19:45:56 -05:00
Johannes Schindelin 40db58b8dc fast-import: Fix compile warnings
Not on all platforms are size_t and unsigned long equivalent.
Since I do not know how portable %z is, I play safe, and just
cast the respective variables to unsigned long.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-07 09:28:23 -08:00
Shawn O. Pearce 22c9f7e4c5 Don't crash fast-import if the marks cannot be exported.
Apparently fast-import used to die a horrible death if we
were unable to open the marks file for output.  This is
slightly less than ideal, especially now that we dump
the marks as part of the `checkpoint` command.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-07 02:46:35 -05:00
Shawn O. Pearce 820b931012 Dump all refs and marks during a checkpoint in fast-import.
If the frontend asks us to checkpoint (via the explicit checkpoint
command) its probably because they are afraid the current import
will crash/fail/whatever and want to make sure they can pickup from
the last checkpoint.  To do that sort of recovery, we will need the
current tip of every branch and tag available at the next startup.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-07 02:42:44 -05:00
Shawn O. Pearce c499d76849 Teach fast-import how to sit quietly in the corner.
Often users will be running fast-import from within a larger frontend
process, and this may be a frequent periodic tool such as a future
edition of `git-svn fetch`.  We don't want to bombard users with our
large stats output if they won't be interested in it, so `--quiet`
is now an option to make gfi be more silent.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-07 02:19:31 -05:00
Shawn O. Pearce 825769a8fe Teach fast-import how to clear the internal branch content.
Some frontends may not be able to (easily) keep track of which files
are included in the branch, and which aren't.  Performing this
tracking can be tedious and error prone for the frontend to do,
especially if its foreign data source cannot supply the changed
path list on a per-commit basis.

fast-import now allows a frontend to request that a branch's tree
be wiped clean (reset to the empty tree) at the start of a commit,
allowing the frontend to feed in all paths which belong on the branch.

This is ideal for a tar-file importer frontend, for example, as
the frontend just needs to reformat the tar data stream into a gfi
data stream, which may be something a few Perl regexps can take
care of. :)

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-07 02:03:03 -05:00
Junio C Hamano 9981b6d915 S_IFLNK != 0140000
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-06 16:08:30 -05:00
Shawn O. Pearce 7073e69e38 Don't do non-fastforward updates in fast-import.
If fast-import is being used to update an existing branch of
a repository, the user may not want to lose commits if another
process updates the same ref at the same time.  For example, the
user might be using fast-import to make just one or two commits
against a live branch.

We now perform a fast-forward check during the ref updating process.
If updating a branch would cause commits in that branch to be lost,
we skip over it and display the new SHA1 to standard error.

This new default behavior can be overridden with `--force`, like
git-push and git-fetch.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-06 16:08:06 -05:00
Shawn O. Pearce 63e0c8b364 Support RFC 2822 date parsing in fast-import.
Since some frontends may be working with source material where
the dates are only readily available as RFC 2822 strings, it is
more friendly if fast-import exposes Git's parse_date() function
to handle the conversion.  This way the frontend doesn't need
to perform the parsing itself.

The new --date-format option to fast-import can be used by a
frontend to select which format it will supply date strings in.
The default is the standard `raw` Git format, which fast-import
has always supported.  Format rfc2822 can be used to activate the
parse_date() function instead.

Because fast-import could also be useful for creating new, current
commits, the format `now` is also supported to generate the current
system timestamp.  The implementation of `now` is a trivial call
to datestamp(), but is actually a whole whopping 3 lines so that
fast-import can verify the frontend really meant `now`.

As part of this change I have added validation of the `raw` date
format.  Prior to this change fast-import would accept anything
in a `committer` command, even if it was seriously malformed.
Now fast-import requires the '> ' near the end of the string and
verifies the timestamp is formatted properly.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-06 14:58:30 -05:00
Shawn O. Pearce e7d06a4b70 Remove unnecessary null pointer checks in fast-import.
There is no need to check for a NULL pointer before invoking free(),
the runtime library automatically performs this check anyway and
does nothing if a NULL pointer is supplied.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-06 12:05:51 -05:00
Shawn O. Pearce e5b1444b96 Correct minor style issue in fast-import.
Junio noticed that I was using a different style in fast-import
for returned pointers than the rest of Git.  Before merging this
code into the main git.git tree I'd like to make it consistent,
as this style variation was not intentional.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-06 00:43:59 -05:00
Shawn O. Pearce 10e8d68820 Correct compiler warnings in fast-import.
Junio noticed these warnings/errors in fast-import when compiling
with `-Werror -ansi -pedantic`.  A few changes are to reduce compiler
warnings, while one (in cmd_merge) is a bug fix.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-06 00:26:49 -05:00
Shawn O. Pearce 0b868e0240 Remove --branch-log from fast-import.
The --branch-log option and its associated code hasn't been used in
several months, as its not really very useful for debugging fast-import
or a frontend.  I don't plan on supporting it in this state long-term,
so I'm killing it now before it gets distributed to a wider audience.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-06 00:15:37 -05:00
Shawn O. Pearce 6c3aac1c69 Don't support shell-quoted refnames in fast-import.
The current implementation of shell-style quoted refnames and
SHA-1 expressions within fast-import contains a bad memory leak.
We leak the unquoted strings used by the `from` and `merge`
commands, maybe others.  Its also just muddling up the docs.

Since Git refnames cannot contain LF, and that is our delimiter
for the end of the refname, and we accept any other character
as-is, there is no reason for these strings to support quoting,
except to be nice to frontends.  But frontends shouldn't be
expecting to use funny refs in Git, and its just as simple to
never quote them as it is to always pass them through the same
quoting filter as pathnames.  So frontends should never quote
refs, or ref expressions.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-05 20:30:37 -05:00
Shawn O. Pearce 10831c5513 Reduce memory usage of fast-import.
Some structs are allocated rather frequently, but were using integer
types which were far larger than required to actually store their
full value range.

As packfiles are limited to 4 GiB we don't need more than 32 bits to
store the offset of an object within that packfile, an `unsigned long`
on a 64 bit system is likely a 64 bit unsigned value.  Saving 4 bytes
per object on a 64 bit system can add up fast on any sizable import.

As atom strings are strictly single components in a path name these
are probably limited to just 255 bytes by the underlying OS.  Going
to that short of a string is probably too restrictive, but certainly
`unsigned int` is far too large for their lengths.  `unsigned short`
is a reasonable limit.

Modes within a tree really only need two bytes to store their whole
value; using `unsigned int` here is vast overkill.  Saving 4 bytes
per file entry in an active branch can add up quickly on a project
with a large number of files.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-05 16:34:56 -05:00
Shawn O. Pearce 8c1f22da9f Include checkpoint command in the BNF.
This command isn't encouraged (as its slow) but it does exist and
is accepted, so it still should be covered in the BNF.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-05 16:05:11 -05:00
Shawn O. Pearce 76db9dec81 Merge branch 'master' into sp/gfi
git-fast-import requires use of inttypes.h, but the master branch has
added it to git-compat-util differently than git-fast-import originally
had used it.  This merge back of master to the fast-import topic is to
get (and use) inttypes.h the way master is using it.

This is a partially evil merge to remove the call to setup_ident(),
as the master branch now contains a change which makes this implicit
and therefore removed the function declaration. (commit 01754769).

Conflicts:

	git-compat-util.h
2007-01-30 11:07:24 -05:00
Shawn O. Pearce b715cfbba4 Accept 'inline' file data in fast-import commit structure.
Its very annoying to need to specify the file content ahead of a
commit and use marks to connect the individual blobs to the commit's
file modification entry, especially if the frontend can't/won't
generate the blob SHA1s itself.  Instead it would much easier to
use if we can accept the blob data at the same time as we receive
each file_change line.

Now fast-import accepts 'inline' instead of a mark idnum or blob
SHA1 within the 'M' type file_change command.  If an inline is
detected the very next line must be a 'data n' command, supplying
the file data.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-01-18 15:17:58 -05:00
Shawn O. Pearce 3b4dce0275 Support delimited data regions in fast-import.
During testing its nice to not have to feed the length of a data
chunk to the 'data' command of fast-import.  Instead we would
prefer to be able to establish a data chunk much like shell's <<
operator and use a line delimiter to denote the end of the input.

So now if a data command is started as 'data <<EOF' we will look
for a terminator line containing only the string EOF on that line.
Once found, we stop the data command.  Everything between the two
lines is used as the data value.

The 'data <<' syntax is slower than 'data n', as we don't know how
many bytes to expect and instead must grow our buffer on the fly.
It also has the problem that the frontend must use a string which
will not appear on a line by itself in the input, and the data
region will always end in an LF.  For these reasons real import
frontends are encouraged to continue to use _only_ 'data n'.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-01-18 13:25:37 -05:00
Shawn O. Pearce e5808826c4 Remove unnecessary options from fast-import.
The --objects command line option is rather unnecessary.  Internally
we allocate objects in 5000 unit blocks, ensuring that any sort
of malloc overhead is ammortized over the individual objects to
almost nothing.  Since most frontends don't know how many objects
they will need for a given import run (and its hard for them to
predict without just doing the run) we probably won't see anyone
using --objects.  Further since there's really no major benefit
to using the option, most frontends won't even bother supplying
it even if they could estimate the number of objects.  So I'm
removing it.

The --max-objects-per-pack option was probably a mistake to even
have added in the first place.  The packfile format is limited
to 4 GiB today; given that objects need at least 3 bytes of data
(and probably need even more) there's no way we are going to exceed
the limit of 1<<32-1 objects before we reach the file size limit.
So I'm removing it (to slightly reduce the complexity of the code)
before anyone gets any wise ideas and tries to use it.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-01-18 12:02:37 -05:00
Shawn O. Pearce ebea9dd4f1 Use fixed-size integers when writing out the index in fast-import.
Currently the pack .idx file format uses 32-bit unsigned integers
for the fan-out table and the object offsets.  We had previously
defined these as 'unsigned int', but not every system will define
that type to be a 32 bit value.  To ensure maximum portability we
should always use 'uint32_t'.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-01-18 11:30:17 -05:00
Shawn O. Pearce 566f44252b Always use struct pack_header for pack header in fast-import.
Previously we were using 'unsigned int' to update the hdr_entries
field of the pack header after the file had been completed and
was being hashed.  This may not be 32 bits on all platforms.
Instead we want to always uint32_t.

I'm actually cheating here by just using the pack_header like the
rest of Git and letting the struct definition declare the correct
type.  Right now that field is still 'unsigned int' (wrong) but a
pending change submitted by Simon 'corecode' Schubert changes it
to uint32_t.  After that change is merged in fast-import will do
the right thing all of the time.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-01-18 11:26:06 -05:00
Shawn O. Pearce 69e74e7412 Correct packfile edge output in fast-import.
Branches are only contained by a packfile if the branch actually
had its most recent commit in that packfile.  So new branches are
set to MAX_PACK_ID to ensure they don't cause their commit to list
as part of the first packfile when it closes out if the commit was
actually in existance before fast-import started.

Also corrected the type of last_commit to be umaxint_t to prevent
overflow and wraparound on very large imports.  Though that is
highly unlikely to occur as we're talking 4 billion commits, which
no real project has right now.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-01-17 02:42:43 -05:00
Shawn O. Pearce fd99224eec Declare no-arg functions as (void) in fast-import.
Apparently the git convention is to declare any function which
takes no arguments as taking void.  I did not do this during the
early fast-import development, but should have.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-01-17 01:47:25 -05:00
Shawn O. Pearce 6f64f6d9d2 Correct a few types to be unsigned in fast-import.
The length of an atom string cannot be negative.  So make it
explicit and declare it as an unsigned value.

The shift width in a mark table node also cannot be negative.
I'm also moving it to after the pointer arrays to prevent any
possible alignment problems on a 64 bit system.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-01-17 01:13:22 -05:00
Shawn O. Pearce 2104838bf9 Corrected BNF input documentation for fast-import.
Now that fast-import uses uintmax_t (the largest available unsigned
integer type) for marks we don't want to say its an unsigned 32
bit integer in ASCII base 10 notation.  It could be much larger,
especially on 64 bit systems, and especially if a frontend uses
a very large number of marks (1 per file revision on a very, very
large import).

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-01-17 00:33:18 -05:00
Shawn O. Pearce 2369ed7907 Print out the edge commits for each packfile in fast-import.
To help callers repack very large repositories into a series of
packfiles fast-import now outputs the last commits/tags it wrote to
a packfile when it prints out the packfile name.  This information
can be feed to pack-objects --revs to repack.  For the first pack
of an initial import this is pretty easy (just feed those SHA1s on
stdin) but for subsequent packs you want to feed the subsequent
pack's final SHA1s but also all prior pack's SHA1s prefixed with
the negation operator.  This way the prior pack's data does not
get included into the subsequent pack.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-01-16 16:18:44 -05:00
Shawn O. Pearce a7ddc48765 Correct object_count type and stat output in fast-import.
Since object_count is limited to 'unsigned long' (really an
unsigned 32 bit integer value) by the pack file format we may as
well use exactly that type here in fast-import for that counter.
An earlier change by me incorrectly made it uintmax_t.

But since object_count is a counter for the current packfile only,
we don't want to output its value at the end.  Instead we should
sum up the individual type counters and report that total, as that
will cover all of the packfiles.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-01-16 04:55:41 -05:00