The tar importer in `contrib/fast-import/import-tars.perl` has a very
convenient feature: if _all_ paths stored in the imported `.tar` start
with a common prefix, e.g. `git-2.26.0/` in the tar at
https://github.com/git/git/archive/v2.26.0.tar.gz, then this prefix is
stripped.
This feature makes a ton of sense because it is relatively common to
import two or more revisions of the same project into Git, and obviously
we don't want all files to live in a tree whose name changes from
revision to revision.
Now, the problem with that feature is that it breaks down if there is a
`pax_global_header` "file" located outside of said prefix, at the top of
the tree. This is the case for `.tar` files generated by Git's very own
`git archive` command: it inserts that header, and `git archive` allows
specifying a common prefix (that the header does _not_ share with the
other files contained in the archive) via `--prefix=my-project-1.0.0/`.
Let's just skip any global header when importing `.tar` files into Git.
Note: this global header might contain useful information. For example,
in the output of `git archive`, it lists the original commit, which _is_
useful information. A future improvement to the `import-tars.perl`
script might be to include that information in the commit message, or do
other things with the information (e.g. use `mtime` information
contained in the global header as date of the commit). This patch does
not prevent any future patch from making that happen, it only prevents
the header from being treated as if it was a regular file.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Importing gcc tarballs[1] with import-tars script (in contrib) fails
when hitting a pax extended header.
Make sure we always read the extended attributes from the pax entries,
and store the 'path' value if found to be used in the next ustar entry.
The code to parse pax extended headers was written consulting the Pax
Pax Interchange Format documentation [2].
[1] http://ftp.gnu.org/gnu/gcc/gcc-7.3.0/gcc-7.3.0.tar.xz
[2] https://www.freebsd.org/cgi/man.cgi?manpath=FreeBSD+8-current&query=tar&sektion=5
Signed-off-by: Pedro Alvarez <palvarez89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, we simply treated hard links as if they were plain files
with size 0, ignoring the link type "1" and hence the link target.
What we should do instead, of course, is to use the link target to get
at the import mark for the contents, even if we cannot recreate the hard
link per se, as Git has no concept of hard links.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also handle the extensions .tlz and .txz, aliases for .tar.lzma and
.tar.xz respectively.
Signed-off-by: Ingmar Vanhassel <ingmar@exherbo.org>
Liked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes an obvious syntax error that snuck in commit 7e787953:
syntax error at /home/ingmar/bin//git-import-tars line 143, near "/^$/ { "
syntax error at /home/ingmar/bin//git-import-tars line 145, near "} else"
syntax error at /home/ingmar/bin//git-import-tars line 152, near "}"
Signed-off-by: Ingmar Vanhassel <ingmar@exherbo.org>
Acked-and-Tested-by: Peter Krefting <peter@softwolves.pp.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the "--metainfo=<ext>" option is given on the command line, a file
called "<filename.tar>.<ext>" will be used to create the commit message
for "<filename.tar>", instead of using "Imported from filename.tar".
The author and committer of the tar ball can also be overridden by
embedding an "Author:" or "Committer:" header in the metainfo file.
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without this patch, symbolic links are turned into empty files.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The import-tars script is typically employed to (re)create the past
history of a project from stored tars. Although assigning authorship in
these cases can be a somewhat arbitrary process, it makes sense to set
the author to whoever created the tars in the first place (if it's
known), and (s)he can in general be different from the committer
(whoever is running the script).
Implement this by having separate author and committer data, making them
settable from the usual GIT_* environment variables.
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
format-patch: add MIME-Version header when we add content-type.
Fixed link in user-manual
import-tars: Use the "Link indicator" to identify directories
git name-rev writes beyond the end of malloc() with large generations
Documentation/branch: fix small typo in -D example
Earlier, we used the mode to determine if a name was associated with
a directory. This fails, since some tar programs do not set the mode
correctly. However, the link indicator _has_ to be set correctly.
Noticed by Chris Riddoch.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Randal L. Schwartz pointed out multiple times that we should be
testing the length of the name string here, not if it is "true".
The problem is the string '0' is actually false in Perl when we
try to evaluate it in this context, as '0' is 0 numerically and
the number 0 is treated as a false value. This would cause us
to break out of the import loop early if anyone had a file or
directory named "0".
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This extension allows GNU tar to process file names in excess of the 100
characters defined by the original tar standard. It does this by faking a
file, named '././@LongLink' containing the true file name, and then adding
the file with a truncated name. The idea is that tar without this
extension will write out a file with the long file name, and write the
contents into a file with truncated name.
Unfortunately, GNU tar does a lousy job at times. When truncating results
in a _directory_ name, it will happily use _that_ as a truncated name for
the file.
An example where this actually happens is gcc-4.1.2, where the full path
of the file WeThrowThisExceptionHelper.java truncates _exactly_ before the
basename. So, we have to support that ad-hoc extension.
This bug was noticed by Chris Riddoch on IRC.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* maint:
http.c: Fix problem with repeated calls of http_init
Add missing reference to GIT_COMMITTER_DATE in git-commit-tree documentation
Fix import-tars fix.
Update .mailmap with "Michael"
Do not barf on too long action description
Catch empty pathnames in trees during fsck
Don't allow empty pathnames in fast-import
import-tars: be nice to wrong directory modes
git-svn: Added 'find-rev' command
git shortlog documentation: add long options and fix a typo
This heeds advice from our resident Perl expert to make sure
the script is not confused with a string that ends with /\n
Signed-off-by: Junio C Hamano <junkio@cox.net>
Some tars seem to have modes 0755 for directories, not 01000755. Do
not generate an empty object for them, but ignore them.
Noticed by riddochc on IRC.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
add a / between the prefix and name fields of the tar archive if prefix
is non-empty.
Signed-off-by: Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Not everyone has gzcat or bzcat installed on their system, but
gunzip -c and bunzip2 -c perform the same task and are available
if the user has installed gzip support or bzip2 support.
Signed-off-by: Michael Loeffler <zvpunry@zvpunry.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
There is a bug with this $git_mode variable which should be 0644
or 0755, but nothing else I think.
Signed-off-by: Michael Loeffler <zvpunry@zvpunry.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This is an example fast-import frontend, in less than 100 lines
of Perl. It accepts one or more tar archives on the command line,
passes them through gzcat/bzcat/zcat if necessary, parses out the
individual file headers and feeds all contained data to fast-import.
No temporary files are involved.
Each tar is treated as one commit, with the commit timestamp coming
from the oldest file modification date found within the tar.
Each tar is also tagged with an annotated tag, using the basename
of the tar file as the name of the tag.
Currently symbolic links and hard links are not handled by the
importer. The file checksums are also not verified.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>