Commit graph

14 commits

Author SHA1 Message Date
Jonathan Nieder
7e2fe3a9fc vcs-svn: remove buffer_read_string
All previous users of buffer_read_string have already been converted
to use the more intuitive buffer_read_binary, so remove the old API to
avoid some confusion.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-26 00:17:35 -05:00
Jonathan Nieder
26557fc1b3 vcs-svn: make buffer_copy_bytes return length read
Currently buffer_copy_bytes does not report to its caller whether
it encountered an early end of file.

Add a return value representing the number of bytes read (but not
the number of bytes copied).  This way all three unusual conditions
can be distinguished: input error with buffer_ferror, output error
with ferror(outfile), early end of input by checking the return
value.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-22 16:40:26 -05:00
Jonathan Nieder
d234f54b2f vcs-svn: make buffer_skip_bytes return length read
Currently there is no way to detect when input ended if it ended
early during buffer_skip_bytes.  Tell the calling program how many
bytes were actually skipped for easier debugging.

Existing callers will still ignore early EOF.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-22 16:39:54 -05:00
Jonathan Nieder
93b709c79e vcs-svn: improve support for reading large files
Move from uint32_t to off_t as the fundamental unit of length used by
the line_buffer library.  Performance would get worse if anything but
I think it's worth it for support of deltas that need to skip large
pieces (> 4 GiB).

Exception: buffer_read_string still takes a uint32_t, since it keeps
its result in an in-core obj_pool.

Callers still have to be updated to take advantage of this.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-22 16:32:09 -05:00
Jonathan Nieder
efc749b48f vcs-svn: allow input errors to be detected promptly
The line_buffer library silently flags input errors until
buffer_deinit time; unfortunately, by that point usually errno is
invalid.  Expose the error flag so callers can check for and
report errors early for easy debugging.

	some_error_prone_operation(...);
	if (buffer_ferror(buf))
		return error("input error: %s", strerror(errno));

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-07 01:32:51 -06:00
Jonathan Nieder
b1c9b798a6 vcs-svn: teach line_buffer about temporary files
It can sometimes be useful to write information temporarily to file,
to read back later.  These functions allow a program to use the
line_buffer facilities when doing so.

It works like this:

 1. find a unique filename with buffer_tmpfile_init.
 2. rewind with buffer_tmpfile_rewind.  This returns a stdio
    handle for writing.
 3. when finished writing, declare so with
    buffer_tmpfile_prepare_to_read.  The return value indicates
    how many bytes were written.
 4. read whatever portion of the file is needed.
 5. if finished, remove the temporary file with buffer_deinit.
    otherwise, go back to step 2,

The svn support would use this to buffer the postimage from delta
application until the length is known and fast-import can receive
the resulting blob.

Based-on-patch-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-02-26 04:59:37 -06:00
Jonathan Nieder
cb3f87cf1b vcs-svn: allow input from file descriptor
Based-on-patch-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-02-26 04:59:37 -06:00
Jonathan Nieder
cc193f1f0b vcs-svn: allow character-oriented input
buffer_read_char can be used in place of buffer_read_string(1) to
avoid consuming valuable static buffer space.  The delta applier will
use this to read variable-length integers one byte at a time.

Underneath, it is fgetc, wrapped so the line_buffer library can
maintain its role as gatekeeper of input.

Later it might be worth checking if fgetc_unlocked is faster ---
most line_buffer functions are not thread-safe anyway.

Helpd-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-02-26 04:59:37 -06:00
Jonathan Nieder
e832f43c1d vcs-svn: add binary-safe read function
buffer_read_string works well for non line-oriented input except for
one problem: it does not tell the caller how many bytes were actually
written.  This means that unless one is very careful about checking
for errors (and eof) the calling program cannot tell the difference
between the string "foo" followed by an early end of file and the
string "foo\0bar\0baz".

So introduce a variant that reports the length, too, a thinner wrapper
around strbuf_fread.  Its result is written to a strbuf so the caller
does not need to keep track of the number of bytes read.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-02-26 04:59:37 -06:00
Jonathan Nieder
e5e45ca1e3 vcs-svn: teach line_buffer to handle multiple input files
Collect the line_buffer state in a newly public line_buffer struct.
Callers can use multiple line_buffers to manage input from multiple
files at a time.

svn-fe's delta applier will use this to stream a delta from svnrdump
and the preimage it applies to from fast-import at the same time.

The tests don't take advantage of the new features, but I think that's
okay.  It is easier to find lingering examples of nonreentrant code by
searching for "static" in line_buffer.c.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-02-26 04:57:59 -06:00
Jonathan Nieder
d350822fa7 vcs-svn: collect line_buffer data in a struct
Prepare for the line_buffer lib to support input from multiple files,
by collecting global state in a struct that can be easily passed
around.

No API change yet.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-02-26 04:57:58 -06:00
Jonathan Nieder
deadcef4c1 vcs-svn: replace buffer_read_string memory pool with a strbuf
obj_pool is inherently global and does not use the standard growing
factor alloc_nr, which makes it feel out of place in the git codebase.
Plus it is overkill for this application: all that is needed is a
buffer that can grow between requests to accomodate larger strings.
Use a strbuf instead.

As a side effect, this improves the error handling: allocation
failures will result in a clean exit instead of segfaults.  It would
be nice to add a test case (using ulimit or failmalloc) but that can
wait for another day.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-02-26 04:57:58 -06:00
Jonathan Nieder
4d21bec0d2 vcs-svn: eliminate global byte_buffer
The data stored in byte_buffer[] is always either discarded or
written to stdout immediately.  No need for it to persist between
function calls.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-02-26 04:57:58 -06:00
David Barr
3bbaec00a8 Add stream helper library
This library provides thread-unsafe fgets()- and fread()-like
functions where the caller does not have to supply a buffer.  It
maintains a couple of static buffers and provides an API to use
them.

[rr: allow input from files other than stdin]
[jn: with tests, documentation, and error handling improvements]

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-14 19:35:37 -07:00