git/t/t5551-http-fetch-smart.sh
Jeff King 6bc0cb5176 http-backend: spool ref negotiation requests to buffer
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request.  In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.

In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:

  1. Apache proxies the request to the CGI, http-backend.

  2. http-backend gzip-inflates the data and sends
     the result to upload-pack.

  3. upload-pack acts on the data and generates output over
     the pipe back to Apache. Apache isn't reading because
     it's busy writing (step 1).

This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).

We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.

The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:

  1. We limit the in-memory buffer to prevent an obvious
     denial-of-service attack. This is a new hard limit on
     requests, but it's unlikely to come into play. The
     default value is 10MB, which covers even the ridiculous
     100,000-ref negotation in the included test (that
     actually caps out just over 5MB). But it's configurable
     on the off chance that you don't mind spending some
     extra memory to make even ridiculous requests work.

  2. We must take care only to buffer when we have to. For
     pushes, the incoming packfile may be of arbitrary
     size, and we should connect the input directly to
     receive-pack. There's no deadlock problem here, though,
     because we do not produce any output until the whole
     packfile has been read.

     For upload-pack's initial ref advertisement, we
     similarly do not need to buffer. Even though we may
     generate a lot of output, there is no request body at
     all (i.e., it is a GET, not a POST).

[1] http://article.gmane.org/gmane.comp.version-control.git/269020

Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 20:43:18 -07:00

269 lines
7.3 KiB
Bash
Executable File

#!/bin/sh
test_description='test smart fetching over http via http-backend'
. ./test-lib.sh
if test -n "$NO_CURL"; then
skip_all='skipping test, git built without http support'
test_done
fi
. "$TEST_DIRECTORY"/lib-httpd.sh
start_httpd
test_expect_success 'setup repository' '
git config push.default matching &&
echo content >file &&
git add file &&
git commit -m one
'
test_expect_success 'create http-accessible bare repository' '
mkdir "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
(cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
git --bare init
) &&
git remote add public "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
git push public master:master
'
setup_askpass_helper
cat >exp <<EOF
> GET /smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1
> Accept: */*
> Accept-Encoding: gzip
> Pragma: no-cache
< HTTP/1.1 200 OK
< Pragma: no-cache
< Cache-Control: no-cache, max-age=0, must-revalidate
< Content-Type: application/x-git-upload-pack-advertisement
> POST /smart/repo.git/git-upload-pack HTTP/1.1
> Accept-Encoding: gzip
> Content-Type: application/x-git-upload-pack-request
> Accept: application/x-git-upload-pack-result
> Content-Length: xxx
< HTTP/1.1 200 OK
< Pragma: no-cache
< Cache-Control: no-cache, max-age=0, must-revalidate
< Content-Type: application/x-git-upload-pack-result
EOF
test_expect_success 'clone http repository' '
GIT_CURL_VERBOSE=1 git clone --quiet $HTTPD_URL/smart/repo.git clone 2>err &&
test_cmp file clone/file &&
tr '\''\015'\'' Q <err |
sed -e "
s/Q\$//
/^[*] /d
/^$/d
/^< $/d
/^[^><]/{
s/^/> /
}
/^> User-Agent: /d
/^> Host: /d
/^> POST /,$ {
/^> Accept: [*]\\/[*]/d
}
s/^> Content-Length: .*/> Content-Length: xxx/
/^> 00..want /d
/^> 00.*done/d
/^< Server: /d
/^< Expires: /d
/^< Date: /d
/^< Content-Length: /d
/^< Transfer-Encoding: /d
" >act &&
test_cmp exp act
'
test_expect_success 'fetch changes via http' '
echo content >>file &&
git commit -a -m two &&
git push public
(cd clone && git pull) &&
test_cmp file clone/file
'
cat >exp <<EOF
GET /smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1 200
POST /smart/repo.git/git-upload-pack HTTP/1.1 200
GET /smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1 200
POST /smart/repo.git/git-upload-pack HTTP/1.1 200
EOF
test_expect_success 'used upload-pack service' '
sed -e "
s/^.* \"//
s/\"//
s/ [1-9][0-9]*\$//
s/^GET /GET /
" >act <"$HTTPD_ROOT_PATH"/access.log &&
test_cmp exp act
'
test_expect_success 'follow redirects (301)' '
git clone $HTTPD_URL/smart-redir-perm/repo.git --quiet repo-p
'
test_expect_success 'follow redirects (302)' '
git clone $HTTPD_URL/smart-redir-temp/repo.git --quiet repo-t
'
test_expect_success 'redirects re-root further requests' '
git clone $HTTPD_URL/smart-redir-limited/repo.git repo-redir-limited
'
test_expect_success 'clone from password-protected repository' '
echo two >expect &&
set_askpass user@host pass@host &&
git clone --bare "$HTTPD_URL/auth/smart/repo.git" smart-auth &&
expect_askpass both user@host &&
git --git-dir=smart-auth log -1 --format=%s >actual &&
test_cmp expect actual
'
test_expect_success 'clone from auth-only-for-push repository' '
echo two >expect &&
set_askpass wrong &&
git clone --bare "$HTTPD_URL/auth-push/smart/repo.git" smart-noauth &&
expect_askpass none &&
git --git-dir=smart-noauth log -1 --format=%s >actual &&
test_cmp expect actual
'
test_expect_success 'clone from auth-only-for-objects repository' '
echo two >expect &&
set_askpass user@host pass@host &&
git clone --bare "$HTTPD_URL/auth-fetch/smart/repo.git" half-auth &&
expect_askpass both user@host &&
git --git-dir=half-auth log -1 --format=%s >actual &&
test_cmp expect actual
'
test_expect_success 'no-op half-auth fetch does not require a password' '
set_askpass wrong &&
git --git-dir=half-auth fetch &&
expect_askpass none
'
test_expect_success 'redirects send auth to new location' '
set_askpass user@host pass@host &&
git -c credential.useHttpPath=true \
clone $HTTPD_URL/smart-redir-auth/repo.git repo-redir-auth &&
expect_askpass both user@host auth/smart/repo.git
'
test_expect_success 'disable dumb http on server' '
git --git-dir="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
config http.getanyfile false
'
test_expect_success 'GIT_SMART_HTTP can disable smart http' '
(GIT_SMART_HTTP=0 &&
export GIT_SMART_HTTP &&
cd clone &&
test_must_fail git fetch)
'
test_expect_success 'invalid Content-Type rejected' '
test_must_fail git clone $HTTPD_URL/broken_smart/repo.git 2>actual
grep "not valid:" actual
'
test_expect_success 'create namespaced refs' '
test_commit namespaced &&
git push public HEAD:refs/namespaces/ns/refs/heads/master &&
git --git-dir="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
symbolic-ref refs/namespaces/ns/HEAD refs/namespaces/ns/refs/heads/master
'
test_expect_success 'smart clone respects namespace' '
git clone "$HTTPD_URL/smart_namespace/repo.git" ns-smart &&
echo namespaced >expect &&
git --git-dir=ns-smart/.git log -1 --format=%s >actual &&
test_cmp expect actual
'
test_expect_success 'dumb clone via http-backend respects namespace' '
git --git-dir="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
config http.getanyfile true &&
GIT_SMART_HTTP=0 git clone \
"$HTTPD_URL/smart_namespace/repo.git" ns-dumb &&
echo namespaced >expect &&
git --git-dir=ns-dumb/.git log -1 --format=%s >actual &&
test_cmp expect actual
'
cat >cookies.txt <<EOF
127.0.0.1 FALSE /smart_cookies/ FALSE 0 othername othervalue
EOF
cat >expect_cookies.txt <<EOF
127.0.0.1 FALSE /smart_cookies/ FALSE 0 othername othervalue
127.0.0.1 FALSE /smart_cookies/repo.git/info/ FALSE 0 name value
EOF
test_expect_success 'cookies stored in http.cookiefile when http.savecookies set' '
git config http.cookiefile cookies.txt &&
git config http.savecookies true &&
git ls-remote $HTTPD_URL/smart_cookies/repo.git master &&
tail -3 cookies.txt > cookies_tail.txt
test_cmp expect_cookies.txt cookies_tail.txt
'
# create an arbitrary number of tags, numbered from tag-$1 to tag-$2
create_tags () {
rm -f marks &&
for i in $(test_seq "$1" "$2")
do
# don't use here-doc, because it requires a process
# per loop iteration
echo "commit refs/heads/too-many-refs-$1" &&
echo "mark :$i" &&
echo "committer git <git@example.com> $i +0000" &&
echo "data 0" &&
echo "M 644 inline bla.txt" &&
echo "data 4" &&
echo "bla" &&
# make every commit dangling by always
# rewinding the branch after each commit
echo "reset refs/heads/too-many-refs-$1" &&
echo "from :$1"
done | git fast-import --export-marks=marks &&
# now assign tags to all the dangling commits we created above
tag=$(perl -e "print \"bla\" x 30") &&
sed -e "s|^:\([^ ]*\) \(.*\)$|\2 refs/tags/$tag-\1|" <marks >>packed-refs
}
test_expect_success 'create 50,000 tags in the repo' '
(
cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
create_tags 1 50000
)
'
test_expect_success EXPENSIVE 'clone the 50,000 tag repo to check OS command line overflow' '
git clone $HTTPD_URL/smart/repo.git too-many-refs &&
(
cd too-many-refs &&
test $(git for-each-ref refs/tags | wc -l) = 50000
)
'
test_expect_success EXPENSIVE 'http can handle enormous ref negotiation' '
git -C too-many-refs fetch -q --tags &&
(
cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
create_tags 50001 100000
) &&
git -C too-many-refs fetch -q --tags &&
git -C too-many-refs for-each-ref refs/tags >tags &&
test_line_count = 100000 tags
'
stop_httpd
test_done