From b92387cd55f9fa66edd6e646ec5c17be8d72609b Mon Sep 17 00:00:00 2001 From: Jeff King Date: Tue, 12 May 2026 12:26:19 -0400 Subject: [PATCH] http: handle absolute-path alternates from server root When a dumb http server reports alternates with an absolute path, we try to paste that onto the root of the URL we're trying to fetch from. So if we go to "http://example.com/path/to/child.git" and it tells us about an alternate at "/parent.git", we'll hit "http://example.com/parent.git". But there's a bug in computing the base when the URL does not have any path component at all, like "http://example.com". When looking for the first slash after the host, strchr() returns NULL, and we compute a nonsense value for the length of the host portion. And then when we use that length to copy the base of the URL into a strbuf, we're likely to fail. The security implications are minimal here. We store the nonsense length ("serverlen") as an int, so on a 64-bit system it may effectively be anything (it is zero minus a 64-bit heap pointer, then truncated to 32-bits and stuffed into a signed value). When we feed that length to strbuf_add(), it is cast into a size_t and one of four things will happen: 1. If serverlen was negative, it will turn into a very large positive value and strbuf_add() will fail to allocate, ending the program. Ditto if serverlen was positive but just very large. This doesn't really get an attacker anything; the victim will just fail to clone their evil repo. 2. If serverlen was small enough, we'll successfully extend the target strbuf, and then copy an arbitrary set of bytes from "base". And then one of these is true: a. That set of bytes is much larger than the length of the "base" string. This is an out-of-bounds read, but there's no out-of-bounds write, since the strbuf code both allocates and copies using the same size_t. This is likely to cause a segfault as we try to read unmapped pages of memory. b. Like (2a), but if the set of bytes is small enough we might not segfault. We might read random memory from the process and copy it into the "target" strbuf. What happens then? We know that "base" ends with a NUL terminator, which will be copied into "target" as well. So even though target.len might be 1000 bytes (or whatever), when interpreted as a NUL-terminated string, target.buf is still the exact same string as "base". And that's all we ever do with target: pass it around as a C string, and then eventually strbuf_detach() it to become a C string. So even though there was arbitrary memory copied into the strbuf, we never access it. c. The other interesting case is when serverlen is actually _shorter_ than the length of base. And there we truncate the string. Probably in a way that makes it totally invalid, but if you were very unlucky you could turn something like: http://victim.com.evil.domain:8000 into: http://victim.com Which looks like the start of a redirect attack, except that the attacker could just have written "http://victim.com" in the first place! Either way we feed it to is_alternate_allowed(), which is where we check redirect and protocol rules. I think we can just treat this like a regular bug. And it's quite a weird setup in the first place, as it implies that the root of the web server is serving a repository (i.e., that you can get something useful from "http://example.com/info/refs"). The bug has been there since b3661567cf ([PATCH] Add support for alternates in HTTP, 2005-09-14) without anybody noticing. I kind of doubt anybody really cares about making this work, but it's easy enough to do so: the host-portion of the URL ends at either the first slash or the end-of-string. So we can just replace strchr() with strchrnul(). The test setup is a little gross, as we take over the httpd document root by shoving our bare-repo components into it. But it demonstrates the problem and shows that our solution actually allows the alternate to function, if the server is configured to allow it. Reported-by: slonkazoid Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- http-walker.c | 2 +- t/t5550-http-fetch-dumb.sh | 20 ++++++++++++++++++++ 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/http-walker.c b/http-walker.c index e886e64866..badf07e499 100644 --- a/http-walker.c +++ b/http-walker.c @@ -268,7 +268,7 @@ static void process_alternates_response(void *callback_data) */ const char *colon_ss = strstr(base,"://"); if (colon_ss) { - serverlen = (strchr(colon_ss + 3, '/') + serverlen = (strchrnul(colon_ss + 3, '/') - base); okay = 1; } diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index ed0ad66fad..802400b404 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -530,4 +530,24 @@ test_expect_success 'dumb http can fetch index v1' ' git -C idx-v1 fsck ' +test_expect_success 'absolute-path alternate when url has no path' ' + src=$HTTPD_DOCUMENT_ROOT_PATH/repo.git && + alt=absolute-alt.git && + git clone --bare --shared "$src" "$alt" && + + # Our repo has an alternate pointing to the absolute filesystem path, + # but that will not make any sense to an http client. So we will + # manually give it the equivalent path that the http server will + # understand. + echo "/dumb/repo.git/objects" >"$alt/objects/info/http-alternates" && + + # Now make our alt repository available at the root of the http + # server without any path (i.e., just http://localhost:1234). + git -C "$alt" update-server-info && + mv absolute-alt.git/* "$HTTPD_DOCUMENT_ROOT_PATH" && + + git -c http.followRedirects=true clone "$HTTPD_URL" alt-clone.git 2>err && + test_grep "adding alternate object store: $HTTPD_URL/dumb/repo.git" err +' + test_done