urlmatch: add url_normalize_pattern() helper

In a following commit, we will need to normalize a URL glob pattern
(which may contain '*' in the host portion) and extract its component
offsets (host, path, etc.) for separate matching. Let's export a
dedicated helper function url_normalize_pattern() for that purpose.

It works like url_normalize(), but passes allow_globs=true to the
internal url_normalize_1(), so that '*' characters in the host are
accepted rather than rejected.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Christian Couder
2026-05-27 16:08:15 +02:00
committed by Junio C Hamano
parent ee7ea4907c
commit 58880c82fe
2 changed files with 17 additions and 0 deletions

View File

@@ -441,6 +441,11 @@ char *url_normalize(const char *url, struct url_info *out_info)
return url_normalize_1(url, out_info, false);
}
char *url_normalize_pattern(const char *url, struct url_info *out_info)
{
return url_normalize_1(url, out_info, true);
}
char *url_parse(const char *url_orig, struct url_info *out_info)
{
struct strbuf url;

View File

@@ -37,6 +37,18 @@ struct url_info {
char *url_normalize(const char *, struct url_info *);
char *url_parse(const char *, struct url_info *);
/*
* Like url_normalize(), but also allows '*' glob characters in the host
* portion. Use this when normalizing URL patterns from user configuration.
*
* Note that '*' is a valid path character per RFC 3986 (as a sub-delim),
* so glob patterns using '*' in the path are also accepted.
*
* Returns a newly allocated normalized string and fills out_info if
* non-NULL, or NULL if the pattern is invalid.
*/
char *url_normalize_pattern(const char *url, struct url_info *out_info);
struct urlmatch_item {
size_t hostmatch_len;
size_t pathmatch_len;