versioncmp: use earliest-longest contained suffix to determine sorting order

When comparing tagnames, it is possible that a tagname contains more
than one of the configured prerelease suffixes around the first
different character.  After fixing a bug in the previous commit such a
tagname is sorted according to the contained suffix which comes first
in the configuration.  This is, however, not quite the right thing to
do in the following corner cases:

  1.   $ git -c versionsort.suffix=-bar
             -c versionsort.suffix=-foo-baz
             -c versionsort.suffix=-foo-bar
             tag -l --sort=version:refname 'v1*'
       v1.0-foo-bar
       v1.0-foo-baz

     The suffix of the tagname 'v1.0-foo-bar' is clearly '-foo-bar',
     so it should be listed last.  However, as it also contains '-bar'
     around the first different character, it is listed first instead,
     because that '-bar' suffix comes first the configuration.

  2. One of the configured suffixes starts with the other:

       $ git -c versionsort.prereleasesuffix=-pre \
             -c versionsort.prereleasesuffix=-prerelease \
             tag -l --sort=version:refname 'v2*'
       v2.0-prerelease1
       v2.0-pre1
       v2.0-pre2

     Here the tagname 'v2.0-prerelease1' should be the last.  When
     comparing 'v2.0-pre1' and 'v2.0-prerelease1' the first different
     characters are '1' and 'r', respectively.  Since this first
     different character must be part of the configured suffix, the
     '-pre' suffix is not recognized in the first tagname.  OTOH, the
     '-prerelease' suffix is properly recognized in
     'v2.0-prerelease1', thus it is listed first.

Improve version sort in these corner cases, and

  - look for a configured prerelease suffix containing the first
    different character or ending right before it, so the '-pre'
    suffixes are recognized in case (2).  This also means that
    when comparing tagnames 'v2.0-pre1' and 'v2.0-pre2',
    swap_prereleases() would find the '-pre' suffix in both, but then
    it will return "undecided" and the caller will do the right thing
    by sorting based in '1' and '2'.

  - If the tagname contains more than one suffix, then give precedence
    to the contained suffix that starts at the earliest offset in the
    tagname to address (1).

  - If there are more than one suffixes starting at that earliest
    position, then give precedence to the longest of those suffixes,
    thus ensuring that in (2) the tagname 'v2.0-prerelease1' won't be
    sorted based on the '-pre' suffix.

Add tests for these corner cases and adjust the documentation
accordingly.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
SZEDER Gábor
2016-12-08 15:24:00 +01:00
committed by Junio C Hamano
parent b8231660fa
commit 51acfa9db5
3 changed files with 60 additions and 10 deletions

View File

@@ -27,14 +27,18 @@ static int initialized;
/*
* off is the offset of the first different character in the two strings
* s1 and s2. If either s1 or s2 contains a prerelease suffix containing
* that offset, then that string will be forced to be on top.
* that offset or a suffix ends right before that offset, then that
* string will be forced to be on top.
*
* If both s1 and s2 contain a (different) suffix around that position,
* their order is determined by the order of those two suffixes in the
* configuration.
* If any of the strings contains more than one different suffixes around
* that position, then that string is sorted according to the contained
* suffix which comes first in the configuration.
* suffix which starts at the earliest offset in that string.
* If more than one different contained suffixes start at that earliest
* offset, then that string is sorted according to the longest of those
* suffixes.
*
* Return non-zero if *diff contains the return value for versioncmp()
*/
@@ -44,27 +48,40 @@ static int swap_prereleases(const char *s1,
int *diff)
{
int i, i1 = -1, i2 = -1;
int start_at1 = off, start_at2 = off, match_len1 = -1, match_len2 = -1;
for (i = 0; i < prereleases->nr; i++) {
const char *suffix = prereleases->items[i].string;
int j, start, suffix_len = strlen(suffix);
int j, start, end, suffix_len = strlen(suffix);
if (suffix_len < off)
start = off - suffix_len + 1;
start = off - suffix_len;
else
start = 0;
for (j = start; j <= off; j++)
if (i1 == -1 && starts_with(s1 + j, suffix)) {
end = match_len1 < suffix_len ? start_at1 : start_at1-1;
for (j = start; j <= end; j++)
if (starts_with(s1 + j, suffix)) {
i1 = i;
start_at1 = j;
match_len1 = suffix_len;
break;
}
for (j = start; j <= off; j++)
if (i2 == -1 && starts_with(s2 + j, suffix)) {
end = match_len2 < suffix_len ? start_at2 : start_at2-1;
for (j = start; j <= end; j++)
if (starts_with(s2 + j, suffix)) {
i2 = i;
start_at2 = j;
match_len2 = suffix_len;
break;
}
}
if (i1 == -1 && i2 == -1)
return 0;
if (i1 == i2)
/* Found the same suffix in both, e.g. "-rc" in "v1.0-rcX"
* and "v1.0-rcY": the caller should decide based on "X"
* and "Y". */
return 0;
if (i1 >= 0 && i2 >= 0)
*diff = i1 - i2;
else if (i1 >= 0)