#626 closed enhancement (fixed)
intbitset: harmonise pop() behaviour
Reported by: | simko | Owned by: | Samuele Kaplun <samuele.kaplun@…> |
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | MiscUtil | Version: | |
Keywords: | Cc: |
Description
As mentioned in ticket:621, we may want to improve a little bit the
description and/or behaviour of intbitset's pop(). Notably, people
may be using search engine's API functions returning intbitsets that
look like lists. Here, pop() has ordered meaning for lists, while
for sets it pops any random element:
In [2]: from invenio.intbitset import intbitset In [3]: l = [1, 10, 2, 20] In [4]: s = set(l) In [5]: i = intbitset(l) In [6]: l, s, i Out[6]: ([1, 10, 2, 20], set([1, 10, 20, 2]), intbitset([1, 2, 10, 20])) In [7]: xl, xs, xi = l.pop(), s.pop(), i.pop() In [8]: l, s, i Out[8]: ([1, 10, 2], set([10, 20, 2]), intbitset([2, 10, 20])) In [9]: xl, xs, xi Out[9]: (20, 1, 1)
The difference between lists and intbitsets is strictly taken OK,
because intbitsets emulate the API of sets, so pop() removes an
arbitrary set element. However, behind the scenes intbitset's pop()
calls intBitSetGetNext() that does an ordered removal, not an
"arbitrary" removal; so we can document this better for end users.
More to the point, intbitset has a native notion of element order,
being a set of increasing integers; it does resemble ordered lists of
integers in this respect. intbitset can be considered as a kind of
ordered set of increasing integers that emulates set API, so having
some facets of lists and some facets of sets, as it were.
Therefore we may want to improve the docstring of intbitset's pop()
in order to reflect this mixed nature of intbitsets: (i) at least by
documenting the non-arbitrary parts, but (ii) more to the point, we
may want to alter perhaps the meaning of what pop() returns, so that
intbitset would resemble lists more (i.e. returning last, not first
element). It will still be an "arbitrary" removal from the set API
point of view, but it will resemble more to what people may be used to
from the list API point of view, if they think of intbitsets as of
lists of increasing integers.
P.S. Not thinking here about list-specific calls like pop(n).
Change History (7)
comment:1 Changed 22 months ago by skaplun
comment:2 Changed 22 months ago by Samuele Kaplun <samuele.kaplun@…>
- Owner set to Samuele Kaplun <samuele.kaplun@…>
- Resolution set to fixed
- Status changed from new to closed
I agree with you on the non arbitrary implementation of pop() in intbitset. However, for performance reasons, it might be nicer to still return the smaller bit, because in order to find the bit to return, the implementation still has to scroll the whole bitset.
So what if we fully document this behavior? Alternatively one can imagine to add a flag to the .pop() function saying:
In this way the faster implementation is used by default, but one can still use the slower and more stack-friendly one.