Generally I need size-efficient data structure similar to std::priority_queue
but stable (preserving order of insertion).
By adding just 4 bytes to the object I could have 1 byte serving as priority and 3 bytes as counter to keep the order of insertion. Closest competitor – std::forward_list
– would add an overhead of 5 bytes (in practice 8 due to alignment) – one for priority and 4 for link (32-bit architecture). It would also be slower than max-heap due to traversal when adding new element.
The problem with counter is the behavior of container when the counter overflows. In Boost the counter is 64-bits long and when it overflows, the container throws an exception ( http://www.boost.org/doc/libs/1_49_0/doc/html/heap/concepts.html#heap.concepts.stability ).
One solution to this problem that comes to mind is resetting the counter whenever the queue gets empty, but this is only a partial solution to the problem.
Is there a generic (and optimal – without traversing through the whole heap each time) way this could be solved? I planned to use std::push_heap()
and std::pop_heap()
from <algorithm>
with an array of raw storage, so I have access to “internals” of the entries. If possible, I’d prefer to stick to these standard functions instead of implementing my own heap (or any other data structure) that would be stable.
EDIT:
As requested in a comment, I’m adding some more information about my use case here.
I need this data structure to implement a message queue in the RTOS I’m developing ( https://github.com/DISTORTEC/distortos ). This message queue must follow all POSIX requirements, and one of them is “stability” – new entries with the same priority must be placed AFTER older entries:
From http://pubs.opengroup.org/onlinepubs/9699919799/functions/mq_send.html# :
A message shall be inserted after other messages in the queue, if any, with equal msg_prio.
This is an RTOS for embedded microcontrollers, so I cannot give any specific usage scenario, because there will be many. Some devices will run for a second and shutdown, other devices may run for years without reboot. As I’m aiming for a design with no limitations, I’m mostly interested in a solution that works in all cases, without limits like the one imposed by simple counter – which fails when the counter overflows.
Because of the fact that this is for a microcontroller, I’d really prefer the solution to be size efficient, so solutions like “use 128-bit counter” are not acceptable.
Generally I see two options – using singly-linked list or using max-heap (as I originally intended).
Using singly-linked list would add 5 (in practice 8 – due to alignment) bytes to each stored object – 1 for priority, 4 for link. This solution is stable “by design”, but insertion of object can be slow when there are many objects in the list – because a lot of nodes will have to be traversed to find the spot for insertion.
Using heap I could try to limit the overhead to 4 bytes – 1 for priority, 3 for counter. This option could possibly be faster than the list, but requires a solution for counter overflow. I see several options here (multiple can be used at the same time):
- use 56-bit counter (total overhead would be 8 bytes per object),
- reset the counter when the list gets empty (only partial solution),
- when the overflow happens, counter has to be reset, whole heap has to be traversed and each visited entry’s counter has to be updated to “low” value.
Of course I know I could just ignore the overflow, but if there is a real solution, I’d like to implement it.
As there seems to be no simple, robust and deterministic solution to the counter overflow problem, I’m starting to lean towards the use of – I would expect that there wouldn’t be many entries in the message queue (most microcontrollers have really limited RAM), so the speed benefit of using heap would probably be negligible. Especially when I’d account for all these copying during insertion/deletion, while the contents of the singly-linked list would be mostly static in this regard. And there’s also “instantaneous” extraction from the head of the list…
8
I don’t think there’s any good way to fix your overflow problem with the constraints given.
Renumbering your heap periodically means a stop-the-world housekeeping operation, which I reckon needs O(n log n) time without extra allocation, or perhaps O(n) time with O(n) temporary space.
Note that if O(n log n) stop-the-world operations aren’t acceptable – you did say it’s an RTOS – then you need to make sure there’s always room to allocate the temporary working space. If you’re reserving that anyway, you could equally spend twice as much space overhead per object on a structure that doesn’t need stop-the-world housekeeping operations in the first place.
The simplest solution I can think of is to amortize that overhead over multiple objects, rather than adding a pointer to each instance. So, let’s split it into two issues:
-
FIFO-ordering of elements with the same priority
You can just use a FIFO-ordered container with constant-time
pop_front
andpush_back
: this is a deque.You could try
std::deque
, and write a custom implementation if you need finer control over the block size, for example. The per-block overhead is amortized over all the objects in that block, so you can control the tradeoff between slack allocation and per-object overhead. -
key-ordered container of priority levels:
-
Since 8 bits are enough to store your priority, an array of 256 levels (deques) would be the simplest solution.
-
If that wastes too much slack space for you, a regular priority queue will still work, but
pop_front
/pop_heap
should pop from the front priority level object, and only pop that level from the overall queue when it is empty.Note that you can write this logic as a thin wrapper around the existing
pop_heap
to re-use its heap logic, but the same isn’t true ofpush_heap
, and in fact that will end up slower in this scheme (you’ll have to search for the insertion point to see if it exists, and then sift it up if it doesn’t).
-
This should have notionally constant-time everything, so is asymptotically better than a heap; whether it’s fast enough in practice will depend on the platform, cache behaviour, and how well you can tune the allocators, block sizes, etc.
Space overhead is a few bytes (something like 2-4 pointers) per distinct priority level, plus hopefully 1 byte or fewer per element (depending on the deque block size and number of elements per block). So, this will only save space if you have a sufficiently large number of elements per priority level. However, it can never encounter your overflow problem, doesn’t require any stop-the-world cleanup, and doesn’t degrade significantly with time or size.
1