I am writing a C API for retrieving data from a hardware device. The data will be returned as a string with approximately 30 bytes of text per item. The problem is there may any number of items. Could just 10 or there could be 10,000. So the returned data could range from 10 x 30 = 300 to 10,000 x 30 = 300,000 bytes.
Initially, the function to get the data was like this:
int get_devicelist(char* devicelist);
So the onus is on the caller to know what they are doing and specify a big enough buffer. Trouble is software using API could be used in many end user applications so the programmer has no end user specific knowledge when they write code to use the API. The only option then is to specify a buffer (or allocate on heap) that is large enough to hold the largest possible amount of data returned.
So for that reason I am thinking about changing the API as here:
char* get_devicelist();
void free_devicelist(char* devlist);
eg in implementation:
char* get_devicelist() {
size_t n = get_devices_size();
char* devlist = malloc(n * 30);
if(devlist)
/* populate devlist */
return devlist;
}
void free_devicelist(char* devlist) {
free(devlist);
}
Or I just inform users to call free on devlist when no longer needed?
I would change the API code to get the number of devices and then allocate an appropriate amount of memory for devicelist. The only downside is that the user is then required to call free_devicelist to free the memory allocated.
Is the second API better? Is there an alternative way?
2
The cleanest way to do this is to essentially expose an iterator which the caller can use to pull one fixed-size item out at a time.
In C, this can be done like the opendir
/readdir
/closedir
POSIX interface.
The first call allows the library to allocate some opaque iterator context, the second returns the current value from the sequence and advances the iterator, and the last destroys the iterator context.
So, your interface would look something like:
struct DEVICELIST; /* this is opaque to the caller */
struct Device {
char description[30];
/* or whatever data you want to return */
};
struct DEVICELIST *open_devicelist();
int read_device(struct DEVICELIST *, struct Device *current);
void close_devicelist(struct DEVICELIST *);
and your client code like:
struct Device dev;
struct DEVICELIST *dl = open_devicelist();
if (dl == NULL) return;
while (read_device(dl, &dev) == 0) {
/* data has been copied into dev here, to use for whatever.
Note it will be overwritten by the next device when we loop round.
*/
}
close_devicelist(dl);
If you want something less complex (I know you’re just returning a string right now), your second suggestion is much better than the original.
Relying on the user to call your cleanup function is perfectly reasonable, and obviously no worse than the above.
(Note, if you look up the readdir
call I mentioned, I’m actually using something more like readdir_r
. The original non-reentrant version is a bit icky).
3
I’d recommend a look on your OSes APIs how to handle this. For the following example, I refer to the Windows API.
A number of functions with variable buffer outputs follow these rules:
int GetSomeString( char * pBuf, int nSize);
Return value:
- If called with
pBuf == NULL
andnSize == 0
, the function returns the required size of the buffer in characters, including the terminating NULL character - If called with
pBuf != NULL
, thenSize
parameter should specifiy the size of *pBuf in characters, including the terminating NULL character. The function will fill the buffer up to nSize chars and return the number of characters written to the buffer
The user is responsible for allocating and freeing the memory.
As a user of the API, I personally find that quite handy. Next, because it is a common pattern, it will be easily understand.
1
The iterator approach mentioned by Useless is pretty good, though it may be enhanced by having the “read more” function allow the caller to indicate how many items it is ready to accept. A limitation with that approach is that it may not be possible for a method to return a “snapshot” without tying up resources which could be left dangling if the caller fails to call a “close” method.
An alternative approach is to define a structure:
struct ThingProcessor {
int (*process)(struct ThingProcessor*, struct Thing *, int n);
};
and then have a method:
void getThings(struct ThingProcessor *proc);
which will read groups of one or more items and, for each struct Thing someThings[n];
invoke proc->process(proc, Thing, n);
[if the supplied method returns a negative value, getThings
should exit early]. Provided that the supplied methods always return, the getThings()
method will be able to ensure that any required resources get released.