I’m trying to construct a random, yet predictable cronjob schedule for a monthly and daily cronjob based upon arbitrary user-provided data. The daily and monthly cronjobs should run at different hours.
The goal is that if a user provides the same input repeatedly, the cronjobs will always run at the exact same time. Which data yields what schedule should “feel” random, but doesn’t have to be cryptographically random. However, if the user just slightly changes the input, the cronjobs should run at totally different (i.e. not similar) times. If the user then changes the input back to the original time, the cronjobs should run at the exact same time as before.
There’s no way of storing the resulting cronjob schedule persistently, i.e. it must be constructed strictly from the user data. The user data is arbitrary, i.e. we mustn’t expect it to have a certain form or length – it can be an empty string, or a string with 1 gigabytes of random data.
The cronjob schedule should be constructed using Bash with just Busybox tools installed.
My idea was the following:
-
The schedule of the monthly cronjob might be represented by an integer representing the nth minute of the month. So, n=123 represents the 123rd minute of the month, or 2:03 on the 1st day of the month (cronjob schedule
3 2 1 * *
). In contrast, n=12345 represents the 12,345th minute of the month, or 13:45 on the 9th day of the month (cronjob schedule45 13 9 * *
). Since the cronjob wouldn’t run in February otherwise, we accept 23:59 on the 28th day max. Thus we need an integer between 0 and 40319 (= 28 days * 24 hours * 60 minutes – 1).For this we might create a
__crontab_monthly()
function, accepting an arbitrary integer. Since the input integer isn’t necessarily within the expected range, we first perform a modulo operation of 40320. We then perform modulo operations and divisions to get the respective day, hour, and minute. Lastly we concat the cronjob schedule.The same principle also works for the daily cronjob, just limited to an integer between 0 and 1439 (= 24 hours * 60 minutes – 1). We might create a similar
__crontab_daily()
function for that.I don’t really have a idea yet about how to ensure that the monthly and daily cronjobs run at different hours – besides just offsetting the value by some magic value… Any ideas?
-
However, for this to work we first need to calculate a random, yet persistent integer from the user data to feed them into our
__crontab_{monthly,daily}()
functions. Since we must accept arbitrary user data, my idea was to first calculate the md5 hash of the user data. This ensures that the result is perceived as random (tiny changes in the input yield a totally different result), yet it is predictable and yields consistent results for the same data.The reason why a md5 hash might be a good starting point is that a md5 hash is just the hexdecimal string representation of a 128-bit number. However, we can’t do math with a 128-bit number in Bash. I thus had the idea of uniformly consolidating two unique 128-bit hashes to the same 64-bit integer. My approach was to first split the hash into two 64-bit slices, perform modulo operations of 2^32 to condense the slices down to 32 bit each and then concat the two binary numbers. This should yield a 64-bit signed integer which can then be fed into the
__crontab_{monthly,daily}()
functions.
I came up with the following solution so far. The user input is stored inside the $USER_DATA
variable. I’m rather certain that the __crontab_{monthly,daily}()
functions do what they are supposed to do, but the __crontab_reference()
function is more tricky… I’m just not sure whether the calculations are correct, binary isn’t “my thing”. Can someone help please?
__crontab_reference() {
local HASH="$(md5sum <<< "$1" | cut -d ' ' -f 1)"
echo $(( 0x$(printf '%xn' $(( 0x${HASH:0:16} % 2147483648 )))$(printf '%xn' $(( 0x${HASH:16:16} % 2147483648 ))) ))
}
__crontab_daily() {
local NUMBER=$(( "$1" % 1440 ))
NUMBER=$(( NUMBER * ((NUMBER>0) - (NUMBER<0)) ))
local HOUR=$(( NUMBER / 60 ))
local MINUTE=$(( NUMBER % 60 ))
echo "$MINUTE $HOUR * * *"
}
__crontab_monthly() {
local NUMBER=$(( "$1" % 40320 ))
NUMBER=$(( NUMBER * ((NUMBER>0) - (NUMBER<0)) ))
local DAY=$(( NUMBER / 1440 + 1 ))
local HOUR=$(( NUMBER % 1440 / 60 ))
local MINUTE=$(( NUMBER % 1440 % 60 ))
echo "$MINUTE $HOUR $DAY * *"
}
CRONTAB_REFERENCE="$(__crontab_reference "$USER_DATA")"
echo "Daily cronjob schedule: $(__crontab_daily "$CRONTAB_REFERENCE")"
echo "Monthly cronjob schedule: $(__crontab_monthly "$CRONTAB_REFERENCE")"
For USER_DATA=""
this yields:
Daily cronjob schedule: 36 1 * * *
Monthly cronjob schedule: 36 1 20 * *
Last but not least, does anyone have ideas (please including PoC code) for alternative approaches?