I have questions about how Consistency Levels would behave in suboptimal conditions, like a catastrophic failover or replications falling behind.
I have tried testing some of the below scenarios but have not been very successful because replications happen very quickly.
I even trying to do manual failovers appears to do a more of a graceful failover vs catastrophic failover. I do wish Microsoft gave you the ability to do more of a catastrophic failover.
Anyway, here are my questions.
- If you’re using
Strong
CL with singlewrite
region inEast US
and 2read
regions inEast US
andWest US
, would Cosmos Db stop taking writes ifEast US
went offline since it can’t replicate the data? - If you’re using
Bounded Staleness
CL with singlewrite
region inEast US
and 2read
regions inEast US
andWest US
, would Cosmos Db stop taking writes ifEast US
staleness threshold has been exceeded? - If you’re using
Strong
CL with singlewrite
region inEast US
and 3read
regions inEast US
,Central US
, andWest US
, would Cosmos Db stop taking writes ifEast US
went offline since it can’t replicate the data? - If you’re using
Bounded Staleness
CL with singlewrite
region inEast US
and 3read
regions inEast US
,Central US
, andWest US
, would Cosmos Db stop taking writes ifEast US
staleness threshold has been exceeded? - For
Consistent prefix
CL, let’s assume you have multiple write regions, withEast US
as priority 1 andWest US
as priority 2. When writing to the same partition key in different regions, will you get incorrect results if replication is lagging? Let’s assume this scenario:
Write To East US @ 2:10 PM UTC: partitionNumber: 1, id: 1, firstName: John
Write To West US @ 2:11 PM UTC: partitionNumber: 1, id: 2, firstName: Bob
If you run statement:
SELECT * FROM c WHERE c.partitionNumber= '1'
- I assume
East US
would give youid
1 withfirstName
John, if replication has not happened? - I assume
West US
would give youid
2 withfirstName
Bob, if replication has not happened? - What result would the Cosmos portal give you?
- If you have multiple write regions, does all CL become
Eventual
? Let’s assume this complex scenario that would happen with any CL (Except Strong) in a multi-region writes.
This is just an example and not necessarily how it should be designed.
Let’s assume you have a UI where you give commands to servers. A Cosmos container has a partition key of serverName
. The server calls an API in a 1-minute interval. You only want to send the command once, so you have a flag that states if it was sent.
Let’s say these events occur:
UI does an insert in East US @ 2:10:10 PM UTC: id: 1, serverName: 'MyServer, command: 'apt update', commandSent: false
UI does an insert in West US @ 2:10:12 PM UTC: id: 1, serverName: 'MyServer, command: 'apt upgrade', commandSent: false
UI does an insert in East US @ 2:10:22 PM UTC: id: 1, serverName: 'MyServer, command: 'apt install nginx', commandSent: false
UI does an insert in West US @ 2:10:30 PM UTC: id: 1, serverName: 'MyServer, command: 'run antivirus program', commandSent: false
UI writes command to East US @ 2:10:55 PM UTC: id: 1, serverName: 'MyServer, command: 'shutoff', commandSent: false
Now the server calls an API that gets the commands to run, the API set ApplicationRegion
to West US
. At the time the server called the API, East US
hasn’t replicated with West US
So West US
runs SQL Statement
SELECT * FROM c WHERE c.serverName = 'MyServer AND c.commandSent = false'
My assumption is this query would return
apt upgrade
run antivirus program
Now the API sets commandSent
to true
for all the commands that were sent.
Before the next server calls the API a second time, a replication happens.
Now when the server calls the API server on West US
and runs query:
SELECT * FROM c WHERE c.serverName = 'MyServer AND c.commandSent = false'
My assumption is this query would return
apt update
run antivirus program
apt install nginx
shutoff
So, instead of the commands getting the run
command in the correct order of:
apt update
apt upgrade
apt install nginx
run antivirus program
shutoff
It would be run in this order:
apt upgrade
run antivirus program
apt update
run antivirus program
apt install nginx
shutoff