My environment is multi-instances.
Here is the code logic:
- Reading a file(contain 15k of entries) -> insert into queue-A respectively with entry ID
- Instance pick up one entry from queue-A -> create a record in DB, log it no matter successful or fail. -> insert into queue-B with entry ID
- Instance pick up one entry from queue-B -> create a property-X for this record in DB, log it no matter successful or fail. -> insert into queue-C with property ID
- Instance pick up one entry from queue-C -> create a property-Y for this record in DB, log it no matter successful or fail.
The flow is sequential.
When we check Datadog, we found some entries log are missing in step 2 and 3. They don’t have either successful and fail log. They are created in DB, only logs are missing.
Assumption:
-
Is it because Serilog doesn’t write properly due to multi-instance environment? No we have set in Serilog setting.
<add key="serilog:write-to:RollingFile.shared" value="true"/>
, Also I can see different instances can write logs into the same file. -
Does the log reach the line/file limit? No, If we use Serilog Async then buffer size feeding Async() is fixed at 10,000 items by default, and does not block when full. In each log, there are about 3-14k of records, it is not big.
-
Does it CPU usage too high and cannot process? No, it is about 20% usage.
-
Is it because Datadog miss some log files? No, our log file has formatted name File-Data-sequentialNumber. I can see a certain of continuous number in the files.
Obervation:
- In Prod environment there are 60 logs missing in step 2 and 60 logs missing in step 3 over 12k of entries.
- In NFT environment, there are about 1/4 of records missing in both step 2 and step 3. It is just about process every 4 records and miss 1 log, sometimes process every 3 records and miss 1 log. The pattern is stable in relatively time frame.
Can anyone help me to identify Why it miss logs?
1