i have a file, metrics.log, that gets logrotated daily at 23:59 via log4j’s RollingFileAppender.
The logrotated file is kept in another directory as metrics-$DATE.log.
It contains json-formatted log data (also, a timestamp from when the logline originated).
Exactly each minute, a new logline gets written into metrics.log.
We recently found out that many loglines appear around 7am in the morning at once.
To further investigate the source of the Problem, i added more timestamps.
The setup is as such:
Elastic/Kibana running in docker on a Ubuntu machine.
Logstash and metrics.log on a Windows machine.
I added a ‘received’ timestamp, for when a log line is received by Logstash on the Windows machine (via a ruby filter in the pipeline code => “event.set(‘received’, Time.now.utc)”). When elastic receives the line, i also added an ingest pipeline that adds a ‘delivered’ timestamp.
Those two timestamps are always next to each other, so i guess this part of the ELK stack is okay.
But the timetamp of the line in metrics.log sometimes is hours or days apart.
I can search for those mismatched entries by:
POST /logs-metrics-prod/_search
{
"query": {
"bool": {
"filter": {
"script": {
"script": {
"source": "doc['delivered'].value.getMillis() - doc['timestamp'].value.getMillis() > 10000",
"lang": "painless"
}
}
}
}
}
}
and the results are like:
{
"took": 22,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3983,
"relation": "eq"
},
"max_score": 0,
"hits": [
{
"_index": ".ds-logs-metrics-prod-2024.08.02-000001",
"_id": "ae7ZEZEBOYDKk8qKQos1",
"_score": 0,
"_source": {
"anzahlPicker": {
"PUPT": 0,
"GTHY": 0
},
"@timestamp": "2024-08-01T22:00:08.515Z",
"log": {
"file": {}
},
"data_stream": {
"namespace": "prod",
"type": "logs",
"dataset": "metrics"
},
"host": {},
"@version": "1",
"received": "2024-08-02T06:48:55.932Z",
"delivered": "2024-08-02T06:48:56.470630659Z",
"offenePickauftraege": {
"PUPT": 0,
"GTHY": 0
},
"event": {},
"type": "metric",
"timestamp": "2024-08-01T22:00:08.515Z"
}
},
{
"_index": ".ds-logs-metrics-prod-2024.08.02-000001",
"_id": "5-7ZEZEBOYDKk8qKQouD",
"_score": 0,
"_source": {
"anzahlPicker": {
"PUPT": 0,
"GTHY": 0
},
"@timestamp": "2024-08-01T22:01:08.519Z",
"log": {
"file": {}
},
"data_stream": {
"namespace": "prod",
"type": "logs",
"dataset": "metrics"
},
"host": {},
"@version": "1",
"received": "2024-08-02T06:48:55.933Z",
"delivered": "2024-08-02T06:48:56.706188482Z",
"offenePickauftraege": {
"PUPT": 0,
"GTHY": 0
},
"event": {},
"type": "metric",
"timestamp": "2024-08-01T22:01:08.519Z"
}
},
[..]
At 2024-08-02T06:48:55.933Z the metrics.log is rotated (23:59) and cant contain a “@timestamp”: “2024-08-01T22:01:08.519Z”.
The pipeline configuration is as such:
input {
file {
path => ["C:/wms/metrics.log"]
start_position => "beginning"
mode => "tail"
codec => plain { charset => "ISO-8859-1" }
#sincedb_path => "C:/Program/ Files/Logstash/sincedb/sincedb_wms_metrics"
#codec => "json"
}
}
filter {
json {
source => "message"
}
date {
match => ["timestamp", "ISO8601"]
timezone => "Europe/Berlin"
target => "@timestamp"
}
ruby {
# format: 2024-07-25T06:52:59.493Z
code => "event.set('received', Time.now.utc)"
}
mutate {
copy => { "@timestamp" => "timestamp" }
add_field => {
"[data_stream][type]" => "logs"
"[data_stream][dataset]" => "metrics"
"[data_stream][namespace]" => "prod"
}
remove_field => ["[host][name]", "[log][file][path]", "[event][original]", "message"]
}
}
output {
pipeline { send_to => ["es-pipeline"] }
}
I copy timestamp around because the metrics.log are emitting in localtime and not UTC.
A logline in raw looks like that:
{"timestamp":"2024-08-05T00:12:12.379774800","type":"metric","offenePickauftraege":{"PUPT":0,"GTHY":0},"anzahlPicker":{"PUPT":0,"GTHY":0}}
I think i now tried all permutations of start_position => "beginning/end"
and mode => "tail/read"
.
Also, i can’t find anything of interest in the logstash logfiles (even on Loglevel INFO).
Between each tries i delete the logs-metrics-prod data stream and restart logstash to have a clean ‘restart’, let it run through the day and take a look again at the next morning.
So, can it be a Problem with the logrotation? Any other idea or hint where to look next as i’m out of ideas. Unfortunately changing the Program that emits the logs/logrotation into unique file names etc. is not possible.
Thank you.