I see more and more articles about logging in JSON. You can also find one on NodeJS blog. Why does everyone like it so much? I can only see more operations getting involved:
- A couple new objects being created.
- Stringifying objects, which either involves calculating string length or multiple string allocations.
- GCing all the crap that was created.
Is there any test on performance when using JSON logging and regular string logging? Do people use JSON (for logging) in enterprise projects?
JSON logging gives you the ability to parse the log file programmatically even if the format has changed in time.
A good example is Apache logs. By default Apache uses common
format for access.log:
"%h %l %u %t "%r" %>s %b"
Say that you have built an offline parser that takes one of those log files and calculates some statistics from it.
At some moment you introduce subdomains to your application and include virtual_host
to your logs (just so you can debug if problems appear with one of the subdomains):
"%v %h %l %u %t "%r" %>s %b"
Your parser doesn’t make use of the virtual_hosts
, but you still need to adapt your parser to:
- accept the new log format (notice the
%v
at the head of log format) - still support the old log format (for older log files)
But if you log in JSON, your parser won’t even notice the added field and can happily parse the new logs as well as old logs. And some other parser can make use of the added fields if they exist.
And of course for you, parsing JSON is easier than writing regexps
to parse string logs.
1
If your machine runs so close to its limits that such issues really would matter, you most likely have more serious problems. While there may be exceptional situations where this makes some difference, many applications (maybe most) run on machines for which the difference if you log JSON, simple text or records to a database doesn’t matter at all. Objects, strings and other conversions have to be done in most cases anyway (unless you log raw binary?), maybe you wont see it, because you use default classes that handle it in the background (like if you write to a database).
If you need performance evaluations for this, you would need to make them yourself on the machine you want to run your code and with the programming environment you use everyday. If there is a big overhead or any at all depends on many things. If you write a website in Ruby on Rails for example, your data in most cases is a hash, converting that to JSON costs you near to nothing, since the internal representation isn’t that far from what you want to write (and it’s typical for Rails code to throw around such objects and data structures all the time).
The advantages depend again on your tools. If you have JSON built in your libraries, you can easily read it and display this in some form. Again as an example: Assuming you had an admin interface for your web site and want to show some logging information stored in JSON, you could do this read and display as HTML in Ruby in a single line of code in some cases.
3