I have been recently asked in an interview if I know about any garbage collection algorithms.
I knew what garbage collection is but I never really thought about learning about garbage collection algorithms since as a developer I never had to worry about it and the Garbage collector does all the hard work for me.
Do you guys think that Java developers should know about garbage collector algorithms? If yes can you tell me which ones should I look into?
2
I think knowing garbage collection algorithms isn’t important at all if you develop “standard software” and not software platforms. You should have a basic understanding of how a garbage collector works and that’s about it. Unless you experience critical delays in your software caused by garbage collection or you need to optimize memory usage.
If you’re interested in those algorithms, please see this post of mine: What are the algorithms behind low pause GC?
Garbage collection is an interesting, non-trivial computer science problem.
Knowing and understanding an algorithm for it is an indication that you have a pretty deep interest and understanding of these algorithms. Even if you haven’t studied Java’s GC algorithm, it would impress me if someone would be able to give a reasonable description of what data structures and algorithms would be used.
In terms of as a Java programmer, it would be good if a developer could describe the advantages and disadvantages of GC, which would include a bit of knowledge about how it’s implemented. This would indicate having an interest in how the tools you use work rather than just passively using them. Knowing the costs would also help you to program in a way that minimizes the costs.
I wouldn’t say this is “required knowledge” to make a living as a Java developer, but a plus skill that shows you’re able and willing to go a little deeper than what you need to know to get today’s job done.
2
I see two reasons why one should know how garbage collector (or any algorithm/technology) works. Here they are:
1. You get a better knowledge of what is going on beneath the code you write. This can often help you write more efficient code, which will guarantee better performance. In some cases this can be vital. (I’ve had an unpleasant experience when GWT was relying on browser’s garbage collector, and we had a huge memory leak with Chrome. So we had to see what exactly caused the leak.)
2. Such algorithms are always (or almost always, no, always) trusted to smart, skilled, qualified and experienced developers. So studying their approach can be very useful.
I see another reason why you were asked such question at interview. Some developers (my ex-colleague particularly) think that a developer isn’t smart or hard-working enough, if he/she doesn’t know such things. I disagree with this statement. But anyway, knowing such things is aften a good way to impress your interviewer.
5
You should know about generational garbage collection, and the specifics about Java garbage collection (the PermGen, Eden and Tenured spaces). You should also be familiar with garbage collection in general (like why reference counting is usually a bad idea, and why mark-and-sweep is better). I’d also recommend reading up on some alternate implementations (like the “pauseless” GC in Azul’s Zing JVM and IBM’s real-time Metronome project).
You should have SOME knowledge about how the garbage collection for Java works for two reasons:
First, if you don’t know how it works, then you may accidentally make design decisions that lead to worst-case performance in your actual application. This becomes less and less likely as the GC improves, but if you have a choice of algorithms in your app, then knowing something about the GC means you can pick one with knowledge of what it’s going to do, instead of finding out that it causes bad behavior.
Second, if you don’t know how it works, you can’t possibly tune the GC for a given application. Most Java programmers never need to tune the GC, as the default parameters work well enough most of the time. If you do something that gets out of that ‘most of the time’, then you may find yourself tuning the GC parameters. Doing so without knowledge of the GC is just randomly turning knobs – you might get something useful out of it, but more likely you’ll jut screw things up worse.
So, while I wouldn’t expect a good Java programmer to know everything under the sun about GC, I’d expect that programmer to know at some level how the GC in the JVM they are using functions, and what the tradeoffs are for that GC algorithm.
Yes, every Java developer should definitely know what’s going on behind the scenes of the virtual machine and that includes the work of the garbage collection.
The level of knowledge howver is another question. I wouldn’t expect a normal developer to explain the difference of an actual implementation (I would have to do some research of that myself) however the basic principle of what a GC does and what the pros and cons against managing the memory yourself should be clear.