tl;dr: Why would I ever choose to write/compile unmanaged1 code?
Lets assume that I am starting a new project. I have decided to write in a C-like language – probably one of C, C++, C# or Java. C# and Java are managed (well, Java runs in the JVM, but lets assume that it is same thing). Managed GCC plugins exist (although I have not used any, so I do not know if they are any good). Managed code has lots of advantages:
- You do not have to worry about explicitly garbage collecting (which is fiddly)
- It is much harder to memory leak
- It is trivial to implement – just choose the correct language or right compiler flags
- e.g. use C#, or use MCP for GCC
- Improved interoperability if you are compiling into IL
- Type safety, array bound and index checking etc
In my opinion (which could be wrong – and I am trying to ascertain why it is wrong) it is better almost always to used managed code. However unmanaged languages exist and managed languages have unmanaged flags, so there must be some reasons for choosing it over managed code. What are the reasons?
Edit: As Deduplicator points out it is possible to use a GC in non-managed code (either third party or one you create yourself). This leads to another related question: Does everybody compiling native C/C++ code use a GC? If not, what are the advantages for not using a GC?
1. For the purpose of this question I am borrowing John Bode’s definition of “Managed” (below) – anything that runs in a virtual machine, not just the Microsoft definition of managed code. I count Java as a “managed” language in that it runs in a VM.
10
-
Legacy – native code long predates managed1 code, and there are still native apps to maintain;
-
Performance – all things being equal, native code should be faster and have a smaller memory footprint than managed code (things are rarely equal though, and for I/O bound tasks, the difference is negligible); also, non-deterministic garbage collection can play hell with realtime code;
-
System or hardware-specific hacks – native code may have access to system calls or libraries that the managed code doesn’t (such as for parallel processing or vector processing, although I imagine most managed languages should be able to expose an API for such operations);
-
No VM available – you may be targeting a system for which no VM has been developed, or doesn’t have the resources to run managed code (such as a microcontroller or other embedded system); also, your target system may not need the capabilities of a full-up VM (predominately CPU-bound tasks, limited I/O, etc.).
I work exclusively with native code (C++ on Linux). There’s no reason beyond inertia that we couldn’t use managed code, but for our purposes it works pretty well. Given that we run several hundred client instances on a single server, we could use every cycle we can get.
1. By which I mean anything that runs in a virtual machine, not just the Microsoft definition of managed code. I count Java as a “managed” language in that it runs in a VM.
3
A lot of it is inertia. C has become the lingua franca of computers. Practically every computer architecture, no matter how obscure or exotic, has a C compiler. Practically every OS defines its binary interface for system calls and dynamically linked libraries in terms of C. Most higher level languages have a mechanism for calling C code. Therefore, if you want a library to be usable in the widest possible variety of hardware and programming languages, you’d write it in C. If you don’t necessarily need it to run on every single computer architecture but still want it to be usable from pretty much every language, you could write it in a higher level language but provide a C interface.
Along that same vein, there’s ease of interoperability. One of the reasons to use C++ is that it’s almost a strict superset of C, so you don’t need to do much of anything to make use of existing C code or wrap your own code in a C interface.
For extremely performance-critical, array-intensive code, it may be necessary to go with unchecked array access, which is something higher level languages generally don’t provide (other than through calling C code) since you get undefined behavior if you get it wrong.
Another reason to use a language with manual memory allocation is for writing extremely safety-critical, reliable, real-time software. For example, NASA’s Jet Propulsion Labs’ coding standards for C forbids recursion and dynamic memory allocation after initialization, so that they can statically guarantee an upper bound on memory usage and allocation/deallocation time. However, there do exist real-time garbage collecting algorithms that guarantee predictable performance, so it’s not impossible to write predictable software in higher level languages. Maybe there’s just no combination of suitable higher level language implementations and static analysis tools available for the hardware and OS they use.