I get confused when people try to make a distinction between compiled languages and managed languages. From experience, I understand that most consider compiled languages to be C,C++ while managed languages are Java,C# (There are obviously more, but these are just few examples). But what exactly is the core difference between the two types of languages?
My understanding is that any program, regardless of what language you use is essentially “compiled” into a low-level machine code which is then interpreted, so does that kinda make managed languages a subset of compiled languages (That is, all managed languages are compiled languages but not the other way around)?
8
The difference is not in “compiled” vs. “managed”, these are two orthogonal axes. By “managed” they normally mean a presence of a garbage-collected memory management and/or a presence of a virtual machine infrastructure. Both has absolutely nothing to do with compilation and whatever people deem to be opposite to it.
All this “differences” are quite blurred, artificial and irrelevant, since it is always possible to mix managed and unmanaged memory in a single runtime, and a difference between compilation and interpretation is very vague too.
3
To quote Wikipedia:
Managed code is a term coined by Microsoft to identify computer program source code that requires and will only execute under the management of a Common Language Runtime virtual machine (resulting in bytecode).
Managed code needs a runtime (like the .NET CLT) to execute.
3
I think there is a distinction to be made, however it is not necessarily between “Compiled” and “Managed”. These are not opposites; a language can be compiled and not managed, or interpreted (not compiled) and managed, or both, or even neither.
A “compiled” language is simply one in which there is a step that transforms the source code written by the developer into some more regular “bytecode” which is what is executed by the machine. The “machine” can be the actual processor, or a “virtual machine” that performs additional operations on the bytecodes to translate them to “native” machine instructions. The antonym for a “compiled” language is an “interpreted” language, in which the source code is transformed into bytecode instructions at runtime, line by line as they are executed, without a compilation step. A hybrid between them is “jitting”, from “JIT” (Just In Time), which is usually interpretation as a one-time step by the executing machine; a line of code (or function or source file) is interpreted when first run, but the native instructions produced are kept in memory so the runtime doesn’t have to re-do the interpretation again on subsequent executions of that code.
A “managed” language is a language designed to produce programs that are consumed within a specific runtime environment, which almost always includes a bytecode interpreter; a “virtual machine” that takes the program’s code and performs some additional machine or environment-specific transformation. The environment may also include memory management, such as a “garbage collector” and other “security” features meant to keep the program operating within its “sandbox” of space and tools, however such features are not the sole domain of “managed” runtimes. Virtually all interpreted languages could be considered managed, because they require the interpreter to be running underneath the lines of “user” code being executed. In addition, JVM and .NET languages (Java, Scala, C#, VB, F#, IronWhatever) are compiled into an intermediate language or IL, which is superficially similar in form and function to a binary assembly language, but doesn’t adhere 100% to any “native” instruction set. These instructions are executed by the JVM, or by .NET’s CLR, which effectively translates them to native binary instructions specific to the CPU architecture and/or OS of the machine.
So, languages can generally be described as “compiled” or “interpreted”, and as “unmanaged” (or “native”) and “managed”. There are languages that can be described as any combination of these except possible “interpreted native” (which would only be true for hand-written hexadecimal opcodes, where what is written by the developer is what is executed); if you consider the interpretation layer as a “runtime” (which is easy to argue for and hard to argue against), then all interpreted languages are “managed”.
If you want to get technical, almost all programs targeting a multitasking OS nowadays are “managed”; the OS will create a “virtual machine” for each program that is running, in which the program thinks (or at least doesn’t have to know otherwise) that it is the only thing running. The code may make calls within itself and to other referenced libraries as if that program was the only thing loaded in memory; similarly, calls to allocate RAM and other higher memory to store and manipulate data and control devices are coded as if the entire memory architecture was available. The VM (and the OS behind it) then translates various memory pointers to the actual location of the program, its data, and hooks to device drivers etc. This is most often done by applying a memory offset (each VM gets a block of 2GB or whatever of memory, starting at address X which the program can treat as if that X was address 0) and as such is very cheap to do, but there are other things the OS kernel is responsible for, such as process scheduling and inter-process communication, which are trickier to manage. However, this basic pattern is generally not considered “managed”, as the program doesn’t have to know that it’s being run by a virtual machine and is often still responsible for keeping its allocated memory “clean”. A program that was designed to be run on the MS-DOS command line can be run on newer Windows OSes that don’t even have the MS-DOS environment underneath them anymore; the program is instead given a “virtual console” environment, and provided it doesn’t try to leave this “sandbox” by trying to directly access protected areas of memory, it will run quite happily.
7
Managed Language in simple terms it is a high-level language that depends on services provided by a run-time environment to execute, such as garbage collection service, that is why it is called managed in general but that is not the only service it uses, and some of these services are security services, exception handling, standard types
, it uses Common Language Run-time CLR
to execute, like in .Net languages or a virtual environment like Java which uses `Java Virtual Machine JVM.
Unmanaged Language is a low-level language executable directly by the operating system without the need for virtual run-time services or intermediate language, such languages like C, C++
, unmanaged code produced by such languages uses library routines that is dynamically linked to the OS to get the code to execute called DLLs (Dynamic Link Libraries), unmanaged code access the memory directly that is why it is faster than managed code, but unless you are building a hardware driver or sophisticated video game you don’t really want to use unmanaged languages as it can get dangerous to work with especially with inexperienced developers like the role state with great power comes great responsibility
, and that is why managed languages exists to help developers produce extensible code without diving into the bottom of the system, but still you can create mixed code if you need, these articles explain it all:
An Overview of Managed/Unmanaged Code Interoperability
Sample: Mixing Unmanaged C++, C++/CLI, and C# code