I’m trying to determine the technical details of why software produced using programming languages for certain operating systems only work with them.
It is my understanding that binaries are specific to certain processors due to the processor specific machine language they understand and the differing instruction sets between different processors. But where does the operating system specificity come from?
I used to assume it was APIs provided by the OS but then I saw this diagram in a book:
Operating Systems – Internals and Design Principles 7th ed – W. Stallings (Pearson, 2012)
As you can see, APIs are not indicated as a part of the operating system.
If for example I build a simple program in C using the following code:
#include<stdio.h>
main()
{
printf("Hello World");
}
Is the compiler doing anything OS specific when compiling this?
17
You mention on how if the code is specific to a CPU, why must it be specific also to an OS. This is actually more of an interesting question that many of the answers here have assumed.
CPU Security Model
The first program run on most CPU architectures runs inside what is called the inner ring or ring 0. How a specific CPU arch implements rings varies, but it stands that nearly every modern CPU has at least 2 modes of operation, one which is privileged and runs ‘bare metal’ code which can perform any legal operation the CPU can perform and the other is untrusted and runs protected code which can only perform a defined safe set of capabilities. Some CPUs have far higher granularity however and in order to use VMs securely at least 1 or 2 extra rings are needed (often labelled with negative numbers) however this is beyond the scope of this answer.
Where the OS comes in
Early single tasking OSes
In very early DOS and other early single tasking based systems all code was run in the inner ring, every program you ever ran had full power over the whole computer and could do literally anything if it misbehaved including erasing all your data or even doing hardware damage in a few extreme cases such as setting invalid display modes on very old display screens, worse, this could be caused by simply buggy code with no malice whatsoever.
This code was in fact largely OS agnostic, as long as you had a loader capable of loading the program into memory (pretty simple for early binary formats) and the code did not rely on any drivers, implementing all hardware access itself it should run under any OS as long as it is run in ring 0. Note, a very simple OS like this is usually called a monitor if it is simply used to run other programs and offers no additional functionality.
Modern multi tasking OSes
More modern operating systems including UNIX, versions of Windows starting with NT and various other now obscure OSes decided to improve on this situation, users wanted additional features such as multitasking so they could run more than one application at once and protection, so a bug (or malicious code) in an application could no longer cause unlimited damage to the machine and data.
This was done using the rings mentioned above, the OS would take the sole place running in ring 0 and applications would run in the outer untrusted rings, only able to perform a restricted set of operations which the OS allowed.
However this increased utility and protection came at a cost, programs now had to work with the OS to perform tasks they were not allowed to do themselves, they could no longer for example take direct control over the hard disk by accessing its memory and change arbitrary data, instead they had to ask the OS to perform these tasks for them so that it could check that they were allowed to perform the operation, not changing files that did not belong to them, it would also check that the operation was indeed valid and would not leave the hardware in an undefined state.
Each OS decided on a different implementation for these protections, partially based on the architecture the OS was designed for and partially based around the design and principles of the OS in question, UNIX for example put focus on machines being good for multi user use and focused the available features for this while windows was designed to be simpler, to run on slower hardware with a single user. The way user-space programs also talk to the OS is completely different on X86 as it would be on ARM or MIPS for example, forcing a multi-platform OS to make decisions based around the need to work on the hardware it is targeted for.
These OS specific interactions are usually called “system calls” and encompass how a user space program interacts with the hardware through the OS completely, they fundamentally differ based on the function of the OS and thus a program that does its work through system calls needs to be OS specific.
The Program Loader
In addition to system calls, each OS provides a different method to load a program from the secondary storage medium and into memory, in order to be loadable by a specific OS the program must contain a special header which describes to the OS how it may be loaded and run.
This header used to be simple enough that writing a loader for a different format was almost trivial, however with modern formats such as elf which support advanced features such as dynamic linking and weak declarations it is now near impossible for an OS to attempt to load binaries which were not designed for it, this means, even if there were not the system call incompatibilities it is immensely difficult to even place a program in ram in a way in which it can be run.
Libraries
Programs rarely use system calls directly however, they almost exclusively gain their functionality though libraries which wrap the system calls in a slightly friendlier format for the programming language, for example, C has the C Standard Library and glibc under Linux and similar and win32 libs under Windows NT and above, most other programming languages also have similar libraries which wrap system functionality in an appropriate way.
These libraries can to some degree even overcome the cross platform issues as described above, there are a range of libraries which are designed around providing a uniform platform to applications while internally managing calls to a wide range of OSes such as SDL, this means that though programs cannot be binary compatible, programs which use these libraries can have common source between platforms, making porting as simple as recompiling.
Exceptions to the Above
Despite all I have said here, there have been attempts to overcome the limitations of not being able to run programs on more than one operating system. Some good examples are the Wine project which has successfully emulated both the win32 program loader, binary format and system libraries allowing Windows programs to run on various UNIXes. There is also a compatibility layer allowing several BSD UNIX operating systems to run Linux software and of course Apple’s own shim allowing one to run old MacOS software under MacOS X.
However these projects work through enormous levels of manual development effort. Depending on how different the two OSes are the difficulty ranges from a fairly small shim to near complete emulation of the other OS which is often more complex than writing an entire operating system in itself and so this is the exception and not the rule.
6
As you can see, APIs are not indicated as a part of the operating system.
I think you are reading too much into the diagram. Yes, an OS will specify a binary interface for how operating system functions are called, and it will also define a file format for executables, but it will also provide an API, in the sense of providing a catalog of functions that can be called by an application to invoke OS services.
I think the diagram is just trying to emphasize that operating system functions are usually invoked through a different mechanism than a simple library call. Most of the common OS use processor interrupts to access OS functions. Typical modern operating systems are not going to let a user program directly access any hardware. If you want to write a character to the console, you are going to have to ask the OS to do it for you. The system call used to write to the console will vary from OS to OS, so right there is one example of why software is OS specific.
printf is a function from the C run time library and in a typical implementation is a fairly complex function. If you google you can find the source for several versions online. See this page for a guided tour of one. Down in the grass though it ends up making one or more system calls, and each of those system calls is specific to the host operating system.
18
Is the compiler doing anything OS specific when compiling this?
Probably. At some point during the compiling and linking process, your code is turned into an OS-specific binary and linked with any required libraries. Your program has to be saved in a format that the operating system expects so that the OS can load the program and start executing it. Furthermore, you’re calling the standard library function printf()
, which at some level is implemented in terms of the services that the operating system provides.
Libraries provide an interface — a layer of abstraction from the operating system and the hardware — and that makes it possible to recompile your program for a different operating system or different hardware. But that abstraction exists at the source level — once the program is compiled and linked, it’s connected to a specific implementation of that interface that’s specific to a given OS.
There are a number of reasons, but one very important reason is that the Operating System has to know how to read the series of bytes that make up your program into memory, find the libraries that go with that program and load them into memory, and then start executing your program code. In order to do this, the creators of the OS create a particular format for that series of bytes so that the OS code knows where to look for the various parts of the structure of your program. Because the major Operating Systems have different authors, these formats often have little to do with each other. In particular, the Windows executable format has little in common with the ELF format most Unix variants use. So all this loading, dynamic linking and executing code has to be OS specific.
Next, each OS provides a different set of libraries for talking to the hardware layer. These are the APIs you mention, and they are generally libraries that present a simpler interface to the developer while translating it to more complex, more specific calls into the depths of the OS itself, these calls often being undocumented or secured. This layer is often quite grey, with newer “OS” APIs are built partially or entirely on older APIs. For example, in Windows, many of the newer APIs Microsoft has created over the years are essentially layers on top of the original Win32 APIs.
An issue that does not arise in your example, but that is one of the bigger ones that developers face is the interface with the window manager, to present a GUI. Whether the window manager is part of the “OS” sometimes depends on your point of view, as well as the OS itself, with the GUI in Windows being integrated with the OS at a deeper level, while the GUIs on Linux and OS X being more directly separated. This is very important because today what people typically call “The Operating System” is a much bigger beast than what textbooks tend to describe, as it includes many, many application level components.
Finally, not strictly an OS issue, but an important one in executable file generation is that different machines have different assembly language targets, and so the actual generated object code must different. This isn’t strictly speaking an “OS” issue but rather a hardware issue, but does mean that you will need different builds for different hardware platforms.
2
From another answer of mine:
Consider early DOS
machines, and what Microsoft’s real contribution to the world was:Autocad had to write drivers for each printer they could print to. So
did lotus 1-2-3. In fact, if you wanted your software to print, you
had to write your own drivers. If there were 10 printers, and 10
programs, then 100 different pieces of essentially the same code had
to be written separately and independently.What windows 3.1 tried to accomplish (along with GEM, and so many
other abstraction layers) is make it so the printer manufacturer wrote
one driver for their printer, and the programmer wrote one driver for
the windows printer class.Now with 10 programs and 10 printers, only 20 pieces of code have to
be written, and since the microsoft side of the code was the same for
everyone, then examples from MS meant that you had very little work to
do.Now a program wasn’t restricted to just the 10 printers they chose to
support, but all the printers whose manufacturers provided drivers for
in windows.
So the OS provides services to the applications so the applications don’t have to do work that is redundant.
Your example C program uses printf, which sends characters to stdout – an OS specific resource that will display the characters on a user interface. The program doesn’t need to know where the user interface is – it could be in DOS, it could be in a graphical window, it could be piped to another program and used as input to another process.
Because the OS provides these resources, programmers can accomplish much more with little work.
However, even starting a program is complicated. The OS expects an executable file to have certain information at the beginning that tells the OS how it should be started, and in some cases (more advanced environments like android or iOS) what resources will be required that need approval since they touch resources outside the “sandbox” – a security measure to help protect users and other apps from misbehaving programs.
So even if the executable machine code is the same, and there are no OS resources required, a program compiled for Windows won’t run on an OS X operating system without an additional emulation or translation layer, even on the same exact hardware.
Early DOS style operating systems could often share programs, because they implemented the same API in hardware (BIOS) and the OS hooked into the hardware to provide services. So if you wrote and compiled a COM program – which is just a memory image of a series of processor instructions – you could run it on CP/M, MS-DOS, and several other operating systems. In fact you can still run COM programs on modern windows machines. Other operating systems don’t use the same BIOS API hooks, so the COM programs won’t run on them without, again, an emulation or translation layer. EXE programs follow a structure that includes much more than mere processor instructions, and so along with the API issues it won’t run on a machine that doesn’t understand how to load it into memory and execute it.
Actually, the real answer is that if every OS did understand the same executable binary file layout, and you only limited yourself to standardized functions (like in the C standard library) that the OS provided (which OSes do provide), then your software would, in fact, run on any OS.
Of course, the reality is that’s not the case. An EXE
file doesn’t have the same format as an ELF
file, even though both contain binary code for the same CPU.* So each operating system would need to be able to interpret all file formats, and they simply didn’t do this in the beginning, and there was no reason for them to start doing so later (almost certainly for commercial reasons rather than technical ones).
Furthermore, your program probably needs to do things that the C library doesn’t define how to do (even for simple things like listing the contents of a directory), and in those cases every OS provides its own function(s) for achieving your task, naturally meaning there won’t be a lowest common denominator for you to use (unless you make that denominator yourself).
So in principal, it’s perfectly possible. In fact, WINE runs Windows executables directly on Linux.
But it’s a ton of work and (usually) commercially unjustified.
*Note: There’s a lot more to an executable file than just binary code. There’s a ton of information that tells the operating system what libraries the file depends on, how much stack memory it needs, what functions it exports to other libraries that may depend on it, where the operating system might find relevant debug information, how to “re-locate” the file in memory if necessary, how to make exception handling work correctly, etc. etc…. again, there could be a single format for this that everyone agrees on, but there simply isn’t.
9
The diagram has the “application” layer (mostly) separated from the “operating system” layer by the “libraries”, and that implies that “application” and “OS” don’t need need to know about each other. That is a simplification in the diagram, but it’s not quite true.
The problem is that the “library” has actually three parts to it: the implementation, the interface to the application, and the interface to the OS. In principle, the first two can be made “universal” as far as the OS is concerned (it depend on where you slice it), but the third part – the interface to the OS – generally cannot. The interface to the OS will necessarily depend on the OS, the APIs it provides, the packaging mechanism (e.g. the file format used by Windows DLL), etc.
Because the “library” is generally made available as a single package, it means that once the program picks a “library” to use, it commits to a specific OS. This happens one of two ways: a) the programmer picks completely in advance, and then the binding between the library and the application can be universal, but the library itself is bound to the OS; or b) the programmer sets things up so the library is selected when you run the program, but then the binding mechanism itself, between the program and the library, is OS-dependent (e.g. The DLL mechanism in Windows). Each has it’s advantages and disadvantages, but either way you have to make a choice in advance.
Now, this doesn’t mean that it’s impossible to do, but you have to be very clever. To overcome the problem, you would have to go the route of picking the library at run-time, and you would have to come up with a universal binding mechanism that doesn’t depend on the OS (so you are responsable to maintain it, a lot more work). Some times it’s worth it.
You don’t have to, but if you are going to put the effort to do that, there is a good chance you don’t want to be tied to a specific processor either, so you will write a Virtual Machine and you will compile your program to a processor neutral code format.
By now you should have noticed where I’m going. Language-platforms like Java do exactly that. The Java runtime (library) defines the OS-neutral binding between your Java program and the library (how the Java runtime opens and runs your program), and it provides an implementation specific to the current OS. .NET does the same thing to an extent, except that Microsoft doesn’t provide an “library” (runtime) for anything but Windows (but others do – see Mono). And, actually, Flash also does the same thing, although it’s more limited in scope to the Browser.
Finally, there are ways to do the same thing without a custom binding mechanism. You could use conventional tools, but defer the binding step to the library until the user picks the OS. That’s exactly what happens when you distribute the source code. The user takes your program and binds it to the processor (compile it) and OS (link it) when the user is ready to run it.
It all depends on how you slice the layers. At the end of the day, you always have a computing device made with specific hardware running specific machine code. The layers are there largely as a conceptual framework.
Software is not always OS specific. Both Java and the earlier p-code system (and even ScummVM) allow for software that is portable across Operating Systems. Infocom (makers of Zork and the Z-machine), also had a relational database based on another virtual machine. However, at some level something has to translate even those abstractions into actual instructions to be executed on a computer.
4
You say
software produced using programming languages for certain operating systems only work with them
But the program you give as an example will work on many operating systems, and even some bare-metal environments.
The important thing here is the distinction between the source code and the compiled binary. The C programming language is specifically designed to be OS independent in source form. It does this by leaving the interpretation of things like “print to the console” up to the implementer. But C may be complied to something which is OS specific (see other answers for reasons). For example, the PE or ELF executable formats.
1
Other people have covered the technical details well, I’d like to mention a less technical reason, the UX/UI side of things:
Write Once, Feel Awkward Everywhere
Every operating system has it’s own user interface APIs and design standards. It is possible to write one user interface for a program and have it run on multiple operating systems, however doing so all but guarantees that the program will feel out of place everywhere. Making a good user interface requires tweaking the details for each supported platform.
A lot of these are little details, but get them wrong and you’ll frustrate your users:
- Confirm dialogs have their buttons in different order in Windows and OSX; get this wrong and users will click on the wrong button by muscle memory. Windows has “Ok”, “Cancel” in that order. OSX has the order swapped and the do-it button’s text is a short description of the action to be performed: “Cancel”, “Move to Trash”.
- “Go back” behavior is different for iOS and Android. iOS applications draw their own back button as needed, typically in the top-left. Android has a dedicated button in the lower-left or lower-right depending on screen rotation. Quick ports to Android will behave incorrectly if the OS back button is ignored.
- Momentum scrolling is different between iOS, OSX, and Android. Sadly, if you’re not writing native UI code, you likely have to write your own scrolling behavior.
Even when it’s technically possible to write one UI codebase that runs everywhere, it’s best to made adjustments for each supported operating system.
An important distinction at this point is separating the compiler from the linker. The compiler most likely produces more or less the same output (the differences mostly being due to various #if WINDOWS
s). The linker, on the other hand, has to handle all the platform specific stuff – linking the libraries, building the executable file etc.
In other words, the compiler cares mostly about the CPU architecture, because it’s producing the actual runnable code, and has to use the instructions and resources of the CPU (note that .NET’s IL or JVM’s bytecode would be considered instruction sets of a virtual CPU in this view). This is why you must compile code separately for x86
and ARM
, for example.
The linker, on the other hand, has to take all this raw data and instructions, and put it in a format that the loader (these days, this would almost always be the OS) can understand, as well as linking any statically linked libraries (which also includes the code required for dynamic linking, memory allocation etc.).
In other words, you could be able to compile the code only once and have it run on both Linux and Windows – but you have to link it twice, producing two different executables. Now, in practice, you often have to make allowances in code too (that’s where the (pre-)compiler directives come in), so even compile once-link twice isn’t used much. Not to mention that people are treating compilation and linking as a single step during the build (just like you no longer care about the parts of the compiler itself).
DOS-era software was often more binary-portable, but you have to understand that it was also compiled not against DOS or Unix, but rather against a certain contract that was common to most IBM-style PCs – offloading what today are API calls to software interrupts. This didn’t need static linking, since you only had to set the necessary registers, call e.g. int 13h
for graphics functions, and the CPU just jumped to a memory pointer declared in the interrupt table. Of course, again, practice was alot trickier, because to get pedal-to-the-metal performance, you had to write all of those methods yourself, but that basically amounted to getting around the OS altogether. And of course, there’s something that invariably needs interaction with the OS API – program termination. But still, if you used the simplest formats available (e.g. COM
on DOS, which has no header, just instructions) and didn’t want to exit, well – lucky you! And of course, you could handle proper termination in the runtime as well, so you could have code for both Unix termination and DOS termination in the same executable, and detect at runtime which one to use 🙂
1