I’ve been using python for a few days now and I think I understand the difference between dynamic and static typing. What I don’t understand is under what circumstances it would be preferred. It is flexible and readable, but at the expense of more runtime checks and additional required unit testing.
Aside from non-functional criteria like flexibility and readability, what reasons are there to choose dynamic typing? What can I do with dynamic typing that isn’t possible otherwise? What specific code example can you think of that illustrates a concrete advantage of dynamic typing?
16
Since you asked for a specific example, I’ll give you one.
Rob Conery’s Massive ORM is 400 lines of code. It’s that small because Rob is able to map SQL tables and provide object results without requiring a lot of static types to mirror the SQL tables. This is accomplished by using the dynamic
data type in C#. Rob’s web page describes this process in detail, but it seems clear that, in this particular use case, the dynamic typing is in large part responsible for the brevity of the code.
Compare with Sam Saffron’s Dapper, which uses static types; the SQLMapper
class alone is 3000 lines of code.
Note that the usual disclaimers apply, and your mileage may vary; Dapper has different goals than Massive does. I just point this out as an example of something that you can do in 400 lines of code that probably wouldn’t be possible without dynamic typing.
Dynamic typing allows you to defer your type decisions to runtime. That’s all.
Whether you use a dynamically-typed language or a statically-typed one, your type choices must still be sensible. You’re not going to add two strings together and expect a numeric answer unless the strings contain numeric data, and if they do not, you’re going to get unexpected results. A statically typed language will not let you do this in the first place.
Proponents of statically type languages point out that the compiler can do a substantial amount of “sanity checking” your code at compile time, before a single line executes. This is a Good Thing™.
C# has the dynamic
keyword, which allows you to defer the type decision to runtime without losing the benefits of static type safety in the rest of your code. Type inference (var
) eliminates much of the pain of writing in a statically-typed language by removing the need to always explicitly declare types.
Dynamic languages do seem to favor a more interactive, immediate approach to programming. Nobody expects you to have to write a class and go through a compile cycle to type out a bit of Lisp code and watch it execute. Yet that’s exactly what I’m expected to do in C#.
15
Phrases like “static typing” and “dynamic typing” are thrown around a lot, and people tend to use subtly different definitions, so let’s start by clarifying what we mean.
Consider a language that has static types that are checked at compile-time. But say that a type error generates only a non-fatal warning, and at runtime, everything is duck-typed. These static types are only for the programmer’s convenience, and do not affect the codegen. This illustrates that static typing does not by itself impose any limitations, and is not mutually exclusive with dynamic typing. (Objective-C is a lot like this.)
But most static type systems do not behave this way. There’s two common properties of static type systems that can impose limitations:
The compiler may reject a program that contains a static type error.
This is a limitation because many type safe programs necessarily contain a static type error.
For example, I have a Python script that needs to run as both Python 2 and Python 3. Some functions changed their parameter types between Python 2 and 3, so I have code like this:
if sys.version_info[0] == 2:
wfile.write(txt)
else:
wfile.write(bytes(txt, 'utf-8'))
A Python 2 static type checker would reject the Python 3 code (and vice versa), even though it would never be executed. My type safe program contains a static type error.
As another example, consider a Mac program that wants to run on OS X 10.6, but take advantage of new features in 10.7. The 10.7 methods may or may not exist at runtime, and it’s on me, the programmer, to detect them. A static type checker is forced to either reject my program to ensure type safety, or accept the program, along with the possibility of producing a type error (function missing) at runtime.
Static type checking assumes that the runtime environment is adequately described by the compile time information. But predicting the future is perilous!
Here’s one more limitation:
The compiler may generate code that assumes the runtime type is the static type.
Assuming the static types are “correct” provides many opportunities for optimization, but these optimizations can be limiting. A good example is proxy objects, e.g. remoting. Say you wish to have a local proxy object that forwards method invocations to a real object in another process. It would be nice if the proxy were generic (so it can masquerade as any object) and transparent (so that existing code does not need to know it is talking to a proxy). But to do this, the compiler cannot generate code that assumes the static types are correct, e.g. by statically inlining method calls, because that will fail if the object is actually a proxy.
Examples of such remoting in action include ObjC’s NSXPCConnection or C#’s TransparentProxy (whose implementation required a few pessimizations in the runtime – see here for a discussion).
When the codegen is not dependent on the static types, and you have facilities like message forwarding, you can do lots of cool stuff with proxy objects, debugging, etc.
So that’s a sampling of some of the stuff you can do if you are not required to satisfy a type checker. The limitations are not imposed by static types, but by enforced static type checking.
4
Duck-typed variables are the first thing everyone thinks of, but in most cases you can get the same benefits through static type inference.
But duck typing in dynamically-created collections is hard to achieve in any other way:
>>> d = JSON.parse(foo)
>>> d['bar'][3]
12
>>> d['baz']['qux']
'quux'
So, what type does JSON.parse
return? A dictionary of arrays-of-integers-or-dictionaries-of-strings? No, even that isn’t general enough.
JSON.parse
has to return some kind of “variant value” that can be null, bool, float, string, array of any of these types recursively, or dictionary from string to any of these types recursively. The main strengths of dynamic typing come from having such variant types.
So far, this is a benefit of dynamic types, not of dynamically-typed languages. A decent static language can simulate any such type perfectly. (And even “bad” languages can often simulate them by breaking type safety under the hood and/or requiring clumsy access syntax.)
The advantage of dynamically-typed languages is that such types cannot be inferred by static type inference systems. You have to write the type explicitly. But in many such cases—including this once—the code to describe the type is exactly as complicated as the code to parse/construct the objects without describing the type, so that still isn’t necessarily an advantage.
5
As every remotely practical static type system is severely limited compared to the programming language it is concerned with, it cannot express all invariants which code could check at runtime. In order to not circumvent the guarantees a type system attempts to give, it hence opts to be conservative and disallow use cases which would pass these checks, but cannot (in the type system) be proven to.
I’ll make an example. Suppose you implement a simple data model to describe data objects, collections of them, etc. which is statically typed in the sense that, if the model says the attribute x
of object of type Foo holds an integer, it must always hold an integer. Because this is a runtime construct, you cannot type it statically. Suppose you store the data described in YAML files. You create a hash map (to be handed to a YAML library later), get the x
attribute, store it in the map, get that other attribute which just so happens to be a string, … hold a second? What’s the type of the_map[some_key]
now? Well shoot, we know that some_key
is 'x'
and the result hence must be an integer, but the type system can’t even begin to reason about this.
Some actively researched type systems may work for this specific example, but these are exceedingly complicated (both for compiler writers to implement and for the programmer to reason in), especially for something this “simple” (I mean, I just explained it in one paragraph).
Of course, today’s solution is boxing everything and then casting (or having a bunch of overriden methods, most of which raise “not implemented” exceptions). But this isn’t statically typed, it’s a hack around the type system to do the type checks at runtime.
7
There is nothing you can do with dynamic typing that you can’t do with static typing, because you can implement dynamic typing on top of a statically typed language.
A short example in Haskell:
data Data = DString String | DInt Int | DDouble Double
-- defining a '+' operator here, with explicit promotion behavior
DString a + DString b = DString (a ++ b)
DString a + DInt b = DString (a ++ show b)
DString a + DDouble b = DString (a ++ show b)
DInt a + DString b = DString (show a ++ b)
DInt a + DInt b = DInt (a + b)
DInt a + DDouble b = DDouble (fromIntegral a + b)
DDouble a + DString b = DString (show a ++ b)
DDouble a + DInt b = DDouble (a + fromIntegral b)
DDouble a + DDouble b = DDouble (a + b)
With enough cases you can implement any given dynamic type system.
Conversely, you can also translate any statically typed program into an equivalent dynamic one. Of course, you would lose all compile-time assurances of correctness that the statically typed language provides.
Edit: I wanted to keep this simple, but here are more details about an object model
A function takes a list of Data as arguments and performs calculations with side effects in ImplMonad, and returns a Data.
type Function = [Data] -> ImplMonad Data
DMember
is either a member value or a function.
data DMember = DMemValue Data | DMemFunction Function
Extend Data
to include Objects and Functions. Objects are lists of named members.
data Data = .... | DObject [(String, DMember)] | DFunction Function
These static types are sufficient to implement every dynamically typed object system I’m familiar with.
6
Membranes:
A membrane is a wrapper around an entire object graph, as opposed to a wrapper for just a single object. Typically, the creator of a membrane starts out wrapping just a single object in a membrane. The key idea is that any object reference that crosses the membrane is itself transitively wrapped in the same membrane.
Each type is wrapped by a type that has the same interface, but which intercepts messages and wraps and unwraps values as they cross the membrane. What is the type of the wrap function in your favorite statically typed language? Maybe Haskell has a type for that functions, but most statically typed languages don’t or they end up using Object → Object, effectively abdicating their responsibility as type-checkers.
4
As someone mentioned, in theory there is no much you can do with dynamic typing that you could not do with static typing if you would implement certain mechanisms on your own. Most of languages provide the type relaxation mechanisms to support type flexibility like void pointers, and root Object type or empty interface.
Better question is why is dynamic typing more suitable and more appropriate in certain situations and problems.
First, lets define
Entity – I would need a general notion of some entity in the code. It can be anything from primitive number to complex data.
Behavior – lets say our entity has some state and a set of methods that allow outside world to instruct the entity to certain reactions. Lets call the state + interface of this entity its behavior. One entity can have more than one behavior combined in a certain way by the tools language provides.
Definitions of entities and their behaviors – every language provides some means of abstractions which help you to define behaviors (set of methods + internal state) of certain entities in the program. You can assign a name to these behaviors and say that all instances that have this behavior are of certain type.
This is probably something that is not that unfamiliar. And as you said you understood the difference, but still. Probably not complete and most accurate explanation but I hope fun enough to bring some value 🙂
Static typing – behavior of all entities in your program are examined in compile time, before code is started to run. This means that if you want for example your entity of type Person to have behavior (to behave like) Magician then you would have to define entity MagicianPerson and give it behaviors of a magician like throwMagic(). If you in your code, mistakenly tell to ordinary Person.throwMagic() compiler will tell you "Error >>> hell, this Person has no this behavior, dunno throwing magics, no run!".
Dynamic typing – in dynamic typing environments available behaviors of entities are not checked until you really try to do something with certain entity. Running Ruby code that asks a Person.throwMagic() will not be caught until your code really comes there. This sounds frustrating, isn’t it. But it sounds revelational as well. Based on this property you can do interesting things. Like, lets say you design a game where anything can turn to Magician and you don’t really know who will that be, until you come to the certain point in code. And then Frog comes and you say HeyYouConcreteInstanceOfFrog.include Magic
and from then on this Frog becomes one particular Frog that has Magic powers. Other Frogs, still not. You see, in static typing languages, you would have to define this relation by some standard mean of combination of behaviors (like interface implementation). In dynamic typing language, you can do that in runtime and nobody will care.
Most of dynamic typing languages have mechanisms to provide a generic behavior that will catch any message that is passed to their interface. For example Ruby method_missing
and PHP __call
if I remember good. That means that you can do any kind of interesting things in run time of the program and make type decision based on current program state. This brings tools for modeling of a problem that are lot more flexible than in, lets say, conservative static programming language like Java.