I think i’ve understood more or less what a parsed Scheme program looks like (a binary tree with atomic values on the leaves, if i have understood correctly). Can anybody please define to me, or give a reference, what a state (or a computation) of a Scheme program is? Is it just the current binding plus a position, or a stack of positions, on the syntax tree? (In such a case, i would appreciate a reference for a formal definition of Scheme binding as well :).)
Is there some simple description, like the one for the Turing Machine (the program + the current content of the tape + the current position on the tape)?
1
Both of the other answers look excellent to me.
I would add the following: the state of a scheme program is… a scheme program! That is, you can define a “meaning” function for scheme programs by showing how programs reduce to other programs. This is called a “small-step” semantics. To show you what I mean: what are the “steps”–in a seventh grade sense–in evaluating
3 * (4 + 5)
? According to the small-step rules provided by most teachers, the first step in evaluating this program is to reduce it to
3 * 20
In exactly the same way, you can define a small-step semantics for Scheme that reduces a program to an answer by taking a series of reduction steps.
12
I’m not aware of any formal definition, but I can speculate a bit.
Basically you can represent any program with a lambda expression in lambda calculus. A binary tree is a simpler view introduced by creators of Scheme, to give a nice idea to newbies in computer science.
Lambda calculus explicitly defines operational semantics for evaluation of lambda expression. Basically this means, in lambda calculus, they define how you will calculate the actual value of any lambda expression, or if we map lambda expressions to programs, how a program will work.
There are different alternatives for defining operational semantics of a programming language. In lambda calculus world these alternatives involve reduction rules and reduction strategy. They always seem like identical but small variations change how lambda expression is reduced. In programming world this corresponds to how program executes. For example different reduction mechanisms might evaluate the actual parameters of a function before or after function definition is evaluated. This corresponds to eager and lazy parameter evaluation mechanisms of different programming languages.
In Scheme it is possible to represent each expression in a simpler binary tree view. Similar to lambda expressions, binary trees are reduced from leaves to root. So each configuration of a Scheme program running is still a binary tree.
However, function calls complicate this. When you call a new function, you have a new Scheme expression to evaluate. It is semantically possible to attach the new expression to the binary tree by replacing the function call. But practically it is not a nice method. I know that List does not use that approach, instead use a stack of configurations, each corresponding to a function. I wouldn’t be surprised if Scheme follows this faction.
Moreover, bindings complicate that issue further. Scheme is not purely functional, it is possible to define bindings and their semantic should be included in the program by obeying scope rules. Even when there are simple scope rules, it is difficult to combine bindings with binary expression trees. Simplest way is using virtual links to bindings (like pointers).
These are only based on my observations from the times I’ve used Scheme, and some of them might false. But I guess it’s clear that a very simple model such as Turing Machine is not purely applicable when factors such as function call and binding go into the picture. Programming language and compiler design is a complex puzzle with many sub-problems inside.
There is book available on web, called An Introduction to Scheme and its Implementations. It might provide more insight and real facts about the Scheme implementation. If you want to learn more on this subject I recommend reading a programming languages book on lambda calculus and a compiler design book (probably the famous dragon book).
1
State and meaning are two very different things, and your question sort of conflates them.
To the question “What does this Scheme program mean?”, one can look at a semantics, either an informal one (a specification written in English, say), or a formal one (usually an operational semantics, but occasionally a denotational semantics, etc.). However, the semantics of a real-world programming language like Scheme is going to be extremely complicated.
Now, if you’ve chosen an operational semantics (as opposed to a denotational semantics), it will have something that looks like a state (you described the state of the operational semantics of a Turing Machine in your question). This may be a reasonably simple object… however, there are many different operational semanticses you can come up with for Scheme (or any real-world language), and each of them will have their state look like something different!
So, essentially, there are infinitely many answers to your question, none of them is going to be illuminating without its corresponding semantics.
The Turing Machine, because the way it’s usually presented is so close to an operational semantics anyways (because it’s so low-level and “state-y”), doesn’t really have this problem, because there’s only one good choice, which you described well in your question.
4
After looking into Alonzo Church’s The calculi of lambda-conversion, and thinking a bit about rewriting systems, different definitions of algorithms, and about Lisp, I have come to mostly agree with John Clements’ answer: that the state of a Scheme program is a “reduced” Scheme program.
I would like however to formulate my own version of it (which i can edit or improve if I learn more).
In my opinion, a state of a Scheme program can be viewed as a sort of a Scheme program, where instead of constant literals, arbitrary constant data values are allowed (they have to be allowed because I am not sure if, for example, an arbitrary float value can be written as a float literal).
For example, a program can execute as follows (the sequence of states):
The program:
(define x 0)
(set! x (lambda (x) (+ x 1)))
(define y 2)
(x y)
Program state after step 1:
(define x (lambda (x) (+ x 1)))
(define y 2)
(x y)
Program state after step 2:
(define y 2)
((lambda (x) (+ x 1)) y)
Program state after step 3:
((lambda (x) (+ x 1)) 2)
Program state after step 4:
(+ 2 1)
Program state after step 5:
3
Update. To confirm this point of view, here are quotes from Racket reference:
1.1 Evaluation Model
Racket evaluation can be viewed as the simplification of expressions to obtain values. For example, just as an elementary-school student simplifies
1 + 1 = 2
Racket evaluation simplifies
(+ 1 1) → 2
The arrow
→
above replaces the more traditional=
to emphasize that evaluation proceeds in a particular direction towards simpler expressions. In particular, a value is an expression that evaluation simplifies no further, such as the number2
.[…]
1.1.4 Top-Level Variables
Given
x = 10
then an algebra student simplifies
x + 1
as follows:x + 1 = 10 + 1 = 11
Racket works much the same way, in that a set of top-level variables are available for substitutions on demand during evaluation. […]
Each evaluation step, then, takes the current set of definitions and program to a new set of definitions and program. Before a define can be moved into the set of definitions, its right-hand expression must be reduced to a value.
defined: evaluate: (begin (define x (+ 9 1)) (+ x 1)) → defined: evaluate: (begin (define x 10) (+ x 1)) → defined: (define x 10) evaluate: (begin (void) (+ x 1)) → defined: (define x 10) evaluate: (+ x 1) → defined: (define x 10) evaluate: (+ 10 1) → defined: (define x 10) evaluate: 11
Using
set!
, a program can change the value associated with an existing top-level variable:defined: (define x 10) evaluate: (begin (set! x 8) x) → defined: (define x 8) evaluate: (begin (void) x) → defined: (define x 8) evaluate: x → defined: (define x 8) evaluate: 8
I have also read about the SECD machine, closures, and environments, but they look to me like just a peculiar implementation method.