Homoiconicity
This article needs additional citations for verification. (April 2012) |
In computer programming, homoiconicity is a property of some programming languages, in which the primary representation of programs is also a data structure in a primitive type of the language itself, from the Greek words homo meaning the same and icon meaning representation. This makes metaprogramming easier than in a language without this property. To put that another way, homoiconicity is where a program's source code is written as a basic data structure that the programming language knows how to access. For example, imagine a programming language that is particularly good at list manipulation. Now, imagine what would happen if you required programs for that language to be written in the form of lists. The result would be that the program could access its own source code while running, and programmatically reprogram itself on the fly.
Contents[hide]
|
[edit] History
The original source is the paper Macro Instruction Extensions of Compiler Languages,[1] according to the early and influential paper TRAC, A Text-Handling Language:[2]
One of the main design goals was that the input script of TRAC (what is typed in by the user) should be identical to the text which guides the internal action of the TRAC processor. In other words, TRAC procedures should be stored in memory as a string of characters exactly as the user typed them at the keyboard. If the TRAC procedures themselves evolve new procedures, these new procedures should also be stated in the same script. The TRAC processor in its action interprets this script as its program. In other words, the TRAC translator program (the processor) effectively converts the computer into a new computer with a new program language -- the TRAC language. At any time, it should be possible to display program or procedural information in the same form as the TRAC processor will act upon it during its execution. It is desirable that the internal character code representation be identical to, or very similar to, the external code representation. In the present TRAC implementation, the internal character representation is based upon ASCII. Because TRAC procedures and text have the same representation inside and outside the processor, the term homoiconic is applicable, from homo meaning the same, and icon meaning representation.
[...]
Following suggestion of McCullough, W. S., based upon terminology due to Peirce, C. S. s McIlroy. M. D., "Macro Instruction Extensions of Compiler Languages," Comm. ACM, p. 214-220; April, 1960.
Alan Kay used and possibly popularized the term "homoiconic" through his use of the term in his 1969 PhD thesis:[3]
A notable group of exceptions to all the previous systems are Interactive LISP [...] and TRAC. Both are functionally oriented (one list, the other string), both talk to the user with one language, and both are "homoiconic" in that their internal and external representations are essentially the same. They both have the ability to dynamically create new functions which may then be elaborated at the users's pleasure. Their only great drawback is that programs written in them look like King Burniburiach's letter to the Sumerians done in Babylonian cuniform! [...]
[edit] Use and advantages
One advantage of homoiconicity is that extending the language with new concepts typically becomes simpler, as data representing code can be passed between the meta and base layer of the program. The abstract syntax tree of a function may be composed and manipulated as a data structure in the meta layer, and then evaluated.
A typical demonstration of homoiconicity is the meta-circular evaluator.
[edit] Examples
Languages which are considered to be homoiconic include members of the Lisp family, Curl, REBOL, SNOBOL, XSLT, XQuery, TRAC, Io, Ioke, Joy, Factor, Pico, PostScript, Prolog, R, Tcl, Mathematica, and Julia.
In Von Neumann architecture systems (including the vast majority of general purpose computers today), raw machine code also has this property, the data type being bytes in memory.
[edit] Homoiconicity in Lisp
Lisp uses S-expressions as an external representation for data and code. S-expressions can be read with the primitive Lisp function READ. READ returns Lisp data: lists, symbols, numbers, strings. The primitive Lisp function EVAL uses Lisp code represented as Lisp data, computes side-effects and returns a result. The result will be printed by the primitive function PRINT, which creates an external S-Expression from Lisp data.
Lisp data, a list using different data types: (sub)lists, symbols, strings and integer numbers.
((:name "john" :age 20) (:name "mary" :age 18) (:name "alice" :age 22))
Lisp code. The example uses lists, symbols and numbers.
(* (sin 1.1) (cos 2.03)) ; in Infix: sin(1.1)*cos(2.03)
Create above expression with the primitive Lisp function LIST and set the variable EXPRESSION to the result
(setf expression (list '* (list 'sin 1.1) (list 'cos 2.03)) ) -> (* (SIN 1.1) (COS 2.03)) ; Lisp returns and prints the result (third expression) ; the third element of the expression -> (COS 2.03)
Change the COS term to SIN
(setf (first (third expression)) 'SIN) ;The expression is now (* (SIN 1.1) (SIN 2.03))
Evaluate the expression
(eval expression) -> 0.7988834
Print the expression to a string
(princ-to-string expression) -> "(* (SIN 1.1) (SIN 2.03))"
Read the expression from a string
(read-from-string "(* (SIN 1.1) (SIN 2.03))") -> (* (SIN 1.1) (SIN 2.03)) ; returns a list of lists, numbers and symbols
[edit] Homoiconicity in Prolog
1 ?- X is 2*5. X = 10. 2 ?- L = (X is 2*5), write_canonical(L). is(_, *(2, 5)) L = (X is 2*5). 3 ?- L = (ten(X):-(X is 2*5)), write_canonical(L). :-(ten(A), is(A, *(2, 5))) L = (ten(X):-X is 2*5). 4 ?- L = (ten(X):-(X is 2*5)), assert(L). L = (ten(X):-X is 2*5). 5 ?- ten(X). X = 10. 6 ?-
On line 4 we create a new clause. The operator ":-" separates the head and the body of a clause. With assert/1* we add it to the existing clauses(add it to the "database"), so we can call it later. In other languages we would call it "creating a function during runtime". We can also remove clauses from the database with abolish/1, or retract/1.
* The number after the clause's name is the number of arguments it can take (it's also called arity)
We can also query the database to get the body of a clause:
7 ?- clause(ten(X),Y). Y = (X is 2*5). 8 ?- clause(ten(X),Y), Y = (X is Z). Y = (X is 2*5), Z = 2*5. 9 ?- clause(ten(X),Y), call(Y). X = 10, Y = (10 is 2*5).
"call" is analogous to Lisp's "eval" function.
[edit] See also
- Concatenative programming language
- Cognitive dimensions of notations, design principles for programming languages' syntax.
[edit] References
- Douglas McIlroy (1960) Macro Instruction Extensions of Compiler Languages
- Calvin Mooers and L. Peter Deutsch (1965) TRAC, A Text-Handling Language
- Alan Kay (1969) The Reactive Engine, PhD thesis (Accessed 20061229)
[edit] External links
- Definition of Homoiconic at the C2 Wiki.
One of the main design goals was that the input script of TRAC (what is typed in by the user) should be identical to the text which guides the internal action of the TRAC processor. In other words, TRAC procedures should be stored in memory as a string of characters exactly as the user typed them at the keyboard. If the TRAC procedures themselves evolve new procedures, these new procedures should also be stated in the same script. The TRAC processor in its action interprets this script as its program. In other words, the TRAC translator program (the processor) effectively converts the computer into a new computer with a new program language -- the TRAC language. At any time, it should be possible to display program or procedural information in the same form as the TRAC processor will act upon it during its execution. It is desirable that the internal character code representation be identical to, or very similar to, the external code representation. In the present TRAC implementation, the internal character representation is based upon ASCII. Because TRAC procedures and text have the same representation inside and outside the processor, the term homoiconic is applicable, from homo meaning the same, and icon meaning representation.
[...]
Following suggestion of McCullough, W. S., based upon terminology due to Peirce, C. S. s McIlroy. M. D., "Macro Instruction Extensions of Compiler Languages," Comm. ACM, p. 214-220; April, 1960.