Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/annenkov/unmix

A simple program specializer (based on partial evaluation) for a subset of Scheme. It includes a binding time analyzer, a residual program generator and an arity raiser.
https://github.com/annenkov/unmix

Last synced: 9 days ago
JSON representation

A simple program specializer (based on partial evaluation) for a subset of Scheme. It includes a binding time analyzer, a residual program generator and an arity raiser.

Awesome Lists containing this project

README

        

THE SPECIALIZER UNMIX
=====================
.. image:: https://api.travis-ci.org/annenkov/unmix.png
:alt: Build status

.. role:: align-center

:align-center:`v4.0 for Racket`

(based on v3.0 https://code.google.com/p/unmix/)

Sergei A.Romanenko

Keldysh Institute of Applied Mathematics

Russian Academy of Sciences

Miusskaya Sq.4, SU-125047, Moscow, Russia

Ported to Racket by Danil Annenkov, Irkutsk, Russia

[email protected]

July, 1990

Revised: December 1992, December 1993, December 2013

COPYRIGHT NOTICE
----------------

Permission to use, copy, modify, and distribute this software and its
documentation for any personal or educational use without fee is hereby
granted, provided that:

* This copyright notice is retained in both source code and
supporting documentation.

* Modified versions of this software cannot be redistributed unless
accompanied by a complete history (date, author, description) of
modifications made; the intension here is to give appropriate
credit to those involved, whilst simultaneously ensuring that any
recipient can determine the origin of the software.

* These same conditions will also be applied to any software system
derived either in full or in part from this system.

The name `Unmix` is not a trademark, registered or otherwise, and you
are free to mention this name in published material, public and private
correspondence, and other documents without restriction or obligation.

`Unmix` is provided `as is` without express or implied warranty.

VERSION 4.0 CHANGES
-------------------

All the functionality of v3.0 is available.

Most significant changes made during porting to Racket:

* used Racket module system ("require" instead of "load");
* added explicit conversion to mutable pairs in some places
(Racket pair is immutable by default and "set-car!" and "set-cdr!"
can't be applied);
* added some macros and utils to handle mutable pairs;
* target programs generated as Racket modules with ".rkt" extension
(and can be ran using Racket);
* Robert Glueck's recursive version (in "resurs" dir) merged to single
"xmainpe.rkt" file (there are three files in v3.0: "xmainpe.scm",
"xpe.scm", "xggg.scm");
* added "xsettings.rkt". There you can set iterative or recursive mode,
target program output format etc;

WHAT IS UNMIX?
--------------

If a program has several input parameters, it can be "specialized"
with respect to the values of some of the parameters, in which case
we get a "residual" program with less input parameters than the original
program.

The parameters whose values are known at the time the program is being
specialized are referred to as "static" parameters, all other parameters
are referred to as "dynamic" ones.

Unmix is a system that specializes programs written in a subset
of Scheme, and consists of two phases: "preprocessor" and "generator of
residual programs". The specialization is done in two steps.

At the first step, preprocessor takes a Scheme program and a descriptor
of the program's input parameters. The program is a non-empty sequence
of function definitions. The first function definition in the program is
assumed to be the "main" function of the program. The descriptor is a
string of letters "s" and "d", the length of the descriptor being equal
to the number of the main function's parameters.

The descriptor of the program's parameters classifies each parameter of
the main function as either static or dynamic. If a parameter is static,
the corresponding letter in the descriptor is "s", otherwise the letter
is "d".

Preprocessor takes as input a program and a descriptor and produces an
"annotated" version of the original program, which contains some
directions for the generator of residual programs.

At the second step, the generator of residual programs takes as input
an annotated program and a sequence of file names (separated by one or
more spaces). The number of file names must be equal to the number of
the program's static input parameters. Each of the files must contain
zero, one, or more Scheme S-expressions to be used as values of the
corresponding static parameters.

THE STRUCTURE OF THE PREPROCESSOR
---------------------------------

The preprocessing consists of several stages.

First, the original program is "desugared", i.e. compiled from Scheme to
Mixwell, which is the internal language of the specializer.

Second, all variables and operations in the program are classified as
either static or dynamic and this information is inserted into the
program. The result is an annotated version of the original program.
Annotated Mixwell programs will be regarded as programs written in
Mixwell-Ann, a version of Mixwell supplemented with additional
constructs to express annotations.

Second, some of the defined function calls in the annotated program are
annotated as "residual" in order to avoid infinite unfolding of function
calls and duplication of function calls that may result from some calls
being unfolded.

THE STRUCTURE OF THE GENERATOR
------------------------------

The generator consists of "partial evaluator", which generates residual
program by symbolically evaluating Mixwell expressions, and
"postprocessor", which performs some additional transformations of the
residual program and then translates the residual Mixwell program to
Scheme.

THE STRUCTURE OF THE POSTPROCESSOR
----------------------------------

The postprocessing comprises the following stages.

1) the first Call Graph Reduction,
2) Arity Raising,
3) the second Call Graph Reduction,
4) Compilation from Mixwell into Scheme.

More information may be found in the source programs of Unmix.

HOW TO RUN UNMIX?
-----------------

This version of Unmix can be run under Racket.

Use
::

racket unmix.scm

to run UNMIX from project root directory.

UNMIX can be ran using DrRacket: load unmix.scm to environment and run it (press F5).

If UNMIX ran from project root, use relative paths to programs to be specialized.
For example::

examples/zip

You can also use unmix.scm from directories with programs to be specialized.
In that case all program files can be accessed without specifying the path, just file name.

When Unmix starts it displays a menu on the screen, which provides further information.

THE INPUT LANGUAGE OF UNMIX
---------------------------

Unmix itself is written in "Scheme with EXtensions" (files with the
extention sex). Before being loaded and executed, the sex-files have to
be compiled to scm-files (with the extension scm), containing programs
in Scheme without extensions.

For bootstrapping reasons, several parts of Unmix have been written in
Scheme directly, so that they are contained in files with the extension
scm and needn't be compiled.

Here is the syntax of the subset of Scheme accepted by the specializer
Unmix::

::= * ;; Program

::= (define ( *) ) ;; Procedure
;; definition

::= ;; Variable
| (quote ) ;; Constant
| ;; Literal constant
| (if ) ;; Conditional
| (let (*) ) ;; Let-expression
| (rcall ( *)) ;; Residual call
| (generalize ) ;; Generalizer
| ( *) ;; Procedure call
| ( *) ;; Macro

::= ( ) ;; Local binding

::= ;; Procedure name
::= ;; Variable name
::= ;; Macro name

::= | |
| |

All procedures called in the program must be without side-effects. For
this reason, the terms "procedure" and "function" will be used in the
description of Unmix interchangeably.

Constructs (generalize ) and (rcall ( *)) are used to
insert into the program hand-made annotations, which permit the user to
control the specializer. They are useless for ordinary programming.

Construct (generalize ) tells the specializer that the result of
specializing must be dynamic, even if is static. When the
program is compiled in the usual way, this construct is equivalent to
.

Construct (rcall ( *)) tells the specializer that the result
of specializing the procedure call ( *) must be a residual
call. When the program is compiled in the usual way, this construct is
equivalent to ( *).

Some useful macro definitions may be found in the file "x-match.sex".
File "x-synt.scm" contains a definition of "extend-syntax", a powerful
tool for defining macro extensions.

THE LANGUAGE MIXWELL
--------------------

Mixwell is the internal language of the specializer Unmix. Here is its
syntax::

::= *

::= ( (*) = )

::= ;; Variable
| (quote ) ;; Constant
| (if ) ;; Conditional
| (call *) ;; Defined function call
| (rcall *) ;; Defined function call
| (xcall *) ;; External function call
| ( *) ;; External function call

::= ;; Procedure name
::= ;; Variable name

The construct (call *) is a call on the procedure
defined in the program, which will be unfolded during partial
evaluation.

The construct (rcall *) is a call on the procedure
defined in the program, which will give rise to a residual call during
partial evaluation.

The construct (xcall *) is a call on the procedure
defined somewhere outside the program. If is different from the
symbols STATIC, IFS, IFD, RCALL, CALL, and XCALL, the keyword XCALL is
omitted and the construct takes the form ( *).

THE LANGUAGE MIXWELL-ANN
------------------------

::

::=
;; Program
::= (*) ;; Residual procedure
;; names
::= ;; Dynamic program
( *)
::= (*) ;; Static program
::=
( = ) ;; Annotated procedure
;; definition
::= (*) ;; Parameter list

::=
;; Variable
| (static ) ;; Static subexpression
| (ifs ) ;; Static conditional
| (ifd ) ;; Dynamic conditional
| (call (*) (*)) ;; Unfoldable defined
;; function call
| (rcall (*) (*)) ;; Residual defined
;; function call
| (xcall *) ;; External function call
| ( *) ;; External function call

EXAMPLES
========

In addition to the files that contain the specializer, there are
several files containing a number of example programs to be
specialized. Here we list the programs with some suggestions about the
way in which they can be specialized.

+---------------------+-------------+----------------------+---------------+
|Program |Source file |Parameter description | Static data |
+=====================+=============+======================+===============+
| Zipper | zip.sex | SD | zip123.dat |
+---------------------+-------------+----------------------+---------------+
| Maximum substring | mcs.sex | SD | mcs123.dat |
+---------------------+-------------+----------------------+---------------+
| MP Interpreter | mp.sex | SD | mprev.dat |
+---------------------+-------------+----------------------+---------------+
| TM Interpreter | tm.sex | SDDD | tmtst.dat |
+---------------------+-------------+----------------------+---------------+
| Parser | prs.sex | SD | prsexp.dat |
+---------------------+-------------+----------------------+---------------+

REFERENCES
----------

[Barzdin 88] G.Barzdin. Mixed Computation and Compiler Basis. In D.Bjorner, A.P.Ershov and N.D.Jones, editors, Partial Evaluation and Mixed Computation, pages 15-26, North-Holland, 1988.

[Beckman 76] L.Beckman, A.Haraldson, O.Oskarsson, E.Sandewall. A Partial
Evaluator, and Its Use as a Programming Tool. Artificial
Intelligence, 7(4):319-357, 1976.

[Bulyonkov 84] M.A.Bulyonkov. Polyvariant Mixed Computation for Analyzer
Programs. Acta Informatica, 21:473-484, 1984.

[Burstall 77] R.M.Burstall and J.Darlington. A Transformation System for
Developing Recursive Programs. Journal of the ACM, 24(1):44-67, 1977.

[Dixon 71] J.Dixon. The Specializer, a Method of Automatically Writing
Computer Programs. Division of Computer Research and Technology,
National Institute of Health, Bethenda, Maryland, 1971.

[Ershov 78] On the Essence of Compilation. In E.J.Neuhold, editor,
Formal Description of Programming Concepts, pages 391-420,
North-Holland, 1978.

[Ershov 81] A.P.Ershov. The Transformational Machine: Theme and
Variations. In J.Grushka and M.Chytil, editors, Mathematical
Foundations of Computer Science, Strbske Pleso, Czechoslovakia, pages
16-32, Lecture Notes in Computer Science, Vol.118, Springer-Verlag,
1981.

[Futamura 71] Partial Evaluation of Computation Process - An Approach to
a Compiler-Compiler. Systems, Computers, Controls, 2(5):45-50, 1971.

[Hughes 88] J.Hughes. Backward Analysis of Functional Programs. In
D.Bjorner, A.P.Ershov and N.D.Jones, editors, Partial Evaluation and
Mixed Computation, pages 187-208, North-Holland, 1988.

[Jones 85] N.D.Jones, P.Sestoft and H.Sondergaard. An Experiment in
Partial Evaluation: The Generation of a Compiler Generator. In
J.-P.Jouannaud, editor, Rewriting Techniques and Applications, Dijon,
France, pages 124-140, Lecture Notes in Computer Science, Vol.202,
Springer-Verlag, 1985.

[Jones 86] N.D.Jones and A.Mycroft. Data Flow Analysis of Applicative
Programs Using Minimal Function Graphs. In Thirteens ACM Symposium on
Principles of Programming Languages, St.Petersburg, Florida, pages
296-306, ACM, 1986.

[Jones 88] Automatic Program Specialization: A Re-Examination from Basic
Principles. In D.Bjorner, A.P.Ershov and N.D.Jones, editors, Partial
Evaluation and Mixed Computation, pages 225-282, North-Holland, 1988.

[Mogensen 88] T.Mogensen. Partially Static Structures in a
Self-Applicable Partial Evaluator. In D.Bjorner, A.P.Ershov and
N.D.Jones, editors, Partial Evaluation and Mixed Computation, pages
325-347, North-Holland, 1988.

[Ostrovski 88] B.N.Ostrowski. Implementation of Controlled Mixed
Computation in System for Automatic Development of Language-Oriented
Parsers. In D.Bjorner, A.P.Ershov and N.D.Jones, editors, Partial
Evaluation and Mixed Computation, pages 385-403, North-Holland, 1988.

[Romanenko 88] S.A.Romanenko. A Compiler Generator Produced by a
Self-Applicable Specializer Can Have a Surprisingly Natural and
Understandable Structure. In D.Bjorner, A.P.Ershov and N.D.Jones,
editors, Partial Evaluation and Mixed Computation, pages 445-463,
North-Holland, 1988.

[Romanenko 90] S.A.Romanenko. Arity Raiser and Its Use in Program
Specialization. In N.Jones, editor, ESOP '90, 3rd European Symposium
on Programming, Copenhagen, Denmark, May 15-18, 1990, pages 341-360,
Lecture Notes in Computer Science, Vol. 432, Springer-Verlag, 1990.

[Sestoft 86] The Structure of a Self-Applicable Partial Evaluator. In
H.Ganzinger and N.D.Jones, editors, Programs as Data Objects,
Copenhagen, Denmark, 1985, pages 236-256, Lecture Notes in Computer
Science, Vol. 217, Springer-Verlag, 1986.

[Schmidt 86] D.A.Schmidt. Denotational Semantics. Allyn and Bacon,
Boston, 1986.

[Sestoft 88] P.Sestoft. Automatic Call Unfolding in a Partial Evaluator.
In D.Bjorner, A.P.Ershov and N.D.Jones, editors, Partial Evaluation
and Mixed Computation, pages 485-506, North-Holland, 1988.

[Turchin 72] V.F.Turchin. Equivalent Transformation of Recursive
Functions Defined in Refal. In Teoriya Yazykov i Metody
Programmirovaniya. Trudy Simposiuma, pages 31-42, Alushta-Kiev, 1972
(in Russian).

[Turchin 79] V.F.Turchin. A Supercompiler System Based on the Language
Refal. SIGPLAN Notices, 14(2):46-54, February 1979.

[Turchin 82] V.F.Turchin, R.M.Nirenberg and D.V.Turchin. Experiments
with a Supercompiler. In 1982 ACM Symposium on Lisp and Functional
Programming, Pittsburgh, Pennsylvania, pages 47-55, ACM, 1982.

[Turchin 86] V.F.Turchin. The Concept of a Supercompiler. ACM
Transactions on Programming Languages and Systems, 8(3):292-325, July
1986.

[Turchin 88] V.F.Turchin. The Algorithm of Generalization in the
Supercompiler. In D.Bjorner, A.P.Ershov and N.D.Jones, editors,
Partial Evaluation and Mixed Computation, pages 531-549,
North-Holland, 1988.

[Wadler 88] P.Wadler. Deforestation: Transforming Programs to Eliminate
Trees. In European Symposium on Programming, Lecture Notes in
Computer Science, Springer-Verlag, 1988.

APPENDIX. SOME MACROS USED IN UNMIX
-----------------------------------

Unmix, as well as the example programs, has been written in Scheme
extended with the following macros.

GENERALIZED CASE-EXPRESSION

::

(MATCH (arg ...)
(pat ... & guard => exp ...) ...)

The expressions "arg ..." are evaluated to produce S-expressions "S-exp
...". "S-exp ..." are then matched against the corresponding patterns
"pat ...". If the matching succeeds for some clause

::

(pat ... & guard => exp ...)

the variables in "pat ..." get bound to the corresponding
subexpressions in "S-exp ...", and then the expression "guard" is
evaluated in the extended environment. If the result of "guard" is not
#f, the expressions "exp ..." are evaluated in the extended
environment, otherwise the next clause is tried. If the guard is #t,
"& guard" may be omitted.

The patterns have the following syntax::

::= ' matches .
| matches .
| matches anything, is bound.
| _ matches anything.
| ( as ) matches , is bound.
| ( . ) matches a pair with 's as elements.

::=
::=
| ()
|
|
|
|
|

GENERALIZED LET-EXPRESSION

::

(WITH ((pat arg) ...) exp ...)

The expressions "arg ..." are evaluated to produce S-expressions "S-exp
...". "S-exp ..." are supposed to match the patterns "pat ...", in which
case the variables in "pat ..." get bound to the corresponding
subexpressions in "S-exp ...", and then the expressions "exp ..." are
evaluated in the extended environment. If some of "S-exp ..." do not
match against patterns "pat ...", the result of the form WITH is
unspecified, because there is no actual analysis of the structure of
"S-exp ...". The syntax of patterns is exactly the same as in the case
of the form MATCH.

The form

::

(WITH* ((pat1 arg1) . (pat arg) ...) exp ...)

is equivalent to

::

(WITH ((pat1 arg1)) (WITH* ((pat arg) ...) exp ...)

RESTRICTED GENERALIZED CASE-EXPRESSION

::

(SELECT (arg ...)
(rpat ... & guard => exp ...) ...)

The expressions "arg ..." are evaluated to produce S-expressions "S-exp
...". "S-exp ..." are then matched against the corresponding restricted
patterns "rpat ...". If the matching succeeds for some clause

::

(rpat ... & guard => exp ...)

the variables in "pat ..." get bound to the corresponding
subexpressions in "S-exp ...", and then the expression "guard" is
evaluated in the extended environment. If the result of "guard" is not
#f, the expressions "exp ..." are evaluated in the extended
environment, otherwise the next clause is tried. If the guard is #t, "&
guard" may be omitted.

The syntax of restricted patterns coincides with that of the ordinary
patterns appearing in the construct MATCH described above, but their
meaning is slightly different.

If a restricted pattern doesn't have the form ( . ),
it has the same meaning as the ordinary pattern .

If an S-expression is not a pair, the result of matching
against a pattern ( . ) is unspecified (i.e.
matching against such a pattern may produce either an error or
unpredictable results).

If an S-expression is a pair ( . ), and a
pattern has the form ( . ()), then matches ,
iff matches . In other words, a restricted pattern of
the form ( . ()) is completely equivalent to the restricted
pattern ( . _).

If an S-expression is a pair ( . ), and a
pattern has the form ( . ), where is not (), then
matches , iff matches and
matches .

The fact that restricted patterns are less careful at examining
S-expressions than the ordinary patterns are, enables them to be
compiled into efficient code.

RCALL

::

(RCALL (fname arg ...))

This construct is used for telling the specializer that the function
call (fname arg ...) is a residual one. In all other respects this
construct is equivalent to (fname arg ...).

GENERALIZE

::

(GENERALIZE exp)

This construct is used for telling the specializer that the result of
specializing (GENERALIZE exp) must be dynamic even if exp is static. In
all other respects this construct is equivalent to exp.