Exotic Programming Ideas, Part 2: Term Rewriting

jswrenn · on Nov 16, 2020

If you're interested in playing around with term rewriting, I highly recommend the Redex framework! https://docs.racket-lang.org/redex/index.html

It's a really neat way to model programming languages and transition systems. Unfortunately, I think the documentation is perhaps more intimidating than the tool (its target audience is programming language theorists; not people who just like noodling around with programming language design).

My favorite Redex hack is that you surface your Redex model as a hashlang (Racket's way of providing replacing/extending the language), and get a fully-featured algebraic stepper in DrRacket! See this repo and PR for details: https://github.com/justinpombrio/RacketSchool/pull/5

boygobbo · on Nov 16, 2020

That's a really useful comment - thanks!

sritchie · on Nov 16, 2020

If anyone wants to play with these ideas in Clojure(script), take a look at the SICMUtils project: https://github.com/littleredcomputer/sicmutils

I picked up co-maintenance on this project a few months ago; it's a Clojure port of the computer algebra system behind Sussmans' "Structure and Interpretation of Classical Mechanics" [0] and "Functional Differential Geometry" [1], and can run all the code from those books. (I'm currently working on integrating it into https://Maria.cloud so all of this will be usable from the browser.)

`sicmutils.generic.simplify` plugs in to a rule-based term rewriting system like Mathematica's, along with polynomial and rational function simplification.

The core idea of the library is that if you build everything on top of extensible generic arithmetic operators (multimethods in Clojure), then a lot of good stuff shows up for free:

- If I implement all the operations for symbols, I get a symbolic arithmetic system for free. - Extend the generics to Dual numbers[2]? Boom, forward-mode AD pops out!

I'm getting carried away. Check it out, get your term-rewriting fix.

[0] https://mitpress.mit.edu/books/structure-and-interpretation-... [1] https://mitpress.mit.edu/books/functional-differential-geome... [2] https://en.wikipedia.org/wiki/Dual_number

agambrahma · on Nov 16, 2020

This is really cool.

Remember coming across SICM a long time ago, feel it gets under-appreciated because it's perhaps too niche (not enough of the "SICP crowd" cares about the physics, people who care about the physics don't come across it).

A Clojure version will ensure it lives on longer (the one downside of these old books is how closely they're tied to MIT-Scheme)

georgewsinger · on Nov 16, 2020

> It is true that this system can be used to program arbitrary logic, it can however become a bit clunky when trying to code imperative logic that works with inplace references and mutable data structures. Nevertheless it is a very powerful domain tool for working in scientific computing and the number of functions related to technical domains that are surfaced in the language represents decades of meticulous expert domain expertise.

Can you explain this more? Why is (i.e.) the Wolfram Programming Language best thought of as a "domain" tool, and not a general purpose programming language? Why couldn't someone build an extraordinarily complex piece of software (say with millions of lines of code) under term-rewriting paradigm (like Wolfram) in a domain that has nothing to do with traditional mathematics/scientific computing?

BTW, my favorite Wolfram fact is that it is the most concise programming language on the Rosetta: https://blog.revolutionanalytics.com/2012/11/which-programmi...

6gvONxR4sf7o · on Nov 16, 2020

You could. But as one of the Mathematica devout, it works be a massive pain in the as compared to other languages better suited for it. Assuming languages are equivalently expressive (in the Turing complete sense), the difference is in what they make easy.

It’s easier to make a small web API in python than assembly, even though they can do the same things. Mathematica is designed to make (symbolic) mathematics easiest. It excelled there at the expense of weirdness elsewhere.

The answer unfortunately reduces to “I dunno, try it and see.”

kmill · on Nov 16, 2020

It's "fine" for imperative programming if you never make a mistake: the problem is that it is extremely permissive and won't stop execution on errors. I usually add my own error checking and default cases to cause programs to halt, and these help a lot, but there are still a number of footguns.

One example is what happens if you map a function f over a list lst: f /@ list. Oh, the variable was called lst and not list? Mathematica will happily evaluate this and result in list, assuming list has not been defined. This is because what /@ does is apply f to each part of the expression, and a Symbol has no parts. If Mathematica had a way to declare which global symbols can be used in local expressions, it might help prevent this class of bugs.

chubot · on Nov 16, 2020

Yeah I'm finding myself explaining the difference between shell and Python a lot:

https://news.ycombinator.com/item?id=24873284

Shell is good for creating Unix systems because of the tools it provides. ... It's hard to explain, but if you work in that area, you'll very quickly see it.

I think many people have learned one language, or 3 languages that are similar, and they think all languages do roughly the same thing.

That's incorrect on all counts: in terms of the core data model, the operators, and the libraries.

If you do a lot of work in a specific domain, you'll start to see all the differences. Shell is the best tool for bringing up machines, which happens a lot in both the cloud/web and embedded spaces.

----

Ditto for R. R makes a lot of tasks really easy, and a lot of other ones really hard (like being 10x slower than Python, or 1000x slower than C, for many tasks). That sounds horrifying, but it really helps you get a lot of work done.

Another thing people have trouble appreciating is the difference between R and Matlab, and Matlab and Mathematica. Those 3 languages are drastically different because of both their core data model and who uses them.

BoiledCabbage · on Nov 17, 2020

The assembly analogy is a very faulty analogy. The reason assembly is harder is because it's a lower level language and requires significantly more code to accomplish the same. Most term re-writing languages are actually higher level than python. And so in theory would take fewer lines.

The real difference is noone has created a set of libraries for the web api domain space in a term re-writing language. If they did, it'd be easy.

And similar to others in this thread, I think term re-writing languages will be prominent in programming future. The same way many years back people could see functional programming over the horizon in programming future.

dTal · on Nov 16, 2020

I would beware of concluding too much about Wolfram Language's expressiveness from that Rosetta fact. The language is notorious for including the kitchen sink in its standard library, to the extent that many common Rosetta tasks will simply be library calls. Which is all wonderful and practical, but doesn't imply very much about the language design.

tikej · on Nov 17, 2020

Term rewriting is an extremely interesting topic that’s more niche than it should be.

I plan to start writing blog posts about undergrad/grad level mathematics and physics where all derivations are done via computer algebra system (can’t quite make up mind whether to use SymPy, Maxima, Reduce, FriCAS/Axiom or SymbolicUtils.jl). Just need to find some interesting topics to start.

Also more broadly I believe that mathematical derivations in scientific papers should be done computer algebra systems and term rewriting. They would be easier to check, reproduce and leave no ambiguities. It’s probably still many years from now but I hope to see that someday.

creatio · on Nov 17, 2020

I'm going to leave this here: http://www.cs.ru.nl/~marko/research/proefschrift/

- Parallel graph rewriting - some contributions to its theory, its implementation and its application

The Clean programming language is build upon this, if I'm not mistaken.

codekilla · on Nov 16, 2020

Maude is a really cool Language based on rewriting logic.

http://maude.lcc.uma.es/maude31-manual-html/maude-manual.htm...

siraben · on Nov 16, 2020

In a similar vein, see[0] which talks about rewrite systems from a more formal viewpoint. On the practice side, GHC performs rewrite rules[1], an example of which is

  "map/map"    forall f g xs.  map f (map g xs) = map (f.g) xs

Combining two calls to map into one. While this seems silly, terms like this can arise as a result of other compiler optimizations. GHC can even remove intermediate data structures[2] entirely, so writing factorial as

  factorial n = product [1..n]

would result in an efficient first-order program without the list at all. Another example of where rewrite rules come in handy is linting[3]. It does seem that a lot of these rewrites are possible only in a pure language, otherwise the compiler would have to resort to analyzing side-effects.

[0] www.cs.tau.ac.il/~nachum/papers/taste-fixed.pdf

[1] https://downloads.haskell.org/~ghc/7.0.3/docs/html/users_gui...

[2] https://users.cs.northwestern.edu/~robby/courses/395-495-201...

[3] https://github.com/ndmitchell/hlint

bjoli · on Nov 18, 2020

I played with these kinds of reductions in a subset of scheme, and the flow analysis you need to do to be able to do it safely is nontrivial at best depending on how you limit the semantics of your language.

It was a lot easier to do with something like srfi-171 [0] due to their semantics. Fusing those steps is as simple as inlining.

[0] transducers, like in clojure. https://srfi.schemers.org/srfi-171/srfi-171.html . I am the srfi author.

siraben · on Nov 18, 2020

That interesting. Reminds me of Haskell's Foldable[0] and Functor typeclasses, which abstracts over containers as well.

[0] https://hackage.haskell.org/package/base-4.14.0.0/docs/Data-...

tgb · on Nov 16, 2020

What's the magic going on in substitutions like:

    Sin[a+b] /. Sin[x_] -> Cos[x]

which yields 'Cos[a+b]'. Is the underscore after the x what indicates that "x_" can take any value? And Mathematica then strips the undescore and replaces 'x' with that value? Why not just use x_ on both sides? And earlier the author was using '_x' instead of 'x_'. It seems rather mysterious to me.

kmill · on Nov 16, 2020

The left-hand side of the arrow is a pattern expression [1] (though usually you'd use the :> arrow to delay evaluation of the right-hand side). The "full" notation is x_H for a pattern variable x that needs to have head H. Dropping the H, it matches anything. Dropping the x, it just doesn't set a variable when matching. Dropping both, it matches anything and doesn't set a variable.

[1] https://reference.wolfram.com/language/guide/Patterns.html

tgb · on Nov 16, 2020

Okay thanks, that all makes more sense now.

6gvONxR4sf7o · on Nov 16, 2020

It matches the expression with x_ as a named wildcard (it has lots of stuff you can do to constrain the matching, like “only match if x is prime”). Then it replaces the whole expression having x bound to whatever x_ matched.

colehasson · on Nov 16, 2020

the underscore makes it a symbolic variable, not one with value

unbound in the lambda sense

tgb · on Nov 16, 2020

So any underscore at all makes it a symbolic variable? (Or trailing/leading underscores only?) And the name of the variable on the right-hand side of the substitution is that name with the (or all?) underscore removed? It seems very clunky, so I think I'm missing something.

Someone · on Nov 16, 2020

Mathematica doesn’t allow underscores in names. I also think the use of _x is an error in this article (disclaimer: I haven’t used mathematics in decades)

⇒ underscores must come at the end.

kmill · on Nov 16, 2020

A pattern like _List matches lists, but the match isn't associated to a variable.

Someone · on Nov 17, 2020

Of course. Thanks! It’s name_type, where both name and type are optional.

tgb · on Nov 16, 2020

So is the right interpretation to think of "_" not as part of the variable name at all but rather as a unary operator (that acts on its left) which denotes a variable as unbound?

jazzyjackson · on Nov 16, 2020

It's not really an operator, it's part of a pattern-matching syntax. "_" is called Blank and matches anything. Putting a name in front of it just makes it available by name -- like referring to capture groups in a regex.

So when you call a function on some argument or list of arguments, the interpreter looks up all the function definitions and finds one with a pattern that matches.

  Double[x_] := 2 * x

Will work for Double[3] but not for Double[3, 2], the latter doesn't match any patterns, and the interpreter doesn't know what to do with it. You would have to define another function with the same name to handle other patterns, like overloading in C++

You can further constrain the pattern to types like this:

  Double[x_Integer] := 2 * x

And now the pattern only matches integers. More in the manual, "Patterns"[0] and "More about Patterns"[1]

[0] https://www.wolfram.com/language/elementary-introduction/2nd...

[1] https://www.wolfram.com/language/elementary-introduction/2nd...

6gvONxR4sf7o · on Nov 16, 2020

At this point you seem too understand the mechanism well enough that the docs will help you with the details best.

comonoid · on Nov 16, 2020

Turchin's REFAL, anyone?

https://en.wikipedia.org/wiki/Refal

hyperpallium2 · on Nov 16, 2020

I want to write a term rewriter for algebraic equations (just commute, assoc, distr, etc), but I wonder about the UI: the best way for a user to specify which parts to transform?

Here's the obvious ways I've thought of - have I missed some?

- Derivations are usually written as before and after, and sometimes the name of the transform, and the reader must work out the arguments. Could have the program check the transform... but I'd like to save the user all that rewriting...

- You could specify the arguments with a function at that point (using location to address them). Seems just as bad.

- An awkward path through the parse tree to specify each argument is also possible.

- In a GUI, arguments could be selected by clicking.

- The user could name the transform, and the program automatically find where it is applicable, and apply it - or give a list of applications, and the user selects which one. But this is a step towards an automatic theorem prover, and is more automation than I want the user to have.

Are there other ways?

bcheung · on Nov 16, 2020

I was thinking something close to Haskell pattern matching with some variation.

Ex:

AdditionRule (x :: NumericExpr) + (y :: NumericExpr) = eval(x) + eval(y)

It's been a while but I also remember seeing a concise DSL for tree rewriting in the ATLR compiler. It also had escape hatches for things like context and precedence calculations.

In terms of GUI and clicking, I think it might need to be more sophisticated than that.

What does clicking mean when you are a clicking on a sum type? Is it just that variant of the ADT, the entire type?

What about for complex expressions? Are you selecting just a term in that complex expression or the entire expression?

I suppose if you were looking at a tree visually then you could select a lot easier.

Seeing how other languages do "destructuring" or "pattern matching" might provide other possible alternatives.

hyperpallium2 · on Nov 17, 2020

That's sophisticated! I'm thinking highschool algebra, like

  a+b     = b+a
  (a+b)+c = a+(b+c)
  a.c+b.c = (a+b).c

Defined by that literal text. Applying them is the step I'm wondering about...

One difficulty comes above the target, when it's not the root. e.g. wanting to commute the target a+b in

  x.(y + z.(a+b))

The other difficuly lies below the target, where each letter in the above rules can represent an arbitary expression - so it's awkward to tell which expression is meant by clicking on it (as you note).

So I'm thinking of clicking on an operator, to uniquely select its two operands - that would work for commute. Other gestures, like long-click, double-click, drag-n-drop etc may be enough. Pop-up menu for the worst case!

I think, the Haskell example defines a rule, rather than applying an existing one? I think, too, a DSL for transforms in ANTLR also would be for defining them.

Similarly, I think destructuring is a kind of definition. The 3 rules defined above are pretty much destructuring and constructuring (heh). The difference is in usage or application: in a program, you need to supply the target as the root, but in highschool algebra, it can be at any depth in the expression.

jdmichal · on Nov 18, 2020

I think you'd be best by defining two classes of rules. The first would be simplifications, and these you want to apply as much as possible. The second would be definitions, which can be used to help make simplifications possible, but otherwise should not be applied just because.

hyperpallium2 · on Nov 18, 2020

Not sure what you mean.

Sometimes you need to "complexify" to put an expression into a form in which you can do something, then simplify it back. You can even generate new terms out of nothing (because b = b + 0 and 0 = a + (-a)) to obtain a specific form. The simplist example I can think of is completing the square:

To get minimum of x^2 + 2x + 2:

  x^2 + 2x         + 2  =
  x^2 + 2x + 1 - 1 + 2  =
     (x+1)^2      + 1
  because (x+1)^2 >= 0, minimum is 1

So it's important that equivalences like ax+bx = (a+b)x can be applied in either direction - not just in the left-to-right, simplifying direction of factoring, but also the right-to-left direction of expansion.

BTW If always "simplifying", the process is contained, and the possibilities are limited. But because of expansion and generation, it is unconstrained, and can be anything.

I think of "definitions" as defining the meanings of new syntax - in effect, they are lemmas. Can you elaborate on what you mean by them?

BTW the option to automatically apply simplifications would be useful in an assistent tool. But what I'm aiming at is a facilitation of each tiny tedious step - so it's not quite as tedious, but you aren't skipping anything.

jdmichal · on Nov 19, 2020

I thought I understood your problem statement as something along the lines of "How do I stop the rule (a+b)->(b+a) from firing just because they can". So my proposal was to only attempt to apply certain rules if they helped reach the goal, where the goal was simplification. They would not fire in and of themselves, but always in a chain that ended in some desired simplification rule also firing.

hyperpallium2 · on Nov 21, 2020

Thanks for explaining. I want the user in manual control, and the problem is how the user specifies what is to be done.

I did entertain some automation as one way to achieve this: to present a set of different things that could be done - and the user could select from that set, instead of specifying the details explicitly. It's not my preferred approach, because it suggest/hints more to the user than I want to ... but it is a solution to the specification problem.

I think I can see your suggestion as a refinement of that approach, of not blindly listing every possible next step, but just the most useful ones. And further, selecting from a set of different series of rule applications. One could also rank them.

Your version of the problem is appropriate for a proof assistent - but I'm aiming at something quite a bit more rudimentary!

----

BTW my motivation for this project is partly to get clear on some formal aspects of proof - the mathematics I've studied so far has only formalized some domains, but not proof itself.

I can also feel it as a fun game, to be able to transform expressions... though whether that turns out to be the case remains to be seen!

jswrenn · on Nov 16, 2020

The `traces` procedure from Redex displays a navigable graph of all applicable transformations and their result.

https://docs.racket-lang.org/redex/The_Redex_Reference.html#...

Syzygies · on Nov 16, 2020

Two term rewriting algorithms of central importance in their domains: Gröbner bases for systems of polynomials in many variables, and the Schreier–Sims algorithm for computing with permutations.

thequux · on Nov 16, 2020

When I was at UCLA, I took a class on compilers where we could write our compiler in any language we wanted. I had recently discovered Stratego/XT in some random wikiwandering, and decided to use it for my compiler. In the end, the typechecker and register allocators were kind of messy, but the rest got written using `aVerySmallSedScript <slides.txt >pass3.str`

Sadly, the developers of the system got their PhD and moved on, and the system has long-since bitrotted :-(

icsa · on Nov 17, 2020

I was a big fan of Stratego/XT as well. Expressing optimizations was very straightforward.

sriram_malhar · on Nov 16, 2020

I like Maude a lot. I recommend Peter Olveczky's book

Designing Reliable Distributed Systems: A Formal Methods Approach Based on Executable Modeling in Maude

dtornabene · on Nov 17, 2020

A good book on Term Re-Writing is "Term Rewriting and All That". It uses Standard ML for implementation of the algorithms, just an excellent text overall. Also, there's a component of BAP, the Binary Analysis Platform, that uses term rewriting situated in lisp, for analysis of program state

kmill · on Nov 16, 2020

Sometimes the way I explain Mathematica to people is that, where Perl is the Perl of string rewriting, Mathematica is the Perl of tree rewriting.

Some comments about the article:

Mathematica has two types of rules, Rule and RuleDelayed, given by a -> b and a :> b. You usually use RuleDelayed unless you want Mathematica to pre-evaluate b. For example, if you create a rule like x_ -> (Print[x]; 1 + x) it will immediately print the symbol x and then result in the rule x_ -> 1 + x, but x_ :> (Print[x]; x + 1) instead evaluates to itself without that side effect.

It might be worth knowing that $Assumptions has a dynamic scope construct (like fluid-let in scheme), where Assuming[a > 0, ...] will temporarily append a > 0 to $Assumptions.

The code for Diff is not correct, since it has patterns on the right-hand side of the rules (and also it's Power, not Pow). For idiomatic Mathematica, you would attach these rules to the Diff symbol itself as "DownValues" so that Mathematica will automatically apply them:

  Diff[f_ g_, x_] := f Diff[g, x] + g Diff[f, x]; 
  Diff[f_ + g_, x_] := Diff[f, x] + Diff[g, x]; 
  Diff[Power[x_, n_Integer], x_] := n*Power[x, n - 1]; 
  Diff[n_?NumericQ, x_] := 0;

These := operators append a rule to DownValues[Diff], where the rule is exactly from replacing := with :>.

  In[14]:= DownValues[Diff]

  Out[14]= {HoldPattern[Diff[f_ g_, x_]] :> f Diff[g, x] + g Diff[f, x],
            HoldPattern[Diff[f_ + g_, x_]] :> Diff[f, x] + Diff[g, x], 
            HoldPattern[Diff[x_^n_Integer, x_]] :> n x^(n - 1), 
            HoldPattern[Diff[n_?NumericQ, x_]] :> 0}

The HoldPatterns in here are to prevent Mathematica from evaluating the patterns, since Diff has all these rules that really want to activate for those left-hand sides. (Yes, patterns, too, are subject to evaluation. It can be helpful in very limited circumstances, like metaprogramming the rules.)

It looks like it's still missing some rules, though:

  In[42]:= Diff[x^2 + 2 x + 1, x]

  Out[42]= 2 x + 2 Diff[x, x]

One fix is to change the power rule to

  Diff[Power[x_, n_Integer: 1], x_] := n*Power[x, n - 1]

which gives n the default value of 1, plugging into a system where Mathematica knows x^1 and x are the same.

---

Something I use Mathematica for is calculations in things like the Temperley-Lieb algebra (for calculating the Jones polynomial of knots). The Temperley-Lieb algebra is pretty much a calculus of path composition, but you also take formal linear combinations of these, and closed loops evaluate to some polynomial. In the following, P[1, 2] would denote a path between points 1 and 2.

  SetAttributes[P, Orderless];
  P /: P[a_, b_] P[b_, c_] := P[a, c];
  P /: P[a_, b_]^2 := P[a, a];
  P /: P[a_, a_] := -A^2 - A^-2;

For example, P[1, 2] P[2, 3] P[3, 1] evaluates to -A^2 - A^-2. (I have a library for these things where the "boxes" it renders to in the Mathematica notebook is a graphical representation of the paths. The graphical paths are the expressions, which has been very helpful to me.)

7thaccount · on Nov 16, 2020

Sometimes I think people don't fully understand how cool the term rewriting is in Mathematica where it continuously expands on anything from text to functions to images and audio...it's pretty neat.

muizelaar · on Nov 16, 2020

If you want to follow along you can use without using Mathematica you can use the Mathematica clone https://github.com/mathics/Mathics

birktj · on Nov 16, 2020

I have always found term rewriting a fascinating idea as a programing paradigm and the Wolfram language seems very cool. It is a shame that it is a proprietary language.

An even more insane language based on term rewriting is the Egison programming language [1]. Egison even supports pattern matching on infinite data structures, it is pretty cool.

[1]: https://www.egison.org/

coliveira · on Nov 17, 2020

Mathematica is a language. The implementation is proprietary, but anyone could create a new implementation, similarly to what Octave did to MATLAB.

birktj · on Nov 17, 2020

That is a good point and there even is a sort of open source implementation of the Wolfram language, Mathics [1]. Although I would assume its value is kind of limited without the standard library included with Mathematica.

[1]: https://mathics.org/

bcheung · on Nov 16, 2020

I've recently become very interested in this space and am wondering what resources are out there for people wanting to learn more and implement their own languages/environments that use term rewriting?

Most of the literature I've seen seems to focus more on the typed lambda calculus aspect of things and less on the term rewriting.

I'm planning on going through:

- The Architecture of Symbolic Computers

- Abstract Computing Machines

- Mathematica documentation / internals

emmanueloga_ · on Nov 16, 2020

I think rewrite is related to program transformation [1]? Most stuff that I've seen looks pretty academic, obscure, not very well maintained, etc ...

Pure [2] is the only practical language I've stumbled upon that appears to be general purpose _and_ based on term rewrite.

I wonder if XSLT could fall under this umbrella? It rewrites trees into other trees ... not sure if that's enough to be considered a term rewrite system. If yes, then maybe CDuce is also an interesting thing to check.

The author of Ragel also has a program transformation system [4], but I'm not sure what state it is at, he seems to have been working on it for a long while but it doesn't seem to be something nearly as popular as Ragel.

Edit: I think specter [5] and meander [6] also could be, perhaps tangentially, related to rewriting. Meander expressly mentions rewrite as inspiration/fondation. Curiously these two look a lil bit like xpath/xslt but with EDN as data model instead of XML infoset.

--

1: https://en.wikipedia.org/wiki/List_of_program_transformation...

2: https://github.com/agraef/pure-lang

3: https://www.cduce.org/#whatis

4: http://www.colm.net/open-source/colm/

5: https://github.com/redplanetlabs/specter

6: https://github.com/noprompt/meander

zem · on Nov 17, 2020

i was hoping the article would focus on pure[0] rather than mathematica - term rewriting is very interesting, but symbolic mathematics seems to be the default use of it, and i never see anyone talking about other applications.

[0] https://agraef.github.io/pure-lang/

birktj · on Nov 17, 2020

I have looked a little bit at Pure before, it seems pretty neat. Do you know if there is anything interesting written about it?

zem · on Nov 17, 2020

very little i could find, sadly. chris double has a few nice blog posts from when he was exploring the language: https://bluishcoder.co.nz/tags/pure/index.html

abqexpert · on Nov 17, 2020

To a large degree you could imagine most of prolog programming as a set of term rewriting steps. As well as most lisp style macro programming languages. I thought the focus on Wolfram was a bit myopic. Even earlier meta and Tree-meta languages used this as well.

The_rationalist · on Nov 17, 2020

If term rewriting interest you : https://github.com/opencog/ure