User Class Hierarchies with RubyWrite

October 2, 2009

I see problems with each of the two approaches that we have tried with RubyWrite. If we create a sub-class on the fly (as we were doing earlier) then the problem is that we don’t have a handle on the class name and we cannot create multiple instances, as seems desirable in some cases. If we rename methods within the same class (as we are doing currently) there is the hack-ish issue of ensuring that we don’t redefine methods and don’t step over user-defined methods, aside from making a redundant check each time a new instance is created. Both leave something to be desired.

How about using a specially defined method to create a new class?

klass = RubyWrite::define do
  def main
    ...
  end
  ...
end
xformer = klass.new

Alternatively,

RubyWrite::define :Klass do
  def main
    ...
  end
  ...
end
xformer = Klass.new

We can now create an internal class and subclass it. Either the subclass is returned (first option) or given the name Klass (second option).

However, all of the above approaches still have one problem: due to the requirement that the user must create a new class by sub-classing RubyWrite::ReWriter, creating a hierarchy of user classes is not easy. If I wanted to modularize the design of my compiler I couldn’t create a “base” transformer and then subclass it to create other transformers. This might be important for implementing related compiler passes, for example, different data-flow analyses.

Implementing RubyWrite as a module is the obvious solution (already, much of it is in modules).

One possible approach with modules is to require users to use a factory method to create new objects, since we cannot redefine new. The factory method will use singleton methods to wrap user-defined methods. Class hierarchy won’t matter, although we WILL need a way to identify user-defined methods in the hierarchy. Even with instantiation overheads, this might be a workable solution, because I can’t imagine a scenario where lots of instances of a rewriter would be needed. If a user instance maintains state, the user is aware of that and can work around if absolutely necessary.

Another possibility is to sidestep the whole issue of monkey-patching by requiring that transforming “methods” be defined using special syntax. E.g.,

class MyTransformer
  rewriter expRewriter do |node|
    ...
  end

  rewriter stmtRewriter do |node|
    ...
    n = rewrite someNode, expRewriter
  end
end

This has the downside of regular methods being different from rewriters (may not be so bad) and the somewhat awkward syntax for invoking rewriters (may be fixable with some creativity).

A hybrid approach is also possible, wherein any “regular” method may be used as a rewriter, but must be called with a special syntax to be used as a rewriter. I do see a couple of advantages of this hybrid approach—users can freely mix rewrite methods with other helper methods, but will always be aware when a method is a rewriter due to the special syntax. Also, this interferes minimally with Ruby class hierarchies and semantics, thus freeing user classes and objects to, for example, define their own singleton classes, if they so desire.


Rethinking RubyWrite

June 26, 2009

The decision to eliminate “current term” (called context in RubyWrite) and to make RubyWrite re-entrant has a few consequences.

  1. We can no longer write match <pattern>. Since there is no implicit “current term” a node to match must be supplied along with the pattern.
  2. Since the Node class knows nothing about the bindings in the Transformer class, an alternative (and arguably cleaner) syntax, x.match does not seem possible. At least, not at first sight. This syntax was implemented in the earlier incarnation of RubyWrite by looking up the bindings in the current instance of Transformer. But, that is precluded by our requirement to be re-entrant.
  3. Traversal methods are passed a block that is called on each node of the tree being traversed. This used to work by setting the current term to the node being traversed before calling the block. Now, the block must accept an argument.
  4. Traversals differ in the way the nodes are visited and the visiting pattern for a node’s children may itself depend on the outcome of the block’s execution. A “failure” is indicated by the block returning nil or false. (This used to be indicated by raising a Fail exception.) In Ruby, a block executes within the scope of the method where it is defined. RubyWrite follows the same scoping principle for Symbol-to-tree bindings. The problem occurs when a block fails after modifying this environment. It is highly desirable to not propagate the partial changes back into the environment. For this, the traversal method must have access to the environment so that it can be rolled back, if needed. This means that, just like match, we can no longer write prog.topdown {...} where prog is a Node object, because topdown must be a method belonging to the Transformer class, not to the Node class.
  5. The case statement needs === to be defined. This can no longer be done for Symbol objects (that bind to portions of trees), since there is no “global” environment to read the bindings from. Instead, the user must first lookup the Symbol and use the resulting tree in the case statement.

All of these syntactic issues make RubyWrite more a traditional library than an embedded DSL. Perhaps, it is time to rethink, one more time! And here are a few things to get the rethinking ball rolling:

  1. Ruby 1.9 supports passing blocks to blocks, thus increasing the syntactic flexibility.
  2. The new lambda in 1.9 is a much cleaner closure. It also comes with an alternative compact syntax (the word lambda can be replaced by “–>”).
  3. Instead of passing blocks around we could pass arbitrary objects with a simple syntactic trick—prefix the block with an appropriately named method name. For example:
      topdown! node, fn{...}
    

    fn could then wrap the block passed to it in an appropriate object that carries a reference to Transformer, thus providing access to all the Transformer’s state while still keeping the code re-entrant. This could help getting the compact syntax back.


Use of Symbol

June 17, 2009

The use of Symbol is overloaded in RubyWrite.

  • It is used to name an internal node in a Node object.
  • It is used as a place holder to match against subtrees in a Node object passed to match.
  • It is used as a variable that points to an already matched subtree within a Node object passed to build.
  • It is used to force unification in a Node object passed to match.

There seems to be no satisfying way to get around it. Ideally, I would like to use Ruby’s local variables to hold subtrees matched by match.


Minutes of the meetings with Chun-Yu

February 10, 2009

This post belongs to the category that is reserved for noting minutes of the meetings with Chun-Yu.


Minutes of the meetings with Andy

February 10, 2009

This post belongs to the category that is reserved for noting minutes of the meetings with Andy.