Clearing the syntactic confusion

As I mentioned in an earlier post, and Andy corroborated, our overloaded use of blocks is a source of confusion and inadvertent errors. The crux is the global current term and the fact that it is used (referenced or modified) by some of the RubyWrite methods implicitly. Operating on an explicit term requires passing a block, which is not only unintuitive but also clashes with the special meaning of blocks for operations such as match, build, succeeds?, etc. Another irritant in the current syntax is violation of the Ruby norm of ending the names of methods with side-effects with “!”. In RubyWrite, all methods have side-effects!

Here is a possible solution. RubyWrite already uses meta-programming to enclose each user-defined method within a similarly named method in a dynamically defined sub-class, in order to handle environment setup and cleanup. We can arrange for a subclass method to propagate back the changes made to the current term only if the method name ends in a “!”. Thus, those user-methods that have names ending with “!” automatically have side-effects, while others don’t. This brings the syntax in line with Ruby norms of serving to warn the users when they use methods with side-effects, at least, as far as the current term is concerned.

Also leveraging Ruby’s meta-programming, for each user-method we can define a similarly named method in the Node class, which sets up the current term to the Node object on which it is called, before invoking the user method. This provides a cleaner syntax than blocks, and eliminates the overloaded use of blocks. Notice that a user method with side-effects (i.e., with name ending in !) will affect the Node object on which it is called, not the current term.

Finally, in order to be able to write traversals with user-specified blocks, which behave like anonymous user-methods, we still need explicit apply and apply! methods. (I used the name xform in my earlier post, but I think apply sounds better.)

In summary, if foo and foo! are user-defined methods, and :Term[...] is a Node object:

# match against the current term
match :Assign[:lhs, :rhs]

# match against the specified term
:Term[...].match :Assign[:lhs, :rhs]

# match against the specified term, and
# succeed only if the block succeeds too
:Term[...].match :Assign[:lhs, :rhs] { foo(:rhs) }

# build
build! :Assign[:lhs, :rhs]

# build, but first update the bindings
# by executing the block
build! :Assign[:lhs, :newrhs] { :newrhs <= foo(:rhs) }

# build, and update the specified term
# instead of the (default) current term
# thus, build!(...) == curT.build!(...)
# this is more useful when :Term[...] is
# replaced by some variable
:Term[...].build! :Assign[:lhs, :rhs]

# changes to current term in foo
# do not propagate back
foo :rhs

# changes to current term in foo
# are propagated back here
foo! :rhs

# called on specified term
:Term[...].foo :rhs

# :Term[...] becomes the current term
# changes to it made within foo! are lost
# again, more useful when :Term[..] is replaced
# by a variable
:Term[...].foo! :rhs

The traversals become slightly simpler than the description in the earlier post.

def all (*args, &f)
  @curT.children.each do |c|
    c.apply!(*args, &f)
  end
end
def topdown (*args, &f)
  apply!(*args, &f)
  @curT.children.each do |c|
    c.apply!(*args) {topdown(*args, &f)}}
  end
end

2 comments  Tagged:  , December 4, 2008

Hide Comments
  1. akeep posted the following on December 4, 2008 at 2:27 pm.

    I like this a lot better!

    I also like the idea of using the ! to indicate that we’ve got side-effects happening… Ruby isn’t very consistent about this outside of the base level objects (like String), but I really like the idea of ! indicating there are other things going on. It also means that I can use the same code for both pretty easily by simply saying:


    def foo
    # . . .
    end
    alias foo foo! # or vice versa, I can never remember the order

    and get both a side-effecting and not side effecting version.

    Just out of curiosity, it looks like I can now call match and build! directly on Node, can I also call user defined functions on a node? do I simply use apply for that?

    So…

    class Transformer < RubyWriter::ReWriter
    def main
    # ...
    end

    def foo
    match :Assign[:rhs,:lhs]
    :lhs.lookup.bar # or :lhs.lookup.apply { bar }
    # . . .
    end

    def bar
    # . . .
    end
    end

    I was just thinking we could get this effect automagically with a little method_missing action…


    class Node
    def method_missing sym, *args
    self.apply { send sym, args }
    end
    end

    Or something along those lines… of course wether this is a good idea or not is a completely other discussion :)

    Reply to akeep
    1. Arun Chauhan posted the following on December 4, 2008 at 2:43 pm.

      You can invoke all user-defined methods on terms. This is implemented by adding methods to the Node class in an eval loop in ReWriter#run. So, we don’t necessarily have to use method missing magic :).

      Invoking a method on a term now has the consistent meaning that the current term will be set to the receiver just before the method call. This holds for match and build! as well. So, for match, this means that the specified pattern will be matched against the receiver, instead of the current term. For build! this means that the result of the build will replace the receiver, instead of the current term.

      You can use apply anywhere you use a method call, with identical effect. However, apply is really intended to be used with lambdas (or Proc objects). Notice that the “method” to be invoked is passed as a block to apply.

      By the way, I am also thinking of supplying a match? method as a shorthand for succeds? match, since it appears to be a common case. And extending the same idea as with user-methods ending in “!”, we could have the eval loop create appropriate context for user methods ending in “?” to behave similarly.

      Reply to Arun Chauhan

Pitfalls of current term

Stratego’s notion of current term works beautifully. But, in embedding some of those capabilities within Ruby I am faced with a dilemma, perhaps not much different from implicit globals in Perl (and imported into Ruby). While writing a simple dead-code eliminator for a hypothetical imperative language I realized how easy it was to make mistakes. This does not happen so often in Stratego because it is (almost) purely functional. So, the only side-effect of modifying the current term is easily tracked.

In Ruby a method could have other side-effects. The code could work with a parse tree directly, without invoking any RubyWrite methods. Worse, the code could mix the two. And that’s where the trouble starts. By default, RubyWrite’s methods operate on the (global) current term. If one is writing code that directly manipulates the parse tree and is careless about maintaining a consistent value for the current term (as I was) then mixing RubyWrite’s methods (say, match or build) may lead to unexpected results. For example, I was looking for live variables. At one point I used match to search for all the references in the statement that I was processing.

alltd { match(:Var[:name]){live_vars << :name.lookup} }

I meant to search for live variables within the statement currently being processed. The above expression, on the other hand searches for names within the current term! It took me a while to realize this mistake. The correct expression is:

call_in_context(stmt[i]) do
  alltd { match(:Var[:name]){live_vars << :name.lookup} }
end

I am still unable to decide whether a better approach would be to require the user to always specify the term to operate on so that errors like these could be more easily avoided.

4 comments  Tagged:  , December 2, 2008

Hide Comments
  1. akeep posted the following on December 3, 2008 at 12:47 pm.

    I agree, this is a pretty challenging issue. Stratego/XT, because it compiles into C code, has an advantage here in that you have the option to call strategies and rules with the implicit current term or an explicit term, which the strategy writer can treat the same way either way. When it compiles the code into C, Stratego can then make all of the implicit current term arguments explicit, which is part of what allows them to be treated the same way.

    Unfortunately in Ruby, we don’t have that luxury. We’ve got part of this working, in that we can pass an explicit term through a block, but because blocks are used for other things in some cases (such as the traversal case) we can’t universally use this to play the same tricks that Stratego can. (as in your alltd example above).

    We might be able to affect something similar, using some of the meta-programming you’re doing to handle scoping issues, by adding a new first argument that is always the explicit term being transformed, but I think it would be awkward to make this seamless in the same way Stratego is.

    I really can’t think of any good alternatives though… on the one hand I like the flexibility of something like:

    match_results = term.match :Pattern[...]

    because it makes it very explicit what is going on, and handles terms in a more object-oriented way. Unfortunately, I don’t think that makes the traversals very easy, since it is unclear how you update the working term.

    If we stick with simple methods we could make the term explicit, but that then forces the “strategy” writer to know that they are getting an explicit first term.

    The parse_tree project - http://parsetree.rubyforge.org/ - (which exposes the Ruby parse tree inside Ruby) allows you to transform the parse tree by writing methods like:

    rewrite_if or process_if

    which would handle processing of all “if” nodes… I’m not too hot on that though as it really denies you a lot of flexibility, and you’re forced to provide some sort of processor for all nodes, which is really annoying, even if you’re just aliasing the same do-nothing method over and over.

    Reply to akeep
    1. Arun Chauhan posted the following on December 3, 2008 at 6:06 pm.

      You make a good point about blocks being used for different purposes. Perhaps, we can reduce the confusion by getting rid of the shortcut of using blocks as current term? There is already the method called call_in_context. But, I can imagine a more intuitive syntax, for example:

      :ThisTerm[..].xform { … }

      The return value of xform is the transformed term. Now, we could write traversals correctly.

      def all (*args, &f)
        @curT.children.each_with_index do |c, i|
          @curT.children[i] = c.xform(*args, &f)
        end
      end

      def topdown (*args, &f)
        @curT = @curT.xform(*args, &f)
        @curT.children.each_with_index do |c, i|
          @curT.children[i] = c.xform(*args) do
            topdown(*args, &f)}
          end
        end
      end

      Node#xform sets the current context to self, calls the supplied block with the supplied arguments, and returns the value returned by of curT at the end of executing the block. The current implementation does create a new copy of the subtree to be used as the new curT. I don’t see a way around this (because we don’t know in advance whether or not curT will be modified) except to implement copy-on-write.

      Notice that topdown and bottomup are both expensive, just as they are in Stratego :). A copy-on-write implementation should mitigate that cost.

      Reply to Arun Chauhan
  2. akeep posted the following on December 3, 2008 at 7:29 pm.

    I like the idea of having a xform method on terms. I see what you mean about it being potentially expensive for topdown and bottomup.

    Will we need to implement copy-on-write semantics to get the traversals to update things properly?

    Reply to akeep
    1. Arun Chauhan posted the following on December 4, 2008 at 11:13 am.

      We won’t need copy-on-write for correctness, only efficiency.

      I have a better idea than having a special xform function. I will post it separately.

      Reply to Arun Chauhan

Making the syntax more Ruby-esque

It struck me that I might have been going through the entire process in a wrong way. Instead of trying to embed Stratego-like syntax in Ruby, it would make much more sense to add similar capabilities, but respecting (and leveraging) Ruby syntax. With this realization things start falling in place. In the following examples, foo and bar are user-defined operations.

# on current term
foo

# on a specified term; the block is evaluated
# and the result assigned to curT
foo {:Assign['x', '10']}

# with arguments, on current term
bar('x', 'y')

# with arguments, on a specified term
bar('x','y') {Var['x']}

# match against current term
match? :Assign[:lhs, :rhs]

# match, with extra conditions (similar to congruence)
# match succeeds if the block succeeds,
# which is executed AFTER match
match :Assign[:lhs, :rhs] { ensure_binop[:rhs] }

# build, with extra conditions (similar to projection)
# the block is executed BEFORE build
build :Assign[:l, :rhs] { :lhs <= new_name('local'){:lhs} }

# save the returned value
:a <= foo

# conditional
if succeeds? { match :SomeTerm[:child] }

User methods are defined as regular method, but some meta-programming magic using instance_methods should let us enclose these methods inside wrappers so we can do setup and cleanup work around each call.

Add comment  Tagged:  , November 15, 2008

No Comments

Some more syntax

Some more syntactic sugar seems necessary.

Invoking a user-defined method, foo, on an explicit term, rather than the default current term (achieved by defining the << operator for Node type):

  :foo['arg1', 'arg2] << :SomeTerm[:child1, 'child2']
  match :Assign[-:foo['arg'] << :SimpleAssign['left'],  :rhs]

Notice that there is no + ahead of :foo and how deferred computation is achieved by prefixing the method name with a - in the second line (achieved by defining << operator for Proc). The mnemonic is that deferred (lazy) calls start with -, immediate (eager) calls start with +, but both operate on the hidden current term. Explicit term specification is not lazy, but it isn’t immediate either because the current term must be changed, so it takes no prefix.

Another option would be to use square brackets ([]) to pass the explicit term, which could be achieved by defining the [] operator for Node and Proc

  :foo['arg1','arg2'][:SomeTerm[:child1, 'child2']]
  match :Assign[-:foo['arg'][:SimpleAssign['left']], :rhs]

I don’t have a strong preference for one or the other, but the first seems somewhat cleaner.

For calling a method on a term:

   :to_string[] >> :a
   :to_string[] >> :SomeTree['child']
   :reverse[] >> ['node1', :Node2['child1', 'child2'], '3']

The >> indicates that a message is being sent to whatever is on the right hand side. The right hand side could be either a term or a Symbol—in the latter case the message is sent not to the symbol, but to the value it is bound to. There is no variant with - prefix in this case. This construct has no equivalent in Stratego.

Add comment  Tagged:  , November 13, 2008

No Comments

Shortcut syntax for method calls

While writing some example code snippets to use RubyWrite I realized something that might be handy. Term nodes can be any of the following: Node, Symbol, Array, or String. I would like to leverage Ruby’s predefined operations, especially on Symbol, Array, and String types succinctly in my code. For example, if I was trying to match a pattern and I expected an array at a position, then I would like to be able to reverse it in place. In other words, I want a shortcut equivalent to the following:

match :While[:cond, :Body[:r <= -:reverse_array[]]]
...
def reverse_array
  x = id
  x.reverse!
end

Instead of having to write the reverse_array wrapper method it would be nice if I could somehow directly invoke Array#reverse method. May be:

match :While[:cond, :Body[:r <= -:method[:reverse]]]

The above can be implemented easily.

class ReWriter
  def method (name, *args)
     @curT.send name, *args
  end
end

match will set the current term to the sub-term at the position where :method occurs. So, it will work. But, is this the most compact syntax? Notice that if additional arguments are to be passed to the invoked method, they can be specified as well. I don’t particularly like the syntax, even though it might be the only feasible way.

method will also work on the current term anywhere.

puts method(:prettyprint)  # same as: puts @curT.prettyprint

Add comment  Tagged:  , November 11, 2008

No Comments

Previous Posts


Links

Calendar

January 2009
M T W T F S S
« Dec    
 1234
567891011
12131415161718
19202122232425
262728293031  

Categories

Admin

Feeds