Making the syntax more Ruby-esque

It struck me that I might have been going through the entire process in a wrong way. Instead of trying to embed Stratego-like syntax in Ruby, it would make much more sense to add similar capabilities, but respecting (and leveraging) Ruby syntax. With this realization things start falling in place. In the following examples, foo and bar are user-defined operations.

# on current term
foo

# on a specified term; the block is evaluated
# and the result assigned to curT
foo {:Assign['x', '10']}

# with arguments, on current term
bar('x', 'y')

# with arguments, on a specified term
bar('x','y') {Var['x']}

# match against current term
match? :Assign[:lhs, :rhs]

# match, with extra conditions (similar to congruence)
# match succeeds if the block succeeds,
# which is executed AFTER match
match :Assign[:lhs, :rhs] { ensure_binop[:rhs] }

# build, with extra conditions (similar to projection)
# the block is executed BEFORE build
build :Assign[:l, :rhs] { :lhs <= new_name('local'){:lhs} }

# save the returned value
:a <= foo

# conditional
if succeeds? { match :SomeTerm[:child] }

User methods are defined as regular method, but some meta-programming magic using instance_methods should let us enclose these methods inside wrappers so we can do setup and cleanup work around each call.

Add comment  Tagged:  , November 15, 2008

No Comments

Some more syntax

Some more syntactic sugar seems necessary.

Invoking a user-defined method, foo, on an explicit term, rather than the default current term (achieved by defining the << operator for Node type):

  :foo['arg1', 'arg2] << :SomeTerm[:child1, 'child2']
  match :Assign[-:foo['arg'] << :SimpleAssign['left'],  :rhs]

Notice that there is no + ahead of :foo and how deferred computation is achieved by prefixing the method name with a - in the second line (achieved by defining << operator for Proc). The mnemonic is that deferred (lazy) calls start with -, immediate (eager) calls start with +, but both operate on the hidden current term. Explicit term specification is not lazy, but it isn’t immediate either because the current term must be changed, so it takes no prefix.

Another option would be to use square brackets ([]) to pass the explicit term, which could be achieved by defining the [] operator for Node and Proc

  :foo['arg1','arg2'][:SomeTerm[:child1, 'child2']]
  match :Assign[-:foo['arg'][:SimpleAssign['left']], :rhs]

I don’t have a strong preference for one or the other, but the first seems somewhat cleaner.

For calling a method on a term:

   :to_string[] >> :a
   :to_string[] >> :SomeTree['child']
   :reverse[] >> ['node1', :Node2['child1', 'child2'], '3']

The >> indicates that a message is being sent to whatever is on the right hand side. The right hand side could be either a term or a Symbol—in the latter case the message is sent not to the symbol, but to the value it is bound to. There is no variant with - prefix in this case. This construct has no equivalent in Stratego.

Add comment  Tagged:  , November 13, 2008

No Comments

Shortcut syntax for method calls

While writing some example code snippets to use RubyWrite I realized something that might be handy. Term nodes can be any of the following: Node, Symbol, Array, or String. I would like to leverage Ruby’s predefined operations, especially on Symbol, Array, and String types succinctly in my code. For example, if I was trying to match a pattern and I expected an array at a position, then I would like to be able to reverse it in place. In other words, I want a shortcut equivalent to the following:

match :While[:cond, :Body[:r <= -:reverse_array[]]]
...
def reverse_array
  x = id
  x.reverse!
end

Instead of having to write the reverse_array wrapper method it would be nice if I could somehow directly invoke Array#reverse method. May be:

match :While[:cond, :Body[:r <= -:method[:reverse]]]

The above can be implemented easily.

class ReWriter
  def method (name, *args)
     @curT.send name, *args
  end
end

match will set the current term to the sub-term at the position where :method occurs. So, it will work. But, is this the most compact syntax? Notice that if additional arguments are to be passed to the invoked method, they can be specified as well. I don’t particularly like the syntax, even though it might be the only feasible way.

method will also work on the current term anywhere.

puts method(:prettyprint)  # same as: puts @curT.prettyprint

Add comment  Tagged:  , November 11, 2008

No Comments

Environment transition in methods

One issue that came up in Friday’s discussion was the problem of isolating the environments of user-defined methods. For example, suppose that the user defined a method called foo and called it within main. Any user-level symbol bindings inside main are visible inside foo and any bindings made inside foo leak out to main. Clearly, this is a bad thing.

We discussed two possible solutions. First involves requiring users to define each method using our keyword (something like define_strategy), where the code is passed as a block. In this way each user-defined method is, in fact, a Proc and we can enclose it within another Proc that does the “right thing” to push and pop environments. The problem with this approach is that users must use special syntax for defining methods, and more importantly, we force each user-level method to be a closure that might have performance issues.

A second “solution” is to use the Kernel#caller method and prefix each symbol with a string representing the call-chain. However, this “solution” has the original problem of eternally growing table of symbol bindings. Moreover, I realized later, it is not a solution at all! Consider what happens when foo is called successively by main. The two invocations of foo will have identical looking call-chain and their symbols will interfere. This is related to the eternally-growing-table problem, because we never remove any symbols from the table when a method ends.

Here’s a possible solution. Require that all methods be called with a special keyword, say apply. Users will define their methods as normal Ruby methods, but must invoke them with the apply keyword. Of course, now they must pass the name of the method as a symbol.

apply :foo['arg1', 'arg2']
 

Can we do better? Of course!

+:foo['arg1', 'arg2']

We can define the unary plus on the Node type (which does work, unlike unary plus on Symbol, which does not). We will now require that each user-defined method invocation use square-brackets even if there is no argument, just like C requires parentheses for function calls. A nice thing with this syntax is that all invocations of user-defined methods are uniform, except that immediate invocations use + and deferred invocations use -. (Yes, unfortunately, the users will need to remember where to use deferred execution.) It also serves to perpetuate the illusion that we have certain keywords within RubyWrite, e.g., match and build, which are distinct from methods.

There is one small wrinkle in implementing the Node unary plus. The method foo must be invoked in the context of the ReWriter instance, which is not known to Node. There was a similar situation with unary minus, but there the final invocation of the returned Proc object was within the RubyWrite internal methods, which can arrange to send the ReWrite instance as an additional argument. Here, we do not want to encumber the user with an additional argument each time their method is to be invoked.

The solution to this wrinkle involves using Ruby’s class instance variables. Recall that each class in Ruby is really an object. So, even though it sounds like an oxymoron, class instance variables do make sense. Here is the code for Node unary minus.

class Node
  def +@
    ReWriter::reWriterInstance.send value, *@children
  end
end

reWriterInstance is defined within the ReWriter initializer using the attribute setter of the ReWriter class.

class ReWriter
  class << self
    attr_accessor :reWriterInstance
  end

  def initialize
     ...
     ReWriter.reWriterInstance = self
  end
end

As long as the users stick to using the run class method (and not directly instantiate their class, for example) the magic of class instance variables makes this work even when there are multiple user classes.

4 comments  Tagged:  , November 8, 2008

Hide Comments
  1. akeep posted the following on November 10, 2008 at 4:02 pm.

    I think this is a good idea… though I’m not sure that the class instance variable is going to give you quite the behaviour you want.

    Specifically I don’t think this will work in the cases where you have more then one instance of the ReWrite class running around in the same Ruby program, since there is only one instance variable defined for the ReWrite class.

    For instance, we can test this out in IRB:


    > class ReWriter
    > class < attr_accessor :reWriterInstance
    > end
    > def initialize
    > ReWriter.reWriterInstance = self
    > end
    > end
    => nil
    > ReWriter::reWriterInstance
    => nil
    > r1 = ReWriter.new
    => #
    > ReWriter::reWriterInstance
    => #
    > r2 = ReWriter.new
    => #
    > ReWriter::reWriterInstance
    => #
    > r1 == ReWriter::reWriterInstance
    => false
    > r2 == ReWriter::reWriterInstance
    => true

    It may still be a limitation worth living with though, at least until we can come up with another clever solution. We might want to add something into ReWriter though to ensure that it only has one instance at a time, and throw an error if it has more then one.

    Reply to akeep
    1. Arun Chauhan posted the following on November 11, 2008 at 11:08 am.

      Of course, if you have more than one instance of any derived class you run into problems. However, class instance variable does protect you against the situation when you might have two different derived classes active at the same time.

      Reply to Arun Chauhan
  2. akeep posted the following on November 10, 2008 at 6:04 pm.

    Actually, after reading the posting on class instance variables, I take back what I said earlier. My example was actually a bad one, because what we really have is:

    class Transformer1 < ReWriter
    # . . .
    end
    class Transformer2 < ReWriter
    # . . .
    end

    Unforunately, this still has the same problem in its simplest form because the instance variable referred to in the initializer is the instance variable from ReWriter, not from the individual transformers.

    Reply to akeep
    1. Arun Chauhan posted the following on November 10, 2008 at 6:10 pm.

      Andy,

      It doesn’t matter, really. There can be exactly one instance of the class running at any given time if you use the “run” class method (so, may be, using class instance variable is not even necessary). There is no way to get back to an earlier instance. Like I said in my original post, all bets are off if the users bypass the “run” method.

      Reply to Arun Chauhan

Exceptions and conditionals

Our convention has been to raise an exception when a method (such as match) fails. This causes a problem with if statements. Often we want to test whether an operation, such as match, succeeded and take an appropriate action. Raising an exception on failure precludes that.

Andy suggested that we could require the use of special method names in conditionals, say ending with ?, and catch the NoMethodError exception to define those methods to return a Boolean value, instead of raising exceptions. So, instead of writing if match, we would write if match?. Unfortunately, that has two problems:

  1. If NoMethodError is raised then either the rest of the computation within main is lost or the entire computation must be restarted (using retry). The first is unacceptable and the second is highly inefficient at best—in the worst case, it would be incorrect if the computations in main have side-effects.
  2. If the user class defined its own method name ending with ? then we have no control over its behavior.

The issue here is similar to what comes up in implementing “projection” and “congruence”. Essentially, we want to defer the computation. In this case, we want to defer computing the conditional expression so that it can first be enclosed inside a rescue block to return a Boolean value, instead of raising an exception. The solution is also similar, by introducing appropriate keywords and requiring that the method name be specified symbolically for later evaluation.

if succeeds :match, with_args(:Assign[...])
  ...
elsif succeeds :myMethod
  ...
else
  ...
end

Of course, with_args is optional, as the elsif clause shows. Finally, fails is a synonym of !succeeds. Once again, my desire to have the most compact syntax trumps other possible solutions that would require enclosing the if-else statement inside a block.

The icing on the cake with this approach is that we can implement environment rollback on failure inside conditionals.

2 comments  Tagged:  , November 6, 2008

Hide Comments
  1. akeep posted the following on November 6, 2008 at 9:56 pm.

    I like this idea. I think it is cleaner then having any kind of implicit automatic method_missing catch (also more efficient).

    Do you think we need the with_args keyword there? Or could we do something like:

    succeeds :match, :Assign[...]

    The we could define succeeds as something like:

    def succeeds meth, *args
    self.send meth, *args
    rescue
    false
    end

    (Assuming the referenced method returned something other then false or nil).

    Reply to akeep
    1. Arun Chauhan posted the following on November 8, 2008 at 10:29 am.

      OK, so after Friday’s discussion I realized that I don’t have to catch the NoMethodError exception and the first problem goes away. However, the second problem still remains. And as Andy has pointed out, using the succeeds method is cleaner.

      Also, with_args is not necessary. I had that just for readability. But, there is a better method, which is to use the same approach used in handling congruence and projection for constructing closures by defining unary minus on Symbol types. (Yes, it turns out that unary plus doesn’t work.)

      Arun

      Reply to Arun Chauhan

Previous Posts


Links

Calendar

November 2008
M T W T F S S
« Sep    
 12
3456789
10111213141516
17181920212223
24252627282930

Categories

Admin

Feeds