Created by: natefaubion
Fixes #298 (closed) plus a few more bugs related to infix matching in various contexts. It's a bit of a doozy, so lets walk through it.
Previously, the special sauce for infix macros and expressions was isolated to the get_expression
helper. It did some magic to continually enforest an expression until it reached a fixed point. This is what would load something like foo => foo + 1
with a =>
lambda macro as a single expression. foo
would be recognized as a single expression, but then it would lookahead to see if there was a macro. It would then expand it and see if it extended the original expression. The problem was get_expression
was not called in all cases, so you'd get into situations where they wouldn't be expanded (like in binary ops and assignment expressions). The solution was to just switch over to get_expression
in these cases, but custom operator put a wrinkle in that.
With fixed binary operators we would just build up BinOp
terms as we went, but with custom operators, the term building is deferred so we can work out the custom precedence and associativity. Sub calls to get_expression
for the right-hand-side of a binary operator are impossible because you lose the operator context (prec, assoc, stack, etc). We have to persist that context, which means the routine for infix expansion had to be moved into enforest/step
instead of get_expression
. In general this is good, because that means infix macros are expanded everywhere, which fixed a number of related bugs. It however opened up another set of issues. Take the example on the homepage:
macro unless {
rule infix { return $value:expr | $guard:expr } => {
if (!($guard)) {
return $value;
}
}
}
function foo(x) {
return true unless x > 42;
return false;
}
This used to work because macro expansion only happened within get_expression
. return
and the expression would be enforested as separate terms, and the unless
infix macro would be run on the next run through enforest. But now that we've shifted everything around, the unless
macro is expanded in the same run of enforest as the lhs expression. This means the lhs expression term will not have been finalized and built yet, so it can't match it as an expression.
So now this means we need a way to finalized the previous expression, but only if we try to match on the expression with an infix macro. To do this, I created a new set of Partial
term trees (PartialExpression
and PartialOperation
). While we are building an expression/operator chain, we wrap out previous syntax and terms in these special terms. What makes PartialExpression
special is that it has a combine
method that knows how to finalize the expression/operator chain its a part of to retrieve the full expression. Now when you try to match on an infix expression, the expander recognizes that its a PartialExpression
and calls its combine
method. Only operator tokens are wrapped in PartialOperation
, so that if you try to match on it directly, it won't be recognized as an expression. The Partial
terms also contain references to the Partial
syntactically to the left so that we can follow the entire chain and work out where it ends.
There's another issue though that operators and infix macros bring up. Say we are using our unless
macro above, but the left hand side expression is a complicated chain of operators 1 * 2 + 3 / 4
that would have several pending combinations on the operator stack. If all we did was splice the syntax, and left the opCtx
as is, all those pending combinations would still exist (even though we've consumed them with the infix macro). This means we have to know how much we've matched on, and rewind the operator context to match the syntax we consumed. To do that we add the current PartialOperation
term to the opCtx
and use it as a marker. When we match on infix syntax in an expression/operator chain, we look at the resulting prevTerms
and find the next PartialOperation
term. We then search the stack for the matching term and slice it there.
But wait! There's more! It turns out, prevStx
and prevTerms
weren't actually being built correctly after combining an operation. When the stepper decides that it needs to combine the term, it persists the previous syntax and terms from the closure it was built. But try and follow along to see why this is wrong. Let's take a simple chain of 1 + 2 + 3
. Here, the 1 +
is on the prevStx
when the combine
closure is built for the first operator. Once we get to the second +
, it combines the first which results in a <BinOp 1 + 2>
term, but with the 1 +
still on the pevious syntax. When the new term is then added to the previous syntax, you get 1 + 1 + 2
. With long chains of operators, you get the syntax duplication gradually building up the longer the chain is. To get the correct prevStx
we actually need to know the prevStx
from before the 1 + 2
, which is not available when we build our combine
closure! To get around this, we have to add the prevStx
and prevTerms
to the opCtx
stack so that we can restore it after combining.
At this point, we really need everything in the opCtx
on the stack to restore state, so instead of just storing a [prec, combine]
tuple, we now just store the entire opCtx
object. This is good because we don't need to do as much copying of the object. We can just push and pop off the stack.
In general I've found this to work really well in my testing. There are only a few edge-cases related to splitting terms. For example, so I want to do this:
macro m {
rule infix { $l:expr + 2 + | } => { $l }
}
1 + 2 + m
It won't work. I'd intuitively expect it to recognize 1
as an expression, but a BinOp
term has already been built with 1 + 2
. We've already matched the + 2
, so if we gave that whole expression it would be wrong. It isn't smart enough to descend into sub-expressions, so it is currently just disallowed and the match is failed.
Fortunately, I can't think of how a use case like that would be useful, so I don't expect it to really come up (famous last words). Otherwise, I think its pretty solid with all tests (plus a few additional) passing. Do try to break it, though.