The Popovitch Blog

Explaining the Abstract Algorithm - Part 1

September 08, 2019

Warnings: Haskell knowledge required.

Creating a Pathological Example

In my last post on the Abstract Algorithm, I explained several important terms such as redex and Optimal. REad that before this one. Now I’m going to explain how the Abstract Algorithm works. Let’s set up an example where the Abstract Algorithm really shines.

id x = x
pair p = (p, p)
part_1 x = x id
part_2 y = pair (y 3)
to_reduce = part_1 part_2
-- to_reduce == (3, 3)

This might seem really convoluted - let’s add some type annotations. We’ll say that the 3 is an Int for simplicity.

id x = x
pair p = (p, p)

part_1 :: ((p -> p) -> t) -> t
part_1 x = x id

part_2 :: (Int -> b) -> (b, b)
part_2 y = pair (y 3)

to_reduce :: (Int, Int)
to_reduce = part_1 part_2
-- to_reduce == (3, 3)

Now, let’s lazily reduce to_reduce. First we replace part_1 with its definition.

to_reduce = (\x -> x id) part_2

Ok, cool, we can just apply the function.

to_reduce = part_2 id

Now the only way to reduce this further is to replace part_2 with its definition.

to_reduce = (\y -> pair (y 3)) id

Then we can just apply the function.

to_reduce = pair (id 3)

Now replace pair with its definiton.

to_reduce = (\p -> (p, p)) (id 3)

Then apply the function.

to_reduce = (id 3, id 3) 

And then I think it’s pretty obvious from here how this turns into (3, 3).

So, there’s the same problem as last time. Namely, the id 3 function gets duplicated and evaluated twice. GHC is smart enough to catch this, even with optimizations turned off, and share the value. This is because GHC’s evaluation strategy is more complicated than just straightforward lazy evaluation. See this StackOverflow question for more info and why Haskell’s strategy does still have issues.

What Is a Virtual Redex

If you remember, a redex is any expression that can be reduced, such as (\x -> x) 3 can be reduced to just 3.

Let’s look at the definition of part_2:

part_2 y = pair (y 3)

What you should notice here is that (y 3) is not a redex, since we don’t know what function y will be. Just looking inside the part_2 function, we have no information that would allow us to reduce part_2. But, later in the reduction of these functions, we will eventually replace y with id, which will allow us to easily reduce it. But as of right now, (y 3) is not a redex since it can’t be reduced. Since we hope that we will one day be able to reduce (y 3), we’ll call it a “virtual redex”.

So, is it possible to reduce a virtual redex? It is, and here’s how. Here is our example again:

id x = x
pair p = (p, p)
part_1 x = x id
part_2 y = pair (y 3)
to_reduce = part_1 part_2

We’re trying to reduce to_reduce, but we want to reduce the (y 3) expression within part_2 early for the sake of demonstration. We identify that we can’t do anything with (y 3) because we don’t know what y represents, and the reason we don’t know what y represents is because it’s passed as a parameter to part_2. So now we “rise up” and replace part_2 with it’s value.

id x = x
pair p = (p, p)
part_1 x = x id
-- part_2 y = pair (y 3)
to_reduce = part_1 (\y -> pair (y 3))

Now, we still don’t know the value of y, so we rise up again:

id x = x
pair p = (p, p)
-- part_1 x = x id
-- part_2 y = pair (y 3)
to_reduce = (\x -> x id) (\y -> pair (y 3))

Now we can reduce this:

id x = x
pair p = (p, p)
-- part_1 x = x id
-- part_2 y = pair (y 3)
to_reduce = (\y -> pair (y 3)) id

And then we can reduce this one more time.

id x = x
pair p = (p, p)
-- part_1 x = x id
-- part_2 y = pair (y 3)
to_reduce = pair (id 3)

And now, finally, we can reduce the (y 3) since we now know it’s really (id 3)

id x = x
pair p = (p, p)
-- part_1 x = x id
-- part_2 y = pair (y 3)
to_reduce = pair 3

It might thing that we’re just reducing until we get a value to pass to the virtual redex. But we’re actually being a bit more direct than that. Visualize it like a tree:

           to_reduce
               $
             /   \  
            /     \  
           /       \  
      part_1 x    part_2 y------------------------------------------------------.
        $            $                                                              \  
      /   \         / \                                                              \   
     /     \       /   \                                                              \  
    /      |       |    \                                                              \ 
   x     id x   pair p   $our virtual redex                                     /
           |            / \          we want to reduce this but we can't              /
           x           y   3         because we get the value of y from up here - ---`

The $ represents function application, same as normal Haskell. When I write part_1 x, it represents the function part_1 which takes the parameter x (I really should write \x -> part_1 x, but we have space constraints here). With this visualization we can present a more principled way of reducing the virtual redex y 3. Also I’ve left off the definition of pair just for brevity’s sake.

So, to evaluate the virtual redex, we start at the $.

           to_reduce
               $
             /   \  
            /     \  
           /       \  
      part_1 x     part_2 y  
        $            $ 
      /   \         / \ 
     /     \       /   \    
    /      |       |    \   
   x     id x   pair p   $current position
           |            / \ 
           x           y   3    

Now we examine the left side of $ (y), and find we don’t know what it is!

           to_reduce
               $
             /   \  
            /     \  
           /       \  
      part_1 x     part_2 y  
        $            $ 
      /   \         / \ 
     /     \       /   \    
    /      |       |    \   
   x     id x   pair p   $   
           |            / \ 
           x           y   3current position

This means that this expression can’t be reduced until we get more information about y. If we knew what y was going to be, we could reduce this further. To figure out what y is, let’s jump to where it is introduced.

           to_reduce
               $
             /   \  
            /     \  
           /       \  
      part_1 x     part_2 ycurrent position
        $            $    
      /   \         / \
     /     \       /   \    
    /      |       |    \   
   x     id x   pair p   $   
           |            / \ 
           x           y   3    

Yes, y is introduced at part_2. As of right now, we still don’t know anything about y, so let’s defer this question right now. For now, we want to answer if part_2 is the left part of a function application, because if it were we could find the value of y just by looking at the right part of that function application. So let’s say we “push the question of ‘what is y’ onto the stack and we arenow trying to see if the part_2 function is applied to any values”. Let’s make a new area to keep track of what mysteries we’re solving.

                                    ----------------------------
           to_reduce               |     mysteries to solve     |
               $                   | -------------------------- |
             /   \                 | what is part_2 applied to? |current mystery
            /     \                |      what is y?            |
           /       \               ---------------------------- 
      part_1 x     part_2 ycurrent position       
        $            $    
      /   \         / \
     /     \       /   \    
    /      |       |    \   
   x     id x   pair p   $   
           |            / \ 
           x           y   3    

You can think of the mysteries as a stack - we’re always trying to find the top one, because we need to to solve any below it. If we knew what part_2 was applied to, whatever it was applied to would be the value of y. So, let’s “rise up” and see where part_2 ends up getting used.

           to_reducecurrent pos  ----------------------------
               $                   |     mysteries to solve     |
             /   \                 | -------------------------- |
            /     \                | what is part_2 applied to? |current mystery
           /       \               |      what is y?            |
      part_1 x     part_2 y         ---------------------------- 
        $            $    
      /   \         / \  
     /     \       /   \     
    /      |       |    \   
   x     id x   pair p   $   
           |            / \ 
           x           y   3    

Now we see part_2 is being passed as a parameter to part_1. This doesn’t really help us yet, we’re looking for a situation where part_2 has a parameter passed to it (we want part_2 to be on the left side of the function application, here it’s on the right side). The procedure now is to go down and see how it’s used.

                     to_reduce                ----------------------------
                         $                   |     mysteries to solve     |
                       /   \                 | -------------------------- |
                      /     \                | what is part_2 applied to? |current mystery
                     /       \               |      what is y?            |
current pospart_1 x     part_2 y         ---------------------------- 
                    $            $    
                  /   \         / \  
                 /     \       /   \     
                /      |       |    \   
               x     id x   pair p   $   
                       |            / \ 
                       x           y   3    

part_1 is comprised of a function application. The parameter x of part_1 is part_2. Let’s go down to see what’s on the left side.

                     to_reduce                ----------------------------
                         $                   |     mysteries to solve     |
                       /   \                 | -------------------------- |
                      /     \                | what is part_2 applied to? |current mystery
                     /       \               |      what is y?            |
                  part_1 x     part_2 y      ---------------------------- 
                    $            $    
                  /   \         / \  
                 /     \       /   \     
                /      |       |    \   
               x     id x   pair p   $   
                       |            / \ 
                       x           y   3    

Finally! x, which represents part_2, is on the left side of a function appplication! Now let’s look at what’s on the right side.

                     to_reduce                ----------------------------
                         $                   |     mysteries to solve     |
                       /   \                 | -------------------------- |
                      /     \                | what is part_2 applied to? |current mystery
                     /       \               |      what is y?            |
                  part_1 x     part_2 y      ---------------------------- 
                    $            $    
                  /   \         / \  
                 /     \       /   \     
                /      |       |    \   
      current posid x   pair p   $   
                       |            / \ 
                       x           y   3   

looks like the id function is on the right side - this solves our first mystery. part_2 is applied to id. Let’s adjust what we’ve written down.

                     to_reduce                ----------------------------
                         $                   |     mysteries to solve     |
                       /   \                 | -------------------------- |
                      /     \                |         what is y?         |current mystery
                     /       \                ---------------------------- 
                  part_1 x     part_2 y     
                    $            $    
                  /   \         / \  
                 /     \       /   \     
                /      |       |    \   
      current posid x   pair p   $   
                       |            / \ 
                       x           y   3     

We now finally know what y is, it’s the id function. With this, we can reduce our to_reduce function. Remember, to_reduce involved applying part_1 to part_2. So the first step is to apply part_1 to part_2.

           to_reduce                         to_reduce       
               $                                 $           
             /   \                             /   \         
            /     \                           /     \        
           /       \                         /       \       
      part_1 x    part_2 y                part_2 y     id x  
        $            $           ->         $            |   
      /   \         / \                   /   \          x   
     /     \       /   \                 /     \             
    /      |       |    \               /      |             
   x     id x   pair p   $            pair     $             
           |            / \                   / \            
           x           y   3                 y   3           

In haskell, this would look like this:

id x = x                      
pair p = (p, p)               id x = x                             id x = x               
part_1 x = x id           ->  pair p = (p, p)                  ->  pair p = (p, p)        
part_2 y = pair (y 3)         part_2 y = pair (y 3)                part_2 y = pair (y 3)  
to_reduce = part_1 part_2     to_reduce = (\x -> x id) part_2      to_reduce = part_2 id  
                                -- replace part_1 with its           -- reduce
                                -- definition. 

(I added the extra step of replacing part_1 with its definition that we don’t have to do when we visualize it as a tree)

Then the next step is to reduce part_2!

       to_reduce                     to_reduce  
           $                             $      
         /   \                         /   \    
        /     \                       /     \   
       /       \                     /       \  
    part_2 y   id x               pair p      $ 
      $          |         ->                / \
    /   \        x                        id x  3
   /     \                                  |      
  /      |                                  x   
pair     $                                      
        / \                                     
       y   3                                    

id x = x                   
pair p = (p, p)        ->  id x = x                               id x = x
part_2 y = pair (y 3)      pair p = (p, p)                    ->  pair p = (p, p)
to_reduce = part_2 id      to_reduce = (\y -> pair (y 3)) id      to_reduce = pair (id 3)
                             -- replace part_2 with its
                             -- definition. 

Then we just apply id!

   to_reduce             to_reduce   
       $                     $       
     /   \                 /   \     
    /     \      ->       /     \    
   /       \             /       \   
pair p      $         pair p      3  
           / \                         
        id x  3                      
          |                          
          x                          


id x = x                     
pair p = (p, p)          ->  pair p = (p, p)                   ->  pair p = (p, p)               
to_reduce = pair (id 3)      to_reduce = pair ((\x -> x) 3)        to_reduce = pair 3
                               -- replace id with its
                               -- definition. 

We have reduced the “virtual redex”! (y 3) has been converted to its true self, 3. But the interesting thing is that we’ve reduced the virtual redex in an optimal number of steps. Let’s be more precice. Remember how we were moving around that current position arrow all around the graph earlier? Let’s label all the edges that we visited on the tree.

           to_reduce            
               $                
             /   \              
           c/     \b            
           /       \            
      part_1 x    part_2 y      
        $            $          
      /   \         / \         
    d/     \e      /   \        
    /      |       |    \       
   x     id x  pair p    $      
           |           a/ \     
           x           y   3    

We started at (y 3), so our initial position is the $ connected to the edge labelled a. You can think of a reduction as combining multiple edges.

           to_reduce                         to_reduce                     to_reduce     
               $                                 $                             $         
             /   \                             /   \                         /   \       
           c/     \b                       bcd/     \e                      /     \      
           /       \                         /       \                     /       \     
      part_1 x    part_2 y                part_2 y     id x             pair p      $    
        $            $           ->         $            |       ->           abcde/ \   
      /   \         / \                   /   \          x                      id x  3  
    d/     \e      /   \                 /     \                                  |      
    /      |       |    \               /      |                                  x      
   x     id x   pair p   $            pair     $                                         
           |           a/ \                  a/ \                                        
           x           y   3                 y   3                                       

First we reduce the application of part_1 to part_2, which has the effect of combining the b, c, and d edges, which we’ll call bcd. Then we reduce the application of part_2 to id, which combines the a, bcd, and e edges. This is the absolute minimum we have to do to convert the virtual redex (y 3) to a the redex (id 3).

What we’re really done here is described in a very high level way how to 1) find the shortest path between a mysterious element in a virtual redex and 2) use that path to optimally reduce a virtual redex. So, any reduction algorithm that is optimal must:

  1. Compute the normal form of am expression without reducing any path corresponding to a virtual redex more than once.

  2. Compute the normal form of am expression without reducing any uneccsary path.

This concept of virtual redexes is very important for understanding the abstract algorithm. For more information, see part 2 once it comes out!


Andre Popovitch

Written by Andre Popovitch, a Michigan State student who wants to change the world. You should follow him on Twitter