🡄 Previous

Next 🡆

Contents > Abstract Machines

Function Emulator

For all n-byte functions f, there exists an MC program, E, that emulates f. This is because, regardless of how f is mathematically defined, it can be replaced with a lookup table, a finite—though potentially large—map from n-byte keys to n-byte values. As such, E simply needs to mimic the following pseudocode, when executed at index x.

for each (k,v) in m
  if ax‥ax+n−1 = k0‥kn−1
    ax‥ax+n−1 ← v0‥vn−1
    break

a is the infinite array abstraction from the previous section. ax‥ax+n−1 is the segment of the array the function operates on.

The code iterates over the key-value pairs of the map. For each pair, if the array segment equal the key, then it overwrites the segment with the value, and it breaks out of the loop to prevent the assigned value from matching the key of a successive entry. If it reaches the end of the loop without finding a match, it leaves the array unchanged.

Unfortunately, MC does not support loops, nor conditional branches (if statements), nor unconditional branches (such as the break statement). But the loop can be unrolled (at the expense of lengthy code) and the branches can be simulated with the assistance of temporary variables appended to the end of the array segment:

p ≡ ax+n

q ≡ ax+n+1

r ≡ ax+n+2

The initial values of those elements do not matter. But they must be reserved since E overwrites them.

The code is transformed through the following steps. First, the break statement is replaced with r acting as a flag:

r ← 0
for each (k,v) in m
  if (ax‥ax+n−1 = k0‥kn−1) and not r
    ax‥ax+n−1 ← v0‥vn−1
    r ← 1

The loop no longer exits early, but r ensures the assignment happens at most once.

Next, the if statement is converted to a ternary operation, where the condition is stored in q:

r ← 0
for each (k,v) in m
  q ← (ax‥ax+n−1 = k0‥kn−1) and not r
  ax‥ax+n−1 ← q ? v0‥vn−1 : ax‥ax+n−1
  r ← r or q

Each iteration overwrites the array segment with itself, except when a key matches for the first time.

Line 3 is expanded into lines 3–6 below.

r ← 0
for each (k,v) in m
  q ← not r
  for i ← 0 to n−1
    p ← ax+i = ki
    q ← q and p
  ax‥ax+n−1 ← q ? v0‥vn−1 : ax‥ax+n−1
  r ← r or q

When the inner loop on line 4 is unrolled, every compare-with-constant, p ← ax+i = ki, is replaced with a byte matcher:

MATCH_ki_C ≡ [ A, C ] ↦ [ A, A = ki ]

A is the array element compared with ki. It passes through unchanged. C is the array element that stores the result of the comparison, which in this case is p.

To make this work, 256 implementations of that byte matcher need to be added to the function catalog, one for every possible value of ki. However, the byte matchers operate on a pair of adjacent elements. So, the following version of the code repeatedly swaps the value in ax+i rightward along the array, until it reaches ax+n−1, the element immediately left of p. Once there, the code performs the comparison. Then it does the swaps in reverse order to restore the moved value back to its original location.

r ← 0
for each (k,v) in m
  q ← not r
  for i ← 0 to n−1
    for j ← i to n−2
      [ax+j,ax+j+1] ← [ax+j+1,ax+j]    
    p ← ax+n−1 = ki
    for j ← n−2 down to i
      [ax+j,ax+j+1] ← [ax+j+1,ax+j]    
    q ← q and p
  ax‥ax+n−1 ← q ? v0‥vn−1 : ax‥ax+n−1
  r ← r or q

Line 11 is expanded into lines 11–13 below.

r ← 0
for each (k,v) in m
  q ← not r
  for i ← 0 to n−1
    for j ← i to n−2
      [ax+j,ax+j+1] ← [ax+j+1,ax+j]    
    p ← ax+n−1 = ki
    for j ← n−2 down to i
      [ax+j,ax+j+1] ← [ax+j+1,ax+j]    
    q ← q and p
  for i ← 0 to n−1
    p ← vi
    ax+i ← q ? p : ax+i
  ax‥ax+n−1 ← q ? v0‥vn−1 : ax‥ax+n−1
  r ← r or q

When the inner loop on line 11 is unrolled, every assign-to-constant, p ← vi, is replaced with a constant function:

CONST_vi ≡ A ↦ vi

A is the array element assigned vi, which in this case is p.

As with the byte matchers, 256 implementations need to be added to the function catalog, one for every possible value of vi.

Additionally, every conditional-assignment, ax+i ← q ? p : ax+i, is replaced with a conditional byte copy function:

COPY_B_A_C ≡ [ A, B, C ] ↦ [ C ? B : A, B, C ].

A is the destination element. When the condition is false, it is also the source element. B is the source element when the condition is true. And C is the condition element.

B and C pass through, unchanged. They are p and q, respectively.

COPY_B_A_C already exists in the function catalog. However, it operates on a triplet of consecutive elements. So, the following version duplicates the strategy used for the byte matchers. It moves the value in ax+i into ax+n−1 via repeated swaps. There, it performs the copy. Then it restores the moved value back to its original location.

r ← 0
for each (k,v) in m
  q ← not r
  for i ← 0 to n−1
    for j ← i to n−2
      [ax+j,ax+j+1] ← [ax+j+1,ax+j]    
    p ← ax+n−1 = ki
    for j ← n−2 down to i
      [ax+j,ax+j+1] ← [ax+j+1,ax+j]    
    q ← q and p
  for i ← 0 to n−1
    p ← vi
    for j ← i to n−2
      [ax+j,ax+j+1] ← [ax+j+1,ax+j]    
    ax+n−1 ← q ? p : ax+n−1
    for j ← n−2 down to i
      [ax+j,ax+j+1] ← [ax+j+1,ax+j]    
  r ← r or q

Next, the placeholders, p, q, and r, are filled in with their corresponding array elements, ax+n, ax+n+1, and ax+n+2, respectively:

ax+n+2 ← 0
for each (k,v) in m
  ax+n+1not ax+n+2
  for i ← 0 to n−1
    for j ← i to n−2
      [ax+j,ax+j+1] ← [ax+j+1,ax+j]    
    ax+n ← ax+n−1 = ki
    for j ← n−2 down to i
      [ax+j,ax+j+1] ← [ax+j+1,ax+j]    
    ax+n+1 ← ax+n+1 and ax+n
  for i ← 0 to n−1
    ax+n ← vi
    for j ← i to n−2
      [ax+j,ax+j+1] ← [ax+j+1,ax+j]    
    ax+n−1 ← ax+n+1 ? ax+n : ax+n−1
    for j ← n−2 down to i
      [ax+j,ax+j+1] ← [ax+j+1,ax+j]    
  ax+n+2 ← ax+n+2 or ax+n+1

Then all references to the array are replaced with MC instructions:

CLEAR n+2
for each (k,v) in m
  COPY_NOT_B_A n+1
  for i ← 0 to n−1
    for j ← i to n−2
      SWAP j    
    MATCH_ki_C n−1
    for j ← n−2 down to i
      SWAP j
    AND_AB_AF n
  for i ← 0 to n−1
    CONST_vi n
    for j ← i to n−2
      SWAP j
    COPY_B_A_C n−1
    for j ← n−2 down to i
      SWAP j
  OR_AB_AF n+1

The code needs two variations of existing functions:

COPY_NOT_B_A ≡ [ D, E ] ↦ [ not E, E ]

OR_AB_AF ≡ [ A, B ] ↦ [ A, A | B ]

The final step is to unroll the loops. As that happens, the (k,v) pairs get encoded into sequences of MATCH_ki_C and CONST_vi instructions. However, since m contains up to 256n entries, the unroll may produce an extremely long program.

Clearly, this technique is not practical or efficient. Nonetheless, it demonstrates an MC program can emulate an arbitrary n-byte function, at least in principle.

🡄 Previous

Next 🡆