What's wrong with Lua upvalues?

2018-02-13

Hello, everyone!

I was working on new release of Effil project and have met a pretty interesting problem. The problem is about Lua upvalues implementation and it affects Lua 5.2 and 5.3. The problem is not new but there is not enough information so I decided to tell about it one more time.

In case you are not familiar with Lua upvalues I recomend you to read about it: https://www.lua.org/pil/27.3.3.html.

Upvalues implement closure in Lua. So you are able to catch outer variables in your function. Like this:

local a = 42
function foo(b)
    return a + b
end

Variable a placed outside of function foo but it’s still reachable. This variable is upvalue of function foo and stored in special place which can be directly accessed by user:

print(debug.getupvalue(foo, 1)) -- a    42

In above case everything works good: you catch local outer variable, save it in function’s upvalues and can access it. The only exception is when you try to compile your Lua code without debug symbols. Let’s see what happens in that case:

Compile sample without symbols:

luac -s sample.lua

And try to execute it:

lua sample.lua

Using lua5.2 you will get:

    42

And using lua5.3 you will get:

(*no name)    42

In both cases the name of variable disappeared. It looks strange from the one point of view and logical from another one: you deleted debug info so the name of variable no longer exists.

When I started to implement upvalue support for Effil I thought that we can live with this feature of Lua. To support upvalues in our multithreading library we need to transmit upvalues beween Lua states. Even if script is compiled without debug info we still have the value of variable, the type and the position of variable in upvalues storage. So, we can transmit function to another state and we can restore it’s upvalues.

But there is one case when this logic does not work. What happens when you use global outer variable?

a = 42
function foo(b)
    return a + b
end

print(debug.getupvalue(foo, 1)) -- _ENV    table: 0x55b823d0c830

Your actual upvalue is current _ENV. I suppose this is strange behavior because in that case we only know that function points to some global variable and even don’t know to which variable exactly. OK, in case of Effil we can bypass this problem: we don’t transmit _ENV upvalues between states and just replace this upvalue with current _ENV of thread. This looks like normal solution considering that transmitting of _ENV table is not possible anyway: _ENV is too big, it includes too much useless data and may include unsupported data e.g. userdata. Now we get one restriction of Effil’s upvalues - avoid usage of global variables in thread’s function (you can do it if you recreate that variable inside thread’s function).

Now let’s return back. What happens if we compile script without symbols?

In that case we lose our single opportunity to determine whether upvalue is _ENV variable or not because we lost the names of upvalues. So there is no difference between any table and _ENV variable:

a = 42
local t = {v=2}

function foo()
    return a + t.v
end

print(debug.getupvalue(foo, 1)) -- (*no name)      table: 0x55c4a198cb80
print(debug.getupvalue(foo, 2)) -- (*no name)      table: 0x55c4a1993680

This feature of Lua looks like an architectural bug. We lost very important information: does function depend on global variable (which one is even better) ? Considering Lua as an embedding language the lost of such information becomes real problem.

What’s more funny is that Lua have solved this problem in previous versions! In Lua 5.1 _ENV is not stored in the list of upvalues and it provides special API: setfenv and getfenv. So you can control function’s environment directly. Yes, this approach still do not provide info about global variable name and do not say if the function uses global variable or not, but I think that differentiating function environment variable from other upvalues is a good idea.

P.S.: Effil shifts this problem onto users so if you compile your scripts without symbols it’s better to avoid usage of global variables in thread’s function at all. And we have added a special flag to disable usage of regular Lua table as upvalues for Effil functionы. So, Effil will raise error when you use regular table instead of effil.table as upvalues.