If Lua raises an error we won't ever link the new_script into the list
and won't find it to remove it and free its components later. The
components do ultimately get freed by the destruction of the heap, and
there's no way to access them after, but we'd rather it be empty and
know nothing is dangling.
This should be replaced by storing the components as Lua objects and
letting Lua's GC take care of them. That way the code could be made much
less fragile.
Also removes a chance of a leak in run_next_script.
Nominally cannot be called as we do not call error-raising Lua functions
outside of protected mode. In the event a panic happens somehow anyway,
Lua calls `lua_abort` which we already handle safely.
Lua calls our registered panic function if a Lua error is raised outside
of any protected mode set up by `lua_pcall` and friends. As scripts
themselves are run in protected mode, such an error will only originate
from our engine code.
The current code handles panics, but does so incorrectly by neglecting
to free the error message buffer during a panic exit. As the underlying
heap is eventually destroyed and recreated, that static buffer pointer
is to a heap that no longer exists. If the new state raises an error it
will try to free that pointer, causing heap corruption and sadness.
Fix this issue and avoid similar ones by refactoring to run the main
part of the scripting engine in Lua protected mode. This creates exactly
one exit path from the scripting engine and avoids the need for fragile
panic handling infrastructure that duplicates what is built into Lua.
Quoth the Lua manual, "The panic function, as its name implies, is a
mechanism of last resort. Programs should avoid it."
This is unlikely to be triggerable in flight as the main engine loop
does not appear to use the Lua API in a way which can trigger errors.
But the possibility can't be fully excluded. It is, however, possible to
trigger beforehand using a perfectly wrong heap size that causes a
memory error during initialization.
Note that due to the error message buffer being freed properly now, an
error message originating from the engine will not cause a pre-arm
failure. This could be improved in the future.
Now that startup RAM usage is not dependent on the number of bindings,
there is little need to specifically measure that phase. This combines
~700 bytes used by the string setup into the initial state usage. But
that number should very rarely change.
The `luaL_testudata` and `luaL_checkudata` functions correctly treat the
absence of a metatable for the type to check as the type not matching,
so there is no need to have the type's metatable in memory until an
object using it exists. Therefore we can defer the metatable's creation
until the first time an object of that type is created.
Saves some 4500 bytes of Lua heap RAM, at least if no such objects are
created. The created metatables are shared between all scripts, so there
is no increase in the total RAM usage if all objects are used. However,
there is some increased risk of heap fragmentation and unexpected
out-of-memory errors on rarely used code paths.
It would be nice to construct the object in the new utility function,
but it's not possible in C++ to take the address of a constructor.
Instead passing in a lambda is ugly, hardly any more efficient, and
screws up constructors which take parameters.
Also optimizes by using the generated userdata creators where possible,
and convincing the compiler to omit unnecessary checks.
This corrupts other MAVLink messages going out at the same time,
particularly in CI, causing tests to fail because scripting restart and
script initial startup messages are destroyed.
Instead use a MAVLink debug statustext, but only send it if the runtime
measurement debug bit is on to avoid an obvious change in behavior and
unnecessary appearance of a new message.
Various places print "error objects", which are nominally things passed
to `error()`. However, the user can pass anything to that function, e.g.
`error(nil)`. This causes `lua_tostring` to return `NULL` and the string
format to dereference that pointer if it happens at the top level.
Fix the issue by using a helper to convert error objects to strings that
returns a sentinel string if the conversion fails.
it is possible to build for boards without storage (so no Posix, no Fatafs), but still have scripts in ROMFS.
In this case we will use the backend AP_Filesystem_backend base class when doing file operations. This will alway fail to open directories, so when we try to load scripts from SCRIPTS_DIRECTORY it will always fail.
This leads to a warning being emitted:
Lua: State memory usage: 2796 + 5227
AP: Lua: open directory (./scripts) failed
AP: hello, world
Time has wrapped
Which isn't great.
Detect we are working on this filesystem and don't warn.
In Lua, strings are the only type that come with a default metatable.
The metatable must be shared by all string objects, and it is set to be
the `string` library table each time that library is opened. In
Ardupilot's scripting engine, the last script to load then has access to
the string metatable as the library is opened fresh for each script, as
its `string` library will have been set to the metatable.
Therefore, if two scripts are loaded, A first and B second, and script B
executes e.g. `string.byte = "haha"`, then `string.byte()` and
`s:byte()` for script B are broken. Because the metatable is shared,
this also breaks `s:byte()` for script A, which violates the integrity
of the sandbox.
Fix the issue by disabling the metatable setup functionality when the
string libary is opened, then manually opening an additional copy of the
library (which won't be given to any script) and setting it as the
string metatable during intialization.
This will break any script that modifies the string metatable for
constructive purposes, but such a script could have been broken if it
weren't the only script running anyway.
Referencing the original function to run is of questionable value and
the only user uses it to grab the script environent from the upvalues.
Instead, use a reference to the script environment table directly.
Some bits of the code in the require machinery use the `lua_ref` to
access the script environment. However, this can change after the script
is rescheduled and it returns an arbitrary function to run next.
Resolve this by introducing `run_ref` which is specifically a reference
to the function to run next. `lua_ref` is preserved for the script
lifetime.
On macOS, sometimes ._script.lua is created to store metadata when the
user copies script.lua over to their SD card. Previously, the scripting
engine would barf since the file is not Lua. Now, these files are
ignored.
Also avoids a case where a hidden and valid script might be loaded
without the user's knowledge.
AP_Logger.h is a nexus of includes; while this is being improved over
time, there's no reason for the library headers to include AP_Logger.h
as the logger itself is access by singleton and the structures are in
LogStructure.h
This necessitated moving The PID_Info structure out of AP_Logger's
namespace. This cleans up a pretty nasty bit - that structure is
definitely not simply used for logging, but also used to pass pid
information around to controllers!
There are a lot of patches in here because AP_Logger.h, acting as a
nexus, was providing transitive header file inclusion in many (some
unlikely!) places.
start of the update
This allows specifying a return value like "return update, 10" to run
at a near perfect 100Hz, where as before it would be run 10 ms after the
script had completed it's loop, which can be highly variable as the
script experiences interupts from the system, as well as needing the
script author to take responsibility for calculating the desired update
rate at the end. This was always intended to be fixed, but I pushed it
back during the initial development, however people are begining to run
scripts that have enough processing, or are rate sensitive enough that
we are now needing to start correcting this, or scripts will have to do
their best to guess the time, which will be inferior to us providing it.
As a note if you exceeded the time expected we will be rescheduling the
script immediately, thus it will have a schedule time in the past and
will be slotted in. This can't indefinetly starve other scripts as they
will still be slotted in, but if you request an update in 1 ms, but took
100ms to run we will simply slide you back into the queue 1ms after when
you started running.