When receiving a shutdown signal during startup, the Supervisor should
cancel its startup task to ensure a graceful shutdown. This prevents
Supervisor accidentally accessing the Event loop after it has been
closed by the stop procedure:
RuntimeError: Event loop stopped before Future completed.
* Allow arbitrarily nested addon config schemas
* Disallow lists directly nested in another list in addon schema
* Handle arbitrarily nested addon schemas in UiOptions class
* Handle arbitrarily nested addon schemas in AddonOptions class
* Add tests for addon config schemas
* Add tests for addon option validation
* Add endpoint for complete logs of the latest container startup
Add endpoint that returns complete logs of the latest startup of
container, which can be used for downloading Core logs in the frontend.
Realtime filtering header is used for the Journal API and StartedAt
parameter from the Docker API is used as the reference point. This means
that any other Range header is ignored for this parameter, yet the
"lines" query argument can be used to limit the number of lines. By
default "infinite" number of lines is returned.
Closes#6147
* Implement fallback for latest logs for OS older than 16.0
Implement fallback which uses the internal CONTAINER_LOG_EPOCH metadata
added to logs created by the Docker logger. Still prefer the time-based
method, as it has lower overhead and using public APIs.
* Address review comments
* Only use CONTAINER_LOG_EPOCH for latest logs
As pointed out in the review comments, we might not be able to get the
StartedAt for add-ons that are not running. Thus we need to use the only
reliable mechanism available now, which is the container log epoch.
* Remove dead code for 'Range: realtime' header handling
* Bump minimal Docker to 23.0.0
Home Assistant OS 10.0 update to Docker 23.0.3, lets make this
Docker version the minimum we support. This will allow us to use
zstd compression for layers (see https://github.com/home-assistant/builder/pull/245).
* Bump minimal Docker version to 24.0.0
* Write cidfiles of Docker containers and mount them individually to /run/cid
There is no standard way to get the container ID in the container
itself, which can be needed for instance for #6006. The usual pattern is
to use the --cidfile argument of Docker CLI and mount the generated file
to the container. However, this is feature of Docker CLI and we can't
use it when creating the containers via API. To get container ID to
implement native logging in e.g. Core as well, we need the help of the
Supervisor.
This change implements similar feature fully in Supervisor's DockerAPI
class that orchestrates lifetime of all containers managed by
Supervisor. The files are created in the SUPERVISOR_DATA directory, as
it needs to be persisted between reboots, just as the instances of
Docker containers are.
Supervisor's cidfile must be created when starting the Supervisor
itself, for that see home-assistant/operating-system#4276.
* Address review comments, fix mounting of the cidfile
The constant PATH_CLOUD_BACKUP is not used anywhere in the codebase.
Remove it to clean up the code. This is a leftover from a removed
initial cloud backup support implementation and got missed in #5464.
When timedatectl is not available (e.g. in minimal devcontainers),
the code currently fails to setup due to missing timedate service on
D-Bus. This change makes the code more robust by checking only checking
for the presence of the service if we actually going to use it.
* Avoid duplicate evaluate_system() calls during resolution manager setup
During resolution manager initialization, both the initial healthcheck call
and the subsequent setup() call would trigger evaluate_system(), causing
redundant system evaluation. All following calls in healthcheck() are
already suppressed during the setup stage, we can optimize this by
calling check_system() directly during load() instead of the full
healthcheck().
This reduces unnecessary processing during supervisor startup while
maintaining the same functional behavior.
* Call full healthcheck on setup and move diagnostics to core start
The OS Agent diagnostics if statement accesses OS Agent through D-Bus
already. This makes the exception handling inside the if statement
not really useful.
Move OS Agent diagnostics setting to core start so we can leverage
the existing global Exception handling in start() instead of
having to add another try/except block in setup(). It also covers the
if statement itself.
* Store and persist OS upgrade map to fix update path evaluation
The existing logic calculated OS upgrade paths inline during fetch_data,
which will not get reevaluted when the current OS is unsupported
(JobCondition.OS_SUPPORTED). E.g. after updating from 11.4 to 11.5, the
system wouldn't offer the next available update (15.2) because the
upgrade path calculation relied on fresh data from the blocked fetch
operation.
Changes:
- Add ATTR_HASSOS_UPGRADE constant and schema validation
- Store hassos-upgrade map from version JSON in updater data
- Refactor version_hassos property to use stored upgrade map instead of
inline calculation during fetch_data
- Maintain upgrade path logic: upgrade within major version first, then
jump to next major version when at the latest in current major
- Add type safety checks for version.major access
This ensures upgrade paths work correctly even when update data refresh
is blocked due to unsupported OS versions, fixing the scenario where
HAOS 11.5 wouldn't show 15.2 as the next available update.
* Update supervisor/updater.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Address mypy issue
* Fix pytest
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Add availability API for addons
* Add cast back and test for latest version of installed addon
* Make error responses more translation/client library friendly
* Add test cases for install/update APIs
Running tests in UTC+2 timezone makes some of the tests fail because the
mocked time in the future is actually in the past, as UTC is used as the
new reference point. Adjust the tests to mock also the time when the
first execution of function happens.
Instances where the second execution happened "immediately" were mocked
to happen 1ms later. The 1ms delta is also needed to be added when
mocking time 1h in the future, otherwise it will be throttled too.
* Add background option to update/install APIs
* Refactor to use common background_task utility in backups too
* Use a validation_complete event rather then looking for bus events
Under certain (timing) conditions ConnectionResetError can be raised
when the client closes the connection while we are still writing to it.
Make sure to handle the appropriate exceptions to avoid flooding the
logs with stack traces.
* Handle missing type attribute in add-on map config
Handle missing type attribute in the add-on `map` configuration key.
* Make sure wrong volumes are cleared in any case
Also add warning when string mapping is rejected.
* Add unit tests
* Improve test coverage