validation: dedicated venv + fix python_bin override timing

eval_proc was started before -d/GUI defines reached gd, so
``-d python_bin=...`` and the GUI ``python_bin`` preference were
silently ignored by the very subprocess that runs ``<| ... |>`` evals
(and only took effect for later items once the discovery cache had
already been seeded with the system interpreter). apply_overrides() is
now applied before eval_process_init(), and bins._resolve()'s cache is
keyed by (name, override) so a later param.yaml change re-resolves on
the next lookup.

The validation suite now ships a wrapper (run.sh / run.bat) that
creates a dedicated venv in the system temp dir and pins it via
``-d python_bin=...``. A new ``venv`` item asserts the override took
effect for both eval_proc and py_func paths, with a
``sys.prefix != sys.base_prefix`` marker to catch the case where the
override happens to be a system interpreter (path-equality alone would
miss it, the venv's ``bin/python3`` being a symlink to the host).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-19 08:19:57 +02:00
parent 6f832cd67b
commit 4d8cafb5a0
10 changed files with 321 additions and 17 deletions

View File

@@ -114,11 +114,20 @@ To add a new API call usable from subprocesses:
### External interpreter resolution (`bins.py`)
`src/testium/interpreter/utils/bins.py` — single source of truth for the paths to the external Python and Lua interpreters used by subprocesses.
- `python_bin()` / `lua_bin()` : resolve once, cache in memory. User can override via the `python_bin` / `lua_bin` global dict keys (typically populated from the YAML config). Falls back to discovery on PATH (candidates: `python3`/`python` and `lua`/`lua5.5`/`lua5.4`/`lua5.3`/`lua5.2`/`lua5.1`).
- `python_bin()` / `lua_bin()` : resolve and cache. The cache is keyed by `(name, override)` so that a later change to `gd[python_bin]` (typically when a `param.yaml` sets the key) triggers a re-resolution on the next lookup instead of returning the stale auto-discovered path. Falls back to discovery on PATH (candidates: `python3`/`python` and `lua`/`lua5.5`/`lua5.4`/`lua5.3`/`lua5.2`/`lua5.1`).
- `ensure(*names)` : called by `TestSet._validate_runtime_deps()` at test load. Always requires `python` (the eval engine always runs); requires `lua` only if a `lua_func` item is in the tree. Fails fast with a clear error citing tried candidates and override key.
Engines (`PyProcessBase`, `LuaProcessBase`, `EvalExecEngine`) call `bins.python_bin()`/`bins.lua_bin()` themselves — call sites never pass an explicit binary path.
#### Override-timing contract (`apply_overrides`)
`bins.python_bin()` is called for the **first** time inside `eval_process_init()` (the long-lived inline-`<| … |>` subprocess), which happens **before** the YAML param files are loaded. To make `-d python_bin=…` and the GUI `python_bin` preference take effect for `eval_proc` itself, `process.py:run()` applies them to gd **before** `eval_process_init()` via the `apply_overrides()` helper extracted from `update_global()`. The post-load `update_global()` call then re-applies the same overrides (after `prepare_global()` clears gd), keeping the gd value in sync with the cached resolution.
| Override source | `eval_proc` | `py_func` / `cycle` / `post_exec` |
|---|---|---|
| `-d python_bin=…` (CLI) | ✅ | ✅ |
| GUI `python_bin` preference | ✅ | ✅ |
| `python_bin: …` in `param.yaml` | ❌ (eval_proc already started) | ✅ (cache re-resolves on key change) |
## Key files
| Path | Role |
@@ -261,12 +270,18 @@ Both Flatpak and AppImage export `TESTIUM_VERSION` from a launcher (Flatpak: lau
- `unittest` item: renamed from `unittest_file`.
- GUI test tree: check and fold state preserved across same-file reloads.
- Licence: EUPL-1.2.
- Interpreter override timing: `apply_overrides()` extracted from `update_global()` and called by `process.py:run()` before `eval_process_init()`, so `-d python_bin=…` / GUI prefs reach `bins.python_bin()` on its first lookup. `bins._resolve()` cache is now keyed by `(name, override)` so later `param.yaml` changes are picked up by subsequently constructed engines.
## Validation tests
Located in `test/validation/`. Run with `-b` flag:
Located in `test/validation/`. Two entry points:
```
./run.sh -b -- test/validation/main.tum
./test/validation/run.sh # wrapper — uses a dedicated venv (see below)
./run.sh -b -- test/validation/main.tum # direct — testium's own python is used for test execution
```
The `run.sh` / `run.bat` wrappers create a dedicated Python venv at `${TMPDIR:-/tmp}/testium-validation-venv` (Linux) or `%TEMP%\testium-validation-venv` (Windows), with `--system-site-packages` + `pip install junit-xml`, and run the suite with `-d python_bin=…` so every test-execution subprocess (eval_proc, py_func, cycle, post_exec) runs inside the venv. testium itself keeps running in the project's own environment. `clean` as the first argument recreates the venv.
The `venv` item (`test/validation/items/venv/`) asserts that the override actually took effect: `python_bin` is set, `sys.executable` matches it, `sys.prefix == dirname(dirname(python_bin))`, and `sys.prefix != sys.base_prefix` (the last marker catches the case where `python_bin` happens to be a system interpreter, which path-equality alone would miss because the venv's `bin/python3` is a symlink to the host). Both `eval_proc` (inline `<| … |>`) and `py_func` paths are exercised.
Parallel item tests: `test/validation/items/parallel/test.tum`
## Dependencies

View File

@@ -16,6 +16,7 @@ from interpreter.utils.test_init import (
env_init,
prepare_global,
update_global,
apply_overrides,
set_standard_gd_keys,
test_run_init,
test_run_header,
@@ -210,6 +211,19 @@ class TestProcess(Process):
env_init()
# Apply GUI defaults and CLI defines to the global dict
# *before* eval_proc starts: bins.python_bin() reads
# ``python_bin`` from gd on its very first call (during
# eval_process_init) and caches the result. Without this,
# ``-d python_bin=...`` and the GUI ``python_bin`` preference
# would only take effect for items spawned *after* the cache
# was already populated with the auto-discovered interpreter,
# i.e. they would silently be ignored for eval_proc itself.
# _load_initial_params re-applies the same overrides after
# ``prepare_global()`` clears gd, so the gd value stays in
# sync with the cached path.
apply_overrides(self.__defs, self.__gui_defaults)
# Creation of the python evaluation process for loading of the complete test
eval_proc = eval_process_init(api_request, 10, test_dir)
eval_proc.start()

View File

@@ -202,16 +202,24 @@ _SPECS = {
"lua": ("Lua 5.1+", "lua_bin", _LUA_CANDIDATES, _is_lua51),
}
# Cached per (name, override) so that runtime changes to gd[gd_key] —
# e.g. ``python_bin`` set from a YAML config file loaded *after*
# eval_proc has already resolved its own interpreter — are picked up by
# the next lookup instead of returning the stale, auto-discovered path.
# Long-lived subprocesses (eval_proc) keep whatever they captured at
# construction time, but every new PyProcessBase / FuncExecEngine spawned
# afterwards sees the current override.
_resolved = {}
def _resolve(name):
if name in _resolved:
return _resolved[name]
display, gd_key, candidates, validator = _SPECS[name]
override = tm.gd(gd_key, "") or ""
cached = _resolved.get(name)
if cached is not None and cached[0] == override:
return cached[1]
path = ""
if override:
# Absolute path: accept as-is (user knows exactly what they want).
@@ -239,7 +247,7 @@ def _resolve(name):
path = p
break
_resolved[name] = path
_resolved[name] = (override, path)
return path

View File

@@ -165,11 +165,14 @@ def env_init():
_constants_init()
def update_global(config_files, defines, gui_defaults, silent=False):
"""Global dict updated with the content of the config file and a dict provided.
this function returns the resulting dict.
def apply_overrides(defines, gui_defaults):
"""Push GUI defaults then CLI defines into the global dict.
Extracted from update_global so it can be called *before* eval_proc
starts: interpreter overrides (python_bin, lua_bin) must be visible
to bins.python_bin() on its first lookup, which happens during
eval_process_init.
"""
# GUI preferences applied first
for k, v in gui_defaults.items():
try:
val = ast.literal_eval(v)
@@ -177,7 +180,6 @@ def update_global(config_files, defines, gui_defaults, silent=False):
val = v
tm.setgd(k, val)
# Then command line defines
for k, v in defines.items():
try:
val = ast.literal_eval(v)
@@ -185,6 +187,14 @@ def update_global(config_files, defines, gui_defaults, silent=False):
val = v
tm.setgd(k, val)
def update_global(config_files, defines, gui_defaults, silent=False):
"""Global dict updated with the content of the config file and a dict provided.
this function returns the resulting dict.
"""
# GUI preferences applied first, then command line defines
apply_overrides(defines, gui_defaults)
# Then the configuration files
# load global dic before test item
_feed_gd_with_params(config_files, silent)

View File

@@ -1,10 +1,43 @@
# Validation
This directory contains the necessary material to run the testium validation.
This directory contains the testium validation suite.
Here is the documentation on how to configure the validation, run it and check that the
results are correct.
## Running the suite
# Tests
```sh
./test/validation/run.sh # Linux
test\validation\run.bat # Windows
```
TBD
The wrapper creates a dedicated Python venv in the system temp dir
(`${TMPDIR:-/tmp}/testium-validation-venv` on Linux, `%TEMP%\testium-validation-venv`
on Windows), using `--system-site-packages` so existing system packages
stay visible. The validation suite is then run with that venv pinned as
`python_bin`. Every test-execution subprocess (inline `<| ... |>`
evaluation, `py_func`, `cycle`, `post_execution`, ...) runs inside the
venv, while testium itself keeps running in the project's own
environment.
Pass `clean` as the first argument to recreate the venv from scratch
(useful after a system Python upgrade):
```sh
./test/validation/run.sh clean
```
## What is checked
The `venv` item under `items/venv/` asserts that the venv is actually
being used:
* `python_bin` is set in the global dict.
* The eval subprocess (used for `<| ... |>` expressions) has
`sys.executable == python_bin`, `sys.prefix == dirname(dirname(python_bin))`,
and `sys.prefix != sys.base_prefix` (i.e. is actually inside a venv).
* A `py_func` subprocess passes the same three checks.
These checks use `abspath`/`normpath` rather than `realpath` on
purpose: the venv's `bin/python3` is a symlink to the host interpreter,
so `realpath` would map both venv and non-venv interpreters to the same
target. `sys.prefix != sys.base_prefix` is the venv-specific marker
that distinguishes the two cases.

View File

@@ -0,0 +1 @@
no_param: Null

View File

@@ -0,0 +1,53 @@
# venv test: assert that the dedicated validation venv is the python
# being used for every test-execution subprocess (eval_proc / py_func /
# cycle / ...). The venv path is pinned by ``-d python_bin=...`` in
# test/validation/run.sh (or run.bat).
#
# We use ``abspath``/``normpath`` rather than ``realpath`` on purpose:
# the venv's ``bin/python3`` is a symlink to the host python, so
# realpath would map every venv interpreter to the same system path and
# the comparison would silently pass even *without* a venv.
# ``sys.prefix != sys.base_prefix`` is the venv-specific marker that
# catches that case.
- check:
name: python_bin is set in the global dict
key: $(test)_PASS
values:
- <| bool(r"$(python_bin)") |>
- check:
name: eval_proc subprocess runs in the validation venv
key: $(test)_PASS
values:
- <| os.path.normpath(os.path.abspath(sys.executable)) == os.path.normpath(os.path.abspath(r"$(python_bin)")) |>
- check:
name: eval_proc sys.prefix matches python_bin venv root
key: $(test)_PASS
values:
- <| os.path.normpath(os.path.abspath(sys.prefix)) == os.path.dirname(os.path.dirname(os.path.normpath(os.path.abspath(r"$(python_bin)")))) |>
- check:
name: eval_proc is actually inside a venv (sys.prefix != sys.base_prefix)
key: $(test)_PASS
values:
- <| os.path.normpath(os.path.abspath(sys.prefix)) != os.path.normpath(os.path.abspath(sys.base_prefix)) |>
- py_func:
name: py_func subprocess runs in the validation venv
key: $(test)_PASS
file: $(test_path)$(psep)verify_venv.py
func_name: check_sys_executable
- py_func:
name: py_func sys.prefix matches python_bin venv root
key: $(test)_PASS
file: $(test_path)$(psep)verify_venv.py
func_name: check_sys_prefix_in_venv
- py_func:
name: py_func is actually inside a venv
key: $(test)_PASS
file: $(test_path)$(psep)verify_venv.py
func_name: check_is_venv

View File

@@ -0,0 +1,62 @@
import os
import sys
import py_func.tm as tm
def _norm(p):
# normpath + normcase, without resolving symlinks. realpath() would
# follow the venv's ``python3`` symlink to ``/usr/bin/python3.X`` and
# defeat the comparison.
return os.path.normcase(os.path.normpath(os.path.abspath(p)))
def _venv_dir():
# python_bin is at ``<venv>/(bin|Scripts)/python*`` so the venv root
# is two levels above the executable.
exe = tm.gd("python_bin", "")
if not exe:
return ""
return os.path.dirname(os.path.dirname(_norm(exe)))
def check_sys_executable():
"""py_func subprocess: sys.executable must match the configured python_bin."""
expected = _norm(tm.gd("python_bin", ""))
actual = _norm(sys.executable)
if expected and actual == expected:
return True
return (
-1,
f"sys.executable={actual!r} differs from python_bin={expected!r}",
)
def check_sys_prefix_in_venv():
"""py_func subprocess: sys.prefix must match the venv root derived
from python_bin (two levels up from the executable)."""
venv = _venv_dir()
if not venv:
return (-1, "python_bin is not set in the global dict")
actual = _norm(sys.prefix)
if actual == venv:
return True
return (
-1,
f"sys.prefix={actual!r} is not the validation venv {venv!r}",
)
def check_is_venv():
"""py_func subprocess: confirm we are inside a venv, i.e. sys.prefix
differs from sys.base_prefix. This catches the case where python_bin
happens to be a system interpreter and the path-based check would
pass trivially."""
actual = _norm(sys.prefix)
base = _norm(sys.base_prefix)
if actual != base:
return True
return (
-1,
f"sys.prefix == sys.base_prefix == {actual!r}: not running in a venv",
)

61
test/validation/run.bat Normal file
View File

@@ -0,0 +1,61 @@
@echo off
SETLOCAL EnableExtensions
REM Runs the testium validation suite with a dedicated Python venv used
REM by every py_func / cycle / inline-eval subprocess. testium itself
REM keeps running in the project's own environment; the validation venv
REM only isolates *test execution*.
REM
REM test\validation\run.bat [clean] [extra testium args]
REM
REM Requires the project venv to already exist (run the project's
REM run.bat once first, or any other testium install method).
SET "SCRIPT_DIR=%~dp0"
SET "PROJECT_DIR=%SCRIPT_DIR%..\.."
REM Venv in the user temp dir (Windows equivalent of /tmp).
SET "VENV_DIR=%TEMP%\testium-validation-venv"
SET "PROJECT_VENV=%PROJECT_DIR%\test\tmp\testium_venv"
IF /I "%~1"=="clean" (
rmdir /s /q "%VENV_DIR%"
SHIFT
)
REM Locate a host Python.
SET "PYTHON_EXE=python"
py --version >nul 2>&1
IF %ERRORLEVEL% EQU 0 (
SET "PYTHON_EXE=py"
goto :PYTHON_FOUND
)
python --version >nul 2>&1
IF %ERRORLEVEL% EQU 0 (
SET "PYTHON_EXE=python"
goto :PYTHON_FOUND
)
echo ERROR : Python could not be found on this system.
exit /b 1
:PYTHON_FOUND
IF NOT EXIST "%VENV_DIR%" (
echo Creating validation venv at %VENV_DIR%
%PYTHON_EXE% -m venv --system-site-packages "%VENV_DIR%"
IF %ERRORLEVEL% NEQ 0 (
echo ERROR while creating the validation venv.
exit /b 1
)
call "%VENV_DIR%\Scripts\pip" install --quiet --upgrade pip
call "%VENV_DIR%\Scripts\pip" install --quiet junit-xml
)
SET "VENV_PYTHON=%VENV_DIR%\Scripts\python.exe"
IF NOT EXIST "%PROJECT_VENV%" (
echo ERROR : project venv not found at %PROJECT_VENV%. Run the project run.bat once first.
exit /b 1
)
call "%PROJECT_VENV%\Scripts\activate"
python "%PROJECT_DIR%\src\testium" -b -d "python_bin=%VENV_PYTHON%" -- "%SCRIPT_DIR%main.tum" %*

47
test/validation/run.sh Executable file
View File

@@ -0,0 +1,47 @@
#!/bin/bash
# Runs the testium validation suite with a dedicated Python venv used by
# every py_func / cycle / inline-eval subprocess (i.e. everything that
# goes through ``bins.python_bin()``). testium itself keeps running in
# the project's own environment — the validation venv only isolates
# *test execution*.
#
# ./test/validation/run.sh [clean] [extra testium args]
#
# ``clean`` (optional, must be the first arg) removes the venv before
# recreating it; this is the way to refresh the venv after a system
# Python upgrade.
set -e
SCRIPT_PATH="$(readlink -f "$0")"
SCRIPT_DIR="$(realpath "$(dirname "$SCRIPT_PATH")")"
PROJECT_DIR="$(realpath "$SCRIPT_DIR/../..")"
# Venv lives in the system temp dir so it stays out of the project tree
# (and is naturally cleaned up by tmpfiles/reboot on most distros).
VENV_DIR="${TMPDIR:-/tmp}/testium-validation-venv"
if [ "${1:-}" = "clean" ]; then
rm -rf "$VENV_DIR"
shift
fi
if [ ! -d "$VENV_DIR" ]; then
echo "Creating validation venv at $VENV_DIR"
# --system-site-packages so we don't have to reinstall pyside6, lxml
# & friends just to support the validation helpers. We still pip
# install junit-xml below because it is the one dep that does *not*
# ship as a system package on most distros and is required by
# post_execution.py.
python3 -m venv --system-site-packages "$VENV_DIR"
"$VENV_DIR/bin/pip" install --quiet --upgrade pip
"$VENV_DIR/bin/pip" install --quiet junit-xml
fi
VENV_PYTHON="$VENV_DIR/bin/python3"
# Delegate to the project's run.sh so testium itself still runs in the
# project venv (with pyside6, gitpython, ...). ``-d python_bin=...``
# pins every test-execution subprocess to the validation venv.
exec "$PROJECT_DIR/run.sh" -b \
-d "python_bin=$VENV_PYTHON" \
-- "$SCRIPT_DIR/main.tum" "$@"