6 Commits

Author SHA1 Message Date
f2eedb5606 docs: add 0.2.1 release note (load-time optimisations + fix)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 15:33:13 +02:00
f02616dc3a perf(load): flatten step list in one pass; fix nested-list duplication
load_test_recursively expanded nested lists and included 'sequence'
entries by splicing each into the step list and rebuilding the whole
list every time (O(n^2)). The list branch also rebuilt after an in-place
splice, duplicating entries when a nested list held more than one item.

Replace both with a single linear _flatten_actions pass. Build phase
~12% faster at 6k items; the real fix is the duplication (a nested
2-element list now yields a,b,c,d not a,b,c,c,d). Validation suite
identical (post-exec SUCCESS, same verdicts/tracebacks).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 14:40:46 +02:00
5adba7fcd5 perf(load): use libyaml CLoader when available
Base the TUM loaders (and the param-file load) on yaml.CLoader when
PyYAML is built with libyaml, falling back to the pure-Python Loader
otherwise. Same ParserError/ScannerError, same custom !include
constructors. YAML parse time ~8x lower; validation suite identical
(same verdicts, same 8 expected-fail tracebacks, post-exec SUCCESS).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 11:22:26 +02:00
5086aa6c0e perf(load): cache compiled jinja templates, render in memory
Shared jinja Environment + compiled-template cache keyed on (path, mtime,
size), and render to an in-memory StringIO instead of a temp file.
Behaviour unchanged (validation suite passes); template time -10..40x,
total load -20..30% on template-heavy trees.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 10:41:42 +02:00
ef49789780 test: add load-time benchmark (jinja/include trees)
Generator + in-process harness timing the real loader's three stages and
template/YAML call counts, across tunable profiles. cases/ git-ignored;
see test/benchmark/README.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 10:41:42 +02:00
6e31ae971a removed unused robustness. 2026-05-31 10:17:54 +02:00
16 changed files with 686 additions and 258 deletions

View File

@@ -1,3 +1,16 @@
version 0.2.1
==============
- Faster test loading, especially for large tests built from jinja
templates and ``!include``: compiled jinja templates are cached and
reused (a file included many times is compiled once), rendering happens
in memory instead of through a temporary file, and YAML is parsed with
the libyaml C loader when available. Typical load time is 3-6x lower on
include / template-heavy tests; behaviour is unchanged.
- Fix: a nested list holding more than one step under ``steps`` no longer
duplicates its entries while the step tree is built.
- New load-time benchmark under ``test/benchmark/`` (synthetic-tree
generator + in-process timing harness) to measure the load pipeline.
version 0.2
==============
- Test items: each item type now declares its accepted parameters

View File

@@ -29,6 +29,51 @@ def _build_item_path(item) -> str:
return " > ".join(reversed(parts))
def _flatten_actions(actions, out, parent_seq_name):
"""Expand nested lists and included ``sequence`` entries into ``out`` as a
flat list of single test-item dicts, propagating each sequence's source
filename onto its items.
Replaces the previous approach, which spliced each entry into the step
list and rebuilt the whole list every time (O(n^2) over the step list, and
a rebuild that duplicated entries when a nested list held more than one
element). This single forward pass is linear.
"""
for idx, action in enumerate(actions):
# a bare list raises its elements to the same level
if isinstance(action, (list, tuple)):
_flatten_actions(action, out, parent_seq_name)
continue
# a NoneType (e.g. pointing at an unused alias) contributes nothing
if action is None:
continue
# a 'sequence' (an included file) is spliced in, with its filename
# propagated onto each of its items
if isinstance(action, dict) and "sequence" in action:
sequence = action["sequence"]["data"]
f = action["sequence"]["filename"]
if isinstance(sequence, dict):
sequence = [{k: v} for k, v in sequence.items()]
# Case of an empty sequence
elif sequence is None:
tm.print_info(
f"An empty sequence is loaded in '{parent_seq_name}'."
)
sequence = []
elif not isinstance(sequence, list):
raise ETUMSyntaxError(
f"Syntax error in '{parent_seq_name}' step number {idx+1}. Sequence definition: '{str(action)}'",
f
)
for s in sequence:
if isinstance(s, dict) and s:
s[list(s.keys())[0]]["seq_filename"] = f
_flatten_actions(sequence, out, parent_seq_name)
continue
out.append(action)
class TestSet:
def __init__(
self,
@@ -434,56 +479,16 @@ class TestSet:
f"No valid list of actions in sequence {parent_seq_name}",
file_name
)
# first we merged to the same level 'sequence dict entries and list within the list
counter = 0
test_dir = tm.gd("test_directory")
la = len(parent_seq_actions)
while counter < la:
action = parent_seq_actions[counter]
# if action is a list raise up to the the same level,
# ie insert action element into the parent_seq_actions
if isinstance(action, (list, tuple)):
parent_seq_actions[counter : counter + 1] = action
parent_seq_actions = (
parent_seq_actions[:counter]
+ action
+ parent_seq_actions[counter + 1 :]
)
la = len(parent_seq_actions)
continue
# if action is a NoneType skip and continue
# (when pointing to an unused alias for instance)
if action is None:
counter += 1
continue
# if action is a sequence we insert its entry into the action list
if "sequence" in action:
sequence = action["sequence"]["data"]
f = action["sequence"]["filename"]
if isinstance(sequence, dict):
sequence = [{k: v} for k, v in sequence.items()]
# Case of an empty sequence
elif sequence is None:
tm.print_info(
f"An empty sequence is loaded in '{parent_seq_name}'."
)
sequence = []
elif not isinstance(sequence, list):
raise ETUMSyntaxError(
f"Syntax error in '{parent_seq_name}' step number {counter+1}. Sequence definition: '{str(action)}'",
f
)
for s in sequence:
s[list(s.keys())[0]]["seq_filename"] = f
parent_seq_actions = (
parent_seq_actions[:counter]
+ sequence
+ parent_seq_actions[counter + 1 :]
)
la = len(parent_seq_actions)
continue
# Action is now for sure a list of dict of length 1
# Flatten nested lists and included 'sequence' entries to the same level
# in one linear pass (was an in-place splice + full list rebuild per
# entry: O(n^2) over the step list).
flat_actions = []
_flatten_actions(parent_seq_actions, flat_actions, parent_seq_name)
for action in flat_actions:
# Action is now for sure a dict of length 1
k = list(action.keys())[0]
if action[k].get("seq_filename", None) is None:
action[k]["seq_filename"] = file_name
@@ -546,8 +551,6 @@ class TestSet:
action[k]["seq_filename"]
)
counter += 1
return ret
def tree(self):

View File

@@ -6,10 +6,10 @@ from runtime.tum_except import ETUMFileError
from interpreter.utils.template import template_to_test
from copy import copy
from interpreter.utils.globdict import global_dict
from interpreter.utils.yaml_load import yaml_load
from interpreter.utils.yaml_load import yaml_load, YAML_BASE_LOADER
class TUMLoaderNoIncludes(yaml.Loader):
class TUMLoaderNoIncludes(YAML_BASE_LOADER):
def __init__(self, stream):

View File

@@ -1,33 +1,74 @@
import io
import os
from sys import exc_info
from jinja2 import Template
from jinja2 import Environment
from jinja2.exceptions import TemplateSyntaxError, TemplateError, UndefinedError
from tempfile import TemporaryFile
from interpreter.utils.yaml_load import print_yaml
from runtime.tum_except import ETUMSyntaxError
# One Environment reused for every render (default settings, i.e. identical
# behaviour to jinja2.Template), plus a compiled-template cache so a file that
# is included many times — or a test that is reloaded — is compiled only once.
# Jinja compilation is the expensive step; render (variable substitution) stays
# per-call. Cache is keyed on path + mtime + size so an edited file recompiles.
_ENV = Environment()
_template_cache = {} # abspath -> (mtime_ns, size, compiled_template)
class _RenderedStream(io.StringIO):
"""A rendered template kept in memory.
Carries ``root`` (and ``name``) so the YAML loader resolves ``!include``
paths exactly as it did from the on-disk temp file this replaces — without
the write + seek + read round-trip (one temp file per included file). That
round-trip is pure overhead, and especially costly on slow storage.
"""
def _compiled_template(filename: str):
"""Return the compiled jinja template for *filename*, reusing the cached
one when the file is unchanged (path + mtime + size)."""
key = os.path.abspath(filename)
try:
st = os.stat(filename)
except OSError:
st = None
if st is not None:
cached = _template_cache.get(key)
if (cached is not None
and cached[0] == st.st_mtime_ns
and cached[1] == st.st_size):
return cached[2]
with open(filename, "r") as f:
source = f.read()
template = _ENV.from_string(source) # compile (may raise TemplateSyntaxError)
if st is not None:
_template_cache[key] = (st.st_mtime_ns, st.st_size, template)
return template
def template_to_test(filename: str, params: list):
""" Function which processes an eventual jinja2 template to a test file
"""
# Temporary file created to receive the processed include
# file
tmpf = TemporaryFile('w+t')
with open(filename, 'r') as f:
try:
j2_template = Template(f.read())
except TemplateError as e:
# Compile (cached) — a syntax error in the template surfaces here.
try:
j2_template = _compiled_template(filename)
except TemplateError as e:
with open(filename, "r") as f:
print_yaml(f, filename)
type, value, tb = exc_info()
msg = "Template error"
if hasattr(value, 'lineno'):
msg = msg + f" on line {value.lineno}: "
else:
msg += ": "
raise ETUMSyntaxError(msg + str(e), filename)
type, value, tb = exc_info()
msg = "Template error"
if hasattr(value, 'lineno'):
msg = msg + f" on line {value.lineno}: "
else:
msg += ": "
raise ETUMSyntaxError(msg + str(e), filename)
# Render into memory (no temp file).
try:
params["include_directory"] = os.path.dirname(os.path.abspath(filename))
tmpf.write(j2_template.render(params))
rendered = j2_template.render(params)
except TemplateSyntaxError as e:
raise ETUMSyntaxError(f"""Template loading of file '{filename}' with following parameters '{str(params)}'
Syntax error in template: {e.message}""")
@@ -42,8 +83,7 @@ Template rendering error: {e.message}""")
raise ETUMSyntaxError(f"""Template loading of file '{filename}' with following parameters '{str(params)}'
Unexpected error: {str(e)}""")
# return to begining of the temp file
tmpf.seek(0, os.SEEK_SET)
tmpf.root = os.path.dirname(filename)
return tmpf
stream = _RenderedStream(rendered)
stream.root = os.path.dirname(filename)
stream.name = filename
return stream

View File

@@ -11,7 +11,7 @@ import api.testium as tm
import interpreter.utils.globdict as globdict
import interpreter.utils.settings as prefs
from interpreter.utils.paths import testium_path
from interpreter.utils.yaml_load import yaml_load
from interpreter.utils.yaml_load import yaml_load, YAML_BASE_LOADER
from interpreter.utils import clear_recursively
from runtime.tum_except import ETUMSyntaxError
from interpreter.utils.params import expanse, eval_func_init
@@ -89,7 +89,7 @@ def locate_report_file(rep_file):
def yamltodict(param_file, silent=True):
# load of the file
with open(param_file, "r") as fd:
dp = yaml_load(fd, param_file, yaml.Loader)
dp = yaml_load(fd, param_file, YAML_BASE_LOADER)
if dp is None:
tm.print_info(f"The YAML file '{param_file}' is empty.")

View File

@@ -1,3 +1,4 @@
import yaml
from yaml.parser import ParserError
from yaml import load, Loader
from yaml.scanner import ScannerError
@@ -5,6 +6,12 @@ from api.testium import print_debug
from runtime.tum_except import ETUMSyntaxError
import io
# Use the libyaml-backed loader (much faster parsing) when PyYAML was built
# with it, falling back to the pure-Python loader otherwise. The C loader
# raises the same ParserError/ScannerError and supports the same custom
# constructors (!include) and construct_* helpers the TUM loaders rely on.
YAML_BASE_LOADER = yaml.CLoader if getattr(yaml, "__with_libyaml__", False) else yaml.Loader
def print_yaml(file: io.TextIOWrapper, file_name):
""" Prints YAML file if debug mode is activated.
@@ -21,10 +28,10 @@ def yaml_load(file, real_file_name: str, loader: Loader):
return load(file, loader)
except ParserError as e:
if isinstance(file, io.TextIOWrapper):
if isinstance(file, (io.TextIOWrapper, io.StringIO)):
print_yaml(file, real_file_name)
raise ETUMSyntaxError(f"yaml file parsing error: " + str(e), real_file_name)
except ScannerError as e:
if isinstance(file, io.TextIOWrapper):
if isinstance(file, (io.TextIOWrapper, io.StringIO)):
print_yaml(file, real_file_name)
raise ETUMSyntaxError("yaml file scanning error: " + str(e), real_file_name)

1
test/benchmark/.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
cases/

116
test/benchmark/README.md Normal file
View File

@@ -0,0 +1,116 @@
# Load-time benchmark
Measures how long *testium* takes to **load** a `.tum` test tree — template
rendering (jinja) + YAML parsing + test-tree construction — *without* executing
it. Purpose: get reproducible numbers before/after load-path optimisations, and
attribute any gain to a specific part of the pipeline.
It is meant for *very long* tests, the kind you can build with `jinja` loops and
`!include`, where load time becomes noticeable.
## Files
| File | Role |
|------|------|
| `gen_bench_test.py` | Generates a synthetic `.tum` tree (the test input). |
| `load_bench.py` | Drives the **real** loader in-process and times it. |
| `run.sh` | Convenience: generate + time across profiles, using the project venv. |
| `cases/` | Generated trees (git-ignored, recreated on demand). |
The benchmark `.tum` files are **generated**, not committed — the generator is
the artifact. They use only `let` leaves and `group` containers, so loading has
no runtime side effect (no subprocess, no `<| |>` eval) and the timing reflects
the parse/build pipeline alone.
## Quick start
```bash
# default matrix (all profiles), 5 repeats each
./test/benchmark/run.sh
# one profile at one size
./test/benchmark/run.sh repeat 2000
# more repeats for a tighter min
REPEAT=10 ./test/benchmark/run.sh includes 1000
```
`run.sh` uses the project venv at `test/tmp/.venv` (created by `./run.sh`). If it
is missing, run `./run.sh` once first.
To drive the harness directly on any `.tum` (not just generated ones):
```bash
test/tmp/.venv/bin/python3 test/benchmark/load_bench.py --repeat 5 --quiet path/to/main.tum
```
## Profiles
Each profile isolates one cost. `--size` is the profile-specific count.
| Profile | What it builds | Stresses |
|---------|----------------|----------|
| `flat` | one main file, N inline `let` steps | big YAML parse + linear object build |
| `includes` | main `!include`s N **distinct** sub-files | per-include template+YAML+tempfile, `sequence` splice |
| `repeat` | main `!include`s the **same** parametrised leaf N times | jinja **recompilation** of an identical template |
| `jinja` | one main file, `{% for %}` emitting N steps | single large render + single large parse |
| `deep` | nested includes, depth N | include recursion (see caveat) |
| `mix` | groups + jinja loop + distinct + repeated includes | realistic blend |
## Reading the output
```
phase min median
initial 0.1131 0.1285 <- pass 1: discover config files (no includes)
loadtest 1.0724 1.0900 <- config fixpoint loop + full recursive include load
build 0.1850 0.1976 <- TestSet: load_test_recursively tree build
total 1.3886 1.4227
counters (last run):
templates : 1003 calls 0.5247s (exclusive: jinja compile+render+tempfile)
yaml : 1004 parses 1.4696s (inclusive of nested includes)
```
- **min** is the headline (least noisy); median is a sanity check.
- **initial / loadtest / build** map to the three pipeline stages in
`interpreter/process.py` and `interpreter/test_set.py`. The main file is
rendered+parsed across `initial` *and* `loadtest` (the loader does ~3 passes).
- **templates** = number of `template_to_test()` calls and their *exclusive*
wall time (one file render each — pure jinja compile+render+tempfile I/O).
A high count with the same source file = recompilation, the `repeat` case.
- **yaml** = number of `yaml_load()` parses. Its time is *inclusive* of nested
includes, so use the **count** for attribution, not the seconds.
## Mapping to the optimisation axes
| Axis (see DESIGN / discussion) | Watch | Best profile to prove it |
|--------------------------------|-------|--------------------------|
| 1 — cache compiled jinja templates | `templates` time drops, count unchanged | `repeat` |
| 2 — drop the tempfile round-trip | `templates` time drops | `includes`, `repeat`, `mix` |
| 3 — C YAML loader (libyaml) | `yaml` time / `loadtest` drops | `flat`, `jinja` |
| 6 — O(n²) sequence splice | `build` drops | `includes`, `mix` |
## How to compare before/after a change
1. Run the matrix on the current code, keep the output.
2. Apply one axis.
3. Re-run the **same** profiles/sizes; compare `min` per phase and the counters.
Change one axis at a time so the attribution is clean. Run on an idle machine
(and note the disk: on a USB stick the tempfile round-trip of axis 2 weighs
more).
## Caveat: deep includes
The loader is recursive and spends ~10 stack frames per include level, so
`deep` hits Python's `RecursionError` around ~90 nested levels. The harness
reports this cleanly instead of crashing. Real tests are *wide* (many steps /
many includes), not deep, so `includes`/`repeat`/`jinja`/`mix` are the
representative "very long" cases.
## Notes
- No execution is triggered — timing stops where `Batch` would mark the test
*loaded*.
- The profiles contain no `<| |>`, so the external eval process is not started.
Pass `--with-eval` to `load_bench.py` for trees that evaluate at load time.
- Numbers are machine- and disk-specific; only compare runs from the same host.

179
test/benchmark/gen_bench_test.py Executable file
View File

@@ -0,0 +1,179 @@
#!/usr/bin/env python3
"""Generate synthetic ``.tum`` test trees to benchmark *load* time.
The generated trees are deliberately cheap to *build* (only ``let`` leaves and
``group`` containers — no subprocess, no runtime side effect) so the load
benchmark measures the parse / template / tree-build pipeline and nothing else.
Profiles, each targeting a specific cost in the loader:
flat one main file, N inline ``let`` steps, no include, no jinja.
Baseline: YAML parse of a big document + linear object build.
includes main ``!include``s N *distinct* sub-files (a few steps each).
Stresses the per-include template+YAML+tempfile round-trip and the
``sequence`` splice in test_set.load_test_recursively.
repeat main ``!include``s the *same* parametrised leaf file N times.
Stresses jinja *recompilation*: the compiled template is identical
every time, only the render params (idx) differ -> the case a
template cache collapses.
jinja one main file whose ``{% for %}`` loop emits N steps.
Stresses a single large jinja render + a single large YAML parse.
deep nested includes, depth N (main -> d0 -> d1 -> ...).
Stresses include recursion and per-level template+YAML.
mix a realistic blend: groups, a jinja loop, distinct includes and a
repeated parametrised include.
Usage:
gen_bench_test.py --profile repeat --size 1000 --out cases/repeat_1000
-> writes <out>/main.tum (+ includes, + param.yaml) and prints the path.
"""
import argparse
import os
import shutil
def _let(indent, i, name=None):
name = name if name is not None else f"s{i}"
pad = " " * indent
return (
f"{pad}- let:\n"
f"{pad} name: {name}\n"
f"{pad} values:\n"
f"{pad} - k{i}: {i}\n"
)
def gen_flat(out, n):
body = "".join(_let(8, i) for i in range(n))
main = f"main:\n name: bench flat {n}\n steps:\n{body}"
_write(out, "main.tum", main)
def gen_includes(out, n):
steps = "".join(f" - !include inc_{i}.tum\n" for i in range(n))
main = f"main:\n name: bench includes {n}\n steps:\n{steps}"
_write(out, "main.tum", main)
for i in range(n):
# each include is a YAML *sequence* (list of steps)
seq = "".join(_let(0, i * 3 + j, name=f"inc{i}_{j}") for j in range(3))
_write(out, f"inc_{i}.tum", seq)
def gen_repeat(out, n):
steps = "".join(
f" - !include {{file: leaf.tum, idx: {i}}}\n" for i in range(n)
)
main = f"main:\n name: bench repeat {n}\n steps:\n{steps}"
_write(out, "main.tum", main)
leaf = (
"- let:\n"
" name: leaf_{{ idx }}\n"
" values:\n"
" - leaf_{{ idx }}: {{ idx }}\n"
)
_write(out, "leaf.tum", leaf)
def gen_jinja(out, n):
main = (
f"main:\n name: bench jinja {n}\n steps:\n"
"{% for i in range(" + str(n) + ") %}\n"
" - let:\n"
" name: j{{ i }}\n"
" values:\n"
" - k{{ i }}: {{ i }}\n"
"{% endfor %}\n"
)
_write(out, "main.tum", main)
def gen_deep(out, n):
main = (
f"main:\n name: bench deep {n}\n steps:\n"
" - let:\n name: top\n values:\n - a: 0\n"
" - !include d_0.tum\n"
)
_write(out, "main.tum", main)
for i in range(n):
seq = _let(0, i, name=f"d{i}")
if i < n - 1:
seq += f"- !include d_{i + 1}.tum\n"
_write(out, f"d_{i}.tum", seq)
def gen_mix(out, n):
# n groups, each: 2 inline lets, one distinct include, one repeated include,
# plus a small jinja loop. Roughly ~6*n steps.
per = max(1, n)
parts = [f"main:\n name: bench mix {n}\n steps:\n"]
for g in range(per):
parts.append(
f" - group:\n"
f" name: grp{g}\n"
f" steps:\n"
)
parts.append(_let(16, g * 2, name=f"g{g}_a"))
parts.append(_let(16, g * 2 + 1, name=f"g{g}_b"))
parts.append(f" - !include inc_{g}.tum\n")
parts.append(f" - !include {{file: leaf.tum, idx: {g}}}\n")
parts.append(
"{% for i in range(3) %}\n"
f" - let:\n"
f" name: g{g}_j{{{{ i }}}}\n"
f" values:\n"
f" - g{g}_k{{{{ i }}}}: {{{{ i }}}}\n"
"{% endfor %}\n"
)
_write(out, "main.tum", "".join(parts))
for g in range(per):
_write(out, f"inc_{g}.tum", _let(0, g, name=f"mixinc{g}"))
_write(
out,
"leaf.tum",
"- let:\n name: mixleaf_{{ idx }}\n values:\n - mixleaf_{{ idx }}: {{ idx }}\n",
)
PROFILES = {
"flat": gen_flat,
"includes": gen_includes,
"repeat": gen_repeat,
"jinja": gen_jinja,
"deep": gen_deep,
"mix": gen_mix,
}
def _write(out, name, content):
with open(os.path.join(out, name), "w") as f:
f.write(content)
def main():
ap = argparse.ArgumentParser(description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter)
ap.add_argument("--profile", required=True, choices=sorted(PROFILES))
ap.add_argument("--size", type=int, default=1000,
help="profile-specific count (steps / includes / depth)")
ap.add_argument("--out", required=True, help="output directory (recreated)")
args = ap.parse_args()
out = os.path.abspath(args.out)
if os.path.isdir(out):
shutil.rmtree(out)
os.makedirs(out)
# minimal config file so the loader does not emit "no param file" noise
_write(out, "param.yaml", "bench_dummy: 1\n")
PROFILES[args.profile](out, args.size)
print(os.path.join(out, "main.tum"))
if __name__ == "__main__":
main()

200
test/benchmark/load_bench.py Executable file
View File

@@ -0,0 +1,200 @@
#!/usr/bin/env python3
"""Time the testium *load* pipeline on a given ``.tum`` tree.
It drives the real loader code (``TestProcess._load_initial_params`` /
``_load_test`` then ``TestSet(...)``) in-process, so the numbers track the
production path and stay honest as the code evolves. Execution is never
triggered — we stop exactly where ``Batch`` would report the test as *loaded*.
Reported per run, over ``--repeat`` iterations (min is the headline, least
noisy):
initial first pass: discover config files (template+YAML, no includes)
loadtest config-file fixpoint loop + full recursive include/template/YAML
build TestSet construction: the load_test_recursively tree build
total sum of the three
Plus instrumentation counters (exact call counts, wall time) for the two
hot leaves the optimisation axes target:
templates jinja template_to_test() calls (axis 1 compile cache, axis 2 tempfile)
yaml yaml_load() parses (axis 3 C loader)
template time is exclusive (one file render); yaml time is wall-inclusive of
nested includes, so lean on the *counts* for attribution.
Must run inside the project venv (jinja2, pyyaml, telnetlib3, ...). The
benchmark profiles contain no ``<| |>`` so the external eval process is not
needed; pass --with-eval to start it for faithfulness on eval-heavy trees.
Usage (see run.sh for the convenience wrapper):
test/tmp/.venv/bin/python3 test/benchmark/load_bench.py [--repeat 5] <main.tum>
"""
import argparse
import os
import statistics
import sys
from queue import Queue
from time import perf_counter
# --- bootstrap: src/testium for flat imports, src for `import testium` --------
HERE = os.path.dirname(os.path.abspath(__file__))
ROOT = os.path.abspath(os.path.join(HERE, "..", ".."))
sys.path.insert(0, os.path.join(ROOT, "src"))
sys.path.insert(0, os.path.join(ROOT, "src", "testium"))
import api.testium as tm # noqa: E402
from interpreter.utils.test_init import env_init, apply_overrides # noqa: E402
from interpreter.utils.test_ctrl import TestSetController # noqa: E402
from interpreter.process import TestProcess # noqa: E402
from interpreter.test_set import TestSet # noqa: E402
from interpreter.utils.py_eval import eval_process_init # noqa: E402
from interpreter.utils.api_srv import api_request # noqa: E402
# --- instrumentation: count + time the two hot leaves -------------------------
import interpreter.process as _proc # noqa: E402
import interpreter.utils.include as _inc # noqa: E402
import interpreter.utils.test_init as _ti # noqa: E402
import interpreter.utils.template as _tpl # noqa: E402
import interpreter.utils.yaml_load as _yl # noqa: E402
_C = {"tpl_n": 0, "tpl_t": 0.0, "yaml_n": 0, "yaml_t": 0.0}
_orig_tpl = _tpl.template_to_test
_orig_yaml = _yl.yaml_load
def _wrap_tpl(*a, **k):
t = perf_counter()
try:
return _orig_tpl(*a, **k)
finally:
_C["tpl_t"] += perf_counter() - t
_C["tpl_n"] += 1
def _wrap_yaml(*a, **k):
t = perf_counter()
try:
return _orig_yaml(*a, **k)
finally:
_C["yaml_t"] += perf_counter() - t
_C["yaml_n"] += 1
# rebind in every module that did `from ... import template_to_test / yaml_load`
for _m in (_proc, _inc):
_m.template_to_test = _wrap_tpl
for _m in (_proc, _inc, _ti):
_m.yaml_load = _wrap_yaml
def _reset_counters():
_C.update(tpl_n=0, tpl_t=0.0, yaml_n=0, yaml_t=0.0)
def load_once(tp, fname, test_dir):
"""One full load (no execution). Returns (initial, loadtest, build) seconds."""
t0 = perf_counter()
init_pf, gv = tp._load_initial_params(test_dir)
t1 = perf_counter()
test_dict, _pf = tp._load_test(init_pf, gv)
t2 = perf_counter()
TestSet(fname, test_dict, Queue())
t3 = perf_counter()
return (t1 - t0, t2 - t1, t3 - t2)
def main():
ap = argparse.ArgumentParser(description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter)
ap.add_argument("main_tum", help="path to the generated main.tum")
ap.add_argument("--repeat", type=int, default=5)
ap.add_argument("--with-eval", action="store_true",
help="start the external eval process (needed only for <| |> at load)")
ap.add_argument("--quiet", action="store_true",
help="silence the loader's INFO output during runs")
args = ap.parse_args()
fname = os.path.abspath(args.main_tum)
if not os.path.isfile(fname):
ap.error(f"not found: {fname}")
test_dir = os.path.dirname(fname)
env_init()
apply_overrides({}, {})
eval_proc = None
if args.with_eval:
eval_proc = eval_process_init(api_request, 10, test_dir)
eval_proc.start()
eval_proc.wait_ready(10)
if args.quiet:
# the loader prints a couple of INFO lines per config file; mute stdout
# around the measured section to avoid I/O skew.
devnull = open(os.devnull, "w")
real_stdout = sys.stdout
tp = TestProcess(fname, Queue(), TestSetController(),
config_files=[], defines={}, gui_defaults={}, text_mode=True)
samples = [] # list of (initial, loadtest, build)
last_counters = None
try:
for r in range(args.repeat):
_reset_counters()
if args.quiet:
sys.stdout = devnull
try:
samples.append(load_once(tp, fname, test_dir))
except RecursionError:
if args.quiet:
sys.stdout = real_stdout
print(f"file : {fname}")
print("ERROR : RecursionError during load — the include "
"nesting is too deep for the recursive loader.\n"
" (each include level costs ~10 stack frames; "
"raise sys.setrecursionlimit to probe further.)")
return 2
except Exception as e: # noqa: BLE001 - report, don't crash the bench
if args.quiet:
sys.stdout = real_stdout
print(f"file : {fname}")
print(f"ERROR : load failed: {type(e).__name__}: {e}")
return 2
finally:
if args.quiet:
sys.stdout = real_stdout
last_counters = dict(_C)
finally:
if eval_proc is not None:
eval_proc.stop()
eval_proc.join()
if args.quiet:
devnull.close()
initial = [s[0] for s in samples]
loadtest = [s[1] for s in samples]
build = [s[2] for s in samples]
total = [sum(s) for s in samples]
def stat(xs):
return min(xs), statistics.median(xs)
print(f"file : {fname}")
print(f"repeats : {args.repeat} (showing min | median, seconds)")
print(f"{'phase':<10}{'min':>12}{'median':>12}")
for name, xs in (("initial", initial), ("loadtest", loadtest),
("build", build), ("total", total)):
mn, md = stat(xs)
print(f"{name:<10}{mn:>12.4f}{md:>12.4f}")
if last_counters:
print("counters (last run):")
print(f" templates : {last_counters['tpl_n']:>7d} calls "
f"{last_counters['tpl_t']:>8.4f}s (exclusive: jinja compile+render+tempfile)")
print(f" yaml : {last_counters['yaml_n']:>7d} parses "
f"{last_counters['yaml_t']:>8.4f}s (inclusive of nested includes)")
if __name__ == "__main__":
sys.exit(main() or 0)

49
test/benchmark/run.sh Executable file
View File

@@ -0,0 +1,49 @@
#!/bin/bash
# Load-time benchmark driver: generate synthetic .tum trees and time the
# testium load pipeline on them, using the project venv.
#
# Usage:
# ./test/benchmark/run.sh # default matrix (all profiles)
# ./test/benchmark/run.sh <profile> <size> # one profile at one size
# REPEAT=10 ./test/benchmark/run.sh repeat 2000
#
# Profiles: flat includes repeat jinja deep mix (see gen_bench_test.py)
#
# Generated trees go under test/benchmark/cases/ (git-ignored). The numbers
# are wall-clock; run on an otherwise idle machine and compare min values.
set -e
SCRIPT_DIR="$(realpath "$(dirname "$(readlink -f "$0")")")"
PROJECT_DIR="$(realpath "$SCRIPT_DIR/../..")"
VPY="$PROJECT_DIR/test/tmp/.venv/bin/python3"
CASES="$SCRIPT_DIR/cases"
REPEAT="${REPEAT:-5}"
if [ ! -x "$VPY" ]; then
echo "ERROR: project venv not found at $VPY — run ./run.sh once to create it." >&2
exit 1
fi
bench() {
local profile="$1" size="$2"
local out="$CASES/${profile}_${size}"
local main
main="$("$VPY" "$SCRIPT_DIR/gen_bench_test.py" --profile "$profile" --size "$size" --out "$out")"
echo "===== profile=$profile size=$size ====="
"$VPY" "$SCRIPT_DIR/load_bench.py" --repeat "$REPEAT" --quiet "$main"
echo
}
if [ $# -eq 2 ]; then
bench "$1" "$2"
exit 0
fi
# Default matrix. 'deep' is kept small: the recursive loader hits Python's
# recursion limit around ~90 nested include levels.
bench flat 2000
bench includes 1000
bench repeat 1000
bench jinja 2000
bench deep 40
bench mix 300

View File

@@ -1,69 +0,0 @@
# Main
################################################################################
main:
name: Serial Terminal bug reproducer
version: 0.1
steps:
- group:
name: Test preparation
steps:
- console:
name: Open RSL Simulator Terminal
console_name: RSL_simulator
steps:
- open:
protocol: terminal
terminal_path: $(rslsimulatorpath)
- writeln: "pwd"
- read_until: {expected: "$", timeout: 5}
- writeln: "./RSverify $(rsTx)" # /dev/ttyMUE1
- read_until: {expected: "RSL controller>", timeout: 5}
- writeln: "setportconf 0 115200 none 8 1 1 255"
- read_until: {expected: "RSL controller>", timeout: 5}
- writeln: "send4ever 0 0"
- read_until: {expected: "RSL controller>", timeout: 5}
- console:
name: Open the EUT console
console_name: cons_target
doc: Initiates the console of the target in order
to be ready to capture its traces.
stop_on_failure: True
steps:
- open:
protocol: serial
serial_port: $(rsRx) # /dev/ttyMUE2
serial_baudrate: 115200
- loop:
name: Qualification loop
stop_on_failure: False
steps:
- py_func:
name: Capture the RS serial output
file: $(test_directory)/terminal_bug_reproducer.py
func_name: RetreiveData
param:
- cons_target
- sleep: {timeout: 1}
# Cleanup sequence
#-------------------------------------------------------------------------------
- group:
name: Cleanup
execute_on_stop: True
steps:
- console:
name: Close the target console
console_name: cons_target
execute_on_stop: True
steps:
- close:
- console:
name: Close the RSL_simulator
console_name: RSL_simulator
execute_on_stop: True
steps:
- close:

View File

@@ -1,26 +0,0 @@
import api.testium as tm
def RetreiveData(console_name):
print("--------------- retrieving data ---------------")
result = 0
cons = tm.console(console_name)
if cons is None:
print("--------------- The console does not exist ---------------")
else:
try:
is_finished = False
while not is_finished:
status, d = cons.read_until('\n', timeout=0, return_data=True, mute=True)
if 0 == status:
print("--------------- Data ---------------")
print(d)
else:
print("--------------- No data ---------------")
print("Status: ", status)
is_finished = True
except:
print("--------------- Error retrieving data ---------------")
result = -1
return result

View File

@@ -1,9 +0,0 @@
chars='<=>| -,;:!/."()[]{}*\&#%+012345689abcdefghiklmnopqrstuvwxyzABCD'
for j in {1..256} ;
do
for i in {1..256} ; do
echo -n "${chars:RANDOM%${#chars}:1}"
done
echo
sleep 0.01
done

View File

@@ -1,26 +0,0 @@
import api.testium as tm
def RetreiveData(console_name):
print("--------------- retrieving data ---------------")
result = 0
cons = tm.console(console_name)
if cons is None:
print("--------------- The console does not exist ---------------")
else:
try:
is_finished = False
while not is_finished:
status, d = cons.read_until('\n', timeout=0, return_data=True, mute=True)
if 0 == status:
print("--------------- Data ---------------")
print(d)
else:
print("--------------- No data ---------------")
print("Status: ", status)
is_finished = True
except:
print("--------------- Error retrieving data ---------------")
result = -1
return result

View File

@@ -1,50 +0,0 @@
# Main
################################################################################
main:
name: Terminal bug reproducer
version: 0.1
steps:
- group:
name: Test preparation
steps:
- console:
name: Open the EUT console
console_name: cons_target
doc: Initiates the console of the target in order
to be ready to capture its traces.
stop_on_failure: True
steps:
- open:
protocol: terminal
- loop:
name: Qualification loop
stop_on_failure: False
steps:
- console:
name: write random data
console_name: cons_target
steps:
- writeln: bash $(test_directory)/generate_char.sh
- py_func:
name: Capture the terminal output
file: $(test_directory)/terminal_bug_reproducer.py
func_name: RetreiveData
param:
- cons_target
- sleep: {timeout: 1}
# Cleanup sequence
#-------------------------------------------------------------------------------
- group:
name: Cleanup
execute_on_stop: True
steps:
- console:
name: Close the target console
console_name: cons_target
execute_on_stop: True
steps:
- close: