monitor: Hits deadline exceeded when monitoring a large log
The main monitoring function, https://git.glasklar.is/sigsum/core/sigsum-go/-/blob/62fc0ea6327385a2c6f3f0082146f57705701bee/pkg/monitor/monitor.go#L102, is written to periodically get latest tree head, and retrieve all new leaves.
This is implemented using an updateCtx with a timeout corresponding to the query interval. However, for a large log, getting all leaves may time out, e.g., if I run
go run ./cmd/sigsum-monitor --interval 10s -P sigsum-test1-2025 tests/test.submit2.key.pub
I get about 75000 leaves in the first 10 seconds, and then monitoring fails with
New 4e89cc51651f0d95f3c6127c15e1a42e3ddf7046c5b17b752689c402e773bb4d leaves, count 0, total processed 75264
2026/03/12 09:07:50 [FATA] Alert log 4e89cc51651f0d95f3c6127c15e1a42e3ddf7046c5b17b752689c402e773bb4d: monitoring alert: Log not responding as expected: get-inclusion-proof failed: Get "https://test.sigsum.org/barreleye/get-inclusion-proof/162323/ba32972f13f6f297a838e0778c06c34b827164ae826fff17a6dddb1a59bb5499": context deadline exceeded
A simple fix could be to ignore deadline exceeded errors, and start a new cycle. Alternatively, reorganize completely and use context.Background() when sending requests to the log.
It would be desirable to generate some kind of warning if the monitor has trouble keeping up with log growth, but not entirely obvious how to do that.