> Each Tuesday I have been looking at the performance results of all the PRs mer...

dbaupp · on Aug 5, 2020

> It's the same for any other kind of test; your tests should pass before you merge, not after

Rust is very good about testing before merge, not after. It somewhat pioneered doing complete testing before merge (at least, built significant tooling and mind-share around doing it): https://graydon2.dreamwidth.org/1597.html https://bors.tech/

That is to say, if Rust isn't validating something pre-merge, it's more likely than not that there's a good reason for doing so. I don't personally know the reason for Rust, but, for performance testing in general, it's easy to overwhelm infrastructure because benchmarks should be being run on a single physical machine that's otherwise quiet. This means that testing is completely serialised and there's no opportunity for parallelism. Based on https://perf.rust-lang.org/status.html , it looks like the performance tests currently take a bit longer than an hour each.

As others have mentioned, there's automatic tooling for running tests on PRs that may be performance sensitive, and PRs are reverted if they accidentally cause a regression, to be able to reset and reanalyse.

nnethercote · on Aug 6, 2020

Coming up with a binary "is this acceptable?" heuristic is also harder for performance than it is for correctness.

dwheeler · on Aug 7, 2020

I think the heuristic is really easy: "Is it slower than before?"

If it's slower, then you require a manual override to accept the merge.

That will make performance regressions a whole lot less likely. Sure, in some cases they may be necessary, but in many cases they were unintentional; they can be fixed, and then merged.

MinusGix · on Aug 5, 2020

The linked llvm article about performance ( https://nikic.github.io/2020/05/10/Make-LLVM-fast-again.html ) says: "Rust does this by running a set of benchmarks on every merge, with the data available on perf.rust-lang.org. Additionally, it is possible to run benchmarks against pull requests using the @rust-timer bot. This helps evaluate changes that are intended to improve compile-time performance, or are suspected of having non-trivial compile-time cost." So, they do seem to have a bot to perform such tests, though I have no clue how often they use it.

dbaupp · on Aug 5, 2020

It looks like it's used on about 6% of PRs: since the start of this year, 204 PRs[0] match a search for "@rust-timer queue", among 3393 total PRs[1].

[0]: https://github.com/rust-lang/rust/pulls?q=is%3Apr+is%3Amerge...

[1]: https://github.com/rust-lang/rust/pulls?q=is%3Apr+is%3Amerge...

hermanradtke · on Aug 5, 2020

In practice I do not think this will work. There is already a queue of PRs waiting to be tested and merged.

Most changes are not going to change the compiler performance. Better to watch a trend line and revert any change that hurts performance in a significant way. Releases are every 6 weeks so there is plenty of time for this type of analysis and activity.