2020 Benchmark Selections
Hi everyone,
We need to finalize the benchmarks in the next few days. There have been some, but not a ton of comments. My group has gone through the benchmarks, and based on our discussion/review of them, I suggest using the following with some comments/details needed, please let us know of any concerns.
1) As discussed earlier, include the benchmarks considered last year:
https://github.com/verivital/ARCH-COMP19-AINNCS
2) Include the following proposed benchmarks:
ReachNN examples (similar to last year's examples, except combinations of activations in the controllers, need to decide on a subset of these to include): https://cps-vo.org/node/67819
Pendulum (need to decide plant complexity/number of links) and aircraft: https://cps-vo.org/node/67884
VCAS (likely to be challenging due to switching nature of controller, potentially need to pull out a plant model if tools that support ODEs for plant models need to consider it that way, versus considering the neural network plant model): https://cps-vo.org/node/67761
Linear models (need to pull out plant model and make clear what specifications are, or potentially skip this one for this year): https://cps-vo.org/node/68020
For each of these, we need to decide on initial states, and if applicable, specifications. Given most methods are reachability based, we can skip specifications if appropriate, but certainly at least need initial states, and some participating tools may require specifications (e.g., if SMT-based).
This is a proposed subset of benchmarks for this year:
1) Last year competition
- ACC
- Benchmark 9 (Tora)
- Benchmark 10 (Bycicle model)
2) VCAS
- With a linear/nonlinear plant
3) ReachNN Benchmarks
- Example 6 (Tora) with different nonlinear/mixed controllers
4) Stanford Benchmarks
- Double pendulum (which controller?)
- Airplane (controller?)
Please, share your thoughts about these benchmarks.
If we are moving forward with these, I suggest we provide a clear description for the dynamics, not in specific format, but something like:
dx(1) = x(2) + x(3)
dx(2) = 2
dx(3) = x(1)*sin(u)
where u is the input.
We should also provide all the networks in the original format as well as the onnx.
Here is the link to 3 of the benchmarks from last year https://github.com/verivital/ARCH-COMP2020/tree/master/benchmarks, with the controllers and the dynamics files.
Thanks a lot for your write-up concerning the choice of models! About the description of the dynamics in plane ODE form, it sounds like a good idea (unless the ODEs are huge, but i don't think it's the case). About the neural networks, can we have a shared folder with the controllers used? (at least for the new models).
Is it possible to have the last year models in the Keras format? I tried to convert using some tools I found online but there were some problems.
Thank you for taking care of this. I also think it would be nice to have a shared folder or even a github repo with all the info. It's a bit hard to navigate this forum.
Thanks,
Rado
> 3) ReachNN Benchmarks
> - Example 6 (Tora) with different nonlinear/mixed controllers
Do you have a link for this model?
> 3) ReachNN Benchmarks
> - Example 6 (Tora) with different nonlinear/mixed controllers
Do you have a link for this model?
From here:
https://github.com/JmfanBU/ReachNNStar/tree/master/ReachNN/Bernstein_Polynomial_Approximation/systems
We'll extract appropriate controllers and add to the repository Diego shared above.