2020 Benchmark Proposal: Heterogeneous Networks (multiple activation types) by Chao Huang
We test our tool ReachNN on the benchmarks that have the same dynamics as the benchmarks proposed in Sherlock, but with different settings of neural-network (NN) controllers. The main difference lies on the NN controller, where the NN controller we use may have different activation functions simultaneously, e.g. we have ReLU+sigmoid NN controller for Ex. \#1. The input dimension of the NN controller, which is the dimension of the system state, ranges from 2 to 4. Each NN controller has either 3 or 4 hidden layers, and width (the number of neurons) of each hidden layer ranges from 20 to 100. The detailed setting can be found in~\cite{huang2019reachnn} and \url{https://github.com/JmfanBU/ReachNNStar}.
In addition, a navigation control benchmark with Dubins car model and various neural-network controller can be found in~\cite{fan2019knowledgedistillation} and \url{https://github.com/JmfanBU/NNCS-Dubins-Car}. The goal is to navigate the vehicle through a corridor. The controller drives the vehicle to turn at the first corner and avoids the obstacle in the middle of the corridor. The state input is 3 and the controller output is the steering as a scalar. NN controller has 2 hidden layers and each hidden layer has 20 neurons. Different NN controllers have different activation functions. We did the simulation in Matlab for all the benchmarks.
The main difference of our benchmarks are the heterogeneous architecture of our NN controllers that have different activation functions, which is common in practice. Such a setting can effectively test the generality of verification techniques.
huang2019reachnn: "ReachNN: Reachability analysis of neural-network controlled systems." ACM Transactions on Embedded Computing Systems (TECS) 18.5s (2019): 1-22.
fan2019knowledgedistillation: "Towards verification-aware knowledge distillation for neural-network controlled systems." 38th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2019. Institute of Electrical and Electronics Engineers Inc., 2019.
I think we should include some of these benchmarks, but we should choose a subset of them as well as which controllers we want to use for each. Do you have any preference as to which ones we should use?