Getting Started: Network Optimization

Getting Started: Network Optimization#

The optimization code is a wrapper around the analysis code. Given an initial network configuration of sensors, the code will add a desired number of sensors to the network. The goal of the optimization is to maximized the EIG of the new sensor network. This is done with a sequential (greedy) optimization that adds sensors one at a time to the initial network. Each optimization is done using a Bayesian optimization method that construct a Gaussian process (GP) surrogate model of the EIG optimization surface. This is done by evaluating many potential new sensor locations and measuring the EIG using the analysis code. These data are then used to construct the surrogate and inform new trial points to query the EIG function. The code then returns the new sensor network after the optimal sensors have been added.

The optimization code is contained within the script network_opt.py. This script takes four arguments: a configuration file, an output file name, the path to a folder to save the output and all intermediate files, and a verbosity control. An example configuration file, which we call opt_inputs.dat, might look like this:

1
20
sensor_optimization_boundary.json
2,2,0
0
1024
4096
2
event_sampling_boundary.json
mpiexec --bind-to core --npernode 16 --n 512
unif_prior.py
10
40.0, -111.5,0.1,2,0  

Here, observe several things about the input file:

Line 3 specifies the filename (and path, if in a different directory) of the shapefile used to define the boundary for the sensors. At this time, a boundary file must always be provided (meaning that if we wanted to simply optimize the sensors over a square domain, we would need to provide a file specifying that).
Line 4 specifies that we wish to use sensors with an SNR offset of 2, an output vector length of 4, and of type 0–meaning sensors that detect seismic waves.
Line 10 uses nodes with 36 cores per node and specifies 256 total cores. If the server being used had a different architecture, this line would need to be modified to match the server.
Lines 9 specifies the file defining the boundary from which events may be sampled.
Line 11 specifies the file that contains the proper functions for sampling events (see Customizing the event prior for details).

For more information on input files, see Writing input files.

Running the code#

The code can be run interactively, either locally or on an HPC system, or it can be run through a HPC scheduler. This tutorial assumes the HPC system uses Slurm.

The code for optimizing a network is contained in the Python file network_opt.py, which is executed with the with the following arguments:

Input file: path to the input file (see Writing input files for more details)
Output file: Path to the location and filename where the outputs will be saved. File must be in .npz format.
Output path: Location where a directory will be created to store the output and temporary files created by the run.
Verbosity: One of 2 verbosity levels may be specified: 0, 1:
- 0 has no output other than the final optimized network.
- 1 is the most verbose with printed statements throughout the code describing what is going on. This verbosity level also causes each intermediate input files to the eig_calc.py script, the outputs of those intermediate runs, and the intermediate optimization objects to be stored in the output path directory.s

Running using HPC interactively#

To submit an interactive job, use the salloc command. The command salloc requests a slurm allocation, and has several flags that are used to specify the details of the allocation. This varies by system, but typically the number of nodes and the allocation time are required:

-nodes: The number of nodes to request.
--time: The time the nodes will be allocated to your account

An example job allocation request looks like this:

salloc --nodes=2 --time=2:00:00

This command is requesting 2 nodes for a length of 2 hours. For more details on salloc, see the Slurm documentation: https://slurm.schedmd.com/documentation.html.

Once you have an allocation, you can now submit the job. For example:

python3 network_opt.py inputs.dat outputs.npz ouput_path 1

which executes the network_opt.py script with Python, reads the input data from inputs.dat, saves the output data to outputs.npz, in a directory called output_path, and uses verbose setting 1.

Running on HPC with script#

A bash script can be written that will submit a job to the HPC job queue. This does not require the user to specifically allocate nodes to use for the job; nodes will be allocated and the job will begin automatically once the number of nodes specified in the bash script are available. An example script might look like

#!/bin/bash
## Do not put any commands or blank lines before the #SBATCH lines
#SBATCH --nodes=16                   # Number of nodes - all cores 
                                     #per node are allocated to the job
#SBATCH --time=04:00:00              # Wall clock time (HH:MM:SS) - 
                                     # once the job exceeds this time, the job will be terminated (default is 5 minutes)
#SBATCH --job-name=net_analysis          # Name of job
#SBATCH --export=ALL                 # export environment variables from the submission env
                                     # (modules, bashrc etc)

nodes=$SLURM_JOB_NUM_NODES           # Number of nodes - the number of 
                                     # nodes you have requested (for a 
                                     # list of SLURM environment 
                                     # variables see "man sbatch")
cores=16                             # Number MPI processes to run 
                                     # on each node (a.k.a. PPN)
                                   

python3 network_opt.py inputs.dat output_path outputs.npz 1

This script can then be submitted using the sbatch command:

sbatch network_opt_batch_submission_script.bash

For a comprehensive list of available options, see the Slurm documentation (https://slurm.schedmd.com/documentation.html), and in particular the Slurm command/option summary, found here.

Optimization output#

When network_opt.py finishes, it will display the optimized sensor network configuration e.g. for each sensor its lat, long, noise level, number of output variables, and sensor type. This network configuration will then be saved in the output numpy file (e.g. opt_network.npz).

Additionally, if the verbose flag is set to 1, two files are created per optimization level where a new sensor number is being placed. The first file is result*.pkl, where * is the sensor number being placed. This file contains an optimization result object. This object contains information about the GP surrogate used to find the optimization objective and the data used to fit it. More information can be found in the Scikit-Learn documentation (https://scikit-optimize.github.io/stable/auto_examples/store-and-load-results.html). The second file, result_eigdata*.npz is a numpy file that contains three variables: sensors, eigdata_full, and Xs. sensors lists the current network configuration before optimization. eigdata_full contains the [EIG, std EIG, minESS] for each trial new sensor location to augment the current network. Xs includes the trial sensor locations.

Examples#

For examples of using this code to optimize a sensor network, see the next section, Performing bounded optimization.