Evaluate
The script evaluate.py
is used to evaluate the performance of a given policy (such as intotaxis or an RL policy)
on the source-tracking POMDP.
The script records many statistics and monitoring information, and plot results.
Computations are parallelized with multiprocessing
(default) or with MPI (requires the mpi4py
module).
To use MPI on n_cpus processors, run the following command line from a terminal:
mpiexec -n n_cpus python3 -m mpi4py evaluate.py -i custom_params.py
- Outputs generated by the script are:
- *figure_distributions.pdf
This figure summarizes the results. It shows the distributions (pdf and cdf) of the number of steps and hits to find the source, as well as statistics (p_not_found, mean, standard deviation, etc.).
- *figure_convergence.pdf
This figure shows the evolution of the mean number of steps to find the source as a function of the number of episodes simulated. It useful to assess convergence of the statistics.
- *monitoring_summary.txt
Text file summarizing various information, such as the distribution of initial hits, the number of failed episodes, and the computational cost.
- *monitoring_episodes.txt
Text file containing detailed monitoring information for each episode (p_not_found, mean number of steps to find the source, reason for terminating the episode, whether the agent touched a boundary, value of the initial hit, …)
- *statistics.txt
Text file with the statistics (mean, standard deviation, median, etc.) of the distributions of the number of steps and hits to find the source
- *parameters.txt
Text file summarizing the parameters used.
- *table_CFD_nsteps.npy, *table_CFD_nhits.npy
Numpy array containing the cdf (cumulative distribution function) of the number of steps or hits to find the source (first row is x, second row is cdf(x)).
- Parameters of the script are:
- Source-tracking POMDP
- N_DIMS (int > 0)
number of space dimensions (1D, 2D, …)
- LAMBDA_OVER_DX (float >= 1.0)
dimensionless problem size
- R_DT (float > 0.0)
dimensionless source intensity
- NORM_POISSON (‘Euclidean’, ‘Manhattan’ or ‘Chebyshev’)
norm used for hit detections, usually ‘Euclidean’
- N_HITS (int >= 2 or None)
number of possible hit values, set automatically if None
- N_GRID (odd int >= 3 or None)
linear size of the domain, set automatically if None
- Policy
- POLICY (int)
-1: neural network
0: infotaxis (Vergassola, Villermaux and Shraiman, Nature 2007)
1: space-aware infotaxis
2: custom policy (to be implemented by the user)
5: random walk
6: greedy policy
7: mean distance policy
8: voting policy (Cassandra, Kaelbling & Kurien, IEEE 1996)
9: most likely state policy (Cassandra, Kaelbling & Kurien, IEEE 1996)
- STEPS_AHEAD (int >= 1)
number of anticipated moves, can be > 1 only for POLICY=0
- MODEL_PATH (str or None)
path of the model (neural network) for POLICY=-1, None otherwise
- Criteria for episode termination
- STOP_t (int > 0 or None)
maximum number of steps per episode, set automatically if None
- STOP_p (float ~ 0.0)
episode stops when the probability that the source has been found is greater than 1 - STOP_p
- Statistics computation
- ADAPTIVE_N_RUNS (bool)
if True, more episodes will be simulated until the estimated error is less than REL_TOL
- REL_TOL (0.0 < float < 1.0)
if ADAPTIVE_N_RUNS: tolerance on the relative error on the mean number of steps to find the source
- MAX_N_RUNS (int > 0 or None)
if ADAPTIVE_N_RUNS: maximum number of runs, set automatically if None
- N_RUNS (int > 0 or None)
if not ADAPTIVE_N_RUNS: number of episodes to simulate, set automatically if None
- Saving
- RUN_NAME (str or None)
prefix used for all output files, if None will use a timestamp
- Parallelization
- N_PARALLEL (int)
number of episodes computed in parallel (if <= 0, will use all available cpus)
This is only when using multiprocessing for parallelization (it has no effect with MPI).
Known bug: for large neural networks, the code may hang if N_PARALLEL > 1, so use N_PARALLEL = 1 instead.