Custom Analyzers

You may eventually find that you would like to analyze data from a simulation run in a way that is not possible with the analyzers that come with VSimComposer. For this, you need to write a custom analyzer. In this section we describe the process of creating custom analyzers written in Python.

For information on analyzer basics and predefined analyzer usage, please refer to User Guide: Data Analysis.

An analyzer is any executable that analyzes data generated by the computational engine (Vorpal). Analyzers can be written in any language, and they can produce output just to the log window, or they can produce data files that can be visualized in the Visualization tab. If custom analyzers are designed with specific syntactical structures, then Composer can automatically integrate these analyzers into the analysis tab, providing for improved workflow management, such as adding graphical widgets corresponding to command line parameters and simplifying file reading and writing.

Creating Custom Python Analyzer Scripts

Given below are the key parts to enable easy compatibility with VSimComposer. These instructions are intended for use with VSim-12.2 or later. For earlier versions of VSim, please consult the appropriate documentation.

Step 1: Import the Standard Modules used for Analysis Scripts

Put the following lines at the top of your analysis script:

import sys
import VpAnalyzer

You may also choose to import additional standard or custom Python modules (like numpy or scipy) at the top level. If possible, it is recommended that other modules get imported within the main() function (see Step 8: Provide a main() Function) as part of a try/except block for error checking.

The VpAnalyzer module defines classes that contain functionality for reading and parsing command line options and arguments, reading and writing of valid VsHdf5-compliant files, and convenience functions related to validation and printing of command line argument inputs. Using the functions in the VpAnalyzer module supplants use of TxPyUtils and VsHdf5 directly, thus simplifying the interfaces for custom analyzers.

Step 2: Write a Class that Inherits from VpAnlayzer

Define a class in Your Custom Analyzer module that inherits from the VpAnalyer class.

class className(VpAnalyzer.VpAnalyzer):
  def __init__(self):
    super(className, self).__init__()

Deriving from the VpAnalyzer base class gives your analyzer access to all of the VpAnalyzer functionality and internally defined attributes.

Step 3: Add Description Attributes

Define two class attributes that describe the purpose of the analyzer and what the output of the analyzer is. These string attributes will be used to provide structured help and for integration with the Composer interface.

self.setAnalyzerDescription('This script performs ...')
self.setAnalyzerOutputDescription('This script prints out formatted text... \n
It also creates a VizSchema-compliant Hdf5 file that contains...')

Step 4: Adding Command Line Options and Flags

Adding command line options and flags to your analyzer will allow the user to input parameters to be used in the analysis.

Note

The following examples for Command Line Options (simulation name) and Command Line Flag (overwrite) should be included in custom analyzers.

Adding Command Line Options

The positional arguments for the VpAnalyzer.addCommandLineOption() function are

  1. short option

    A string starting with a single dash followed by a single character.

  2. long option

    A string starting with two dashes followed by a longer, descriptive name. Note that the long description must be a valid Python variable name, e.g. does not start with a number, is not a reserved Python word, etc.

  3. option description

    A string describing what the command line option is; typically a few sentences.

  4. option type

    A string declaring the primitive type of the parameter. This is either ‘string’ or ‘float’ or ‘int.’

  5. default value

    The default value for the parameter. For no default value, use “None.”

  6. required or optional

    Whether or not this parameter is necessary for the analysis to run. A Boolean: True or False.

self.addCommandLineOption('-s', '--simulationName', 'Name of the simulation.', 'string', None, True)

The long option string is converted to a variable of the appropriate type when command line arguments are parsed by VpAnalyzer, which is available as an class attribute. So in the example above, there will be a Python variable referenced as self.simulationName in the class with a value equal to the string passed to the analyzer either as -s sim or as --simulationName sim.

Adding Command Line Flags

A command line flag is like a command line option, except the command line flag is Boolean. A command line flag can be used as a ‘switch’ to turn on/off certain aspects of the analysis. The positional arguments for the VpAnalyzer.addCommandLineFlag() are

  1. short flag

    A string starting with a single dash followed by a single character.

  2. long flag

    A string starting with two dashes followed by a longer, descriptive name.

  3. flag description

    A string describing the flag; typically a few sentences.

self.addCommandLineFlag('-w', '--overwrite', 'Whether a dataset or group should be overwritten if it already exists.')

The “long flag” string is converted to a Python Boolean variable (True/False), and so must be a valid Python variable, similarly to the “long option” string. The value of the variable will be True if the flag is specified on the command line either as the short or long flag, and will be False if not specified on the command line. The --overwrite flag should be included in your list of command line flags. This particular flag indicates to the VsHdf5 file writers whether or not datasets should be overwritten if they already exist in an output file. VsHdf5 by default will not overwrite existing datasets, so the results of your analyzer may not be written into the output file unless this flag is passed on the command line.

Step 5: Add Validation of Command Line Options

This is an opportunity to perform error checking on passed command line arguments. For instance, if an option must be non-negative, e.g. a frequency, then the value can be validated here.

def validateInput(self):
  if self.frequency <= 0.0:
    print('\n[moduleName] Error Command-line argument "frequency" must be greater than  0.0.\n')
    self.printHelp()
    sys.exit(9)

Note that this function is a class member of the custom analyzer class that you are providing, and it’s behavior supersedes the behavior of the abstract function/method VpAnalyzer.validateInput().

Step 6: Write Helper Functions

Helper functions are separate functions that will be called by analyze().

def SMOOTH(self, data, freq, dt):
  LENGTH = len(data)
  LENGTH_INDICES = range(LENGTH)
  PERIOD = 1/(freq*dt)
  [...] etc.

Step 7: Write the analyze() Function

The analyze() function is a required class function (see Custom Analyzers Reference & Examples for more specific details about writing an analyze() function). This function is where the actual analysis is performed. The analyze() function is called from the analyzer main() function and has access to all variables set through parsing of command line options and flags, as well as any other class functions that have been defined.

def analyze(self):
  [...] do analysis...

All VsHdf5-compatible file reading and writing that is performed in analyze() or other defined class functions should use the VpAnalyzer convenience class functions instead of direct calls to VsHdf5 functions. An instance of the VpAnalyzer base class contains an attribute that is an instance of the VsHdf5 class. The following VpAnalyzer class functions pass arguments to the owned VsHdf5 instance, thus adding an opaque interface layer to VsHdf5 that separates VsHdf5 data structures and file I/O from the VpAnalyzer class.

Step 8: Provide a main() Function

This is where execution of the analyzer is launched when invoked on the command line. The main() function should be similar in structure to the following:

def main():
  global os, glob
  import os, glob
  global numpy
  try:
    import numpy
  except:
    print('[className] Could not import numpy. Please make sure it is in your Python path')
    print('Python path is: ')
    print(sys.path)
    sys.exit(1)

  classInstance = className()
  classInstance.parseArgs(sys.argv)
  classInstance.validateInput()
  classInstance.analyze()
  sys.exit(0)

The first part of main() is importing modules that are used in the analyze() or other class-defined functions. Non-system modules should be imported in a try: except: block to catch errors, similar to what is shown above for importing numpy. Modules imported in main() need to be declared as global prior to being imported in order to be used in class functions. Optionally, importing modules can be done at the top-level scope of the analyzer module instead of in the main() function.

The second part of main() first instantiates an analyzer class object, parses the arguments to the analyzer that were passed on the command line, validates the command line arguments (optional), and performs the analysis.

Step 9: Make the Analyzer Executable as a Python Script

The following two-line stanza should be placed an the end of the analyzer. Also ensure that the analyzer is executable (permissions 755 are recommended on unix-type systems).

if (__name__ == "__main__"):
  main()

Finally, to make the analyzer executable from the command line, one should ensure that the first line of the file is:

#!/usr/bin/env python