Auto-generation of namelist.input

Introduction

In the nwpservice layer, the initailization routines for WRF and REAL expect the path to a fully-correct namelist.input as one of the arguments.   In the initial implementation of the full EHRATM workflow, it was also assumed that at the highest layers, the user would present a fully correct namelist.input, and this would simply be passed down through the layers.  This decision was meant as a short-term strategy to put emphasis on the development and testing of the workflow, and not on the complexities involved in creating a correct namelist.input.  This issue would be solved later.

It is now “later” and efforts are being made to develop and test the code which will create a custom namelist.input based on parameters defined in the workflow definition at the highest EHRATM layers, and then this presumably correct namelist.input will be presented to the REAL and WRF initialization routines in nwpservice, just as is done now.  In keeping with the layered approach, the strategy is to build a new class NamelistInputWriter within the nwpservice layer, whose purpose is to generate a custom namelist.input based on a provided namelist.input template and a dict of parameter changes to make relative to the template.

We emphasize here that the nwpservice REAL and WRF routines continue to require a fully-correct namelist.input as an input argument, and that this new NamelistInputWriter class is provided as a utility (primarily intended for use in layers above nwpservice) to modify a namelist.input template with data in a provided dict to generate the fully-correct namelist.input.

We also emphasize that down at this nwpservice layer, there is the assumption that the namelist.input template and the data in the dict are correct and error-free.  This is in keeping with the spirit of the low-level nwpservice layer to generally focus on simply getting the work done of running specified NWP and ATM components.  Therefore, there is the implicit assumption that any kind of error checking of the data dict is done in layers above nwpservice.


NamelistInputWriter() Overview

The primary design goal of this class is to support the creation of a namelist.input file specific to a simulation, based on general parameters within a specified namelist.input template file and specific parameters provided in a Python dict.  The following graphic depicts a scenario where a general namelist.input template has been created for a general class of problems (let’s consider, for example, an operational forecast performed on the same domain with the same physics options indefinitely, four times a day.  In this case, most fields in the template will be mapped as-is to the specific namelist.input for a simulation, but fields related to the time will need to be customized for each simulation.

dictplustemplate

The parameters Python dict contains the fields that will need to be modified in the template in order to create a namelist.input.  In this case, clearly the start and end times will need to be modified relative to the template.  Additionally, just for demonstration purposes, we also show a change in number of simulation hours, and a couple of other parameters.

The default behaviour within the NamelistInputWriter class is outlined as follows

  • Ensure that the specified template namelist.input file is accessible and read its contents into memory

  • Iterate through each of the sections (e.g. time_control, domains, physics, …) in the parameters dict and ensure that it is a valid section for a correct namelist.input (if an unrecognized section name is found, an exception is raised)

    • Within each section iterate through the items

      • If the item name is not present in the template, we make the judgment that we should not blindly insert it, and so we raise an exception.  This behaviour can be bypassed if desired, allowing for the entry of “new” item names, but it is dangerous, and we generally recommend in such cases that a new template file be created, one that actually contains the item name to be replaced.

      • If the item name is present in the template, then a no-questions-asked replacement is made to that item.  We choose to avoid, at this low level, the great complexity of determining whether values are correct or not, and simply accept them as-is, assuming that an application layer that needs such checking will do the checking itself before calling the routines.

  • Write the modified, custom namelist.input to a file, which can then be passed to the applicable routine

nmllayers

Although this approach is intended to be generally applicable to a wide range of uses, the vision is to support a collection of simulations, each of which may differ by only a few parameters in their namelist.input files.  In this way, a general namelist.input template can be provided for all simulations and then a small set of custom parameters can be specified for each simulation, with these specific parameters being used to overwrite the respective parameters in the namelist.input template.

The default behaviour of the class is meant to add some safety to the process.  To summarize what was said above, the default behaviour is such that

  • Any section name in the parameters dict which is not a valid section name (as determined by the authority of the WRF Users Guide) will cause the process to raise an exception.

  • Any item name in the parameters dict that is not present in the template will cause the process to raise an exception.  This is meant to prevent the insertion of typographical errors, or variables that are not compatible with the configuration of the particular template.  If such errors were introduced, they would surely cause the actual real or wrf invocation to crash, and be difficult to debug, so the intent is to catch these early.  Again, this behaviour can be bypassed by setting the permit_new_entries initialization argument to True, but this gets risky.  If a new variable needs to be inserted, it is probably better to create a new template that contains that variable.

  • When an item in the parameters dict is deemed valid for replacement in the template, the replacement is made with no questions asked.  If the item value is incorrect for any reason, then it will be passed through to the real or wrf routine that uses the namelist.input.  Likewise, if the template has been set up for three domains, but the replacement item is a list of only two elements, then the real or wrf invocation will ultimately fail.  This design decision was made with the policy that, at this low nwpservice level, we should not be introducing the immense complexity of judging proper values for a particular case.  Our goal here is to simply get the parameters dict values into the new namelist as easily as possible, assuming that if more error checking on the values is needed, it should be done on the parameters dict before passing it into this routine.

Demonstration test case

A demonstration of the use of this capability is available in the packages/nwpservice/tests/demos/ directory of the repository.  This directory consists of Python programs meant to test NWP workflows invoked solely through the use of the nwpservice layer.  Most users will never have a reason to do this, but these are provided for those who may want to work down at this level, for various reasons.

The wrf_test_driver.py test case was originally created to demonstrate the use of the nwpservice Wrf() component, and the following excerpt shows how it would normally be called.  Note that a correct namelist.input is assumed, and is simply passed into the initializer for the Wrf() class, ready to run

.
.
.
namelist_input = self._testcase_data_dir + '/WRF/namelist.input'

self._wrf_obj = wrf.Wrf(
                    wpswrf_distro_path=self._wpswrf_distro,
                    wpswrf_rundir=self._testrun_dir,
                    namelist_input=namelist_input,
                    realdatadir=self._testcase_real_data_dir,
                    output_dir=self._output_stage_dir,
                    numpes=int(self._num_mpi_tasks_wrf),
                    mpirun_path=self._mpirun,
                    log_level=logging.DEBUG
)


self._wrf_obj.setup()
output_manifest = self._wrf_obj.run()
.
.
.

A minor variation of this test, wrf_test_driver_autogen_nml.py, shows how this section of code would be modified to make use of the NamelistInputWriter class, and then using the resultant namelist.input as before.  This test is presented primarily for illustrative reasons - in reality, the **NamelistInputWriter* class is expected to be invoked from the Application layer above nwpservice, where the values of the parameters dict, and the specific template to use, can be error checked at a higher level

.
.
.
############### BEGIN namelist.input generation #######################
# This is the code unique to this demo test, relative to
# wrf_test_driver.py.  Instead of using the previously
# created namelist.input, we create a custom namelist.input
# based on a namelist.input.template and a section_data_dict
# specific to this "gfs_spain_simple_twonest" test case.
## namelist_input = self._testcase_data_dir + '/WRF/namelist.input'


namelist_input_template = self._testcase_data_dir + \
                        '/WRF/namelist.input.template'
newnml_data_dict = {
    'time_control' : {
        'run_hours' : 3,
        'start_date_list' : ['20171023030000', '20171023030000'],
        'end_date_list' : ['20171023060000', '20171023060000'],
        'interval_seconds' : 10800,
        'history_interval' : 90
    },
    'domains' : {
        'time_step' : 90
    },
    'physics' : {
        'radt' : [10,10]
    }
}

# Put the new namelist in the default tmp dir, with a random string
# suffix in its name.  This is what will be passed into the wrf
# processing
new_namelist_input = os.path.join(TMP_ROOT_DIR,
                    'newnamelist.input_' + str(uuid.uuid4()))
print('new_namelist_input: %s' % new_namelist_input)

# Create the custom namelist based on the template the data dict
new_namelist_obj = namelistinput.NamelistInputWriter(
    destpath=new_namelist_input,
    section_data_dict=newnml_data_dict,
    template_nml_path=namelist_input_template
)
new_namelist_obj.write()

# Now, new_namelist_input should contain the correct custom namelist
# and that gets passed into the next call

################ END namelist.input generation ########################


self._wrf_obj = wrf.Wrf(
                    wpswrf_distro_path=self._wpswrf_distro,
                    wpswrf_rundir=self._testrun_dir,
                    namelist_input=new_namelist_input,
                    realdatadir=self._testcase_real_data_dir,
                    output_dir=self._output_stage_dir,
                    numpes=int(self._num_mpi_tasks_wrf),
                    mpirun_path=self._mpirun,
                    log_level=logging.DEBUG
)


self._wrf_obj.setup()
output_manifest = self._wrf_obj.run()
.
.
.

Again, the envisioned implementation of a program that uses this is that at the application layer, the newnml_data_dict would be prepared - presumably with a lot of care and error-checking.  Then, the code in the application layer would make use of the nwpservice.wrf.namelistinput.NamelistInputWriter class to generate a correct namelist.input file.  Finally, with a correct namelist.input available, the application layer would invoke the nwpservice.wrf.Wrf() (or Real()) class for setting up and running the module.

For convenience, the full wrf_test_driver_autogen_nml.py is presented below.  It will be run much like the example presented in Demo of Single-Component Usage:

  • Ensure that hard-coded paths at the top of the demo driver are correct for their environment

  • Ensure that a suitable Python environment has been set up

  • Ensure that nwpservice/src has been included in the PYTHONPATH

$ conda activate ehratmv1.0
$ export PYTHONPATH=/home/ctbtuser/git/high-res-atm/packages/nwpservice/src


$ ~/git/high-res-atm/packages/nwpservice/tests/demos/wrf_test_driver_autogen_nml.py
setup...
new_namelist_input: /home/ctbtuser/tmp/newnamelist.input_f8a1797a-7dfb-41cb-9678-810b47a0159f
DEBUG      [wrf:wrf.py:__init__:73] --> started
.
.
.
DEBUG      [wrf:wrf.py:run:323] --> the_command: ['/usr/lib64/openmpi/bin/mpirun', '-n', '2', './wrf.exe']
 starting wrf task            1  of            2
 starting wrf task            0  of            2
output_manifest: {'rundir': '/home/ctbtuser/tmp/wrf_test_20240411.233712/WRF', 'wrfout': {'files': {'wrfout_d01_2017-10-23_03:00:00': {'bytes': 5011980}, 'wrfout_d01_2017-10-23_04:30:00': {'bytes': 5011980}, 'wrfout_d01_2017-10-23_06:00:00': {'bytes': 5011980}}}}
.
.
.
Output stage success: True
Keeping the test run dir: /home/ctbtuser/tmp/wrf_test_20240411.233712
$ tree -L 1 -F /home/ctbtuser/tmp/wrf_test_20240411.233712/WRF
/home/ctbtuser/tmp/wrf_test_20240411.233712/WRF
.
.
.
├── wrfbdy_d01 -> /home/ctbtuser/git/high-res-atm/packages/nwpservice/tests/testcase_data/gfs_spain_simple_twonest/WRF/output/wrfbdy_d01
├── wrf.exe -> /scratch/WRFV4.3-Distribution/WRF/main/wrf.exe*
├── wrfinput_d01 -> /home/ctbtuser/git/high-res-atm/packages/nwpservice/tests/testcase_data/gfs_spain_simple_twonest/WRF/output/wrfinput_d01
├── wrfinput_d02 -> /home/ctbtuser/git/high-res-atm/packages/nwpservice/tests/testcase_data/gfs_spain_simple_twonest/WRF/output/wrfinput_d02
├── wrfout_d01_2017-10-23_03:00:00
├── wrfout_d01_2017-10-23_04:30:00
└── wrfout_d01_2017-10-23_06:00:00

0 directories, 92 files

As in the other run directories, this represents a complete run environment, including the newly-created namelist.input.  If there are problems, one could play around in this run directory for debugging and testing.


Code listing of wrf_test_driver_autogen_nml.py

#!/usr/bin/env python3


import datetime
import logging
import os
import shutil
import subprocess
import sys
import uuid

import nwpservice.wrf.namelistinput as namelistinput
import nwpservice.wrf.wrf as wrf

# 01 April 2024 - this is a copy of the demo "wrf_test_driver.py"
# modified so that instead of using a previously prepared "namelist.input"
# will, instead, demonstrate the use of the newly created
# "nwpservice.namelistinput.py" module to create a custom namelist
# based on a template and a section_data_dict containing items to modify
# relative to the template.  That new namelist will then be passed into
# the nwpservice.wrf class.
#
# Only the testcase "gfs_spain_simple_twonest" has been set up
# in here for the test, and the newly added code is highly
# specific to that one case.


## Uncomment exactly one of the following
SYSTEM_NAME = 'EHRATM_VM'
#SYSTEM_NAME = 'CTBTO_DEVLAN'

#### These values are generally system-specific
if SYSTEM_NAME == 'EHRATM_VM':
    USER_ROOT = '/home/ctbtuser'

    GIT_REPO_DIR = os.path.join(USER_ROOT, 'git/high-res-atm') # Local git repo
    TMP_ROOT_DIR = os.path.join(USER_ROOT, 'tmp')     # Dir for temp files
    OUTPUT_ROOT_DIR = os.path.join(USER_ROOT, 'tmp')  # Dir for output products
    WPSWRF_DISTRO = '/scratch/WRFV4.3-Distribution'   # Dir of WPS/WRF distro

    MPIRUN = '/usr/lib64/openmpi/bin/mpirun'          # mpirun executable

elif SYSTEM_NAME == 'CTBTO_DEVLAN':

    USER_ROOT = '/dvlscratch/ATM/morton'

    GIT_REPO_DIR = os.path.join(USER_ROOT, 'git/high-res-atm') # Local git repo
    TMP_ROOT_DIR = os.path.join(USER_ROOT, 'tmp')     # Dir for temp files
    OUTPUT_ROOT_DIR = os.path.join(USER_ROOT, 'tmp')  # Dir for output products
    WPSWRF_DISTRO = '/scratch/WRFV4.3-Distribution'   # Dir of WPS/WRF distro
    MPIRUN = '/usr/lib64/openmpi/bin/mpirun'          # mpirun executable

else:

    print('SYSTEM_NAME not supported: %s' % SYSTEM_NAME)
    sys.exit()

####  ---------------------------------------------------------
####  These values should not vary across systems
####  ---------------------------------------------------------

####  Possible testcases  - Haven't recently tested all of these in each driver
####
#### TESTCASE = 'gfs_spain_simple_twonest'
#### TESTCASE = 'nam_nodak_simple_wps_wrf'
####
###############################################################
TESTCASE = 'gfs_spain_simple_twonest'             # Testcase to use

# Testcase files, dependent on TESTCASE value
TESTCASE_DATA_DIR = os.path.join(GIT_REPO_DIR,
                                'packages/nwpservice/tests/testcase_data',
                                TESTCASE)
TESTCASE_REAL_DATA_DIR = os.path.join(TESTCASE_DATA_DIR, 'WRF/output')

NUM_MPI_TASKS_WRF = 2

# If set to True, the run directories will be retained
NO_CLEANUP = True



class WrfTestDrive(object):

    def __init__(self):

        self._wpswrf_distro = WPSWRF_DISTRO
        self._tmp_root_dir = TMP_ROOT_DIR
        self._output_root_dir = OUTPUT_ROOT_DIR
        self._testcase_data_dir = TESTCASE_DATA_DIR
        self._testcase_real_data_dir = TESTCASE_REAL_DATA_DIR
        self._num_mpi_tasks_wrf = NUM_MPI_TASKS_WRF
        self._mpirun = MPIRUN
        self._cleanup = not NO_CLEANUP


    def setup(self):
        print('setup...')

        # Set up name for a test directory.  Note that the root portion
        # must already exist (and is tested for in __init__(), above)
        utc_timestamp = datetime.datetime.utcnow()
        time_str = utc_timestamp.strftime('%Y%m%d.%H%M%S')
        dirname = "wrf_test_" + time_str
        self._testrun_dir = os.path.join(self._tmp_root_dir, dirname)


        # Set up name for an output dir and create it.  Note that the root
        # portion must already exist
        if self._output_root_dir:
            dirname = "wrf_output_" + time_str
            self._output_stage_dir = os.path.join(self._output_root_dir,
                                                    dirname)
            try:
                os.mkdir(self._output_stage_dir, 0o755)
            except:
                raise OSError('Failed to make output stage dir: %s' %
                                self._output_stage_dir)
        else:
            self._output_stage_dir = None

        ############### BEGIN namelist.input generation #######################
        # This is the code unique to this demo test, relative to
        # wrf_test_driver.py.  Instead of using the previously
        # created namelist.input, we create a custom namelist.input
        # based on a namelist.input.template and a section_data_dict
        # specific to this "gfs_spain_simple_twonest" test case.
        ## namelist_input = self._testcase_data_dir + '/WRF/namelist.input'


        namelist_input_template = self._testcase_data_dir + \
                                '/WRF/namelist.input.template'
        newnml_data_dict = {
            'time_control' : {
                'run_hours' : 3,
                'start_date_list' : ['20171023030000', '20171023030000'],
                'end_date_list' : ['20171023060000', '20171023060000'],
                'interval_seconds' : 10800,
                'history_interval' : 90
            },
            'domains' : {
                'time_step' : 90
            },
            'physics' : {
                'radt' : [10,10]
            }
        }

        # Put the new namelist in the default tmp dir, with a random string
        # suffix in its name.  This is what will be passed into the wrf
        # processing
        new_namelist_input = os.path.join(TMP_ROOT_DIR,
                            'newnamelist.input_' + str(uuid.uuid4()))
        print('new_namelist_input: %s' % new_namelist_input)

        # Create the custom namelist based on the template the data dict
        new_namelist_obj = namelistinput.NamelistInputWriter(
            destpath=new_namelist_input,
            section_data_dict=newnml_data_dict,
            template_nml_path=namelist_input_template
        )
        new_namelist_obj.write()

        # Now, new_namelist_input should contain the correct custom namelist
        # and that gets passed into the next call

        ################ END namelist.input generation ########################

        self._wrf_obj = wrf.Wrf(
                            wpswrf_distro_path=self._wpswrf_distro,
                            wpswrf_rundir=self._testrun_dir,
                            namelist_input=new_namelist_input,
                            realdatadir=self._testcase_real_data_dir,
                            output_dir=self._output_stage_dir,
                            numpes=int(self._num_mpi_tasks_wrf),
                            mpirun_path=self._mpirun,
                            log_level=logging.DEBUG
        )


        self._wrf_obj.setup()


    def run(self):


        output_manifest = self._wrf_obj.run()
        print('output_manifest: %s' % output_manifest)

        stage_success = self._wrf_obj.stage_output(
                            auxfiles=True
        )
        print('Output stage success: %r' % stage_success)


    def cleanup(self):
        if self._cleanup:
            print('Removing test run dir: %s' % self._testrun_dir)
            shutil.rmtree(self._testrun_dir)
        else:
            print('Keeping the test run dir: %s' % self._testrun_dir)






def main():
    TestDrive = WrfTestDrive()
    TestDrive.setup()
    TestDrive.run()
    TestDrive.cleanup()



if __name__=="__main__":
    main()