Pre-processing of CMIP5 data available at ESGF

Introduction

The intention of this document is to describe the procedure that we use to perform WRF simulations using CMIP5 data as initial and boundary conditions.

This is only a working document that we decided to share with the community, so it only describes our experiences but some of the comments could not be correct or the best way to achieve the corresponding objectives. Moreover, the indicated recipe has only been tested with a particular set of CMIP5 data, corresponding to the MIROC5 model for the historical experiment, and for some WRF parameterizations.

The CMIP5 data has been obtained from the Earth System Grid Federation (ESGF) portal (http://pcmdi9.llnl.gov/). The availability of the different variables that have been used for WRF ingest are summarised in Appendix A.

Tools

We have tried to use the minimum set of tools to prepare CMIP5 data to be ingested in WRF. All of them are freely available and are commonly used by the climate community. They are:

Step by step procedure

1. Creating the Vtable

We have created a new Vtable based on some of the provided by WRF distribution:

GRIB1| Level| From |  To  | metgrid  |  metgrid | metgrid                                  |
Param| Type |Level1|Level2| Name     |  Units   | Description                              |
-----+------+------+------+----------+----------+------------------------------------------+
  11 | 100  |   *  |      | TT       | K        | Temperature                              |
  33 | 100  |   *  |      | UU       | m s-1    | U                                        |
  34 | 100  |   *  |      | VV       | m s-1    | V                                        |
  52 | 100  |   *  |      | SPECHUMD | kg kg-1  |                                          |
     | 100  |   *  |      | RH       | %        | Relative Humidity                        |
  11 | 105  |   2  |      | TT       | K        | Temperature                              | At 2 m
  52 | 105  |   2  |      | SPECHUMD | kg kg-1  |                                          | At 2 m
     | 105  |   2  |      | RH       | %        | Relative Humidity at 2 m                 | At 2 m
  33 | 105  |  10  |      | UU       | m s-1    | U                                        | At 10 m
  34 | 105  |  10  |      | VV       | m s-1    | V                                        | At 10 m
  81 |  1   |   0  |      | LANDSEA  |          | Land/Sea flag                            | 
   1 |  1   |   0  |      | PSFC     | Pa       | Surface Pressure                         |
   2 | 102  |   0  |      | PMSL     | Pa       | Sea-level Pressure                       |
  11 |  1   |   0  |      | SKINTEMP | K        | Skin Temperature (and SST)               | 
 144 | 112  |   0  |   5  | SM000010 | fraction | Soil Moist 0-5 cm below grn layer (Up)   |
 144 | 112  |   5  |  25  | SM010040 | fraction | Soil Moist 5-25 cm below grn layer       | 
 144 | 112  |  25  | 100  | SM040100 | fraction | Soil Moist 25-100 cm below grn layer     |
 144 | 112  | 100  | 200  | SM100200 | fraction | Soil Moist 100-200 cm below gr layer     |
  11 | 112  |   0  |   5  | ST000010 | K        | T 0-5 cm below ground layer (Upper)      |
  11 | 112  |   5  |  25  | ST010040 | K        | T 5-25 cm below ground layer (Upper)     |
  11 | 112  |  25  | 100  | ST040100 | K        | T 25-100 cm below ground layer (Upper)   |
  11 | 112  | 100  | 200  | ST100200 | K        | T 100-200 cm below ground layer (Bottom) |
-----+------+------+------+----------+----------+------------------------------------------+

Important comment: Please note that the names of the soil variables do not correspond with the depths indicated in the last column of Vtable. To facilitate the process, we preferred to keep the variable names as appear in the Registry and in the METGRID.TBL. However in this last file the soil layer depths have been modified, according to the CMIP5 input data (in our case, for MIROC5 model, 0-5, 5-25, 25-100, 100-200 cm).

2. Transforming hybrid levels to pressure levels

Although WRF can handle input data provided in hybrid levels, we found easier to interpolate original CMIP5 hybrid level data to pressure levels.

CDO cannot understand the hybrid levels format as provided in ESGF database. So, it is necessary to manually supply the zaxis information in the right format.

The following CDO commands have to be applied to the following variables: ta, hus, ua, va

2.1. Invert the height levels, from bottom-top to top-bottom. E.g.:

# cdo invertlev ta_6hrLev_MIROC5_historical_r1i1p1_2005010100-2005013118.nc temp1.nc

2.2. Obtain the hybrid levels coefficients.

In the CMIP5 original data, to compute the pressure of each 3D-grid point the following common formula is used:

pres(lat, lon, lev) = a(lev)*p0 + b(lev)*ps(lat, lon)

where the coefficients a and b are provided.

However, CDO assumes the following formula:

pres(lat, lon, lev) = a0(lev) + b(lev)*ps(lat, lon)

where a0(lev) = a(lev)*p0. Therefore, we have to obtain the reference pressure. Maybe the easiest way is using the ncdump command, e.g.:

# ncdump -v p0 ta_6hrLev_MIROC5_historical_r1i1p1_2005010100-2005013118.nc | grep "p0 ="
p0 = 100000 ;

The a0 (a multilied by p0) and b coeeficients can be obtained using:

# cdo outputf,%10.3f,2 -mulc,100000.0 -selname,a_bnds temp1.nc
   700.000     0.000
  1452.600   700.000
  2292.700  1452.600
  3038.400  2292.700
  3691.700  3038.400
  4346.200  3691.700
  5047.500  4346.200
  5814.500  5047.500
  6680.900  5814.500
  7676.299  6680.900
  8820.000  7676.299
 10134.000  8820.000
 11643.999 10134.000
 13379.001 11643.999
 15372.000 13379.001
 17663.000 15372.000
 20295.000 17663.000
 23318.001 20295.000
 26792.999 23318.001
 30785.001 26792.999
 28744.824 30785.001
 26400.867 28744.824
 23707.318 26400.867
 20613.031 23707.318
 17212.737 20613.031
 14544.096 17212.737
 12231.271 14544.096
 10274.269 12231.271
  8673.083 10274.269
  7338.764  8673.083
  6093.399  7338.764
  4936.986  6093.399
  3914.007  4936.986
  2979.981  3914.007
  2179.391  2979.981
  1512.229  2179.391
   978.502  1512.229
   533.729   978.502
   222.387   533.729
     0.000   222.387
# cdo outputf,%10.3f,2  -selname,b_bnds temp1.nc
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.000     0.000
     0.066     0.000
     0.142     0.066
     0.230     0.142
     0.330     0.230
     0.441     0.330
     0.528     0.441
     0.603     0.528
     0.666     0.603
     0.718     0.666
     0.762     0.718
     0.802     0.762
     0.840     0.802
     0.873     0.840
     0.903     0.873
     0.929     0.903
     0.951     0.929
     0.968     0.951
     0.983     0.968
     0.993     0.983
     1.000     0.993

The output of these commands can be used to create a Z-axis description file, as required by CDO. Further information can be obtained from CDO documentation (https://code.zmaw.de/embedded/cdo/1.5.5/cdo.html#x1-220001.4). In our case, the file (myzaxisinvert.dat) has the following content:

zaxistype = hybrid 
size      = 40
levels    = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 
vctsize   = 82
vct       = 0.000 700.000 1452.600 2292.700 3038.400 3691.700 4346.200 5047.500
            5814.500 6680.900 7676.299 8820.000 10134.000 11643.999 13379.001 15372.000
            17663.000 20295.000 23318.001 26792.999 30785.001 28744.824 26400.867 23707.318
            20613.031 17212.737 14544.096 12231.271 10274.269 8673.083 7338.764 6093.399
            4936.986 3914.007 2979.981 2179.391 1512.229 978.502 533.729 222.387 
            0.0
            0.0
            0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
            0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.066 0.142 0.230 0.330 0.441 0.528   
            0.603 0.666 0.718 0.762 0.802 0.840 0.873 0.903 0.929 0.951 0.968 0.983 0.993    
            1.000    

This file can be used for all the hybrid-levels variables.

2.3 Set the z-axis description in the netdcf file

# cdo setzaxis,myzaxisinvert.dat temp1.nc temp2.nc

2.4 Remove unused variables

Some variables included in the original files have different size and can not be handled by cdo conversion to pressure levels, generating an error message: cdo ml2pl (Abort): Grids have different size!. To remove them:

# cdo delname,a_bnds,b_bnds,ps temp2.nc temp3.nc

2.5 Interpolate to pressure levels

To avoid problems with the missing values in those grid points where the pressure of a level is under the surface pressure, we had to use the extrapolation capabilities of cdo. It is necessary to declare an environment variable in your shell. For example, for bash:

# export EXTRAPOLATE=1

And, finally, the data are interpolated to pressure levels using cdo, e.g.:

# cdo ml2pl,100000,97500,95000,92500,90000,87500,85000,82500,80000,77500,75000,70000,65000,60000,55000,50000,45000,40000,35000,30000,25000,22500,20000,17500,15000,12500,10000,7000,5000,3000 temp3.nc ta_Plev.nc

3. Creating grib files for pressure-levels variables

It is necessary to create the grib files from the previously commented pressure-levels files. For it, the grib code and the level type should be correctly assigned to each variable, following the Vtable definition. For example, for U:

# cdo delname,ps ua_Plev.nc temp1.nc
# cdo -f grb setltype,100 -chparam,-1,33 temp1.nc ua_Plev.grb

4. Surface and mean sea level pressures

The surface pressure is contained in the previous variables files (ta, hus, va, ua) and can be obtained from any of them, e.g:

# cdo -f grb setltype,1 -chparam,-1,1 -selname,ps ta_6hrLev_MIROC5_historical_r1i1p1_2005010100-2005013118.nc ps.grb

The mean sea level pressure must be obtained from its own file, e.g.:

# cdo -f grb setltype,102 -chparam,-1,2 psl_6hrPlev_MIROC5_historical_r1i1p1_2005010100-2005123118.nc psl.grb

5. Near surface variables

In a similar way, the near surface variables (tas, huss, uas, vas) are converted to grib format. But in this case, the level where the variable is defined must be specified. It can be done by a description file similar to (myzaxis2m.dat):

zaxistype = height
    size      = 1 
    levels    = 2

Then, for example, the following cdo command can be used:

# cdo -f grb setzaxis,myzaxis2m.dat -setltype,105 -chparam,-1,52 huss_3hr_MIROC5_historical_r1i1p1_200501010000-200512312100.nc thuss.grb

6. Skin temperature

We decided to combine two files to obtain the skin temperature, one of them provides the sea surface temperature and the other one the skin temperature over land, e.g:

# cdo min tslsi_3hr_MIROC5_historical_r1i1p1_200501010000-200512312100.nc tso_3hr_MIROC5_historical_r1i1p1_200501010000-200512312100.nc my_skin.nc
# cdo -f grb -setltype,1 -chparam,-1,11 my_skin.nc my_skin.grb

The minimum of the two datasets, for each time and grid point, is computed, because the missing values are larger than temperatures.

7. Soil temperature and moisture

We found easier to split the soil levels and recombine only those that we need. For example, to split the soil temperature file:

# cdo splitlevel tsl_Lmon_MIROC5_historical_r1i1p1_185001-201212.nc soilt
# ls soilt*
soilt000003.nc  soilt000009.nc  soilt000.15.nc  soilt0001.5.nc  soilt00.025.nc  soilt00.625.nc

Then, we can process each of these files using the corresponding z-axis file and the grib codes and level types (from Vtable):

# cdo -f grb setzaxis,myzaxis0_5.dat -setltype,112 -chparam,-1,11 soilt00.025.nc soilt01.grb
# cat myzaxis0_5.dat
zaxistype = depth_below_land 
size      = 1
name      = depth
longname  = depth_below_land
units     = cm
lbounds    = 0   
ubounds    = 5 

Make sure that units of these variables match with WRF expected inputs (defined in the Registry). For example, to convert MIROC5 water content (kg m-2) to WRF soil moisture fraction (m3 m-3):

# cdo mulc,0.02 soilmois01.grb soilmois01_frac.grb
# cdo mulc,0.005 soilmois02.grb soilmois02_frac.grb
# ...

8. Final remarks

In order to facilitate the ingest procedure in the WRF Preprocessing System (WPS), other considerations are suggested:

# cdo merge soilmois*_frac* soilt0* soillev.grb
# ...
# cdo merge atmos.grb surface.grb near_surface.grb my_skin.grb soillev.grb miroc_input.grb

# grib_set -s iDirectionIncrement=1406 miroc_input.grb miroc_input_fixed.grb 

Appendix A.

In the next table the CMIP5 models and the currently (October 2012) available variables (only those that we used for WRF ingest) are shown. Note that, although some variables are missing for some models, they can be available in other time frequency or can be replaced by another similar.

Frequency & type

6hr, hybrid levels

3hr, near surface

3hr, surface

Mon, soil

fixed, mask

MODEL\VARIABLE

ta

hus

ua

va

psl

tas

huss

uas

vas

tso

tsli

mrlsl

tsl

sftlf

ACCESS1.0

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

{X}

{X}

(./)

(./)

(./)

ACCESS1.3

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

{X}

{X}

(./)

(./)

(./)

BCC-CSM1.1

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

BCC-CSM1.1(m)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

{X}

BNU-ESM

(./)

(./)

(./)

(./)

{X}

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

{X}

CCSM4

(./)

(./)

(./)

(./)

(./)

(./)

(./)

{X}

{X}

{X}

{X}

(./)

(./)

(./)

CMCC-CM

(./)

(./)

(./)

(./)

(./)

(./)

{X}

(./)

(./)

(./)

(./)

(./)

(./)

{X}

CNRM-CM5

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

{X}

(./)

(./)

(./)

(./)

CSIRO-Mk3.6.0

(./)

(./)

(./)

(./)

(./)

{X}

{X}

{X}

{X}

{X}

{X}

{X}

{X}

(./)

CanESM2

(./)

(./)

(./)

(./)

(./)

{X}

{X}

{X}

{X}

{X}

{X}

(./)

(./)

(./)

EC-EARTH

(./)

{X}

(./)

(./)

(./)

(./)

{X}

(./)

(./)

{X}

{X}

{X}

{X}

{X}

FGOALS-g2

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

{X}

(./)

(./)

(./)

(./)

GFDL-CM3

(./)

(./)

(./)

(./)

{X}

(./)

(./)

(./)

(./)

{X}

{X}

{X}

(./)

(./)

GFDL-ESM2G

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

{X}

{X}

(./)

(./)

(./)

GFDL-ESM2M

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

{X}

{X}

(./)

(./)

(./)

HadGEM2-CC

(./)

{X}

(./)

(./)

(./)

(./)

{X}

{X}

{X}

{X}

{X}

(./)

(./)

(./)

HadGEM2-ES

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

{X}

{X}

(./)

(./)

(./)

INM-CM4

(./)

{X}

(./)

(./)

(./)

(./)

(./)

(./)

(./)

{X}

{X}

(./)

(./)

(./)

IPSL-CM5A-LR

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

{X}

{X}

{X}

(./)

(./)

IPSL-CM5A-MR

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

{X}

{X}

{X}

(./)

(./)

IPSL-CM5B-LR

(./)

{X}

(./)

(./)

(./)

{X}

{X}

{X}

{X}

{X}

{X}

{X}

(./)

(./)

MIROC-ESM

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

MIROC-ESM-CHEM

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

MIROC4h

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

{X}

(./)

MIROC5

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

MPI-ESM-LR

(./)

{X}

(./)

(./)

(./)

{X}

{X}

{X}

{X}

{X}

{X}

{X}

(./)

(./)

MPI-ESM-MR

(./)

{X}

(./)

(./)

(./)

{X}

{X}

{X}

{X}

{X}

{X}

{X}

(./)

(./)

MRI-CGCM3

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

(./)

{X}

(./)

(./)

NorESM1-M

(./)

(./)

(./)

(./)

(./)

(./)

(./)

{X}

{X}

{X}

(./)

(./)

(./)

(./)

Variable names:

Variable

Variable long name

ta

Air Temperature

hus

Specific Humidity

ua

Eastward Wind

va

Northward Wind

psl

Sea Level Pressure

tas

Air Temperature (near surface)

huss

Near-Surface Specific Humidity

uas

Eastward Near-Surface Wind

vas

Northward Near-Surface Wind

tso

Sea Surface Temperature

tslsi

Surface Temperature Where Land or Sea Ice

mrlsl

Water Content of Soil Layer

tsl

Temperature of Soil

sftlf

Land Area Fraction