gordias.util.cmip_path.build_cmip_like_filename_template#
- gordias.util.cmip_path.build_cmip_like_filename_template(files: list[Path] | list[str], include_time_range: bool = True, time_range_placeholder: bool = False) str#
Form a CMIP/CORDEX style filename template based on the given file names.
A CMIP or CORDEX filename has a well-defined structure, adhering to the DRS (Data Reference Syntax) rules. This means that several of the data attributes are included in the filename. It generally starts with a variable name and, unless containing a dataset with no associated time dimension, also ends with a time range. In the case of CORDEX files, the frequency is also included in the filename.
This function takes a number of filenames (or file paths) and tries to combine them into a single filename template, suitable for using as output filename when combining the files or storing the result of an analysis of the data in the files.
For example, the two CORDEX-like filenames
tas_element1_element2_element3_day_19710101-19751231.nc, tas_element1_element2_element3_day_19760101-19801231.nc
can generate the following filename templates, depending on settings:
{var_name}_element1_element2_element3_{frequency}_19710101-19801231.nc, {var_name}_element1_element2_element3_{frequency}_{start}-{end}.nc, {var_name}_element1_element2_element3_{frequency}.nc.
Note that if the input filenames contains folder paths, those will be stripped in the returned filename template.
The frequency placeholder will always be added to the template, even though in the case of CMIP-like filenames, the frequency generally is not included.
If the input filenames represent both historical and scenario data, the corresponding attributes will be combined with a hyphen in the template, e.g. historical and rcp45 will be combined into historical-rcp45 in the filename template.
If the code fails to create an unambiguous filename template, the template fallback {var_name}_{frequency}.nc will be returned.
- Parameters:
files (list[pathlib.Path] | list[str]) – Input file names. Note: no wildcard expansion is run.
include_time_range (bool, optional) – If True, time range will be included in the template. The time period of the template will always represent the whole period as covered by all files, given that the time periods can be parsed from the input filenames. Default True.
time_range_placeholder (bool, optional) – If True, a time range placeholder will be added instead of extracting the time range from the input files, i.e. the template will end with _{start}-{end}.nc. Default False.
- Returns:
filename_template – A format string suitable for use in the construction of an output filename.
- Return type: