Python API
The ds-format Python package provides API for reading, writing and manipulating data fies. The library can be imported with:
import ds_format as ds
Contents
| Function | Description | 
|---|---|
| attr | Get or set a dataset or variable attribute. | 
| attrs | Get variable or dataset attributes. | 
| dim | Get a dimension size. | 
| dims | Get dataset or variable dimensions or set variable dimensions. | 
| find | Find a variable, dimension or attribute matching a glob pattern in a dataset. | 
| findall | Find variables, dimensions or attributes matching a glob pattern in a dataset. | 
| group_by | Group values along a dimension. | 
| merge | Merge datasets along a dimension. | 
| meta | Get dataset or variable metadata. | 
| read | Read dataset from a file. | 
| readdir | Read multiple files in a directory. | 
| rename | Rename a variable. | 
| rename_attr | Rename a dataset or variable attribute. | 
| rename_dim | Rename a dimension. | 
| require | Require that a variable, dimension or attribute is defined in a dataset. | 
| rm | Remove a variable. | 
| rm_attr | Remove a dataset or variable attribute. | 
| select | Filter dataset by a selector. | 
| size | Get variable size. | 
| type | Get or set variable type. | 
| var | Get or set variable data. | 
| vars | Get all variable names in a dataset. | 
| write | Write dataset to a file. | 
Constants
ds.drivers.netcdf.JD_UNITS
days since -4713-11-24 12:00 UTC
NetCDF units for storing Julian date time variables.
ds.drivers.netcdf.JD_CALENDAR
proleptic_greogorian
NetCDF calendar for storing Julian date time variables.
Variables
mode
Error handling mode. If “strict”, handle missing variables, dimensions and
attributes as errors. If “moderate”, report a warning. If “soft”, ignore
missing items. Overrides the environment variable DS_MODE.
Examples:
Set error handling mode to strict.
ds.mode = 'strict'
Environment variables
DS_MODE
The same as mode.
Functions
attr
Get or set a dataset or variable attribute.
Usage: attr(d, attr, *value, var=None)
Arguments:
- d: Dataset (dict).
- attr: Attribute name (str).
- value: Attribute value. If supplied, set the attribute value, otherwise get the attribute value.
Options:
- var: Variable name (str) to get or set a variable attribute, orNoneto get or set a dataset attribute.
Return value:
Attribute value if value is not set, otherwise None.
attrs
Get variable or dataset attributes.
Usage: attrs(d, var=None, *value)
Arguments:
- d: Dataset (dict).
- value: Attributes to set (dict). If supplied, set attributes to value, otherwise get attributes.
Options:
- var: Variable name (str) orNoneto get dataset attributes.
Return value:
Attributes (dict).
dim
Get a dimension size.
Usage: dim(d, dim, full=None)
Arguments:
- d: Dataset (dict).
- dim: Dimension name (str).
Options:
- full: Return dimension size also for a dimension for which no variable data are defined, i.e. it is only defined in dataset metadata.
Return value:
Dimension size or 0 if the dimension does not exist (int).
dims
Aliases: get_dims
Get dataset or variable dimensions or set variable dimensions.
Usage:
dims(d, var=None, *value, full=False, size=False)
get_dims(d, var=None, full=False, size=False)
The function get_dims (deprecated) is the same as dims, but assumes that size is True if var is None and does not allow setting of dimensions.
Arguments:
- d: Dataset (dict).
- value: A list of dimensions (listofstr) orNone. If supplied, set variable dimensions, otherwise get dataset or variable dimensions. IfNone, remove variable dimensions (will be set to autogenerated names on write). If supplied, var must not be None.
Options:
- var: Variable name (str) orNoneto get dimensions for.
- full: Get variable dimensions even if the variable is only defined in the matadata (bool).
- size: Return a dictionary containing dimension sizes instead of a list.
Return value:
If size is False, a list of dataset or variable dimension names (list of str). If size is True, a dictionary of dataset or variable dimension names and sizes (dict), where a key is a dimension name (str) and the value is the dimension size (int). The order of keys in the dictionary is not guaranteed. Dataset dimensions are the dimensions of all variables together.
find
Find a variable, dimension or attribute matching a glob pattern in a dataset.
Usage: find(d, what, name, var=None)
If more than one name matches the pattern, raises ValueError.
Arguments:
- d: Dataset (dict).
- what: Type of item to find (str). One of: “var” (variable), “dim” (dimension), “attr” (attribute).
- name: Glob pattern matching a variable, dimension or attribute name (str).
Options:
- var: Variable name (str) orNone. Applies only if what is “attr”. If notnone, name is a variable attribute name, otherwise it is a dataset attribute name.
Return value:
A variable, dimension or attribute name matching the pattern, or name if no matching name is found (str).
findall
Find variables, dimensions or attributes matching a glob pattern in a dataset.
Usage: findall(d, what, name, var=None)
Arguments:
- d: Dataset (dict).
- what: Type of item to find (str). One of: “var” (variable), “dim” (dimension), “attr” (attribute).
- name: Glob pattern matching a variable, dimension or attribute name (str).
Options:
- var: Variable name (str) orNone. Applies only if what is “attr”. If notnone, name is a variable attribute name, otherwise it is a dataset attribute name.
Return value:
A list of variables, dimensions or attributes matching the pattern, or [name] if no matching names are found (list of str).
attrs
Get variable or dataset attributes.
Usage: attrs(d, var=None, *value)
Arguments:
- d: Dataset (dict).
- value: Attributes to set (dict). If supplied, set attributes to value, otherwise get attributes.
Options:
- var: Variable name (str) orNoneto get dataset attributes.
Return value:
Attributes (dict).
group_by
Group values along a dimension.
Usage: group_by(d, dim, group, func)
Each variable with a given dimension dim is split by group into subsets. Each subset is replaced with a value computed by func.
Arguments:
- d: Dataset (dict).
- dim: Dimension to group along (str).
- group: Groups (ndarrayorlist). Array of the same length as the dimension.
- func: Group function (function). func(y, axis=i) is called for each subset y, where i is the index of the dimension.
Return value:
None
merge
Merge datasets along a dimension.
Usage: merge(dd, dim, new=None, variables=None)
Merge datasets along a dimension dim. If the dimension is not defined in the dataset, merge along a new dimension dim. If new is None and dim is not new, variables without the dimension are set with the first occurrence of the variable. If new is not None and dim is not new, variables without the dimension dim are merged along a new dimension new. If variables is not None, only those variables are merged along a new dimension and other variables are set to the first occurrence of the variable.
Arguments:
- dd: Datasets (list).
- dim: Name of a dimension to merge along (str).
Options:
- new: Name of a new dimension (str) orNone.
- variables: Variables to merge along a new dimension (list) orNonefor all variables.
Return value:
A dataset (dict).
meta
Aliases: get_meta
Get or set dataset or variable metadata.
Usage: meta(d, var=None, meta=None, create=False)
Arguments:
- d: Dataset (dict).
Options:
- var: Variable name (str), orNoneto get dataset metadata, or an empty string to get dataset attributes.
- meta: Metadata to set (dict) orNoneto get metadata.
- create: Create (modifyable/bound) metadata dictionary in the dataset if not defined (bool). IfFalse, the returned dictionary is an empty unbound dictionary if not present in the dataset.
Return value:
Metadata (dict).
read
Read dataset from a file.
Usage: read(filename, variables=None, sel=None, full=False, jd=False)
Arguments:
- filename: Filename (str,bytesoros.PathLike).
- variables: Variable names to read (strorlistofstr) orNoneto read all variables.
Options:
- sel: Selector (see select).
- full: Read all metadata (bool).
- jd: Convert time variables to Julian dates (see Aquarius Time) (bool).
Return value:
Dataset (dict).
Supported formats:
- DS: .ds
- JSON: .json
- NetCDF4: .nc,.nc4,.nc3,.netcdf,.hdf,.h5
readdir
Read multiple files in a directory.
Usage: readdir(dirname, variables=None, merge=None, warnings=[], …)
Arguments:
- dirname: Directory name (str,bytesoros.PathLike).
Options:
- variables: Variable names to read (strorlistofstr) orNoneto read all variables.
- merge: Dimension name to merge datasets by (str) orNone.
- warnings: A list to be populated with warnings (list).
- …: Optional keyword arguments passed to read.
Return value:
A list of datasets (list of dict) if merge is None or a merged dataset (dict) if merge is a dimension name.
rename
Rename a variable.
Usage: rename(d, old, new)
Any dimension with the same name is also renamed.
Arguments:
- d: Dataset (dict).
- old: Old variable name (str).
- new: New variable name (str) orNoneto remove the variable.
Return value:
None
rename_attr
Rename a dataset or variable attribute.
Arguments:
- d: Dataset (dict).
- old: Old attribute name (str).
- new: New attribute name (str).
Options:
- var: Variable name (str) to rename a variable attribute orNoneto rename a dataset attribute.
Return value:
None
rename
Rename a dimension.
Usage: rename_dim(d, old, new)
Arguments:
- d: Dataset (dict).
- old: Old dimension name (str).
- new: New dimension name (str).
Return value:
None
require
Require that a variable, dimension or attribute is defined in a dataset.
Usage: require(d, what, name, var=None, full=False)
If the item is not found and the mode is “soft”, returns False. If the mode is “strict”, raises NameError. If the mode is “moderate”, produces a warning and returns False.
Arguments:
- d: Dataset (dict).
- what: Type of item to require. One of: “var” (variable), “dim” (dimension), “attr” (attribute) (str).
- name: Variable, dimension or attribute name (str).
Options:
- var: Variable name (str) orNone. Applies only if what is “attr”. If notnone, name is a variable attribute name, otherwise it is a dataset attribute name.
- full: Also look for items which are defined only in dataset metadata (bool).
Return value:
true if the required item is defined in the dataset, otherwise false or raises an exception depending on the mode.
rm
Remove a variable.
Usage: rm(d, var)
Arguments:
- d: Dataset (dict).
- var: Variable name (str).
Return value:
None
rm_attr
Remove a dataset or variable attribute.
Usage: rm_attr(d, attr, var)
Arguments:
- d: Dataset (dict).
- attr: Attribute name (str).
Options:
- var: Variable name (str) to remove a variable attribute orNoneto remove a dataset attribute.
Return value:
None
select
Filter dataset by a selector.
Usage: select(d, sel)
Arguments:
- d: Dataset (dict).
- sel: Selector (dict). Selector is a dictionary where each key is a dimension name and value is a mask to apply along the dimension or a list of indexes.
Return value:
None
size
Get variable size.
Usage: size(d, var)
Variable size is determined based on the size of the variable data if defined, or by variable metadata attribute .size.
Arguments:
- d: Dataset (dict).
- var: Variable name (str).
Return value:
Variable size (list) or None if not defined.
type
Get or set variable type.
Usage: type(d, var, *value)
Variable type is determined based on the type of the variable data if defined, or by variable metadata attribute .type.
Arguments:
- d: Dataset (dict).
- var: Variable name (str).
- value: Variable type (str). One of:float32andfloat64(32-bit and 64-bit floating-point number, resp.),int8int16,int32andint64(8-bit, 16-bit, 32-bit and 64-bit integer, resp.),uint8,uint16,uint32anduint64(8-bit, 16-bit, 32-bit and 64-bit unsigned integer, resp.),bool(boolean),str(string) andunicode(Unicode).
Return value:
Variable type (str) or None if not defined.
var
Get or set variable data.
Usage: var(d, var, *value)
Arguments:
- d: Dataset (dict).
- var: Variable name (str).
- value: Variable data. If supplied, set variable data, otherwise get variable data.
Return value:
Variable data (np.ndarray or np.generic) or None if the variable data are not defined or value is supplied. If the variable data are a list or tuple, they are converted to np.ndarray, or to np.ma.MaskedArray if they contain None, which is masked. If the variable data are int, float, bool, str or bytes, they are converted to np.generic. Raises ValueError if the output dtype is not one of float32, float64, int8, int16, int32, int64, uint8, uint16, uint32, uint64, bool, bytes
vars
Aliases: get_vars
Get all variable names in a dataset.
Usage: get_vars(d, full=False)
Arguments:
- d: Dataset (dict).
Options:
- full: Also return variable names which are only defined in the metadata.
Return value:
Variable names (list of str).
with_mode
Context manager which temporarily changes ds.mode.
Arguments:
- mode: Mode to set (str). See mode.
Examples:
A block of code in which ds.mode is set to “soft”.
with ds.with_mode('soft'):
	...
write
Write dataset to a file.
Usage: write(filename, d)
The file type is determined from the file extension.
Arguments:
- filename: Filename (str,bytesoros.PathLike).
- d: Dataset (dict).
Return value:
None
Supported formats:
- NetCDF4: .nc,.nc4,.netcdf
- JSON: .json
- DS: .ds