Python API
The ds-format Python package provides API for reading, writing and manipulating data fies. The library can be imported with:
import ds_format as ds
Contents
Function | Description |
---|---|
attr | Get or set a dataset or variable attribute. |
attrs | Get variable or dataset attributes. |
dim | Get a dimension size. |
dims | Get dataset or variable dimensions or set variable dimensions. |
find | Find a variable, dimension or attribute matching a glob pattern in a dataset. |
findall | Find variables, dimensions or attributes matching a glob pattern in a dataset. |
group_by | Group values along a dimension. |
merge | Merge datasets along a dimension. |
meta | Get dataset or variable metadata. |
read | Read dataset from a file. |
readdir | Read multiple files in a directory. |
rename | Rename a variable. |
rename_attr | Rename a dataset or variable attribute. |
rename_dim | Rename a dimension. |
require | Require that a variable, dimension or attribute is defined in a dataset. |
rm | Remove a variable. |
rm_attr | Remove a dataset or variable attribute. |
select | Filter dataset by a selector. |
size | Get variable size. |
type | Get or set variable type. |
var | Get or set variable data. |
vars | Get all variable names in a dataset. |
write | Write dataset to a file. |
Constants
ds.drivers.netcdf.JD_UNITS
days since -4713-11-24 12:00 UTC
NetCDF units for storing Julian date time variables.
ds.drivers.netcdf.JD_CALENDAR
proleptic_greogorian
NetCDF calendar for storing Julian date time variables.
Variables
mode
Error handling mode. If “strict”, handle missing variables, dimensions and
attributes as errors. If “moderate”, report a warning. If “soft”, ignore
missing items. Overrides the environment variable DS_MODE
.
Examples:
Set error handling mode to strict.
ds.mode = 'strict'
Environment variables
DS_MODE
The same as mode.
Functions
attr
Get or set a dataset or variable attribute.
Usage: attr
(d, attr, *value, var=None
)
Arguments:
- d: Dataset (
dict
). - attr: Attribute name (
str
). - value: Attribute value. If supplied, set the attribute value, otherwise get the attribute value.
Options:
- var: Variable name (
str
) to get or set a variable attribute, orNone
to get or set a dataset attribute.
Return value:
Attribute value if value is not set, otherwise None
.
attrs
Get variable or dataset attributes.
Usage: attrs
(d, var=None
, *value)
Arguments:
- d: Dataset (
dict
). - value: Attributes to set (
dict
). If supplied, set attributes to value, otherwise get attributes.
Options:
- var: Variable name (
str
) orNone
to get dataset attributes.
Return value:
Attributes (dict
).
dim
Get a dimension size.
Usage: dim
(d, dim, full=None
)
Arguments:
- d: Dataset (
dict
). - dim: Dimension name (
str
).
Options:
- full: Return dimension size also for a dimension for which no variable data are defined, i.e. it is only defined in dataset metadata.
Return value:
Dimension size or 0 if the dimension does not exist (int
).
dims
Aliases: get_dims
Get dataset or variable dimensions or set variable dimensions.
Usage:
dims
(d, var=None
, *value, full=False
, size=False
)
get_dims
(d, var=None
, full=False
, size=False
)
The function get_dims
(deprecated) is the same as dims
, but assumes that size is True if var is None and does not allow setting of dimensions.
Arguments:
- d: Dataset (
dict
). - value: A list of dimensions (
list
ofstr
) orNone
. If supplied, set variable dimensions, otherwise get dataset or variable dimensions. IfNone
, remove variable dimensions (will be set to autogenerated names on write). If supplied, var must not be None.
Options:
- var: Variable name (
str
) orNone
to get dimensions for. - full: Get variable dimensions even if the variable is only defined in the matadata (
bool
). - size: Return a dictionary containing dimension sizes instead of a list.
Return value:
If size is False, a list of dataset or variable dimension names (list
of str
). If size is True, a dictionary of dataset or variable dimension names and sizes (dict
), where a key is a dimension name (str
) and the value is the dimension size (int
). The order of keys in the dictionary is not guaranteed. Dataset dimensions are the dimensions of all variables together.
find
Find a variable, dimension or attribute matching a glob pattern in a dataset.
Usage: find
(d, what, name, var=None
)
If more than one name matches the pattern, raises ValueError
.
Arguments:
- d: Dataset (
dict
). - what: Type of item to find (
str
). One of: “var” (variable), “dim” (dimension), “attr” (attribute). - name: Glob pattern matching a variable, dimension or attribute name (
str
).
Options:
- var: Variable name (
str
) orNone
. Applies only if what is “attr”. If notnone
, name is a variable attribute name, otherwise it is a dataset attribute name.
Return value:
A variable, dimension or attribute name matching the pattern, or name if no matching name is found (str
).
findall
Find variables, dimensions or attributes matching a glob pattern in a dataset.
Usage: findall
(d, what, name, var=None
)
Arguments:
- d: Dataset (
dict
). - what: Type of item to find (
str
). One of: “var” (variable), “dim” (dimension), “attr” (attribute). - name: Glob pattern matching a variable, dimension or attribute name (
str
).
Options:
- var: Variable name (
str
) orNone
. Applies only if what is “attr”. If notnone
, name is a variable attribute name, otherwise it is a dataset attribute name.
Return value:
A list of variables, dimensions or attributes matching the pattern, or [name] if no matching names are found (list
of str
).
attrs
Get variable or dataset attributes.
Usage: attrs
(d, var=None
, *value)
Arguments:
- d: Dataset (
dict
). - value: Attributes to set (
dict
). If supplied, set attributes to value, otherwise get attributes.
Options:
- var: Variable name (
str
) orNone
to get dataset attributes.
Return value:
Attributes (dict
).
group_by
Group values along a dimension.
Usage: group_by
(d, dim, group, func)
Each variable with a given dimension dim is split by group into subsets. Each subset is replaced with a value computed by func.
Arguments:
- d: Dataset (
dict
). - dim: Dimension to group along (
str
). - group: Groups (
ndarray
orlist
). Array of the same length as the dimension. - func: Group function (
function
). func(y, axis=i) is called for each subset y, where i is the index of the dimension.
Return value:
None
merge
Merge datasets along a dimension.
Usage: merge
(dd, dim, new=None
, variables=None
)
Merge datasets along a dimension dim. If the dimension is not defined in the dataset, merge along a new dimension dim. If new is None and dim is not new, variables without the dimension are set with the first occurrence of the variable. If new is not None and dim is not new, variables without the dimension dim are merged along a new dimension new. If variables is not None, only those variables are merged along a new dimension and other variables are set to the first occurrence of the variable.
Arguments:
- dd: Datasets (
list
). - dim: Name of a dimension to merge along (
str
).
Options:
- new: Name of a new dimension (
str
) orNone
. - variables: Variables to merge along a new dimension (
list
) orNone
for all variables.
Return value:
A dataset (dict
).
meta
Aliases: get_meta
Get or set dataset or variable metadata.
Usage: meta
(d, var=None
, meta=None
, create=False
)
Arguments:
- d: Dataset (
dict
).
Options:
- var: Variable name (
str
), orNone
to get dataset metadata, or an empty string to get dataset attributes. - meta: Metadata to set (
dict
) orNone
to get metadata. - create: Create (modifyable/bound) metadata dictionary in the dataset if not defined (
bool
). IfFalse
, the returned dictionary is an empty unbound dictionary if not present in the dataset.
Return value:
Metadata (dict
).
read
Read dataset from a file.
Usage: read
(filename, variables=None
, sel=None
, full=False
, jd=False
)
Arguments:
- filename: Filename (
str
,bytes
oros.PathLike
). - variables: Variable names to read (
str
orlist
ofstr
) orNone
to read all variables.
Options:
- sel: Selector (see select).
- full: Read all metadata (
bool
). - jd: Convert time variables to Julian dates (see Aquarius Time) (
bool
).
Return value:
Dataset (dict
).
Supported formats:
- DS:
.ds
- JSON:
.json
- NetCDF4:
.nc
,.nc4
,.nc3
,.netcdf
,.hdf
,.h5
readdir
Read multiple files in a directory.
Usage: readdir
(dirname, variables=None
, merge=None
, warnings=[], …)
Arguments:
- dirname: Directory name (
str
,bytes
oros.PathLike
).
Options:
- variables: Variable names to read (
str
orlist
ofstr
) orNone
to read all variables. - merge: Dimension name to merge datasets by (
str
) orNone
. - warnings: A list to be populated with warnings (
list
). - …: Optional keyword arguments passed to read.
Return value:
A list of datasets (list
of dict
) if merge is None
or a merged dataset (dict
) if merge is a dimension name.
rename
Rename a variable.
Usage: rename
(d, old, new)
Any dimension with the same name is also renamed.
Arguments:
- d: Dataset (
dict
). - old: Old variable name (
str
). - new: New variable name (
str
) orNone
to remove the variable.
Return value:
None
rename_attr
Rename a dataset or variable attribute.
Arguments:
- d: Dataset (
dict
). - old: Old attribute name (
str
). - new: New attribute name (
str
).
Options:
- var: Variable name (
str
) to rename a variable attribute orNone
to rename a dataset attribute.
Return value:
None
rename
Rename a dimension.
Usage: rename_dim
(d, old, new)
Arguments:
- d: Dataset (
dict
). - old: Old dimension name (
str
). - new: New dimension name (
str
).
Return value:
None
require
Require that a variable, dimension or attribute is defined in a dataset.
Usage: require
(d, what, name, var=None
, full=False
)
If the item is not found and the mode is “soft”, returns False
. If the mode is “strict”, raises NameError
. If the mode is “moderate”, produces a warning and returns False
.
Arguments:
- d: Dataset (
dict
). - what: Type of item to require. One of: “var” (variable), “dim” (dimension), “attr” (attribute) (
str
). - name: Variable, dimension or attribute name (
str
).
Options:
- var: Variable name (
str
) orNone
. Applies only if what is “attr”. If notnone
, name is a variable attribute name, otherwise it is a dataset attribute name. - full: Also look for items which are defined only in dataset metadata (
bool
).
Return value:
true
if the required item is defined in the dataset, otherwise false
or raises an exception depending on the mode.
rm
Remove a variable.
Usage: rm
(d, var)
Arguments:
- d: Dataset (
dict
). - var: Variable name (
str
).
Return value:
None
rm_attr
Remove a dataset or variable attribute.
Usage: rm_attr
(d, attr, var)
Arguments:
- d: Dataset (
dict
). - attr: Attribute name (
str
).
Options:
- var: Variable name (
str
) to remove a variable attribute orNone
to remove a dataset attribute.
Return value:
None
select
Filter dataset by a selector.
Usage: select
(d, sel)
Arguments:
- d: Dataset (
dict
). - sel: Selector (
dict
). Selector is a dictionary where each key is a dimension name and value is a mask to apply along the dimension or a list of indexes.
Return value:
None
size
Get variable size.
Usage: size
(d, var)
Variable size is determined based on the size of the variable data if defined, or by variable metadata attribute .size
.
Arguments:
- d: Dataset (
dict
). - var: Variable name (
str
).
Return value:
Variable size (list
) or None
if not defined.
type
Get or set variable type.
Usage: type
(d, var, *value)
Variable type is determined based on the type of the variable data if defined, or by variable metadata attribute .type
.
Arguments:
- d: Dataset (
dict
). - var: Variable name (
str
). - value: Variable type (
str
). One of:float32
andfloat64
(32-bit and 64-bit floating-point number, resp.),int8
int16
,int32
andint64
(8-bit, 16-bit, 32-bit and 64-bit integer, resp.),uint8
,uint16
,uint32
anduint64
(8-bit, 16-bit, 32-bit and 64-bit unsigned integer, resp.),bool
(boolean),str
(string) andunicode
(Unicode).
Return value:
Variable type (str
) or None
if not defined.
var
Get or set variable data.
Usage: var
(d, var, *value)
Arguments:
- d: Dataset (
dict
). - var: Variable name (
str
). - value: Variable data. If supplied, set variable data, otherwise get variable data.
Return value:
Variable data (np.ndarray
or np.generic
) or None
if the variable data are not defined or value
is supplied. If the variable data are a list
or tuple
, they are converted to np.ndarray
, or to np.ma.MaskedArray
if they contain None
, which is masked. If the variable data are int
, float
, bool
, str
or bytes
, they are converted to np.generic
. Raises ValueError
if the output dtype is not one of float32
, float64
, int8
, int16
, int32
, int64
, uint8
, uint16
, uint32
, uint64
, bool
, bytes<n>
, str<n>
, or object
for which all items are an instance of str
or bytes
.
vars
Aliases: get_vars
Get all variable names in a dataset.
Usage: get_vars
(d, full=False
)
Arguments:
- d: Dataset (
dict
).
Options:
- full: Also return variable names which are only defined in the metadata.
Return value:
Variable names (list
of str
).
with_mode
Context manager which temporarily changes ds.mode.
Arguments:
- mode: Mode to set (
str
). See mode.
Examples:
A block of code in which ds.mode is set to “soft”.
with ds.with_mode('soft'):
...
write
Write dataset to a file.
Usage: write
(filename, d)
The file type is determined from the file extension.
Arguments:
- filename: Filename (
str
,bytes
oros.PathLike
). - d: Dataset (
dict
).
Return value:
None
Supported formats:
- NetCDF4:
.nc
,.nc4
,.netcdf
- JSON:
.json
- DS:
.ds