Command line
The ds-format Python package provides a command ds for reading, writing and
modifying data files.
The command should be run in a shell such as Bash. zsh (the default shell on
macOS) is not well supported due to the fact that curly brackets ({, }) are
interpreted as special characters in this shell. zsh can be used if curly
brackets are escaped with a backslash character (\).
On Unix-like operating systems, manual pages are available for the commands
with man ds and man ds cmd. Note that you might have to add the manual
page path (usually $HOME/.local/share/man/) to the MANPATH environment
variable in order to access the manual pages.
Synopsis
ds
Tool for reading, writing and modifying dataset files.
Usage:
ds [cmd [args]] [options]
ds --help [cmd]
ds --version
The command line interface is based on the PST format. In all commands, variable, dimension and attribute names are interpreted as glob patterns, unless the -F option is enabled. Note that the pattern has to be enclosed in quotes in order to prevent the shell from interpreting the glob.
Arguments:
- cmd: Command to execute or show help for. If omitted,
dsis a shorthand for the commandls, with a difference that files with the same name as any available command cannot be listed. Available commands are listed below. - args: Command arguments and options.
Options:
-F: Interpret variable, dimension and attribute names as fixed strings instead of glob patterns.--help: Show this help message or help for a command if cmd is supplied.-m: Moderate error handling mode. Report a warning on missing variables, dimensions and attributes. Overrides the DS_MODE environment variable.-s: Strict error handling mode. Handle missing variables, dimensions and attributes as errors. Overrides the DS_MODE environment variable.-t: Soft error handling mode. Ignore missing variables, dimensions and attributes. Overrides the DS_MODE environment variable.-v: Be verbose. Print more detailed information and error messages.--version: Print the version number and exit.
Available commands:
attrs: Print attributes in a dataset.cat: Print variable data.dims: Print dimensions of a dataset or a variable.ls: List variables.merge: Merge files along a dimension.meta: Print dataset metadata.rename: Rename variables and attributes.rename_dim: Rename a dimension.rm: Remove variables or attributes.select: Select and subset variables.set: Set variable data, dimensions and attributes in an existing or new dataset.size: Print variable size.stats: Print variable statistics.type: Print variable type.
Supported input formats:
- DS:
.ds - JSON:
.json - NetCDF4:
.nc,.nc4,.nc3,.netcdf,.hdf,.h5
Supported output formats:
- DS:
.ds - JSON:
.json - NetCDF4:
.nc,.nc4,.netcdf
Environment variables:
- DS_MODE: Error handling mode. If “strict”, handle missing variables, dimensions and attributes as errors. If “moderate”, report a warning. If “soft”, ignore missing items.
Commands
| Command | Description |
|---|---|
| attrs | Print attributes in a dataset. |
| cat | Print variable data. |
| dims | Print dimensions of a dataset or a variable. |
| ls | List variables. |
| merge | Merge files along a dimension. |
| meta | Print dataset metadata. |
| rename | Rename variables and attributes. |
| rename_dim | Rename a dimension. |
| rm | Remove variables or attributes. |
| select | Select and subset variables. |
| set | Set variable data, dimensions and attributes in an existing or new dataset. |
| size | Print a variable size. |
| stats | Print variable statistics. |
| type | Print a variable type. |
attrs
Print attributes in a dataset.
Usage: ds attrs [var] [attr] input [options]
The output is formatted as PST.
Arguments:
- var: Variable name or
noneto print a dataset attribute attr. If omitted, print all dataset attributes. - attr: Attribute name.
- input: Input file.
- options: See help for ds for global options.
Examples:
Print dataset attributes in dataset.nc.
$ ds attrs dataset.nc
title: "Temperature data"
Print attributes of a variable temperature in dataset.nc.
$ ds attrs temperature dataset.nc
long_name: temperature units: celsius
Print a dataset attribute title.
$ ds attrs none title dataset.nc
"Temperature data"
Print an attribute units of a variable temperature.
$ ds attrs temperature units dataset.nc
celsius
cat
Print variable data.
Usage:
ds cat var input [options]
ds cat var… input [options]
Data are printed by the first index, one item per line, formatted as PST-formatted. If multiple variables are selected, items at a given index from all variables are printed on the same line as an array. The first line is a header containing a list of variables. Missing values are printed as empty rows (if printing one single dimensional variable) or as none.
Arguments:
- var: Variable name.
- input: Input file.
- options: See help for ds for global options.
Options:
-h: Print human-readable values (time as ISO 8601).--jd: Convert time variables to Julian date (see Aquarius Time).
Examples:
Print temperature values in dataset.nc.
$ ds cat temperature dataset.nc
temperature
16.000000
18.000000
21.000000
Print time and temperature values in dataset.nc.
$ ds cat time temperature dataset.nc
time temperature
1 16.000000
2 18.000000
3 21.000000
dims
Print dimensions of a dataset or a variable.
Usage: ds dims [var] input [options]
Arguments:
- var: Variable to print dimensions of.
- input: Input file.
- options: See help for ds for global options.
Options:
-s,--size: If var is defined, print the size of dimensions as an object instead of an array of dimensions. The order is not guaranteed.
Examples:
Print dimensions of a dataset.
$ ds dims dataset.nc
time
Print dimensions of the variable temperature.
$ ds dims temperature dataset.nc
time
ls
List variables.
Usage:
ds [var]… input [options]
ds ls [var]… input [options]
Lines in the output are formatted as PST.
Arguments:
- var: Variable name to list.
- input: Input file.
- options: See help for ds for global options.
Options:
-l: Print a detailed list of variables (name, type and an array of dimensions), preceded with a line with dataset dimensions.a:attrs: Print variable attributes after the variable name and dimensions. attrs can be a string or an array.
Examples:
Print a list of variables in dataset.nc.
$ ds ls dataset.nc
temperature
time
Print a detailed list of variables in dataset.nc.
$ ds ls -l dataset.nc
time: 3
temperature float64 { time }
time int64 { time }
Print a list of variables with an attribute units.
$ ds ls dataset.nc a: units
temperature celsius
time s
Print a list of variables with attributes long_name and units.
$ ds ls dataset.nc a: { long_name units }
temperature temperature celsius
time time s
Print all variables matching a glob “temp*” in dataset.nc.
$ ds ls 'temp*' dataset.nc
temperature
merge
Merge datasets along a dimension.
Usage: ds merge dim input… output [options]
Merge datasets along a dimension dim. If the dimension is not defined in the dataset, merge along a new dimension dim. If new is none and dim is not new, variables without the dimension are set with the first occurrence of the variable. If new is not none and dim is not new, variables without the dimension dim are merged along a new dimension new. If variables is not none, only those variables are merged along a new dimension and other variables are set to the first occurrence of the variable.
Arguments:
- dim: Name of a dimension to merge along.
- input: Input file.
- output: Output file.
- options: See help for ds for global options.
Options:
new:value: Name of a new dimension.variables:{value…}: Variables to merge along a new dimension or none for all variables.
Examples:
Write example data to dataset1.nc.
$ ds set { time none time { 1 2 3 } long_name: time units: s } { temperature none time { 16. 18. 21. } long_name: temperature units: celsius } title: "Temperature data" none dataset1.nc
Write example data to dataset2.nc.
$ ds set { time none time { 4 5 6 } long_name: time units: s } { temperature none time { 23. 25. 28. } long_name: temperature units: celsius } title: "Temperature data" none dataset2.nc
Merge dataset1.nc and dataset2.nc and write the result to dataset.nc.
$ ds merge time dataset1.nc dataset2.nc dataset.nc
Print time and temperature variables in dataset.nc.
$ ds cat time temperature dataset.nc
time temperature
1 16.000000
2 18.000000
3 21.000000
4 23.000000
5 25.000000
6 28.000000
meta
Print dataset metadata.
Usage: ds meta [var] input [options]
The output is formatted as PST.
Arguments:
- input: Input file.
- var: Variable name to print metadata for or “.” to print dataset metadata. If not specified, print metadata for the whole file.
- options: See help for ds for global options.
Examples:
Print metadata of dataset.nc.
$ ds meta dataset.nc
.: {{
title: "Temperature data"
}}
time: {{
long_name: time
units: s
.dims: { time }
.size: { 3 }
}}
temperature: {{
long_name: temperature
units: celsius
.dims: { time }
.size: { 3 }
}}
rename
Rename variables and attributes.
Usage:
ds rename vars input output [options]
ds rename var attrs input output [options]
ds rename { var attrs }… input output [options]
Arguments:
- var: Variable name, or an array of variable names whose attributes to rename, or
noneto rename dataset attributes. - vars: Pairs of old and new variable names as oldvar
:newvar. If newattr isnone, remove the attribute. - attrs: Pairs of old and new attribute names as oldattr
:newattr. If newattr isnone, remove the attribute. - input: Input file.
- output: Output file.
- options: See help for ds for global options. Note that with this command options can only be supplied before the command name or at the end of the command line.
Examples:
Rename variables time to newtime and temperature to newtemperature in dataset.nc and save the output in output.nc.
$ ds rename time: newtime temperature: newtemperature dataset.nc output.nc
Rename a dataset attribute title to newtitle in dataset.nc and save the output in output.nc.
$ ds rename none title: newtitle dataset.nc output.nc
Rename an attribute units of a variable temperature to newunits in dataset.nc and save the output in output.nc.
$ ds rename temperature units: newunits dataset.nc output.nc
rename_dim
Rename a dimension.
Usage:
ds rename_dim dims input output [options]
Arguments:
- dims: Pairs of old and new dimension names as olddim
:newdim. - input: Input file.
- output: Output file.
- options: See help for ds for global options. Note that with this command options can only be supplied before the command name or at the end of the command line.
Examples:
Rename dimension time to newtime in dataset.nc and save the output in output.nc.
$ ds -l dataset.nc
time: 3
temperature
time
$ ds rename_dim time: newtime dataset.nc output.nc
$ ds -l output.nc
newtime: 3
temperature
time
rm
Remove variables or attributes.
Usage:
ds rm var input output [options]
ds rm var attr input output [options]
Arguments:
- var: Variable name, an array of variable names or
noneto remove a dataset attribute. - attr: Attribute name or an array of attribute names.
- input: Input file.
- output: Output file.
- options: See help for ds for global options.
Examples:
Remove a variable temperature in dataset.nc and save the output in output.nc.
$ ds rm temperature dataset.nc output.nc
Remove variables time and temperature in dataset.nc and save the output in output.nc.
$ ds rm { time temperature } dataset.nc output.nc
Remove a dataset attribute title in dataset.nc and save the output in output.nc.
$ ds rm none title dataset.nc output.nc
Remove an attribute units of a variable temperature in dataset.nc and save the output in output.nc.
$ ds rm temperature units dataset.nc output.nc
select
Select and subset variables.
Usage: ds select [var…] [sel] input output [options]
select can also be used to convert between different file formats (ds select input output).
Arguments:
- var: Variable name.
- sel: Selector as dim
:idx pairs, where dim is a dimension name and idx is an index or a list of indexes as{i…}. - input: Input file.
- output: Output file.
- options: See help for ds for global options. Note that with this command options can only be supplied before the command name or at the end of the command line.
Examples:
Write data to dataset.nc.
$ ds set { time none time { 1 2 3 } long_name: time units: s } { temperature none time { 16. 18. 21. } long_name: temperature units: celsius } title: "Temperature data" none dataset.nc
List variables in dataset.nc.
$ ds dataset.nc
temperature
time
Select variable temperature from dataset.nc and write to temperature.nc.
$ ds select temperature dataset.nc temperature.nc
List variables in temperature.nc.
$ ds temperature.nc
temperature
Subset by time index 0 and write to 0.nc.
$ ds select time: 0 dataset.nc 0.nc
Print variables time and temperature in 0.nc.
$ ds cat time temperature 0.nc
time temperature
1 16.000000
Convert dataset.nc to JSON.
$ ds select dataset.nc dataset.json
$ cat dataset.json
{"time": [1, 2, 3], "temperature": [16.0, 18.0, 21.0], ".": {".": {"title": "Temperature data"}, "time": {"long_name": "time", "units": "s", ".dims": ["time"], ".size": [3]}, "temperature": {"long_name": "temperature", "units": "celsius ".dims": ["time"], ".size": [3]}}}
set
Set variable data, dimensions and attributes in an existing or new dataset.
Usage:
ds set ds_attrs input output [options]
ds set var [type [dims [data]]] [attrs]… input output [options]
ds set { var [type [dims [data]]] [attrs]… }… ds_attrs input output [options]
Arguments:
- var: Variable name.
- type: Variable type (
str), ornoneto keep the original type if data is not supplied or autodetect based on data if data is supplied. - dims: Variable dimension name (if single), an array of variable dimensions (if multiple),
noneto keep original dimension or autogenerate if a new variable, or{ }to autogenerate new dimension names. - data: Variable data. This can be a PST-formatted scalar or an array.
nonevalues are interpreted as missing values. - attrs: Variable attributes or dataset attributes if var is
noneas attr:value pairs. - ds_attrs: Dataset attributes as attr
:value pairs. - input: Input file or
nonefor a new file to be created. - output: Output file.
- options: See help for ds for global options. Note that with this command options can only be supplied before the command name or at the end of the command line.
Examples:
Write variables time and temperature to dataset.nc.
$ ds set { time none time { 1 2 3 } long_name: time units: s } { temperature none time { 16. 18. 21. } long_name: temperature units: celsius } title: "Temperature data" none dataset.nc
Set data of a variable temperature to an array of 16.0, 18.0, 21.0 in dataset.nc and save the output in output.nc.
$ ds set temperature none none { 16. 18. 21. } dataset.nc output.nc
Set a dimension of a variable temperature to time, data to an array of 16.0, 18.0, 21.0, its attribute long_name to “temperature” and units to “celsius” in dataset.nc and save the output in output.nc.
$ ds set temperature none time { 16. 18. 21. } long_name: temperature units: celsius dataset.nc output.nc
Set multiple variables in dataset.nc and save the output in output.nc.
$ ds set { time none time { 1 2 3 } long_name: time units: s } { temperature none time { 16. 18. 21. } long_name: temperature units: celsius } title: "Temperature data" dataset.nc output.nc
Set a dataset attribute newtitle to New title in dataset.nc and save the output in output.nc.
$ ds set newtitle: "New title" dataset.nc output.nc
Set an attribute newunits of a variable temperature to K in dataset.nc and save the output in output.nc.
$ ds set temperature newunits: K dataset.nc output.nc
size
Print a variable size.
Usage: ds size var input [options]
Arguments:
- var: Variable to print the size of.
- input: Input file.
- options: See help for ds for global options.
Examples:
Print the size of a variable temperature in a dataset dataset.nc.
$ ds size temperature dataset.nc
3
stats
Print variable statistics.
Usage: ds stats var input [options]
The output is formatted as PST.
Arguments:
- var: Variable name.
- input: Input file.
- options: See help for ds for global options.
Output description:
count: Number of array elements.max: Maximum value.mean: Sample mean.median: Sample median.min: Minimum value.
Examples:
Print statistics of variable temperature in dataset.nc.
$ ds stats temperature dataset.nc
count: 3 min: 16.000000 max: 21.000000 mean: 18.333333 median: 18.000000
type
Print a variable type.
Usage: ds type var input [options]
Arguments:
- var: Variable to print the type of.
- input: Input file.
- options: See help for ds for global options.
Examples:
Print the type of a variable temperature in a dataset dataset.nc.
$ ds type temperature dataset.nc
float64