Non-substituted keys
These are keys
that are understood and used by the interpreter, but do not
represent fields
that will be expanded/substituted in the command line calls.
is_alias_of
If it exists, its value is the real name of the executable. This allows the creation of
multiple commands using the same underlying executable. If e.g. the command is:
do_something $args -o $ofile $ifiles
and the configuration file contains
is_alias_of: an_executable
args: "-a -b"
ofile: outfile
ifiles: file[1-9].txt
primary: ifiles
then the interpreter will loop through all the files matching the ifiles
pattern in target_dir
. For the first file, it will execute:
an_executable -a -b -o outfile file1.txt
mandatory
List of mandatory fields
; field names defined under mandatory
must exist
in the provided command. Do not provide this key or leave it empty to disable
these checks.
mandatory: [ifiles]
# or equivalently
mandatory:
- field1
- field2
primary
Name of the field
to use as primary. A primary field
has a special status:
files are collected from the target_dir
according to the type of the
underlying key
, then they are looped over and for each step the command
string is created and executed. If the value of any other key
or field
needs to be
built at run time, it will use the primary
files to do it. VDAT
is
shipped with few primary types.
This key can have either a single value or a list of values. If it has a single
value, the corresponding field
must be present in the command. If it is a
list of values, one and only one of the fields
must be present in the
command. Multiple primary fields
are not allowed.
# single primary
primary: fits
# mutiple primaries
primary: [fits1, fits2]
filter_selected
Tells the interpreter how to filter the list of primary files. If this option
is not found in the configuration or the selected
keyword in
CommandInterpreter
is None
, no
filtering is performed. Otherwise, for each element in the primary list:
- uses the instructions from the value of
filter_selected
to extract a
string
- check if the string is in
selected
.
The value of filter_selected
can be any available key
type, e.g.
the built-in ones described below.
With the following settings:
# Use the value of the header keyword ``IFUSLOT`` to decide whether to
# keep the primary field or not
filter_selected:
type: header
keyword: IFUSLOT
the content of the fits header keyword IFUSLOT
is extracted and compared
with the list provided with the selected
options in
CommandInterpreter
execute
For each iteration of the primary
, it tells the interpreter whether to run the command
or not. If the option is not found, no filtering will be performed.
VDAT
ships some execute types.
If the handling of this key
raises an exception, it is logged as a warning
and the command is executed as if the key
returned True
.
Built-in primary key/field types
plain
Search for files matching the given pattern in the target directory. If
the value of a key
is a string, it is interpreted as a plain
type.
These three definitions are equivalent:
keyword: 20*.fits
---
keyword: &plain
type: plain
value: 20*.fits
---
keyword: {type: plain, value: 20*.fits}
By default, the keyword values are interpreted as shell-style wildcards. As in
the fnmatch the only
special characters are:
Pattern |
Meaning |
* |
matches everything |
? |
matches any single character |
[seq] |
matches any character in seq |
[!seq] |
matches any character not in seq |
If you need more complex matches, it’s possible to use python regular
expressions. To make the
interpreter aware of it you can add the optional key is_regex
and set it to
True
. For example:
keyword:
type: plain
value: '(?:e.)?jpes[0-9].*fits'
is_regex: True
will get all the files in the target_dir
whose name matches
e.jpes[0-9]*fits
or jpes[0-9]*fits
, but not, e.g., FEjpes[0-9]*fits
,
If rather than returning the filename we just one to extract some part of it, e.g. just the time stamp
, we can add the returns
option with the corresponding instructions. The content of
returns can be any available secondary keyword:
keyword:
<<: *plain
returns:
type: regex
match: '.*(\d{8}T\d{6}).*'
replace: \1
here the \1
refers to the first regex
group returned from the expression in match
.
loop
This is designed to loop over, for example, IFUs, channels and/or amplifiers.
- collects the
keys
which have been stored under a yaml
key called (a little confusingly) keys
- cycles through all the possible combinations of them
- for each combination replaces the corresponding entries in
value
(see example below) using
the standard python format string syntax
- look for all the files matching the resulting strings
- if any files are found, construct a string. If multiple files are found, construct a single string
with the different files separated by a space.
- if the
returns
option is given, uses it to manipulate the string with
the file names (as explained above)
- yields the string
The entries stored under the keys
key are maps between the names of the entries, e.g. ifu
and the values that they can have in the loop described in step (2) above. Their value can be either
a list or three comma separated numbers: start, stop, step
. The latter case is converted
into a list of numbers from start
to stop
excluded every step
.
The following configuration:
keyword: &loop
type: loop
value: 's[0-9]*{ifu:03d}{channel}{amp}_*.fits'
keys: # dictionary of keys to expand in ``value``
ifu: 1, 100, 1 # start, stop, step values of a slice
channel: [L, R] # a list of possible values
amp: # alternative syntax for the list
- L
- U
cycles through all the possible combinations of the three lists: [1, 2, ..,
99]
, ['L', 'R']
and ['L', 'R']
. For the first combination we get:
ifu
: 1, channel
: L, amp
: L and value
becomes
s[0-9]*001LL_*.fit
. Then all the files matching this pattern are
collected.
As with the plain primary keyword, it’s possible to
interpret the strings resulting from filling in the fields in value
as regexes by providing
the optional key is_regex
. As before, one can also extract some part from the file name(s) with
the returns
key.
groupby
- collects all the files matching
value
and loops through them
- for each of the files replace
match
with all the values in replace
using the regex secondary keyword implementation.
The following configuration:
keyword:
type: groupby
value: 'p*[0-9][LR]L_*.fits'
match: (.*p.*\d[LR])L(_.*\.fits)
replace:
- \1U\2
cycles through all the files matching value
in the target_dir
, e.g.
“p2LL_sci.fits”, and for each of them creates a new file name replacing the last “L” with
“U”, e.g. “p2LU_sci.fits”. The two files are then returned.
To create multiple files out of the first one, it’s enough to provide other
entries to replace
. E.g.:
replace: [\1U\2, \1A\2, \2_\1]
will create three new files: “p2LU_sci.fits”, “p2LA_sci.fits” and
“_sci.fits_p2L”
As with the plain primary keyword, it’s possible to
interpret the value
as a regex providing the optional key is_regex
. All
the keywords recognised by regex secondary keyword are also supported.
all_files
This primary type has the same interface of the plain primary keyword. The behaviour is however different: while the plain
primary keyword return an iterator (or list) of file names or strings,
all_files
returns a list containing a single string of space separated file
names or, when using the returns
option, values.
The following configuration collect all the files matching value
as
explained in plain and returns a list with a single element:
keyword: &all_files
type: all_files
value: 20*.fits
If e.g. there are four files matching the pattern, the type returns something
like:
['/path/to/20180219T071318.8_073LL_sci.fits /path/to/20180219T071318.8_073LU_sci.fits /path/to/20180219T072418.2_106RL_sci.fits /path/to/20180219T072418.2_106RU_sci.fits']
For comparison, the plain
primary type would return:
['/path/to/20180219T071318.8_073LL_sci.fits',
'/path/to/20180219T071318.8_073LU_sci.fits',
'/path/to/20180219T072418.2_106RL_sci.fits',
'/path/to/20180219T072418.2_106RU_sci.fits']
The regex
and returns
options are interpreted as described in
plain
Warning
The filter_selected option is used to select which of the elements
returned by the primary key are to be used. They are not used to filter
substrings of the elements returned by the primary key. So using
filter_selected
with all_files
might lead to unexpected results and
we suggest to avoid the option
Build-in keyword types ———————-
plain
^^^^^^^^^
A static string. These three definitions are equivalent:
keyword: '-a -b --long option' --- keyword: type: plain value: '-a -b
--long option' --- keyword: {type: plain, value: '-a -b --long option'}
regex
Returns a string obtained from primary replacing match
with replace
. It
uses re.subn()
to do the substitution. If e.g. the primary is
file_001_LL.fits file_001_RL.fits
, the following entry returns L001
keyword:
type: regex
match: \S*?_(\d{3})_([LR]).*?\.fits
replace: \2\1
If the substitution fails because of a regex mismatch or because more than one
substitution is performed, a
CIKeywordError
is raised. It is
possible to declare the expected number of substitutions or to disable the
check altogether via the optional n_subs
key:
if not present, defaults to one, if do_split
is True
, or to the
number of input primary files, otherwise;
if a negative number: the check is disabled;
positive integer: exactly n_subs
must be performed. E.g:
keyword:
type: regex
match: \S*?_(\d{3})_([LR]).*?\.fits
replace: \2\1
n_subs: 2
will fail because it requires two substitutions;
list of integers: the number of substitutions must be one of the list entries:
keyword:
type: regex
match: \S*?_(\d{3})_([LR]).*?\.fits
replace: \2\1
n_subs: [1,2]
will accept either one or two substitutions;
string: interpreted as a slice [start]:[stop][:step]
or
a comma separated list of [start],[stop][,step]
. The string is used to
initialize a SliceLike
instance and
then to check if the number of substitutions is within the allowed range as
defined in the class documentation. E.g the following will succeed:
keyword:
type: regex
match: \S*?_(\d{3})_([LR]).*?\.fits
replace: \2\1
n_subs: 1:10:2
but using n_subs: 0:10:2
with raise an error.
Finally the do_split
optional key will instruct the function whether to
split the primary on white spaces or not. E.g.:
keyword:
type: regex
match: \S*?_(\d{3})_([LR]).*?\.fits
replace: \2\1
do_split: False
with return L001 R001
from the files Sfile_001_L.fits Sfile_001_R.fits
as a single string. If not provided,
it defaults to True
.
Examples and more information about the python regex syntax can be found in
the official python documentation
fplane_map
This type allows to maps from one type of ID to an other using the fplane file.
The following code shows all the mandatory keys; their explanation can be found
below.
keyword: &fplane_map
type: fplane_map
fplane_file: /path/to/fplane.txt
in_id:
type: regex
match: '.*?/dither_(\d{3})\.txt'
replace: \1
in_id_type: ifuslot
out_id_type: ifuid
where:
fplane_file
points to the fplane file
in_id
can be any of the available keyword types and is used to extract
the ID from the primaries.
in_id_type
is the type of ID returned by in_id
and can be any of the
values supported by pyhetdex.het.fplane.FPlane.by_id()
.
out_id_type
is the type of ID to return and can be any of the ones
supported by pyhetdex.het.fplane.IFU
.
If the primary is /path/to/dither_073.txt
and the fplane file contains the
following IFU:
# IFUSLOT X_FP Y_FP SPECID SPECSLOT IFUID IFUROT PLATESC
073 150.0 150.0 04 136 023 0.0 1.0
the above configuration returns the value '023'
Similarly to the header
keyword, by default the id is cast to a string. The
formatter
keyword can be used to the formatting of the output id. In the
following example:
keyword:
<<: *fplane_map
out_id_type: specid
formatter: '{:03d}' # or '{0:03d}'
the return value is '004'
. Without the formatter
keyword the output would be
'4'
.
As in the previous cases, if do_split
is present and False
, the ids are
extracted from all the primaries and converted; the resulting IDs are
concatenated.
For information about the fplane parser, follow this link.