Hmm - that example is misleading. That file is copied from the upstream sources but isn’t used by Guild in the way you’re expecting there.
Guild doesn’t support multiple flag values as a single assignment. This has come up a few times before — I believe there’s a GitHub issue for that. I’ll revive that issue and copy you there so you have a reference to it.
This is tricky for Guild (I think) as Guild doesn’t support lists for flag values, which is what you’re talking about here.
In the meantime, the official approach for this is to use a single string value and parse the value as you see fit. An easy way to enable support for a list is to use Python’s shlex
module to parse a string.
With this example you need to remote nargs='+'
in your arg config (this breaks you argparse interface though, not so great):
import shlex
model_dirs = shlex.split(args.model_dirs)
To support nargs='+'
with Guild:
if os.getenv("GUILD_OP") and args.model_dirs:
args.model_dirs = shlex.split(args.model_dirs[0])
I think the later is a better pattern as it preserves your intended interface. The use of Guild is a change, but as Guild formalizes flags as an interface (rather than args) I think this is okay. Of course what’s not okay is the modification to code for Guild’s sake, which we want to avoid.
Implementation Notes
Flag attr is-list
I can imagine a flag attr is-list
(boolean) that enables this behavior automatically. Guild would detect this when importing the flag when nargs
is a list-enabling value. The question I have is whether this string-parsing via shlex is the right interface.
The list notation (e.g. model_dirs=[a,b,c]
) is used for generating trials. We can’t use that to get a list value for a single run. I’ve wondered what interface we should provide. Introducing another list specification is tricky. We’re already annoying zsh users with the use of [...]
as zsh treats square brackets as tokens. Other tokens to avoid: (...)
, {...}
, #...#
— the list goes on
Then there’s the issue of lists of lists for multiple runs. E.g. model_dirs=[[a,b,c],[d,e,f]]
.
Using the shell parsing logic via shlex.split
is a way to handle this, as long as Guild knows the flag is a list. I think this is reasonable though — the flag attr type
overrides the default parsing rules. So is-list
tells Guild to treat values as lists, parsing them with Python’s shlex.split(VAL, posix=True)
algorithm.
Flag attr parser
This could be spelled as a “command line arg parser” rather than “is a list”. This is independent of type
, which gives a hint as to how the value might be parsed (e.g. list of int, etc.)
train:
flags:
model_dirs:
parser: shlex
The value for parser
could be a reference to a Python function (e.g. utils.parse_model_dirs
). The function signature would be (arg_val)
or (arg_val, flag_def)
(i.e. arity of 1 or 2).
If a string, the reference could be a reference to guild.flag_util.parse_<str>
. So e.g. shlex
would invoke guild.flag_util.parse_shlex
.
The advantage of this approach over is-list
is obviously flexibility — users can concoct their own flag parsing routines as needed (e.g. sets, maps, JSON, etc.) The spelling parser: shlex
also provides the important detail that the strings must be formatted as shlex-parseable strings, which is otherwise implicit.
The counterpoint is that this is complicating things to accommodate edge cases. And while this scheme works nicely with Python globals, it does not work seamlessly with argparse
or other command line arg interfaces. A simple “list plus type” scheme does. E.g. argparse
doesn’t support maps or other non-list structures.
The only case that occurs to me where parser
is not a stretch to specify a different list interface. E.g. parser: csv
would indicate that values are parsed using comma delimiters.