Command Line Tool is a command line script (executeworkflow.bat
on Windows and executeworkflow.sh
on Linux/UNIX) that runsĀ workflows from a terminal.
Download the Taverna 2.2 Command Line Tool.
The executeworkflow script
To get help and a full set of options for the executeworkflow
script type the following in the command prompt:
sh executeworkflow.sh -help
It will give the following usage options:
usage: executeworkflow [options] [workflow] -clientserver Connects as a client to a Derby server instance. -dbproperties Loads a properties file to configure the database. -embedded Connects to an embedded Derby database. This can prevent multiple invocations. -help Displays comprehensive help information. -inmemory Runs the workflow with data stored in-memory rather than in a database. This can give performance improvements, at the cost of overall memory usage. -inputdelimiter Causes an inputvalue or inputfile to be split into a list according to the delimiter. The associated workflow input must expect a list. -inputdoc Loads inputs from a Baclava document. -inputfile Loads the named input from file or URL. -inputvalue Directly uses the value for the named input. -logfile The log file to which more verbose logging will be written to. -outputdir Saves outputs as files in directory. Default is to make a new directory called workflowName_output. -outputdoc Saves outputs to a new Baclava document. -port The port that the database is running on. If the script is starting its own internal server, it will be started on this port. -provenance Generates provenance information and stores it in the database. -startdb Automatically starts an internal Derby database server.
By default, the workflow is executed using the -inmemory
option, and the results are written out to a directory named after the workflow name.
If this directory already exists then an new directory is created, and appended with _<n>
, where n
is incremented to the next available index.
Results are written out to files named after the output port for that result. If a result is composed of lists, then a directory is created for the output port and individual list items are named after the list element index (with 1 being the first index). The the output is the result of an error, the filename is appended with ‘.error’.
You can provide your own output directory with the -outputdir
option. There will be an error if the directory already exists.
You can also record your results to a Baclava document using -outputdoc
option. The document will be overwritten if it already exists. You can use the DataViewer Tool to view Baclava files.
Inputs can be provided in three ways. Both -inputfile
and -inputvalue
options can be used together; -inputdoc
option must be used on its own. -inputfile
and -inputvalue
options both take two additional arguments, the name of the port for the input, and either a file containing the input data, or the input value itself respectively.
If one of more of you workflow inputs is described as a list, you can create a list by using the -inputdelimiter
option, which may be used with either -inputfile
or -inputvalue
. This option takes two parameters – an input name and the delimiter by which to split the input into a list.
The delimiter may be a simple character, such as a comma or a new-line character, or a regular expression. The input string, or file, will then be converted into a list being split by the delimiter specified.
If a list of greater depth (i.e. a list or lists or deeper) is required then you will need to use the -inputdoc
option to pass data from a Baclava file. However, if you provide an input of lower depth to that required, then it will automatically be wrapped in one or more lists up to the required depth. Providing an input of greater depth than that required will result in an error.
Running the script with database
If a workflow has a high memory requirement, then it may be better to run it using a database to store data rather than storing it in memory, which is the default option. There are three options for using a database:
-embedded
option, runs with an embedded database. This is slightly faster than -clientserver option (below), but has the limitation that only oneexecuteworkflow
script may be executed simultaneously.-clientserver
option allows the workflow to be executed backed by the database running as a server. By default a database is not started for you, but may be started using-startdb
option.-startdb
option starts a database. It may be used without providing a workflow to allow a database to be started separately, allowing multiple simultaneousexecuteworkflow
script runs.
More advanced database configurations can be specified using -dbproperties
option, allowing you to take full control over the database used. This takes a second argument, the filename of the propeties file, for which the following example contains the default settings used:
in_memory = true provenance = false connector = derby port = 1527 dialect = org.hibernate.dialect.DerbyDialect start_derby = false driver = org.apache.derby.jdbc.EmbeddedDriver jdbcuri = jdbc:derby:t2-database;create=true;upgrade=true
If you want to run the Derby database with client/server setting instead of with the embedded driver, change the properties in the -dbproperties
file (described above) as follows:
driver=org.apache.derby.jdbc.ClientDriver
start_derby=true
Note that there should be an underscore in the start_derby
option.
If you wish to run your own separate Derby server instance, then do not define the start_derby
option and define the port on which you are running you Derby server as port=<port>
.
If you want to use MySQL database rather than Derby, first you need to drop the MySQL connector jar file into the lib
folder of the Taverna installation directory (Taverna installation directory is where you are running the script from). The supplied MySQL jar is version 5.1.5.
You need to edit the -dbproperties
file as follows:
connector=mysql jdbcuri=jdbc:mysql://localhost/T2Provenance dialect=org.hibernate.dialect.MySQLDialect driver=com.mysql.jdbc.Driver username=<username> password=<password> port=<port>
It is essential that the database name in the jdbcuri
property is T2Provenance, as this database is hard-coded into the MySQL provenance SQL queries.
If you do not specify the port, the script will try to connect to the default port for MySQL server which is 3306.
-dbproperties
together with other options, the other options take precedence.Examples
Some examples on how the script can be invoked are shown below.
sh executeworkflow.sh "/Users/alex/Taverna Workflows/wf-1.t2flow"
Executes the workflow located in /Users/alex/Taverna Workflows/wf-1.t2flow
that has no inputs and uses the memory for data storage (the default option). Make sure to enclose the file path in quotes if it contains spaces. The <workflowName>_output
directory will be created in the current directory and outputs will be written to it.
sh executeworkflow.sh -embedded -inputvalue in1 aaa -inputvalue in2 "bb b" -outputdir /tmp/wf-2/ wf-2.t2flow
Executes the workflow wf-2.t2flow
from the current directory passing the value “aaa” to the input port in1
and value “bb b” to the input port in2
. If input values contain spaces make sure to enclose them in quotes. Uses the embedded Derby database to store the data. Outputs will be written to the /tmp/wf-2/
directory.
sh executeworkflow.sh -inputvalue in1 aaa -inputfile in2 input2.txt -inputdelimiter in2 "\n" wf-3.t2flow
Executes the workflow wf-3.t2flow
from the current directory passing the value “aaa” to the input port in1
and splitting the content of file input2.txt
using “\n” (new line character) as the delimiter and passing the resulting list to the input port in2
. Make sure to out the delimiter in quotes, even if it is just a single character, like “;”. The <workflowName>_output
directory will be created in the current directory and outputs will be written to it.
sh executeworkflow.sh -inputdoc /tmp/input-doc.xml -outputdoc /tmp/output-doc.xml "/Users/alex/Taverna Workflows/wf-4.t2flow"
Executes the workflow /Users/alex/Taverna Workflows/wf-4.t2flow
, loading inputs from the Baclava document /tmp/input-doc.xml
, and writing the outputs to the Baclava document /tmp/output-doc.xml
.