Taverna has now moved to the Apache Software Foundation. For updated information, see Apache Taverna (incubating).

Taverna Command Line Tool 2.3

Command Line Tool is a command line script (executeworkflow.bat on Windows and executeworkflow.sh on Linux/UNIX) that runs workflows from a terminal.

Download the Taverna 2.3 Command Line Tool.

The executeworkflow script

To get help and a full set of options for the executeworkflow script type the following in the command prompt:

sh executeworkflow.sh -help

It will give the following usage options:

usage: executeworkflow [options] [workflow]

 -clientserver      Connect as a client to a derby server instance.

 -cmdir             Absolute path to a directory where Credential Manager's files
                    (keystore and truststore) are located.

 -cmpassword        Indicate that the master password for Credential Manager will be provided on
                    standard input.

 -dbproperties      Load a properties file to configure the database.

 -embedded          Connect to an embedded Derby database. This can prevent multiple invocations.

 -help              Display comprehensive help information.

 -inmemory          Run the workflow with data stored in-memory rather than in a database. This
                    can give performance inprovements, at the cost of overall memory usage.

 -inputdelimiter    Cause an inputvalue or inputfile to be split into a list according to the
                    delimiter. The associated workflow input must be expected to receive a list.

 -inputdoc          Load inputs from a Baclava document.

 -inputfile         Load the named input from file or URL.

 -inputvalue        Directly use the value for the named input.

 -janus             Save Janus RDF/XML trace of execution to FILE or 'provenance-janus.rdf'.

 -logfile           The logfile to which more verbose logging will be written to.

 -opm               Save Open Provenance Model (OPM) RDF/XML trace of execution to FILE or
                    'provenance-opm.rdf'.

 -outputdir         Save outputs as files in directory, default is to make a new directory
                    workflowName_output.

 -outputdoc         Save outputs to a new Baclava document.

 -port              The port that the database is running on. If set requested to start its own
                    internal server, this is the start port that will be used.

 -provenance        Generate provenance information and store it in the database.

 -startdb           Automatically start an internal Derby database server.

By default, the workflow is executed using the -inmemory option, and the results are written out
to a directory named after the workflow name.

If this directory already exists then a new directory is created, and appended with _, where n is
incremented to the next available index.

Results are written out to files named after the output port for that result. If a result is
composed of lists, then a directory is created for the output port and individual list items are
named after the list element index (with 1 being the first index). The the output is the result of
an error, the filename is appended with '.error'.

You can provide your own output directory with the -outputdir option. There will be an error if the
directory already exists.

You can also record your results to a Baclava document using -outputdoc option. The document will be
overwritten if it already exists.

Inputs can be provided in three ways. Both -inputfile and -inputvalue options can be used together;
-inputdoc option must be used on its own. -inputfile and -inputvalue options both take two
additional arguments, the name of the port for the input, and either a file containing the input
data or the input value itself respectively.

If one of more of your workflow inputs is a list, you can create a list input by using the
-inputdelimiter option, which may be used with either -inputfile or -inputvalue. This option takes
two parameters - an input name and the delimiter by which to split the input into a list.

The delimiter may be a simple character, such as a comma or a new-line character, or a regular
expression. The input string, or file, will then be converted into a list being split by the
delimiter specified. Make sure to put the delimiter character in quotes as it may be interpreted by
the shell as a special character, e.g. ;.

If a list of greater depth (i.e. a list or lists or deeper) is required then you will need to use
the -inputdoc option. However, if you provide an input of lower depth to that required, then
it will automatically be wrapped in one or more lists up to the required depth. Providing an
input of greater depth than that required will result in an error.

If a workflow has a high memory requirement, then it may be better to run it using a database to
store data rather than storing it in memory, which is the default option. There are three options
for using a database:

-embedded option, runs with an embedded database. This is slightly faster than the -clientserver
option (below), but has the limitation that only one executeworkflow script may be executed
simultaneously.

-clientserver option allows the workflow to be executed backed by the database running as a server.
By default a database is not started for you, but may be started using -startdb option.

-startdb option starts a database. It may be used without providing a workflow to allow a database
to be started separately, allowing multiple simultaneous executeworkflow script runs.

More advanced database configurations can be specified using -dbproperties option, allowing you
to take full control over the database used. This takes a second argument, the filename of the
properties file, for which the following example contains the default settings used:

in_memory = true
provenance = false
connector = derby
port = 1527
dialect = org.hibernate.dialect.DerbyDialect
start_derby = false
driver = org.apache.derby.jdbc.EmbeddedDriver
jdbcuri = jdbc:derby:t2-database;create=true;upgrade=true

Note that when using -dbproperties together with other options, the other options take precedence.

-cmdir option lets you specify an absolute path to a directory where Credential Manager's files
(keystore and truststore - containing user's credentials and trusted certificates for accessing
secure services) are stored. If not specified and the workflow requires access to these files,
Taverna will try to find them in the default location in /security somewhere inside user's home
directory (depending on the platform).

-cmpassword option can be used to tell Taverna to expect the password for the Credential Manager
on standard input. If the password is not piped in, Taverna will prompt you for it in the terminal
and block until it is entered. Do not enter your password in the command line! If -cmpassword
option is not specified and -cmdir option is used, Taverna will try to find the password in a special
file password.txt in the directory specified with the -cmdir option.

By default, the workflow is executed using the -inmemory option, and the results are written out to a directory named after the workflow name.

If this directory already exists then an new directory is created, and appended with _<n>, where n is incremented to the next available index.

Results are written out to files named after the output port for that result. If a result is composed of lists, then a directory is created for the output port and individual list items are named after the list element index (with 1 being the first index). The the output is the result of an error, the filename is appended with ‘.error’.

You can provide your own output directory with the -outputdir option. There will be an error if the directory already exists.

You can also record your results to a Baclava document using -outputdoc option. The document will be overwritten if it already exists. You can use the DataViewer Tool to view Baclava files.

Inputs can be provided in three ways. Both -inputfile and -inputvalue options can be used together; -inputdoc option must be used on its own. -inputfile and -inputvalue options both take two additional arguments, the name of the port for the input, and either a file containing the input data, or the input value itself respectively.

If one of more of you workflow inputs is described as a list, you can create a list by using the -inputdelimiter option, which may be used with either -inputfile or -inputvalue. This option takes two parameters – an input name and the delimiter by which to split the input into a list.

The delimiter may be a simple character, such as a comma or a new-line character, or a regular expression. The input string, or file, will then be converted into a list being split by the delimiter specified.

If a list of greater depth (i.e. a list or lists or deeper) is required then you will need to use the -inputdocoption to pass data from a Baclava file. However, if you provide an input of lower depth to that required, then it will automatically be wrapped in one or more lists up to the required depth. Providing an input of greater depth than that required will result in an error.

Running the script with database

If a workflow has a high memory requirement, then it may be better to run it using a database to store data rather than storing it in memory, which is the default option. There are three options for using a database:

  • -embedded option, runs with an embedded database. This is slightly faster than -clientserver option (below), but has the limitation that only one executeworkflow script may be executed simultaneously.
  • -clientserver option allows the workflow to be executed backed by the database running as a server. By default a database is not started for you, but may be started using -startdb option.
  • -startdb option starts a database. It may be used without providing a workflow to allow a database to be started separately, allowing multiple simultaneous executeworkflow script runs.

More advanced database configurations can be specified using -dbproperties option, allowing you to take full control over the database used. This takes a second argument, the filename of the propeties file, for which the following example contains the default settings used:

in_memory = true
provenance = false
connector = derby
port = 1527
dialect = org.hibernate.dialect.DerbyDialect
start_derby = false
driver = org.apache.derby.jdbc.EmbeddedDriver
jdbcuri = jdbc:derby:t2-database;create=true;upgrade=true

If you want to run the Derby database with client/server setting instead of with the embedded driver, change the properties in the -dbproperties file (described above) as follows:

driver=org.apache.derby.jdbc.ClientDriver
start_derby=true

Note that there should be an underscore in the start_derby option.

If you wish to run your own separate Derby server instance, then do not define the start_derby option and define the port on which you are running you Derby server as port=<port>.

If you want to use MySQL database rather than Derby, first you need to drop the mysql-java-connector.jar file into the lib folder of the Taverna installation directory (Taverna installation directory is where you are running the script from). The supplied MySQL jar is version 5.1.5.

You need to edit the -dbproperties file as follows:

connector=mysql
jdbcuri=jdbc:mysql://localhost/T2Provenance
dialect=org.hibernate.dialect.MySQLDialect
driver=com.mysql.jdbc.Driver
username=<username>
password=<password>
port=<port>

It is essential that the database name in the jdbcuri property is T2Provenance, as this database is hard-coded into the MySQL provenance SQL queries.

If you do not specify the port, the script will try to connect to the default port for MySQL server which is 3306.

Note. When using -dbproperties together with other options, the other options take precedence.

Examples

Some examples on how the script can be invoked are shown below.

Execute a workflow on local disk with no inputs

sh executeworkflow.sh "/Users/alex/Taverna Workflows/wf-1.t2flow"

Executes the workflow located in /Users/alex/Taverna Workflows/wf-1.t2flow that has no inputs and uses the memory for data storage (the default option). Make sure to enclose the file path in quotes if it contains spaces. The <workflowName>_outputdirectory will be created in the current directory and outputs will be written to it.

Execute a workflow from a URL with no inputs

sh executeworkflow.sh "http://www.myexperiment.org/workflows/1005/download?version=2"

Executes the workflow located at http://www.myexperiment.org/workflows/1005/download?version=2 that has no inputs and uses the memory for data storage (the default option). Make sure to enclose the URL in quotes as it may contain special characters interpreted by the command prompt shell. The <workflowName>_outputdirectory will be created in the current directory and outputs will be written to it.

Execute a local workflow with two inputs passed as inline values and embedded database for data storage

sh executeworkflow.sh -embedded -inputvalue in1 aaa -inputvalue in2 "bb b" -outputdir /tmp/wf-2/ wf-2.t2flow

Executes the workflow wf-2.t2flow from the current directory passing the value “aaa” to the input port in1 and value “bb b” to the input port in2. If input values contain spaces make sure to enclose them in quotes. Uses the embedded Derby database to store the data. Outputs will be written to the /tmp/wf-2/directory.

Execute a local workflow with two inputs passed as files and splitting the second input into lines

sh executeworkflow.sh -inputvalue in1 aaa -inputfile in2 input2.txt -inputdelimiter in2 "\n" wf-3.t2flow

Executes the workflow wf-3.t2flow from the current directory passing the value “aaa” to the input port in1 and splitting the content of file input2.txt using “\n” (new line character) as the delimiter and passing the resulting list to the input port in2. Make sure to out the delimiter in quotes, even if it is just a single character, like “;”. The <workflowName>_outputdirectory will be created in the current directory and outputs will be written to it.

Execute a local workflow with inputs passed as a Baclava document and save the results to a Baclava document

sh executeworkflow.sh -inputdoc /tmp/input-doc.xml -outputdoc /tmp/output-doc.xml "/Users/alex/Taverna Workflows/wf-4.t2flow"

Executes the workflow /Users/alex/Taverna Workflows/wf-4.t2flow, loading inputs from the Baclava document /tmp/input-doc.xml, and writing the outputs to the Baclava document /tmp/output-doc.xml.

Execute a workflow from a URL that requires access to secure services and Credential Manager

sh executeworkflow.sh -cmdir "/User/alex/Library/Application Support/taverna-2.3.0/security" -cmpassword /tmp/output-doc.xml "http://tinyurl.com/6kv2t5c"

Executes the workflow from http://tinyurl.com/6kv2t5c that requires no inputs but calls a secure service that requires a username and a password (testuser/testpasswd) from Credential Manager. The path to Credential Manager files is set as /User/alex/Library/Application Support/taverna-2.3.0/security/. You should really run this workflow from the Workbench first so you can save the username and password into Credential Manager and then point to it from command line. The Taverna Workbench saves the Credential Manager files to TAVERNA_HOME/security directory and you can point to this directory from command line. It also tells Taverna Command Line Tool to prompt you for the Credential Manager’s (master) password on the command line – do not type it directly into the command line.