Getting Started
In this section we will run through the Ensemble Toolkit API. We will develop an example application consisting of a simple bag of Tasks.
Note
The reader is assumed to be familiar with the PST Model and to have read through the Introduction of Ensemble Toolkit.
Note
This chapter assumes that you have successfully installed Ensemble Toolkit, if not see Installation.
You can download the complete code discussed in this section here
or find it in
your virtualenv under share/radical.entk/user_guide/scripts
.
Importing components from the Ensemble Toolkit Module
To create any application using Ensemble Toolkit, you need to import five modules: Pipeline, Stage, Task, AppManager, ResourceManager. We have already discussed these components in the earlier sections.
1from radical.entk import Pipeline, Stage, Task, AppManager
Creating the workflow
We first create a Pipeline, Stage and Task object. Then we assign the ‘executable’ and ‘arguments’ for the Task. For this example, we will create one Pipeline consisting of one Stage that contains one Task.
In the below snippet, we first create a Pipeline then a Stage.
22 t = Task()
23 t.name = 'my.first.task' # Assign a name to the task (optional, do not use ',' or '_')
24 t.executable = '/bin/echo' # Assign executable to the task
25 t.arguments = ['Hello World'] # Assign arguments for the task executable
26
27 # Add Task to the Stage
28 s.add_tasks(t)
Next, we create a Task and assign its name, executable and arguments of the executable.
30 # Add Stage to the Pipeline
31 p.add_stages(s)
32
33 # Create Application Manager
34 appman = AppManager()
Now, that we have a fully described Task, a Stage and a Pipeline. We create our workflow by adding the Task to the Stage and adding the Stage to the Pipeline.
37 # resource, walltime, and cpus
38 # resource is 'local.localhost' to execute locally
39 res_dict = {
40
41 'resource': 'local.localhost',
Creating the AppManager
Now that our workflow has been created, we need to specify where it is to be
executed. For this example, we will simply execute the workflow locally. We
create an AppManager object, describe a resource request for 1 core for 10
minutes on localhost, i.e. your local machine. We assign the resource request
description and the workflow to the AppManager and run
our application.
42 'walltime': 10,
43 'cpus': 1
44 }
45
46 # Assign resource request description to the Application Manager
47 appman.resource_desc = res_dict
48
49 # Assign the workflow as a set or list of Pipelines to the Application Manager
50 # Note: The list order is not guaranteed to be preserved
51 appman.workflow = set([p])
52
53 # Run the Application Manager
54 appman.run()
Warning
If the python version your system has by default is Anaconda python, please change line 51 in the above code block to
'resource': 'local.localhost_anaconda',
To run the script, simply execute the following from the command line:
python get_started.py
Warning
The first run may fail for different reasons, most of which
related to setting up the execution environment or requesting the correct
resources. Upon failure, Python may incorrectly raise the exception
KeyboardInterrupt
. This may be confusion because it is reported even when
no keyboard interrupt has been issued. Currently, we did not find a way to
avoid to raise that exception.
And that’s it! That’s all the steps in this example. You can generate more verbose output by setting the environment variable `export RADICAL_LOG_TGT=radical.log;export RADICAL_LOG_LVL=DEBUG`.
After the execution of the example, you may want to check the output. Under your home folder, you will find a folder named radical.pilot.sandbox. In that folder, there will be a re.session.* folder and a ve.local.localhost folder. Inside, re.session.*, there is a pilot.0000 folder and in there a unit.000000 folder. In the unit folder, you will see several files including a unit.000000.out and unit.000000.err files. The unit.000000.out holds the messages from the standard output and unit.000000.err holds the messages from standard error. The unit.000000.out file should have a Hello World message.
Let’s look at the complete code for this example:
#!/usr/bin/env python
from radical.entk import Pipeline, Stage, Task, AppManager
import os
# ------------------------------------------------------------------------------
# Set default verbosity
if os.environ.get('RADICAL_ENTK_VERBOSE') is None:
os.environ['RADICAL_ENTK_REPORT'] = 'True'
if __name__ == '__main__':
# Create a Pipeline object
p = Pipeline()
# Create a Stage object
s = Stage()
# Create a Task object
t = Task()
t.name = 'my.first.task' # Assign a name to the task (optional, do not use ',' or '_')
t.executable = '/bin/echo' # Assign executable to the task
t.arguments = ['Hello World'] # Assign arguments for the task executable
# Add Task to the Stage
s.add_tasks(t)
# Add Stage to the Pipeline
p.add_stages(s)
# Create Application Manager
appman = AppManager()
# Create a dictionary describe four mandatory keys:
# resource, walltime, and cpus
# resource is 'local.localhost' to execute locally
res_dict = {
'resource': 'local.localhost',
'walltime': 10,
'cpus': 1
}
# Assign resource request description to the Application Manager
appman.resource_desc = res_dict
# Assign the workflow as a set or list of Pipelines to the Application Manager
# Note: The list order is not guaranteed to be preserved
appman.workflow = set([p])
# Run the Application Manager
appman.run()