7.1.3. Task API¶
-
class
radical.entk.
Task
(from_dict=None)[source] A Task is an abstraction of a computational unit. In this case, a Task consists of its executable along with its required software environment, files to be staged as input and output.
At the user level a Task is to be populated by assigning attributes. Internally, an empty Task is created and population using the from_dict function. This is to avoid creating Tasks with new uid as tasks with new uid offset the uid count file in radical.utils and can potentially affect the profiling if not taken care.
-
arguments
List of arguments to be supplied to the executable
Getter: returns the list of arguments of the current task Setter: assigns a list of arguments to the current task Arguments: list of strings
-
copy_input_data
Copies data (filenames in a list) from another task to a current task (or data staging area) before it starts.
The following is an example
t2.copy_input_data = ['$Pipeline_%s_Stage_%s_Task_%s/output.txt' % (p.name, s1.name, t1.name)] # output.txt is copied from a t1 task to a current task before it # starts.
Getter: return the list of files Setter: assign the list of files Arguments: list of strings
-
copy_output_data
Copies data (filenames in a list) from a current task to another task (or data staging area) when a task is finished.
The following is an example
t.copy_output_data = [ 'results.txt > $SHARED/results.txt' ] # results.txt is copied to a data staging area `$SHARED` when a task is # finised.
Getter: return the list of files Setter: assign the list of files Arguments: list of strings
-
cpu_reqs
Purpose: The CPU requirements of the current Task.
The requirements are described in terms of the number of processes and threads to be run in this Task. The expected format is:
task.cpu_reqs = {'cpu_processes' : X, 'cpu_process_type' : None/MPI, 'cpu_threads' : Y, 'cpu_thread_type' : None/OpenMP}
This description means that the Task is going to spawn X processes and Y threads per each of these processes to run on CPUs. Hence, the total number of cpus required by the Task is X*Y for all the processes and threads to execute concurrently. The same assumption is made in implementation and X*Y cpus are requested for this Task.
The default value is:
task.cpu_reqs = {'cpu_processes' : 1, 'cpu_process_type' : None, 'cpu_threads' : 1, 'cpu_thread_type' : None}
This description requests 1 core and expected the executable to non-MPI and single threaded.
Getter: return the cpu requirement of the current Task Setter: assign the cpu requirement of the current Task Arguments: dict
-
download_output_data
Downloads data (filenames in a list) from a current task to a local client (e.g. laptop) when a task is finished.
The following is an example .. highlight:: python
t.download_output_data = [ 'results.txt' ] # results.txt is transferred to a local client (e.g. laptop) when a # current task finised.
Getter: return the list of files Setter: assign the list of files Arguments: list of strings
-
executable
A unix-based kernel to be executed
Getter: returns the executable of the current task Setter: assigns the executable for the current task Arguments: string
-
exit_code
Get the exit code for DONE tasks. 0 for successful, 1 for failed tasks.
Getter: return the exit code of the current task
-
from_dict
(d)[source] Create a Task from a dictionary. The change is in inplace.
Argument: python dictionary Returns: None
-
gpu_reqs
Purpose: The GPU requirements of the current Task.
The requirements are described in terms of the number of processes and threads to be run in this Task. The expected format is:
task.gpu_reqs = {'gpu_processes' : X, 'gpu_process_type' : None/MPI, 'gpu_threads' : Y, 'gpu_thread_type' : None/OpenMP/CUDA}
This description means that the Task is going to spawn X processes and Y threads per each of these processes to run on GPUs. Hence, the total number of gpus required by the Task is X*Y for all the processes and threads to execute concurrently. The same assumption is made in implementation and X*Y gpus are requested for this Task.
The default value is: .. highlight:: python .. code-block:: python
- task.gpu_reqs = {‘gpu_processes’ : 0,
- ‘gpu_process_type’ : None, ‘gpu_threads’ : 0, ‘gpu_thread_type’ : None}
This description requests 0 gpus as not all machines have GPUs.
Getter: return the gpu requirement of the current Task Setter: assign the gpu requirement of the current Task Arguments: dict
-
lfs_per_process
Set the amount of local file-storage space required by the task
-
link_input_data
Symlinks data (filenames in a list) from another task to a current task (or data staging area) before it starts.
Getter: return the list of files Setter: assign the list of files Arguments: list of strings
-
link_output_data
Symlins data (filenames in a list) from a current task to another task (or data staging area) when a task is finished.
Getter: return the list of files Setter: assign the list of files Arguments: list of strings
-
luid
Unique ID of the current task (fully qualified).
- example:
>>> task.luid pipe.0001.stage.0004.task.0234
Getter: Returns the fully qualified uid of the current task Type: String
-
move_input_data
Moves data (filenames in a list) from another task to a current task (or data staging area) before it starts.
Getter: return the list of files Setter: assign the list of files Arguments: list of strings
-
move_output_data
Moves data (filenames in a list) from a current task to another task (or data staging area) when a task is finished.
Getter: return the list of files Setter: assign the list of files Arguments: list of strings
-
name
Name of the task. Do not use a ‘,’ or ‘_’ in an object’s name.
Getter: Returns the name of the current task Setter: Assigns the name of the current task Type: String
-
parent_pipeline
Getter: Returns the pipeline this task belongs to Setter: Assigns the pipeline uid this task belongs to
-
parent_stage
Getter: Returns the stage this task belongs to Setter: Assigns the stage uid this task belongs to
-
path
Get the path of the task on the remote machine. Useful to reference files generated in the current task.
Getter: return the path of the current task
-
post_exec
List of commands to be executed post executable
Getter: return the list of commands Setter: assign the list of commands Arguments: list of strings
-
pre_exec
List of commands to be executed prior to the executable
Getter: return the list of commands Setter: assign the list of commands Arguments: list of strings
-
rts_uid
Unique RTS ID of the current task
Getter: Returns the RTS unique id of the current task Type: String
-
sandbox
Sandbox the task is running in
Getter: returns the sandbox name Setter: assigns the sandbox name Arguments: string
-
state
Current state of the task
Getter: Returns the state of the current task Type: String
-
state_history
Returns a list of the states obtained in temporal order
Returns: list
-
stderr
Name of the file to which stderr of task is to be written
Getter: return name of stderr file Setter: assign name of stderr file Arguments: str
-
stdout
Name of the file to which stdout of task is to be written
Getter: return name of stdout file Setter: assign name of stdout file Arguments: str
-
tag
WARNING: It will be deprecated.
-
tags
Set the tags for the task that can be used while scheduling by the RTS
Getter: return the tags of the current task
-
to_dict
()[source] Convert current Task into a dictionary
Returns: python dictionary
-
uid
Unique ID of the current task
Getter: Returns the unique id of the current task Type: String
-
upload_input_data
Transfers data (filenames in a list) from a local client (e.g. laptop) to the location of the current task (or data staging area) before it starts.
Getter: return the list of files Setter: assign the list of files Arguments: list of strings
-