Contents
Skip to end of metadata
Go to start of metadata

Overview

It is important to know how to transfer the data between login.osgconnet.net and the remote worker node. Because the input/output files are transferred to the worker machine for running a job.  The file transfer to the remote machine are accomplished  via HTTP, condor transfer or skeleton key.  

Condor Transfer

 HTCondor has a built-in mechanism to transfer binaries and files to and from compute nodes.  If users have relatively small amounts of data and binaries to transfer (<100MB) or needs to do ad-hoc job submissions, then this mechanism can be effective.

Preliminaries

Before getting started, users should login to login01.osgconnect.org and get a copy of the tutorial files:

Setting tutorial up

Word Distribution Example

This example will use the HTCondor transfer mechanisms to transfer a binary (distribution) and a file with a list of words (random_words) to compute nodes that are running the jobs.  The condor file that will be used is shown below:

transfer.submit  Expand source

The key parts of the submit file are the transfer_input_files parameter that gives a comma separated list of paths to the files that will be transferred.  In addition, ShouldTransferFiles needs to be set to YES and when_to_transfer_output needs to be set to ON_EXIT in order to make sure that the HTCondor will return the output.

Finally, change submit file to by replacing PROJECT_NAME with the appropriate value before submitting the file:

path warning

Icon

You must run condor_submit in the same directory that you created the files and directories in. Otherwise HTCondor will give you an error due to not being able to find the distribution and random_words files

Submit job

When the jobs are completed, verify the output:

Job verification  Expand source

 

HTTP

Preliminaries

Before getting started, users should login to login01.osgconnect.net and get a copy of the tutorial files:

Set up tutorial files

Making data accessible over HTTP

All user accounts on OSG-Connect have a directory that is automatically web accessible.  This directory is located at ~/data/public.  To make a file or directory accessible, copy it to this directory or a subdirectory of this directory and give files permissions of 644 and directories permissions of 755.  E.g. :

Making file accessible on HTTP

Accessing data from stash over HTTP within jobs

The final part of this section covers getting data within stash to jobs running on OSG using HTTP access.  This example will show the user how to access stash over HTTP within jobs.  The primary component of this example is the shell script that is run on the compute node.  It downloads the random_words data file and then generates a histogram with the most common words found in the file.  Before running this example, app_script.sh needs to be edited to replace username with the user's username:

app_script.sh

Next edit the application/application.submit file and replace PROJECT_NAME with the appropriate project name:

application.submit

Once that change has been made, submit the file:

Running random words application

Once the jobs are completed, users can look at the output in the logs directory and verify that the job ran correctly:

Verifying job completion

 

 

  • No labels