Contents
Skip to end of metadata
Go to start of metadata
Table of Contents

Overview

In this module we demonstrate job submission to the OSG Connect environment from your laptop with BOSCO. This will allow you to manage jobs running on OSG from your familiar environment. It does not have to be a laptop: Any Linux or Mac host can be used provided it runs RHEL5 or RHEL6 (and Scientific Linux distributions), Debian 6, or Mac OS X (10.5 or later).

Here is a diagram showing a picture of how the different resources are connected when you install BOSCO on your laptop:

Icon

Jobs running on OSG Connect must have a project name set ( +ProjectName in the HTCondor submit file). The best way to do that is by adding in your home directory a file with your default project name: $HOME/.osg_default_project

For example on login.osgconnect.net do: echo ConnectTrain > $HOME/.osg_default_project to use the ConnectTrain account.  Once you create a project, you should replace the ConnectTrain with your project name.

Icon

In the examples below substitute user with your actual User ID.

In the rest of the tutorial we'll assume that you'll be working on a terminal on your laptop or on whichever host you choose as BOSCO submit host.

Install and configure BOSCO

The following example is on a host named laptop. On other machines your prompt will be more something like [yourname@yourhost ~], mentioning your host instead of [user@laptop ~]. The rest will be the same.

  • Download the BOSCO Quickstart Multi-Platform installer from this download page. If you prefer to work in a terminal window, you can also copy the URL that will be printed in the download page and use cURL:

    [user@laptop Downloads]$ curl -o bosco_quickstart.tar.gz ftp://ftp.cs.wisc.edu/GET_THE_URL_FROM_THE_PAGE/bosco_quickstart.tar.gz 
    curl detailed output  Expand source
    Icon

    If you have no curl you can use wget to download the file: 

    wget -O ./bosco_quickstart.tar.gz ftp://ftp.cs.wisc.edu/GET_THE_URL_FROM_THE_PAGE/bosco_quickstart.tar.gz 
  • Untar the bosco_quickstart script from a terminal with a current working directory of the ~/Downloads folder or the folder in which you saved the file:
    [user@laptop Downloads]$ tar xvzf ./bosco_quickstart.tar.gz
    
  • Run the quickstart script and answer the questions.
    [user@laptop Downloads]$ ./bosco_quickstart
    
    • When prompted "Do you want to install Bosco? Select y/n and press [ENTER]:" press "y" and ENTER.
    • When prompted "Type the cluster name and press [ENTER]:" type login.osgconnect.net and press ENTER.
    • When prompted "Type your name at login.osgconnect.net (default YOUR_USER) and press [ENTER]:" enter your user name on OSG-Connect and press ENTER.
    • When prompted "Type the queue manager for login.osgconnect.net (pbs, condor, lsf, sge, slurm) and press [ENTER]:" enter condor and press ENTER.
    • Then when prompted "user@login.osgconnect.net's password:" enter your OSG-Connect user password.
click here to see the output  Expand source
  • After a successful installation, before changing directory, you can remove the installer and its log file:
    [user@laptop Downloads]$ rm bosco_quickstart* 

  • Setup the environment
    [user@laptop ~]$ source ~/bosco/bosco_setenv 

  • BOSCO has been started for you but in the future you may need to restart it with:
    [user@laptop ~]$ bosco_start
    BOSCO Started

    At this point, submission to login.osgconnect.net, which gets to the full OSG-Connect environment is now ready. The BOSCO services will remain running even if you log out unless explicitly shut down.

Each time setup the BOSCO environment

Each time you login or start a new shell stup the environment and invoke bosco_start (bosco_start is a no-op if the services are already running):

$ source ~/bosco/bosco_setenv
$ bosco_start
BOSCO Started

Create a tutorial directory

Create a new directory to run this tutorial and the log directory for the jobs:

$ mkdir -p tutorial-bosco/log
$ cd tutorial-bosco

Submit a job to OSG-Connect

Now run a simple job, like the Job 1 of the Quickstart tutorial . The workload is the same, the submit description file will be slightly different.

Create a workload

Inside the tutorial directory that you created or installed previously, let's create a test script to execute as your job (remember to make the script executable!):

$ vi short.sh
$ chmod +x short.sh

Here is the content of short.sh:

file: short.sh
#!/bin/bash
# short.sh: a short discovery job

printf "Start time: "; /bin/date
printf "Job is running on node: "; /bin/hostname
printf "Job running as user: "; /usr/bin/id

echo "Environment:"
/bin/env | /bin/sort

echo "Dramatic pause..."
sleep ${1-15}    # Sleep 15 seconds, or however much we're told to sleep
echo "Et voila!"

Create a condor submit file:

The next step is to create a submission file for the job.

$ vi bosco01.sub

Here is the bosco01.sub content, configured to use a special project name on login01.osgconnect.net. This is a general purpose project name and you are encouraged to use one of the projects that you are member of. You can see the projects you are member of by using the osgconnect_show_projects command. This is very nearly the minimal content of a submission file.
Note that differently from the previous examples, now the Universe of the job is now grid. This tells BOSCO to run the job on the resource added during the setup.

file: bosco01.sub
########################
# Submit description file for short test program
########################
Universe       = grid
Executable     = short.sh
Error   = log/job.err.$(Cluster)-$(Process)
Output  = log/job.out.$(Cluster)-$(Process)
Log     = log/job.log.$(Cluster)
+ProjectName="ConnectTrain"
Queue 1
submit file: bosco01.sub  Expand source

Submit the job using condor_submit.

$ condor_submit bosco01.sub
Submitting job(s).
1 job(s) submitted to cluster 2.
Icon

Note the "submitted to cluster 1": if you did a fresh installation of BOSCO the ID of the job group you've created is 1. You'll use this for monitoring the status of your jobs.

Check job status

The condor_q command tells the status of currently running jobs. Generally you will want to limit it to your own jobs:

$ condor_q

-- Submitter: laptop : <127.0.0.1:11000?sock=44111_3112_3> : laptop
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
   2.0   user            8/20 16:12   0+00:00:00 I  0   0.0  short.sh

1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended
Icon
Note that condor_q lists only your jobs even without specifying the cnet id. Only you can submit to your BOSCO, it is your personal HTCondor installation.
Icon

If you notice that your jobs are being held after submitting them from BOSCO, double check your ~/.osg_default_project to make sure that you have your project listed in that file.

 

Submit more jobs to OSG-Connect

This example submits 20 jobs to OSG-Connect. To ease the observation of the job we'll increase the sleep time to 40 seconds.

  • Edit the submit file bosco01.sub and add the line Arguments = 40 and change the last line to Queue 20:

    $ vi bosco01.sub 
    New version of bosco01.sub  Expand source
  • Submit the set of 20 jobs:

    $ condor_submit ./bosco01.sub
    Submitting job(s)....................
    20 job(s) submitted to cluster 3.
    
  • Watch the jobs go through the queue by using watch -n2 condor_q -grid. The -grid option changes the format of condor_q and provides more information about where the jobs run.

    $ condor_q -grid
    
    
    -- Submitter: laptop.local : <127.0.0.1:11000?sock=52977_4003_3> : laptop.local
     ID      OWNER          STATUS     GRID->MANAGER    HOST       GRID_JOB_ID
       3.0   user           IDLE       batch-> user@login01.osgcon /804//
       3.1   user           IDLE       batch-> user@login01.osgcon laptop.local_1100
       3.2   user           IDLE       batch-> user@login01.osgcon /809//
       3.3   user           IDLE       batch-> user@login01.osgcon laptop.local_1100
       3.4   user           IDLE       batch-> user@login01.osgcon laptop.local_1100
       3.5   user           IDLE       batch-> user@login01.osgcon /806//
       3.6   user           IDLE       batch-> user@login01.osgcon laptop.local_1100
       3.7   user           IDLE       batch-> user@login01.osgcon /810//
       3.8   user           IDLE       batch-> user@login01.osgcon /802//
       3.9   user           IDLE       batch-> user@login01.osgcon laptop.local_1100
       3.10  user           IDLE       batch-> user@login01.osgcon /807//
       3.11  user           IDLE       batch-> user@login01.osgcon laptop.local_1100
       3.12  user           IDLE       batch-> user@login01.osgcon /811//
       3.13  user           IDLE       batch-> user@login01.osgcon /803//
       3.14  user           IDLE       batch-> user@login01.osgcon laptop.local_1100
       3.15  user           IDLE       batch-> user@login01.osgcon /808//
       3.16  user           IDLE       batch-> user@login01.osgcon laptop.local_1100
       3.17  user           IDLE       batch-> user@login01.osgcon laptop.local_1100
       3.18  user           IDLE       batch-> user@login01.osgcon /805//
       3.19  user           IDLE       batch-> user@login01.osgcon laptop.local_1100
    
Icon

Note that condor_q on your BOSCO installation will list only your jobs. There may be other jobs queued on OSG Connect but to see them you'll have to login on login01.osgconnect.net and issue condor_q there.

Other BOSCO commands

You can check the resources connected to BOSCO:

$ bosco_cluster --list
user@login01.osgconnect.net/condor

You can stop and uninstall BOSCO:

$ source ~/bosco/bosco_setenv
$ bosco_stop
Sending off command to condor_master.
Sent "Kill-Daemon" command for "master" to local master
Stopped HTCondor
BOSCO is now off.
$ bosco_uninstall
Ensuring Condor is stopped...
BOSCO is now off.
Removing BOSCO installation under /home/mmb/bosco
Done

All the HTCondor commands work form BOSCO.  This document contains a detailed description of all the installation options and all the BOSCO commands.

  • No labels