This section covers how to use the OASIS system to run a real application like R statistical package. For this example, we'll estimate the value of pi using a Monte Carlo method. We'll first run the program locally, then create a submit file, send it out to OSG-Connect, and collate our results.
Some background is useful here. We define a square inscribed by a unit circle. We randomly sample points, and calculate the ratio of the points outside of the circle to the points inside for the first quadrant. This ratio approaches pi/4.
This method converges extremely slowly, which makes it great for a CPU-intensive exercise (but bad for a real estimation!).
First we'll need to create a working directory, you can either run
tutorial R or type the following:
Since R is installed into OASIS, it's not available in the normal system paths. We'll need to set up those paths so we can access R correctly. To do that we'll do the following:
Once we have the path set up, we can try to run R. Don't worry if you aren't an R expert, I'm not either.
Great! R works. You can quit out with "q()".
Now that we can run R, let's try using my Pi estimation code:
R normally runs as an interactive shell, but it's easy to run in batch mode too.
This should take few seconds to run. Now edit the file. Increasing the trials ten times (10000000) it will take little over a minute to run, but the estimation still isn't very good. Fortunately, this problem is pleasingly parallel since we're just sampling random points. So what do we need to do to run R on the campus grid?
The first thing we're going to need to do is create a wrapper for our R environment, based on the setup we did in previous sections.
Notice here that we're using Rscript (equivalent to R --slave). It accepts the script as command line argument, it makes R much less verbose, and it's easier to parse the output later. If you run it at the command line, you should get similar output as above. This lets the wrapper launch R on any generic worker node under Condor.
Now that we've created a wrapper, let's build a Condor submit file around it.
Notice the requirements line? You'll need to put HAS_CVMFS =?= TRUE (or some variation such as what you see above) any time you need software from /cvmfs. There's also one small gotcha here – make sure the "log" directory used in the submit file exists before you submit! Else Condor will fail because it has nowhere to write the logs.
Finally, submit the job to OSG Connect!
You can follow the status of your job cluster with the connect watch command, which shows condor_q output that refreshes each 5 seconds. Press control-C to stop watching.
Since our jobs just output their results to standard out, we can do the final analysis from the log files. Let's see what one looks like:
After job completion we have 100 Monte Carlo estimates of the value of pi. Taking an average across them all should give us a closer approximation.
We'll use a bit of awk magic to do the averaging:
That's pretty close! With even more sample sets — that is, more Queue jobs in the cluster — we can statistically come even closer.
What to do next?
The R.submit file may have included a few lines that you are unfamiliar with. For example,
$(Process) are variables that will be replaced with the job's cluster and process id. This is useful when you have many jobs submitted in the same file. Each output and error file will be in a separate directory.
Also, did you notice the
transfer_input_files line? This tells HTCondor what files to transfer with the job to the worker node. You don't have to tell it to transfer the executable, HTCondor is smart enough to know that the job will need that. But any extra files, such as our MonteCarlo R file, will need to be explicitly listed to be transferred with the job. You can use
transfer_input_files for input data to the job, as shown in Transferring data with HTCondor. If you have larger data requirements, you may look into Transferring your Stash'd data with HTCondor.