Contents
Skip to end of metadata
Go to start of metadata
Table of Contents

Mathematica down

Icon

Warning the Mathmematica license server is currently being reconfigured so Mathematica won't work at the moment.

Overview

The second application that we'll run on OSG Connect is Mathematica. This application example will introduce the use of HTCondor Directed Acylic Graphs to manage job workflows. Specifically, we'll take the Mandelbrot set calculation, split it up into N jobs, and then recombine the final data to produce a nice picture.

Background

This exercise will calculate the Mandelbrot set, one of the most easily recognized fractals. You can read more about the Mandelbrot set here: http://en.wikipedia.org/wiki/Mandelbrot_set

Running Mathematica locally

We'll need to set up a working directory, so you can either type tutorial mathematica or type the following:

$ mkdir -p osg-mathematica/{csv,log}; cd osg-mathematica

Like R, we can get Mathematica from CVMFS. It's installed at /cvmfs/uc3.uchicago.edu/Wolfram, so we'll need to export our PATH appropriately.

$ export PATH=$PATH:/cvmfs/uc3.uchicago.edu/Wolfram/Mathematica/8.0/Executables

Once we've got Mathematica in our PATH, let's try running it.

$ math
Mathematica 8.0 for Linux x86 (64-bit)
Copyright 1988-2011 Wolfram Research, Inc.

In[1]:=

Looks like it works. You can quit with Quit[]

In[1]:= Quit[]
$

I've written the Mandelbrot set code for you already. I'll spare you the gory details, but the important things to note here is that it takes 2 variables as input: "PID" and "Jobs".

$ nano mandelbrot.m
file: mandelbrot.m
Scaling = 5
TotalCols = LCM[7/4, 2]*Jobs*Scaling
TotalRows = Ceiling[(TotalCols - 1)*4/7]
maxIterations = 120;

MandelbrotPixel =
  Compile[{{ColNum, _Integer}, {RowNum, _Integer}},
   Module[{x = 0., y = 0., xtemp, iterations = 0},
    While[x^2 + y^2 <= 4 && iterations < maxIterations,
     xtemp = x*x - y*y + (ColNum - 1)*3.5/TotalCols - 2.5;
     y = 2*x*y + (RowNum - 1)*2/TotalRows - 1;
     x = xtemp;
     iterations = iterations + 1;];
    iterations], CompilationTarget -> "C"];

MandelbrotData =
 Table[MandelbrotPixel[i, j], {j, 1,
   TotalRows}, {i, (PID - 1)*TotalCols/Jobs + 1, (TotalCols/Jobs)*
    PID}];

Export[StringJoin[
  "mandelbrot." <>IntegerString[PID, 10, IntegerLength[Jobs]]<> ".csv"], MandelbrotData, "CSV"]

There's also a scaling factor at the top of the code that you are free to modify. If you run 10 jobs with the default scaling, you can expect to produce a 700x400 pixel Mandelbrot.

Let's try running this from the command line. We'll pass in PID and Jobs as arguments. 'Jobs' determines how many times we want to cut up the calculation, and 'PID' specifies which chunk we want to evaluate.

$ math -run "PID=1;Jobs=10" < mandelbrot.m
Mathematica 8.0 for Linux x86 (64-bit)
Copyright 1988-2011 Wolfram Research, Inc.

In[1]:=
Out[1]= 700

In[2]:=
Out[2]= 400

In[3]:=
In[4]:=
In[4]:=
In[5]:=
In[5]:=
In[6]:=
In[6]:=
Out[6]= mandelbrot.01.csv

In[7]:=
$

There's a lot of stuff on standard out, but we see that "mandelbrot.01.csv" was created.

Icon

You can look at the data with cat, but it's not very interesting yet.

Let's go ahead and remove the file to avoid confusion later:

$ rm mandelbrot.01.csv

Creating an HTCondor job

How do we wrap this up into a HTCondor job? First we need to create a small script that sets up our environment variables and runs Mathematica in batch mode. Here's my script:

$ nano math.sh
file: math.sh
#!/bin/bash
# This script assumes that 1 argument has no additional arguments
# and that 3 arguments follows the form PID and Jobs

export PATH=/usr/bin:/cvmfs/uc3.uchicago.edu/Wolfram/Mathematica/8.0/Executables

if [ $# -eq 1 ]; then
  math -run < $1
elif [ $# -eq 2 ]; then
  math -run "PID=`expr $2 + 1`" < $1
elif [ $# -eq 3 ]; then
  math -run "PID=`expr $2 + 1`;Jobs=$3" < $1
else
  echo "Wrong number of arguments"
  echo "Usage: math.sh batch.m [PID] [Number of Jobs]"
fi

We're going to be creating another Mathematica batch file later, so this script has been made a bit generic. Let's create a Condor submit file to go along with it.

$ nano mandelbrot.submit
file: mandelbrot.submit
executable = math.sh
universe = vanilla
Log = ../log/log.$(Cluster).$(Process)
Output = ../log/out.$(Cluster).$(Process)
Error = ../log/err.$(Cluster).$(Process)

WhenToTransferOutput = ON_EXIT
should_transfer_files = YES
transfer_input_files = ../mandelbrot.m

initial_dir = csv

Arguments = mandelbrot.m $(Process) $(Jobs)

requirements = (HAS_MATH_LICENSE =?= True)

Jobs = 10
queue $(Jobs)
Icon

Notice the "HAS_MATH_LICENSE" requirement. This steers HTCondor jobs to resources that have a valid Mathematica license.

This HTCondor submit script is a bit more complex than others for a few reasons.

  • I don't like to pollute my submit directory with data and logs, so I split them off into directories called 'csv' and 'logs' respectively.
  • I also want to make sure that my Mathematica script is aware of how many jobs are being made, so I define a macro called "Jobs" and pass it to arguments and queue.
  • Mathematica requires a license! We use a boolean expression to make sure the CVMFS and the license server are available to the worker.
  • Finally, $(Process) translates to our Mathematica script's PID variable.

Stitching the output together

The problem is that this code will create 10 separate CSVs for us to stitch together, when we actually just want one to make a nice picture. I've created a Mathematica script for handling this as well.

$ nano stitch.m
file: stitch.m
Export["mandelbrot.csv",
 Flatten[Table[
   Transpose[Import[FileNames["*.csv"][[i]]]], {i, 1,
    Length[FileNames["*.csv"]]}], 1]];

As before, we create a submit file. This time we'll transfer the entire contents of "csv" with the job.

$ nano stitch.submit
file: stitch.submit
executable = math.sh
universe = vanilla
Log = log/log.$(Cluster).$(Process)
Output = log/out.$(Cluster).$(Process)
Error = log/err.$(Cluster).$(Process)

WhenToTransferOutput = ON_EXIT
should_transfer_files = YES
transfer_input_files = stitch.m, csv/

Arguments = stitch.m

requirements = (HAS_MATH_LICENSE =?= True)

queue 1

Creating an HTCondor DAG

We could submit these by hand, but why not let HTCondor's workflow manager take care of it? Enter DAGs, or Directed Acylic Graphs. I won't bother to explain in detail, but they let you create job workflows and also have some nice features like automatic retry in the event of failure. You can read more about them here: http://research.cs.wisc.edu/htcondor/tutorials/intl-grid-school-3/simple_dag.html

Writing one for our jobs is easy since our workflow is pretty linear. This DAG will launch mandelbrot.submit, which creates our CSV files, and then, once finished, collates the output with stich.submit.

$ nano mandelbrot.dag
file: mandelbrot.dag
JOB  A  mandelbrot.submit
JOB  B  stitch.submit
PARENT A CHILD B
Retry A 3

Submitting DAGs is just as easy as submitting HTCondor jobs:

$ condor_submit_dag mandelbrot.dag

-----------------------------------------------------------------------
File for submitting this DAG to Condor           : mandelbrot.dag.condor.sub
Log of DAGMan debugging messages                 : mandelbrot.dag.dagman.out
Log of Condor library output                     : mandelbrot.dag.lib.out
Log of Condor library error messages             : mandelbrot.dag.lib.err
Log of the life of condor_dagman itself          : mandelbrot.dag.dagman.log

Submitting job(s).
1 job(s) submitted to cluster 85959.
-----------------------------------------------------------------------

The DAG will fire up all of our math.sh scripts for us:

-- Submitter: login01.uchicago.edu : <10.1.3.94:9618?sock=25212_0c25_14> : login01.uchicago.edu
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
85959.0   netid        5/18 13:43   0+00:00:14 R  0   0.3  condor_dagman
85960.0   netid        5/18 13:43   0+00:00:00 I  0   0.0  math.sh mandelbrot
85960.1   netid        5/18 13:43   0+00:00:00 I  0   0.0  math.sh mandelbrot
85960.2   netid        5/18 13:43   0+00:00:00 I  0   0.0  math.sh mandelbrot
85960.3   netid        5/18 13:43   0+00:00:00 I  0   0.0  math.sh mandelbrot
85960.4   netid        5/18 13:43   0+00:00:00 I  0   0.0  math.sh mandelbrot
85960.5   netid        5/18 13:43   0+00:00:00 I  0   0.0  math.sh mandelbrot
85960.6   netid        5/18 13:43   0+00:00:00 I  0   0.0  math.sh mandelbrot
85960.7   netid        5/18 13:43   0+00:00:00 I  0   0.0  math.sh mandelbrot
85960.8   netid        5/18 13:43   0+00:00:00 I  0   0.0  math.sh mandelbrot
85960.9   netid        5/18 13:43   0+00:00:00 I  0   0.0  math.sh mandelbrot

And once it's finished, you should see the completed "mandelbrot.csv" in your homedir!

Fractals!

Mathematica can't export graphics without having the GUI running or the appropriate X11 libraries installed, so you'll need to 'scp' the code to your laptop or another machine that has Mathematica installed.

Nevertheless, here's the (trivial) code for getting Mathematica to plot it:

Mandelbrot = Import["/Users/netid/mandelbrot/mandelbrot.csv"]
ArrayPlot[Transpose[Mandelbrot], ColorFunction -> "Rainbow"]

Here it is!

  • No labels