**Table of Contents**

## Overview

The second application that we'll run on OSG Connect is Mathematica. This application example will introduce the use of HTCondor Directed Acylic Graphs to manage job workflows. Specifically, we'll take the Mandelbrot set calculation, split it up into N jobs, and then recombine the final data to produce a nice picture.

## Background

This exercise will calculate the Mandelbrot set, one of the most easily recognized fractals. You can read more about the Mandelbrot set here: http://en.wikipedia.org/wiki/Mandelbrot_set

## Running Mathematica locally

We'll need to set up a working directory, so you can either type `tutorial mathematica`

or type the following:

$ mkdir -p osg-mathematica/{csv,log}; cd osg-mathematica

Like R, we can get Mathematica from CVMFS. It's installed at /cvmfs/uc3.uchicago.edu/Wolfram, so we'll need to export our PATH appropriately.

$ export PATH=$PATH:/cvmfs/uc3.uchicago.edu/Wolfram/Mathematica/8.0/Executables

Once we've got Mathematica in our PATH, let's try running it.

$ math Mathematica 8.0 for Linux x86 (64-bit) Copyright 1988-2011 Wolfram Research, Inc. In[1]:=

Looks like it works. You can quit with Quit[]

In[1]:= Quit[] $

I've written the Mandelbrot set code for you already. I'll spare you the gory details, but the important things to note here is that it takes 2 variables as input: "PID" and "Jobs".

$ nano mandelbrot.m

**file: mandelbrot.m**

Scaling = 5 TotalCols = LCM[7/4, 2]*Jobs*Scaling TotalRows = Ceiling[(TotalCols - 1)*4/7] maxIterations = 120; MandelbrotPixel = Compile[{{ColNum, _Integer}, {RowNum, _Integer}}, Module[{x = 0., y = 0., xtemp, iterations = 0}, While[x^2 + y^2 <= 4 && iterations < maxIterations, xtemp = x*x - y*y + (ColNum - 1)*3.5/TotalCols - 2.5; y = 2*x*y + (RowNum - 1)*2/TotalRows - 1; x = xtemp; iterations = iterations + 1;]; iterations], CompilationTarget -> "C"]; MandelbrotData = Table[MandelbrotPixel[i, j], {j, 1, TotalRows}, {i, (PID - 1)*TotalCols/Jobs + 1, (TotalCols/Jobs)* PID}]; Export[StringJoin[ "mandelbrot." <>IntegerString[PID, 10, IntegerLength[Jobs]]<> ".csv"], MandelbrotData, "CSV"]

There's also a scaling factor at the top of the code that you are free to modify. If you run 10 jobs with the default scaling, you can expect to produce a 700x400 pixel Mandelbrot.

Let's try running this from the command line. We'll pass in PID and Jobs as arguments. 'Jobs' determines how many times we want to cut up the calculation, and 'PID' specifies which chunk we want to evaluate.

$ math -run "PID=1;Jobs=10" < mandelbrot.m Mathematica 8.0 for Linux x86 (64-bit) Copyright 1988-2011 Wolfram Research, Inc. In[1]:= Out[1]= 700 In[2]:= Out[2]= 400 In[3]:= In[4]:= In[4]:= In[5]:= In[5]:= In[6]:= In[6]:= Out[6]= mandelbrot.01.csv In[7]:= $

There's a lot of stuff on standard out, but we see that "mandelbrot.01.csv" was created.

Let's go ahead and remove the file to avoid confusion later:

$ rm mandelbrot.01.csv

## Creating an HTCondor job

How do we wrap this up into a HTCondor job? First we need to create a small script that sets up our environment variables and runs Mathematica in batch mode. Here's my script:

$ nano math.sh

**file: math.sh**

#!/bin/bash # This script assumes that 1 argument has no additional arguments # and that 3 arguments follows the form PID and Jobs export PATH=/usr/bin:/cvmfs/uc3.uchicago.edu/Wolfram/Mathematica/8.0/Executables if [ $# -eq 1 ]; then math -run < $1 elif [ $# -eq 2 ]; then math -run "PID=`expr $2 + 1`" < $1 elif [ $# -eq 3 ]; then math -run "PID=`expr $2 + 1`;Jobs=$3" < $1 else echo "Wrong number of arguments" echo "Usage: math.sh batch.m [PID] [Number of Jobs]" fi

We're going to be creating another Mathematica batch file later, so this script has been made a bit generic. Let's create a Condor submit file to go along with it.

$ nano mandelbrot.submit

**file: mandelbrot.submit**

executable = math.sh universe = vanilla Log = ../log/log.$(Cluster).$(Process) Output = ../log/out.$(Cluster).$(Process) Error = ../log/err.$(Cluster).$(Process) WhenToTransferOutput = ON_EXIT should_transfer_files = YES transfer_input_files = ../mandelbrot.m initial_dir = csv Arguments = mandelbrot.m $(Process) $(Jobs) requirements = (HAS_MATH_LICENSE =?= True) Jobs = 10 queue $(Jobs)

This HTCondor submit script is a bit more complex than others for a few reasons.

- I don't like to pollute my submit directory with data and logs, so I split them off into directories called 'csv' and 'logs' respectively.
- I also want to make sure that my Mathematica script is aware of how many jobs are being made, so I define a macro called "Jobs" and pass it to arguments and queue.
- Mathematica requires a license! We use a boolean expression to make sure the CVMFS and the license server are available to the worker.
- Finally, $(Process) translates to our Mathematica script's PID variable.

## Stitching the output together

The problem is that this code will create 10 separate CSVs for us to stitch together, when we actually just want one to make a nice picture. I've created a Mathematica script for handling this as well.

$ nano stitch.m

**file: stitch.m**

Export["mandelbrot.csv", Flatten[Table[ Transpose[Import[FileNames["*.csv"][[i]]]], {i, 1, Length[FileNames["*.csv"]]}], 1]];

As before, we create a submit file. This time we'll transfer the entire contents of "csv" with the job.

$ nano stitch.submit

**file: stitch.submit**

executable = math.sh universe = vanilla Log = log/log.$(Cluster).$(Process) Output = log/out.$(Cluster).$(Process) Error = log/err.$(Cluster).$(Process) WhenToTransferOutput = ON_EXIT should_transfer_files = YES transfer_input_files = stitch.m, csv/ Arguments = stitch.m requirements = (HAS_MATH_LICENSE =?= True) queue 1

## Creating an HTCondor DAG

We could submit these by hand, but why not let HTCondor's workflow manager take care of it? Enter DAGs, or Directed Acylic Graphs. I won't bother to explain in detail, but they let you create job workflows and also have some nice features like automatic retry in the event of failure. You can read more about them here: http://research.cs.wisc.edu/htcondor/tutorials/intl-grid-school-3/simple_dag.html

Writing one for our jobs is easy since our workflow is pretty linear. This DAG will launch mandelbrot.submit, which creates our CSV files, and then, once finished, collates the output with stich.submit.

$ nano mandelbrot.dag

**file: mandelbrot.dag**

JOB A mandelbrot.submit JOB B stitch.submit PARENT A CHILD B Retry A 3

Submitting DAGs is just as easy as submitting HTCondor jobs:

$ condor_submit_dag mandelbrot.dag ----------------------------------------------------------------------- File for submitting this DAG to Condor : mandelbrot.dag.condor.sub Log of DAGMan debugging messages : mandelbrot.dag.dagman.out Log of Condor library output : mandelbrot.dag.lib.out Log of Condor library error messages : mandelbrot.dag.lib.err Log of the life of condor_dagman itself : mandelbrot.dag.dagman.log Submitting job(s). 1 job(s) submitted to cluster 85959. -----------------------------------------------------------------------

The DAG will fire up all of our math.sh scripts for us:

-- Submitter: login01.uchicago.edu : <10.1.3.94:9618?sock=25212_0c25_14> : login01.uchicago.edu ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 85959.0 netid 5/18 13:43 0+00:00:14 R 0 0.3 condor_dagman 85960.0 netid 5/18 13:43 0+00:00:00 I 0 0.0 math.sh mandelbrot 85960.1 netid 5/18 13:43 0+00:00:00 I 0 0.0 math.sh mandelbrot 85960.2 netid 5/18 13:43 0+00:00:00 I 0 0.0 math.sh mandelbrot 85960.3 netid 5/18 13:43 0+00:00:00 I 0 0.0 math.sh mandelbrot 85960.4 netid 5/18 13:43 0+00:00:00 I 0 0.0 math.sh mandelbrot 85960.5 netid 5/18 13:43 0+00:00:00 I 0 0.0 math.sh mandelbrot 85960.6 netid 5/18 13:43 0+00:00:00 I 0 0.0 math.sh mandelbrot 85960.7 netid 5/18 13:43 0+00:00:00 I 0 0.0 math.sh mandelbrot 85960.8 netid 5/18 13:43 0+00:00:00 I 0 0.0 math.sh mandelbrot 85960.9 netid 5/18 13:43 0+00:00:00 I 0 0.0 math.sh mandelbrot

And once it's finished, you should see the completed "mandelbrot.csv" in your homedir!

## Fractals!

Mathematica can't export graphics without having the GUI running or the appropriate X11 libraries installed, so you'll need to 'scp' the code to your laptop or another machine that has Mathematica installed.

Nevertheless, here's the (trivial) code for getting Mathematica to plot it:

Mandelbrot = Import["/Users/netid/mandelbrot/mandelbrot.csv"] ArrayPlot[Transpose[Mandelbrot], ColorFunction -> "Rainbow"]

Here it is!