Computation Section

How To RUN your case in OCTOPUS500

(with 768 cores and 2T RAM )

Available Parallel Environments

The new parallel environment is mpiX (where you should replace X  by the number of cpus (1 to 10) that you want to use in each compute node).

ex: if you select

-pe mpi5 25

It will launch a run using 5 compute nodes with 5 cpus on each node. This means that the number of slots that you required must be a multiple of X (mpiX):

-pe mpi5 22 —–> will not work

-pe mpi5 10 —–> will work

-pe mpi2 4  —–>  will work

-pe mpi2 5  —–> will not work

THE RULE: The maximun number of cpus allowed per node are 10, so take care when submiting your job, its always a good idea to visit Ganglia (http://192.168.92.140/ganglia). Try to be cordial and efficient, if you want to submit a case, say with 5 cpus, and if there is a node with already 5 cpus occupied, you should choose that node with the command -l h=compute-0-x, where x is related to the number of the required node.

OpenFOAM SCRIPT

I have created a script that should work for all your cases. Just run:

qsub RUN.sh

————————————————————————————————-

#!/bin/bash

#$ -S /bin/bash

# Set the Parallel Environment and number of procs.

#$ -pe mpi5 10

# The job will run in the actual directory

#$ -cwd

# Define the name of the job (name that will be displayed)

#$ -N putWhatEverYouWant

# Set your job output file

#$ -o log

# Set your job error file

#$ -e error.err

# Set the priority, default value is 0 (no priority)

#$ -p 0

# Set the email to receive news from the job

#$ -M youremail@ubi.pt

#$ -m bea

# Put your Job commands here.

#————————————————

. $HOME/OpenFOAM/OpenFOAM-2.0.x/etc/bashrc #Put the correct one!

# Defining openmpi parameters (dont change, this should be ok)

ARGS=”–mca btl ^openib –mca btl_tcp_if_include eth0″

# Solver (the solver that you will run)

SOLVER=clusterMHDFoam

mpirun -np $NSLOTS $ARGS $SOLVER -parallel

————————————————————————————————-

Note: For running this script you need to be in the actual case folder.

And remember, kill all your process when you leave the cluster and always check ganglia to see cluster load;

http://192.168.92.140/ganglia (Inside UBI network)

Have FUN!

Commonly-Used SGE Commands

This page lists some of the more frequently used Sun Grid Engine commands. It does not list all of the options for each command. The man command can be used to see the detailed description of any of these commands. For example, to see a detailed description of the qsub command, enter:

man qsub

List of Commands:

qsub Submit a Job

qstat Determine the Status of a Job

qhost Display Node Information

qdel Cancel a Job

qhold Place a hold on a queued job to prevent it from running

qrls Release a job held with qhold

The qsub command is used to submit jobs to SGE. The syntax of the qsub command is:

qsub [-cwd] [-v SOME_VAR] [-o path] [-e path] [-M mail_address] [-m mail_options] [-l resources] script

Where:

-cwd

Directs SGE to run the job in the same directory from which you submitted it. Alternatively, you can specify this flag in the SGE command file for the job.

-v SOME_VAR

Passes environment variable SOME_VAR to the job. Alternatively, you can specify this flag in the SGE command file for the job.

-o path

Redirects stdout from the SGE script. The default is your home directory. Specify /dev/null to disgard SGE messages. Alternatively, you can specify this flag in the SGE command file for the job.

-e path

Redirects stderr from the SGE script. The default is your home directory. Specify /dev/null to disgard SGE error messages. Alternatively, you can specify this flag in the SGE command file for the job.

-M mail_address

where mail_address is user’s email address. It is always login_id@mail on Hoffman2.

-m mail_options

Specifies the circumstances under which mail is to be sent to the job owner defined by -M option. For example options “bea” mean mail is sent at the begining, end, and at abort time (if it happens) of the job. Option “n” means no mail will be sent.

-l resources

Specifies a list of resouces required for your job, for example memory and time per core:

-l h_data=1024M,h_rt=24:00:00

Script

Either the SGE command file or the script that starts up your job.

The qsub command line switches and options can also be used as active comments or embedded directives in an SGE command file that you submit with the qsub command. Advantages of this approach are: you have a record of what options were used to run your job; you can easily resubmit jobs; and you can use one command file as the basis for creating other similar command files. For example, if the file myjob.cmd contains:

#!/bin/csh

/path/to/executable

and the qsub command used to submit it is:

qsub -cwd -o path -M login_id@mail -m bea -l h_data=1024M,h_rt=24:00:00 myjob.cmd

then the same result could be achieved by adding the following lines to the myjob.cmd file before the /path/to/executable line:

#$ -cwd

#$ -o path

#$ -M login_id@mail

#$ -m bea

#$ -l h_data=1024M,h_rt=24:00:00

and submitting the myjob.cmd script with:

qsub myjob.cmd

After submitting a job with qsub, SGE will respond with something like:

Your job 624556 (“myjob.cmd”) has been submitted

where 624556 is the job number assigned by SGE to your job.

The qstat command displays information about the jobs in the SGE queues, both running and waiting to run. The syntax of the qstat command is:

qstat [-f] [-j job_number] [-U login_id] [-u login_id]

where:

(qstat alone with no arguments) Displays a list of all running and waiting jobs.

-f Displays summary information on each queue as well as the job list.

-j job_number

Displays the status of the job whose job number is job_number

-U login_id

Displays a list of running and waiting jobs for those queues which login_id can access. Or use the groupjobs script for this information; enter groupjobs -help for usage information.

-u login_id

Displays a list of login_id ‘s running and waiting jobs. Or use the myjobs script for this information for your own login_id.

The qhost command displays information about compute nodes: their architectures, number of processors, load, etc. The syntax of the qhost command is:

qhost [-j] [-q]

where:

(qhost alone with no arguments)

Displays a table of information about the compute nodes.

-j Adds information about the specific jobs that are running on each compute node.

-q Shows the queues each compute node accepts.

The qdel command is used to cancel a job either while it is waiting to execute or while it is running. The syntax of the qdel command is:

qdel job_number

If a running job does not get cancelled right away, enter:

qdel -f job_number

to force it to be cancelled. Jobs in the “dr” state (disabled running) cannot be cancelled by the job owner. They must be cancelled by a system administrator. “dr” state jobs usually indicate a system hardware problem.

error: Content is protected !!