ssh -X kongull.hpc.ntnu.nu
This might look like
rn@barn-desktop:~$ ssh -X kongull.hpc.ntnu.no
barn@kongull.hpc.ntnu.no's password:
Last login: Mon Mar 8 14:55:42 2010 from ipt126.ipt.ntnu.no
Rocks 5.2 (Chimichanga)
Profile built 11:08 27-Jan-2010
Kickstarted 12:25 27-Jan-2010
[barn@kongull ~]$
module avail
rn@kongull ~]$ module avail
------------------------------------------------------- /share/apps/modulefiles/ --------------------------------------------------------
4store/4.4.3(default) gcc/3.3.6 intel/compilers/11.1.059 paraview/3.6.2 rasqal/0.9.19(default)
adf/adf2009.01 gcc/3.4.6 jdk/jdk1.6.0_18(default) pcre/8.01(default) vtk/5.4.2
cmake/2.8.0 gcc/4.4.3(default) matlab/R2009b pgi/8.0-3
fftw/3.2.2(default) hdf5/1.6.10(default) openfoam/1.6.100218 raptor/1.4.21(default)
[barn@kongull ~]$
To set up the user environment for a specific piece of software type (in this case the intel compiler)
modul load intel/compilers
#---- SU
CWPROOT=/home/barn/sim/src/su
export CWPROOT
PATH=$CWPROOT/bin:$PATH
#--- MPI
PATH=/home/barn/sim/src/mpich-64/bin:$PATH
#--- SPL
SYS=x86_64
export SYS
SPL=/home/barn/sim/src/spl
PATH=/home/barn/sim/src/spl/bin/$SYS:$PATH
export SPL
#---- PBS
MANPATH=/opt/torque/man:$MANPATH
. .profile
qsub < file.sh
#!/bin/sh
#PBS -N fdmod1
#PBS -l nodes=401
#PBS -q bigmem
#
#-- Mpi resources
#
#
np=401 # Number of mpi-processors (compute nodes)
nodes=8 # Number of mpi-processors pr. physical node
Wrkdir=$HOME/Project/multi-shot # Working directory
. $HOME/.profile
cd $Wrkdir
#-- Run modeling using mpi
mpirun -np $np -nodes $nodes -machinefile $PBS_NODEFILE $SPL/bin/$SYS/splfd2dmod \
v=1 \
logfile=log \
mp=1 \
dt=0.00025 \
lx=8000.0 \
logfile=log \
time=4.0 \
smap=smap.m \
kmap=kmap.m \
max=1000,3000000,3000000 \
min=1,0,0
qstat
qstat -f
man pbs
cp -R $SPL/demos/splfd2dmod/fdmod1 .
./job.sh
*** This is splfd2dgeom of July 2007
*** Input parameters
--- nfldr : 400
--- nrec : 150
--- dsx : 25.00000
--- dgx : 25.00000
--- sx_pos : 3725.000
--- gx_pos : 0.0000000E+00
--- rectime : 4.000000
--- dt : 1.0000000E-03
--- scalco : -1
--- direction : 1
=== i, hwpos: 1 9
=== i, hwpos: 2 73
=== i, hwpos: 3 81
percentage completed: 1
percentage completed: 2
percentage completed: 3
percentage completed: 4
percentage completed: 5
percentage completed: 6
percentage completed: 7
percentage completed: 8
percentage completed: 9
percentage completed: 10
percentage completed: 11
percentage completed: 12
percentage completed: 13
percentage completed: 14
percentage completed: 15
percentage completed: 16
percentage completed: 17
percentage completed: 18
percentage completed: 19
percentage completed: 20
percentage completed: 21
percentage completed: 22
percentage completed: 23
percentage completed: 24
percentage completed: 25
percentage completed: 26
percentage completed: 27
percentage completed: 28
percentage completed: 29
percentage completed: 30
percentage completed: 31
percentage completed: 32
percentage completed: 33
percentage completed: 34
percentage completed: 35
percentage completed: 36
percentage completed: 37
percentage completed: 38
percentage completed: 39
percentage completed: 40
percentage completed: 41
percentage completed: 42
percentage completed: 43
percentage completed: 44
percentage completed: 45
percentage completed: 46
percentage completed: 47
percentage completed: 48
percentage completed: 49
percentage completed: 50
percentage completed: 51
percentage completed: 52
percentage completed: 53
percentage completed: 54
percentage completed: 55
percentage completed: 56
percentage completed: 57
percentage completed: 58
percentage completed: 59
percentage completed: 60
percentage completed: 61
percentage completed: 62
percentage completed: 63
percentage completed: 64
percentage completed: 65
percentage completed: 66
percentage completed: 67
percentage completed: 68
percentage completed: 69
percentage completed: 70
percentage completed: 71
percentage completed: 72
percentage completed: 73
percentage completed: 74
percentage completed: 75
percentage completed: 76
percentage completed: 77
percentage completed: 78
percentage completed: 79
percentage completed: 80
percentage completed: 81
percentage completed: 82
percentage completed: 83
percentage completed: 84
percentage completed: 85
percentage completed: 86
percentage completed: 87
percentage completed: 88
percentage completed: 89
percentage completed: 90
percentage completed: 91
percentage completed: 92
percentage completed: 93
percentage completed: 94
percentage completed: 95
percentage completed: 96
percentage completed: 97
percentage completed: 98
percentage completed: 99
percentage completed: 100
1483.kongull.hpc.ntnu.no
qstat
you should get the following output
Job id Name User Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
1447.kongull cfbcycleb yuefa 55:18:06 R default
1448.kongull cfbcyclea yuefa 55:17:28 R optimist
1461.kongull job.sh arnemort 268:04:1 R optimist
1468.kongull job.sh arnemort 49:08:30 R optimist
1469.kongull job.sh arnemort 46:10:27 R optimist
1470.kongull job.sh arnemort 39:55:08 R optimist
1471.kongull job.sh arnemort 39:51:30 R optimist
1472.kongull job.sh arnemort 35:54:34 R optimist
1473.kongull job.sh arnemort 38:25:09 R optimist
1474.kongull job.sh arnemort 31:53:19 R optimist
1475.kongull job.sh arnemort 26:24:05 R optimist
1483.kongull fdmod1 barn 0 R bigmem
When the job is finished, two files will appear in the directory the job started:
[barn@kongull fdmod1-sol]$ ls fdmod*
fdmod1.e1485 fdmod1.o1485
The first file contains the output from the program produced on the unix standard error file, and the second file contains the output from the program produced on the unix standard output file. In our case most of the output are captured in the log files, and only in case of a crash or serious error will these files contain any usefull information.