faq > Error when running main process on grid
Showing 1-4 of 4 posts
Display:
Results per page:
Feb 21, 2018  09:02 AM | Nobody
Error when running main process on grid
Hi! 

Im trying to run the ISC toolbox 3.0 on a linux cluster. The analysis works fine when I force local computing. When I try to run it on the grid I type: 

--partition=HPC --mem=8096 --time=2-0

The program does submit a batch job but only for a few seconds, then it disappears from the queue. I've tried making sure that the destination directory is new. 

Any suggestions are very appreciated. 

Best
Sigurd
Feb 21, 2018  07:02 PM | Juha Pajula - VTT Technical Research Centre of Finland Ltd / Tampere University of Technology
RE: Error when running main process on grid
Hi!

Can you tell us what kind of grid you are using, or what kind of grid selection you have done?
Currently supported grid engines are Slurm and SGE variants. Torque support is experiemental. Condor grids are not supported.

You can check the output of the processes from the scripts folder in the root of your analysis folder.
Files ending with exxxxx are the error files which probably tell the reason why the system does not work.

-Juha Pajula

Originally posted by Nobody:
Hi! 

Im trying to run the ISC toolbox 3.0 on a linux cluster. The analysis works fine when I force local computing. When I try to run it on the grid I type: 

--partition=HPC --mem=8096 --time=2-0

The program does submit a batch job but only for a few seconds, then it disappears from the queue. I've tried making sure that the destination directory is new. 

Any suggestions are very appreciated. 

Best
Sigurd
Feb 22, 2018  08:02 AM | Nobody
RE: Error when running main process on grid
Thank you very much for replying! 

We use just SLURM. The error file in the scripts folder reads:

/var/lib/slurm-llnl/slurmd/job177034/slurm_script: 2: /var/lib/slurm-llnl/slurmd/job177034/slurm_script: module: not found
slurmstepd: *** JOB 177034 ON small29 CANCELLED AT 2018-02-21T10:05:09 *** 


As I understand it, module is used to set up the enviroment, here Matlab, but Matlab is already set up. 

However, any help would really be appreciated. 

Best
Mar 9, 2018  06:03 PM | Juha Pajula - VTT Technical Research Centre of Finland Ltd / Tampere University of Technology
RE: Error when running main process on grid
Originally posted by Nobody:
Thank you very much for replying! 

We use just SLURM. The error file in the scripts folder reads:

/var/lib/slurm-llnl/slurmd/job177034/slurm_script: 2: /var/lib/slurm-llnl/slurmd/job177034/slurm_script: module: not found
slurmstepd: *** JOB 177034 ON small29 CANCELLED AT 2018-02-21T10:05:09 ***


As I understand it, module is used to set up the enviroment, here Matlab, but Matlab is already set up. 

However, any help would really be appreciated. 

Best

You can fix this by commenting the following line out from the file gridParser.m:
dlmwrite(file_name, 'module load matlab','-append','delimiter',''); 

Then the running script does not try to load matlab module before it starts the matlab process.

-Juha Pajula