To use the CoE HPC cluster, you must have an engineering account. If you do not have an engineering account, you can create one here and click on "Create a new account (Enable your Engineering resources)". After your CoE account has been created, your access to the HPC cluster will be enabled.
As a new HPCC user, you may wish to subscribe to the Cluster mailing list to receive news or status updates regarding the cluster. You may also check for news or the status of the cluster here.
If you are off campus, you must first connect to the OSU VPN before you can connect directly to the HPC cluster. Alternatively, you can first connect to the CoE gateway host via SSH. If you are using Linux or a Mac computer, then you can just launch a terminal window and use the ssh command, e.g.:
ssh myONID@access.engr.oregonstate.edu
where myONID = your ONID, or OSU Network ID. If you are using Windows, you need to download an SSH client like MobaXterm or Putty, then open an SSH session to the gateway host listed above.
The CoE HPC cluster may be accessed via SSH to one of the login or submit nodes. If you are on campus, or are off campus but you have connected to the OSU VPN or to a CoE Gateway host as described above, then you can SSH to one of the three submit hosts as follows.:
ssh myONID@submit-X.hpc.engr.oregonstate.edu
where X=a, b or c. Note that the submit nodes are not for running long or large jobs or calculations, but serve as a gateway for the rest of the cluster. From a submit node, you may request compute resources (e.g. CPUs, RAM and GPUs) from available compute nodes, either via an interactive session, or by submitting one or more batch jobs. See the SLURM section below for how to reserve resources and run jobs on the cluster.
If you need to run a GUI-based application on the cluster, you need an X11 server application installed on your computer. If you are running Windows then this is already provided by MobaXterm, but if you are using Putty as your ssh client, then you will need to install Xming on your computer, and configure Putty to enable X11 forwarding. If you are running MacOS, then you need to install Xquartz, then log in using the "-Y" option to enable X11 forwarding, e.g.:
ssh -Y myONID@submit-a.hpc.engr.oregonstate.edu
Researchers sponsored by a PI may have an HPC global scratch share with a 1 TB quota located in /nfs/hpc/share/myONID that they can use to run their jobs from and to store data. Be advised this HPC share is NOT backed up and should not be considered a place to store data long term, but a temporary place to store data generated by the HPC cluster. This share is subject to a purge policy of 90 days, files older than 90 days may be purged.
Some software is already available to you through your default executable path. Other software is managed and made available using Lmod (lua-based environment modules). Check out this link for more information on using Lmod to access available software on the CoE HPC cluster.
Slurm Workload Manager is the batch-queue system used to gain access to or run jobs on the CoE HPC cluster, including the Nvidia DGX systems and other compute nodes with GPUs. To use Slurm, you will need to be assigned to a Slurm account corresponding to your department, class, or research group. Check out this link for more information on using Slurm on the CoE HPC cluster.