Search open search form

Transferring Files

How to transfer files using SCP, SFTP, Fuse and S3

Using OpenOnDemand to move files graphically from a browser

Some people are choosing  to move files via the web console since you are only asked for your Duo authentication when first logging in and then can keep the window open to transfer files selectively.  To learn more about this option take a look at the documentation for Open OnDemand


Mounting via Windows or Mac

From Campus or on VPN you can mount your home and group directories using the filesystem.

On the Mac, you can go to the Finder, and Choose Go from the menus and "Connect to Server..."

From there you can choose to mount locations such as your home directory, group directory, or scratch.  Here are locations you can put in:

GPFS
smb://transfer.hpc.caltech.edu/groups/grouname
smb://transfer.hpc.caltech.edu/username
smb://transfer.hpc.caltech.edu/scratch

VAST
smb://fs.hpc.caltech.edu/groups/grouname
smb://fs.hpc.caltech.edu/scratch

Mounting via command line on Mac

GPFS
mkdir ~/cluster_group_dir
mount_smbfs //username@transfer.hpc.caltech.edu/groups/groupname ~/cluster_group_dir

VAST
mkdir ~/cluster_group_dir
mount_smbfs //username@fs.hpc.caltech.edu/groups/groupname ~/cluster_group_dir

On Windows it is similar except the locations are formatted differently:

GPFS
\\transfer.hpc.caltech.edu\groups\groupname
\\transfer.hpc.caltech.edu\username
\\transfer.hpc.caltech.edu\scratch

VAST
\\fs.hpc.caltech.edu\groups\groupname
\\fs.hpc.caltech.edu\scratch

You can mount via the command line in windows using the net command:

GPFS
net use Z: \\transfer.hpc.caltech.edu\groups\groupname

VAST
net use Z: \\fs.hpc.caltech.edu\groups\groupname

Using SSH/SCP

Why use SSH/SCP/SFTP for file transfer?

SCP and SFTP both run over ssh and are thus encrypted. There are implementations available for all common operating systems including Linux, Windows, and Mac OS X.

Windows

GUI:
  • WinSCP
    • Host: login.hpc.caltech.edu
    • Enter your username and password.
  • FileZilla
  • Command Line:
  • Linux

  • Command Line:
    • Start Terminal (Applications->Accessories->Terminal)

      • To copy files from your computer to the central cluster

        • Type scp local_filename username@login.hpc.caltech.edu:~/remote_directory_name
        • Or type rsync -avPz -e ssh local_filename username@login.hpc.caltech.edu:~/remote_directory_name
      • To copy files from the central cluster to your computer
        • Type scp username@login.hpc.caltech.edu:~/remote_filename .
  • Mac OS X

    Command Line:
    • Start Terminal (Applications->Utilities->Terminal)
      • To copy files from your computer to the central cluster
        • Type scp local_filename username@login.hpc.caltech.edu:~/remote_directory_name/
      • To copy files from the central cluster to your computer
        • Type scp username@login.hpc.caltech.edu:~/remote_filename

    SSHFS on Mac OS X

    If you prefer filesystem like access you may use FuseOS together with SSHFS. This works over SSH protocol and is therefore encrypted as with standard SSH/SCP/SFTP but with the added benefit of drag and drop transfers. 

    • Download and install FUSE and SSHFS here.
    • Make a local mount directory on your Mac. mkdir ~/Desktop/HPC-Mount
    • Run a command similar to the following, swapping out your username and directory name.
    •  sshfs -o allow_other,defer_permissions,auto_cache remote-username@login.hpc.caltech.edu:/home/remote-username ~/Desktop/HPC-Mount

    GUI:


    Cyberduck
      • Download
      • Cyberduck can be made to work with 2 factor
        • Click on "Open Connection"
        • choose "SFTP"
        • enter you username and password, then click connect
        • In the "Provide additional login credentials" box, enter 1 in the password field and hit enter if using the smartphone app.
        • You should be prompted on you cell phone to allow the connection
        • If using a yubikey, you can touch it when prompted to complete the login.
      • Note: By default Cyberduck will ask for multi-factor authentication on every file copy transaction. To avoid this go to preferences > general > Transfer Files: 'User Browser Connection' then connect to the cluster via SFTP.
    FileZilla
    • In FileZilla, under transfer settings, limit the number of simultaneous
      connections to "1".  When transferring multiple files, FileZilla tries to
      open multiple connections, and it's doing the interactive logon for each
      new connection it's opening.  Limiting the connections to "1" should force
      FileZilla to use one connection (and thus one authentication) for the
      entire transfer.

    Using Amazon S3

    If your data is in Amazon S3 you may use the awscli tools which are already installed as a module on the cluster. 

    • Log into the cluster and run module load awscli/1.15.27
    • Type aws configure and enter your Amazon Web Services API key and private key. (You generate these in the IAM credential page in the AWS console).
    • Run a command similar to the following to copy data from S3 to your cluster home directory. 
    • aws s3 cp --recursive s3://my-bucket-name/subfolder/ ~/destination-directory/
    • Run a command similar to the following to copy data from the cluster to a pre-existing S3 bucket.
    • aws s3 cp --recursive ~/source-directory/ s3://my-bucket-name/subfolder/
    • More s3 examples are available here. 

    Using Google Cloud Storage

    If your data is in Google Cloud Storage you may use the gsutil which is installed as a module on the cluster. 

    • Log into the cluster and run module load python/2.7.15 gcloud/latest
    • Run gcloud auth login to configure the Google SDK for your GCP account if needed.

    • Run following command to copy data from the cluster to Google Cloud Storage. 

    • gsutil cp ~/kitten.png gs://my-awesome-bucket

    • More gsutil examples here.


    Globus on the Resnick HPC


    We employ a Globus endpoint enabled on the Resnick HPC which allows researchers to 
    efficently move data from both client machines and other institution endpoints to the cluster
    filesystem.

    For moving data between client machines and the Resnick cluster endpoint, we recommend
    installing Globus Connect Personal. https://www.globus.org/globus-connect-personal

    Once installed, you may search for "Resnick" in the File Transfer dialog box. This will
    bring up our endpoint "Caltech Resnick HPC Directories", which provides access
    to both /home and /group directories. We strongly recommend transfering files only to
    /group directories as the storage capacity is much greater than /home.

    Name: Resnick-HPC-Cluster
    Direct Link: https://app.globus.org/file-manager/collections/5be0c110-3df6-4c24-ae62-7fd88da26e3b/overview
    ID 5be0c110-3df6-4c24-ae62-7fd88da26e3b
    UUID 9fc54b35-f66e-4ef0-a36a-49b20d684b99

    Reference:

    https://www.globus.org/what-we-do
    https://www.globus.org/globus-connect-personal