View on GitHub

Carme-Docu

Documentation Project for Carme

Carme User Documentation

Overview

What is Carme ?

Carme an open source frame work to mange resources for multiple users running interactive jobs (e.g. Jupyter notebooks) on a Cluster of (GPU) compute nodes.

Key features for users

NOTES

Basic Usage

After logging into a Carme system, users will see the following basic home screen home_screen

  1. Status bar: shows user details and session time-out
  2. Log out
  3. System News: maintenance and feature announcements
  4. Cluster utilization: graph showing the current and past availability of resources
  5. Messages: Carme status messages (e.g. starting/stopping jobs)
  6. Start new job: select (selection is restricted by user profile)
    • num of nodes
    • num of GPUs per node
  7. Job image selection
  8. Job name (optional)
  9. Start job button
  10. Running jobs: list of queued and running jobs
    • NOTE: depending on the availability of resources and user profile quotas, jobs might not start right away
  11. Entry Points: list services running in the job image, click to start
  12. Job Infos: system Information on a running job
    • GPU / CPU / Memory assigned to job
    • GPU usage graph
    • GPU mem usage graph
  13. Stop Job: manual termination of the job

tools Additionally, the home screen shows a Carme-Tools section with links to:

FAQs

see List of FAQs

Entry Points

see Entry Point Doc

Multi-Node Jobs

see Multi Node Doc

Carme APIs

Python API

Carme provides a python library, which allows users to directly interact with the Carme system. To use it, simply

import carme

...

See the auto generated API documentation for details.

Bash Scripts

Creating and managing images