Giter Club home page Giter Club logo

jean-zay-doc's Introduction

Why this doc?

We are researchers and engineers in AI (very vague term but oh well ...) who have managed to get access to Jean Zay and think this can be a very useful cluster for your AI research.

At the time of writing (end January 2020), the GPU part of Jean Zay is very much underused and we think a user-contributed documentation could help people navigating the access procedure and knowing a few necessary tips and tricks to be productive on such a cluster.

This is supposed to be a collaborative doc, if you spot errors or things that could be improved, open an issue or even better a Pull Request (PR)!

We use gitter for chat, don't hesitate to get involved there and ask questions!

Gitter

Content

In the medium term, more material could be added to discuss tips and tricks, limitations, work-arounds, etc ... on Jean Zay. In particular, feel free to share tutorials, tools and scripts to help users have a more productive use of the Jean Zay cluster, e.g.:

  • how to make your code use checkpointing to be able to get long running processing despite the 20 hour wall time limit;
  • how to make sure your code can leverage the hardware optimally (e.g. with mixed precision and tensorcores);
  • how to make sure that your processing is not limited by suboptimal data access patterns on the disks or inefficient pre-processing on the CPUs;
  • how to do efficient hyper-parameter tuning at scale;
  • how to synchronize you code between local computer and the cluster.

Useful links

Jean Zay doc for AI users (French only for now): http://www.idris.fr/ia

Hardware: http://www.idris.fr/eng/jean-zay/cpu/jean-zay-cpu-hw-eng.html

Doc: http://www.idris.fr/eng/jean-zay/

Doc in French (more accurate sometimes): http://www.idris.fr/jean-zay/

Email for Jean Zay user support: [email protected].

Generic advice

  • There is a big cultural gap between traditional HPC (High Performance Computing) users and AI users. For example, most traditional "serious" HPC clusters do not have access to the internet, yes you have read this correctly, people in traditional HPC do not need internet access to work on their problems.
  • So far every interaction we have had with Jean Zay user support has been very positive. Even if there may be some frustration (on both sides), try to be both pedagogical and constructive when you send an email to [email protected].

jean-zay-doc's People

Contributors

lesteve avatar mdiazmel avatar mypey avatar ogrisel avatar rstrudel avatar zaccharieramzi avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.