Giter Club home page Giter Club logo

distributed-machine-learning-with-python's Introduction

Packt Conference

3 Days, 20+ AI Experts, 25+ Workshops and Power Talks

Code: USD75OFF

Distributed Machine Learning with Python

Distributed Machine Learning with Python

This is the code repository for Distributed Machine Learning with Python, published by Packt.

Accelerating model training and serving with distributed systems

What is this book about?

Reducing time cost in machine learning leads to a shorter waiting time for model training and a faster model updating cycle. Distributed machine learning enables machine learning practitioners to shorten model training and inference time by orders of magnitude

This book covers the following exciting features:

  • Deploy distributed model training and serving pipelines
  • Get to grips with the advanced features in TensorFlow and PyTorch
  • Mitigate system bottlenecks during in-parallel model training and serving
  • Discover the latest techniques on top of classical parallelism paradigm
  • Explore advanced features in Megatron-LM and Mesh-TensorFlow
  • Use state-of-the-art hardware such as NVLink, NVSwitch, and GPUs

If you feel this book is for you, get your copy today!

https://www.packtpub.com/

Instructions and Navigations

All of the code is organized into folders.

The code will look like the following:

# Connect to API through subscription key and endpoint
subscription_key = "<your-subscription-key>"
endpoint = "https://<your-cognitive-service>.cognitiveservices.
azure.com/"
# Authenticate
credential = AzureKeyCredential(subscription_key)
cog_client = TextAnalyticsClient(endpoint=endpoint,
credential=credential)

Following is what you need for this book: This book is for data scientists, machine learning engineers, and ML practitioners in both academia and industry. A fundamental understanding of machine learning concepts and working knowledge of Python programming is assumed. Prior experience implementing ML/DL models with TensorFlow or PyTorch will be beneficial. You'll find this book useful if you are interested in using distributed systems to boost machine learning model training and serving speed.

With the following software and hardware list you can run all code files present in the book (Chapter 1-12)

Software and Hardware List

Chapter Software required OS required
1-12 PyTorch Windows, Mac OS X, and Linux (Any)
1-12 TensorFlow Windows, Mac OS X, and Linux (Any)
1-12 Python Windows, Mac OS X, and Linux (Any)
CUDA/C
NVprofiler/Nsight

We assume you have Linux/Ubuntu as your operating system. We assume you use NVIDIA GPUs and have installed the proper NVIDIA driver as well. We also assume you have basic knowledge about machine learning in general and are familiar with popular deep learning models.

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. Click here to download it.

Related products

Get to Know the Author

Guanhua Wang is a final-year computer science Ph.D. student in the RISELab at UC Berkeley, advised by Professor Ion Stoica. His research lies primarily in the machine learning systems area, including fast collective communication, efficient in-parallel model training, and real-time model serving. His research has gained lots of attention from both academia and industry. He was invited to give talks to top-tier universities (MIT, Stanford, CMU, Princeton) and big tech companies (Facebook/Meta, Microsoft). He received his master's degree from HKUST and a bachelor's degree from Southeast University in China. He has also done some cool research on wireless networks. He likes playing soccer and has run multiple half-marathons in the Bay Area of California.

Download a free PDF

If you have already purchased a print or Kindle version of this book, you can get a DRM-free PDF version at no cost.
Simply click on the link to claim your free PDF.

https://packt.link/free-ebook/9781801815697

distributed-machine-learning-with-python's People

Contributors

guanhuawang avatar packt-itservice avatar packtutkarshr avatar roshank10 avatar utkarsha-packt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

distributed-machine-learning-with-python's Issues

Chapter02 model doesn't converge

Hi, I ran the code in main.py in chapter 2 and found that the loss explodes when training the model while the loss seems decrease in book. Is there any problem with the model settings?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.