ray-project / ray-educational-materials Goto Github PK
View Code? Open in Web Editor NEWThis is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.
License: Apache License 2.0
This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.
License: Apache License 2.0
As the number of different notebooks grows, it becomes more and more difficult to surface what it is that users are interested in. Right now, the directories are named around either relevant library (e.g. "Ray Core") or around type of data (e.g. "Computer_vision_workloads").
At the very least, these conventions should be consistent, and ideally, centered around workflows that developers would relate to. In addition, the README should increase in quality to better describe this repository as well as direct attention and traffic to the relevant modules more quickly.
Currently the list of use cases in https://github.com/ray-project/ray-educational-materials/blob/main/Introductory_modules/Overview_of_Ray.ipynb contains the following:
This is skewed toward advanced use cases, which I don't think accurately reflects the entire target audience of Ray. I think it would be productive to break this down into two categories:
Introduction to Ray AIR
The serve code snippet is the tune one, and should be swapped out.
n/a
Low: Minor problem.
Should the labels in preprocess_function be enconded output? It seems to used input_ids as label instead of output.
predictions_dataset = predictor.predict(data=dataset, batch_size=1)
If I run on a GPU server, this line will raise a RayTaskError. It seems
the returned segmentation_maps_postprocessed has to be put into CPU numpy and the `num_gpus_per_worker=1' has to be set. It took me much time to realize the example has that issue. For a newbie, even a minor issue may lead to confusion.
Thanks
The link to "Dask on Ray" takes you to a 404 page:
https://docs.ray.io/en/latest/data/dask-on-ray.html
It looks like it should be this page:
https://docs.ray.io/en/latest/ray-more-libs/dask-on-ray.html
N/A
Low: Minor problem.
Here are a list of small changes to make based off of feedback from the "Overview of Ray" dry run:
n_estimators
as 8
and then increment in 8
to achieve a more satisfying convergenceMerge Datasets and BatchPredictor approaches into one: "Distributed batch inference with Ray AIR".
Datasets approach is more basic; BatchPredictor is more specialized, easy to use and feature rich as it also:
Note in this section that BatchPredictor calls dataset.map_batches() under the hood. From that perspective they are similar.
Description
Running the "try it out" colab on the website fails with import error.
AttributeError: 'NoneType' object has no attribute 'replace'
Using the latest version of xgboost-ray (0.1.18) fix the problem.
ray==2.3.0 xgboost_ray==0.1.15
Low: Minor problem.
The first line of example 3 includes the following import: import tasks_helper_utils as t_utils
. But, tasks_helper_utils
is not a real library.
ray, version 2.7.0, Python 3.11.5, MacOS Monterey 12.2.1
Low: Minor problem.
Computer_vision_workloads/Semantic_segmentation/Scaling_batch_inference.ipynb
Import as well as other dependencies need to be fixed for chekpoint related changes.
#from ray.air import Checkpoint
from ray.train import Checkpoint
Futher Checkpoint.from_dict() does not work as:
AttributeError: The new ray.train.Checkpoint
class does not support from_dict()
. Instead, only directories are supported.
Ray 2.10.0
Python 3.10.13
Ubuntu
None
Add Part 3, that will consist of small coding exercises:
Work with Object store
ray.put()
ray.get()
to access value of the object.Compute pi digits
Use this docs example to show highly_parallel computational job - compute pi digits.
When using Ray, you can pass objects as arguments to remote functions. Ray will automatically store these objects in the local object store (on the worker node where the function is running) using the ray.put() function. This makes the objects available to all local tasks. However, if the objects are large, this can be inefficient as the objects will need to be copied every time they are passed to a remote function.
To improve performance, you can explicitly store both the model and feature extractor in the object store by using ray.put(). This avoids the need to create multiple copies of the objects.
I am confused on the words on : ray.put()
which sentence should I follow ?
Help Ray users understand how they can estimate number of Actors and compute needed to achieve performant batch prediction. Mention the following:
Ray_Core_1_Remote_Functions.ipynb
Running this cell gives
UnidentifiedImageError: cannot identify image file '**/ray-educational-materials/Ray_Core/task_images/stennis.jpg'
Ray: 2.3.1
Python: 3.10.12
OS: Ubuntu 22.04
Minor
Example 3: How to use Ray distributed tasks for image transformation and computation
When I run the "run_distribued"๏ผ I had the following errors:
In my case I set the batch to 100 but even I set it to 35, the errors raised too.
I am new to Ray and can not figure out what is going on . What resouces are unavailable and why does the syestm halt?
System: Centos 7
CPUs: 128
Ray: 2.3
python 3.9
None
Add more rows to the table:
Add more content to the table:
ML practitioner examples -> add scalable training and parallel training examples. Training many models in parallel
Merge Actors and ActorPool approaches into one.
As ActorPool is a utility, it can be presented as a convenience wrapper that it easy to work with. It provides load balancing and Actors management so that Ray user does not need to implement it themselves (as presented in the Actors section).
LLM_finetuning_and_batch_inference.ipynb
Get the following errors while running the following cell
trainer = HuggingFaceTrainer( trainer_init_per_worker=trainer_init_per_worker, scaling_config=ScalingConfig(num_workers=num_workers, use_gpu=use_gpu), datasets={ "train": train_dataset, "evaluation": validation_dataset, }, run_config=RunConfig( checkpoint_config=CheckpointConfig( num_to_keep=1, checkpoint_score_attribute="eval_loss", checkpoint_score_order="min", ), ), preprocessor=batch_preprocessor, )
ray 2.8 python3.9
High: It blocks me from completing my task.
The collection of diagrams for the Ray Serve use case under the section "Mutli-model composition for model serving" is illegible and cluttered. Replace this image with a more readable diagram whenever it becomes available.
Failed to execute following python code:
# Read Parquet file to Ray Dataset.
dataset = ray.data.read_parquet(
"s3://anyscale-training-data/intro-to-ray-air/nyc_taxi_2021.parquet"
)
### Environment info
Python version: 3.11.3
Ray version: 2.5.0
### Issue Severity
High: It blocks me from completing my task.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.