Skip to content

Commit

Permalink
Ray OBS support
Browse files Browse the repository at this point in the history
Signed-off-by: Sergei <[email protected]>
  • Loading branch information
sWX1071736 authored and imhy committed Apr 27, 2024
1 parent a8bac2d commit cbcc617
Show file tree
Hide file tree
Showing 3 changed files with 103 additions and 0 deletions.
103 changes: 103 additions & 0 deletions reps/2024-04-24-ray-obs-support.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
## Summary

### General Motivation

Ray users can use the submit_job Api to send jobs to the ray cluster.

At the same time, they can specify the location of the project's working directory in a remote file system, for example s3.

This document suggests expanding the list of supported file systems and adding support for OBS [(Object Storage Service)](https://support.huaweicloud.com/intl/en-us/productdesc-obs/en-us_topic_0045853681.html).

After the implementation of this improvement, the user will be able to run submit tasks specifying the location of the working directory in OBS.


The use case of submitting ray job via obs is shown in Figure 1:

- First, the user starts the remote ray cluster, and then uploads the codes to the obs;

- Second, the user submits ray job to the ray cluster using the submit_job APIs, as well as setting the obs path in the submit_job API

- Third, after submitting the ray job, the ray cluster automatically downloads, uncompresses, and executes the codes from the OBS

<img style="background-color:white" src="2024-04-24-ray-obs-support/fig1.png" alt="Fig 1">


### Should this change be within `ray` or outside?

main `ray` project. Changes are made to Ray core components.

## Stewardship

### Required Reviewers

- @jjyao
- @ericl


### Shepherd of the Proposal (should be a senior committer)

- @ericl


### Design and Architecture

To submit a Ray job through OBS, perform the following steps:

1. Compress the code into a zip or jar package and put it in OBS

2. You can use environment variables or configuration files to access the AK, SK, and Endpoint of OBS

3. When you submit a ray job, specify the path of the OBS service

```python
job_id = client.submit_job(
entrypoint="python script.py",
runtime_env={"working_dir": "obs://example_bucket/example_file.zip"}
)
```

4. The ray cluster automatically downloads the example_file.zip from the specified OBS bucket, decompresses it, and then runs the entry file script.py in the working dir path


## Design Insights

Modify the source code of Ray so that it can download and run the OBS code as follows:

1. Parse the OBS path, for example, "obs://example_bucket/example_file.zip";

2. Read the AK, SK, and endpoint of OBS through environment variables and configuration files to access the remote OBS path.

3. Download the example_file.zip from the specified OBS bucket, decompress it, and execute the user's code

We can extend the ray project to support obs protocol, enabling the ray cluster to parse the obs URI and download the codes from obs:

1. parse obs URI, such as "obs://example_bucket/example_file.zip";

2. config the obs AK, SK, and Endpoint via environment variables and config files;

3. the ray cluster automatically downloads and execute the codes from obs.

## Implementation Analysis

To extend the ray project for obs, we first need to figure out the workflow of parsing and accessing remote URIs, which is shown in Figure 2.

<img style="background-color:white" src="2024-04-24-ray-obs-support/fig2.png" alt="Fig 2">

After the user submitting remote ray jobs, the ray cluster calls download_and_unpack_package function to download and uncrompress the remote files, as shown in Figure 2. To extend the ray project for OBS, we should extend the download_and_unpack_package function to support the OBS scheme, which is implemented via the following two steps.

**Step 1**: Extend the **parse_uri** function to parse the OBS URIs, which is implemented in the file [ray/_private/runtime_env/packaging](https://github.com/ray-project/ray/blob/master/python/ray/_private/runtime_env/packaging.py).

**Step 2**: Extend the third-party library [smart_open](https://github.com/piskvorky/smart_open) to read OBS objects, which is suggested to implement 3 interfaces, i.e., parse uri, open_uri, and open, as shown in Extending smart_open.



The 3 interfaces in smart_open have their own intents:

**parse_uri** : parse the remote URI "obs://bucketId/keyId" to obtain the following info: obs (scheme), bucketId, keyId.

**open_uri** : using the parsed URI info to open the remote objects and call the open API to return an IO stream

**open** : access the remote object and open it as an IO stream


It is worth noting that we can extend the open API to access the OBS objects, which can be implemented to call the wrapper functions based on [obs SDK](https://pypi.org/project/esdk-obs-python/).
Binary file added reps/2024-04-24-ray-obs-support/fig1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added reps/2024-04-24-ray-obs-support/fig2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit cbcc617

Please sign in to comment.