Giter Club home page Giter Club logo

mc2's People

Contributors

chester-leung avatar dependabot[bot] avatar jochasinga avatar lvntky avatar mc2-bot avatar octaviansima avatar podcastinator avatar ryanleh avatar wzheng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mc2's Issues

mc2 run failed

Could help with this? Thanks.

docker run --env HTTP_PROXY="XXXXXXXXXXXXX" --env HTTPS_PROXY="XXXXXXXXXXXX" -it -v ~/github/mc2-project/playground:/mc2/client/playground mc2project/mc2_img:v0.1.3

root@da42abf41b17:/mc2/client# cp -r quickstart/* playground
root@da42abf41b17:/mc2/client# mc2 configure $(pwd)/playground/config.yaml
2021-12-14 07:13:06 - INFO - Set configuration path to /mc2/client/playground/config.yaml
root@da42abf41b17:/mc2/client# mc2 init
2021-12-14 07:13:12 - WARNING - Skipping keypair generation - private key already exists at /mc2/client/playground/keys/user1.pem
2021-12-14 07:13:12 - WARNING - Skipping symmetric key generation - key already exists at /mc2/client/playground/keys/user1_sym.key
2021-12-14 07:13:12 - INFO - init finished successfully
root@da42abf41b17:/mc2/client# mc2 start
2021-12-14 07:13:15 - INFO - Running 'cd /mc2/opaque-sql; build/sbt run' locally
2021-12-14 07:13:15 - INFO - start finished successfully
root@da42abf41b17:/mc2/client# mc2 upload
2021-12-14 07:13:23 - HOST - info - Successfully initialized cryptography module.
^[[A2021-12-14 07:13:23 - INFO - Encrypted /mc2/client/playground/data/opaquesql.csv in sql format and outputted to /mc2/client/playground/data/opaquesql.csv.enc
2021-12-14 07:13:23 - INFO - Using local deployment. Copying /mc2/client/playground/data/opaquesql.csv.enc to /mc2/data/opaquesql.csv.enc
2021-12-14 07:13:23 - INFO - Using local deployment. Copying /mc2/client/playground/data/opaquesql.csv.enc to /mc2/data/opaquesql.csv.enc
2021-12-14 07:13:23 - INFO - upload finished successfully
root@da42abf41b17:/mc2/client# mc2 run
E1214 07:13:26.457284059 214 http_proxy.cc:81] 'https' scheme not supported in proxy URI
Traceback (most recent call last):
File "/mc2/client/mc2.py", line 192, in
mc2.configure_job(config)
File "/usr/local/lib/python3.6/dist-packages/mc2client-0.0.1-py3.6.egg/mc2client/core.py", line 1215, in configure_job
_attest(head_address, simulation_mode, enclave_signer_pem)
File "/usr/local/lib/python3.6/dist-packages/mc2client-0.0.1-py3.6.egg/mc2client/core.py", line 1271, in _attest
response = stub.GetRemoteEvidence(attest_pb2.AttestationStatus(status=0))
File "/usr/local/lib/python3.6/dist-packages/grpc/_channel.py", line 946, in call
return _end_unary_response_blocking(state, call, False, None)
File "/usr/local/lib/python3.6/dist-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses"
debug_error_string = "{"created":"@1639466006.457566980","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3008,"referenced_errors":[{"created":"@1639466006.457565708","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":397,"grpc_status":14}]}"

No module named 'fxgb_pb2'

Hi, thanks for your sharing.
I followed your instruction, while met an error as below.
image
Could u help me to figure out this? Thanks in advance.

Set tracker uri + port with RPC parameters

Currently, the tracker_uri and tracker_port variables in rabit are set by looking at environment variables. Whether XGBoost is run in a distributed manner is determined by the value of tracker_uri (whether or not it is null). Unfortunately, when running Federated XGBoost with RPC, setting the environment variables before individual RPC calls doesn't seem to modify the tracker_uri and tracker_port variables, meaning that XGBoost doesn't run in distributed mode.

A possible fix to this is passing in the tracker uri and the tracker port as arguments to the RPC call, then setting the tracker_uri and tracker_port variables in the Rabit instance on each worker. The variables are currently set from environment variables in the AllreduceBase::SetParam function, which is called during initialization. We'll need to modify how these variables are set, setting them on the workers immediately after the RPC call is made on the tracker.

Add TLS

Add TLS to FederatedXGBoost to reduce network leakage

Add GitHub actions linter to enforce code style

To enforce code style in the repo, we should add a GitHub Actions linter to enforce code style. In particular, we should add black and flake8 for Python, and a Chromium style clang-format checker.

We currently have the Super Linter commented out. We can likely reuse the super linter for Python black and flake8. We'll have to find another linter for clang-format, as Super Linter doesn't support C++.

Use C++ CIPHER_KEY_SIZE as key size in `generate_symmetric_key`

To prevent inconsistent key sizes across the MC2 ecosystem, we should not allow users to specify a key size in generate_symmetric_key(). Instead, we should retrieve the CIPHER_KEY_SIZE from C++, similar to this, and use the CIPHER_KEY_SIZE as the number of bytes for our generated key.

To do so, we'll need to add a function in src/c_api.cpp, similar to the cipher_iv_size() function that gets the CIPHER_KEY_SIZE from C++.

Make mc2 configure optional

Should we make mc2 configure $(path-to-config-file) a default so the user only needs to run this if the config file is somewhere else in a different path?

Add Dockerfile

We currently offer a Docker image with necessary dependencies, but the image is quite large (~5GB) and takes a while to download. Consequently, we should also provide a Dockerfile as part of this repo to enable users to build the container much faster locally.

Make topology a parameter

Currently the system will always use the star topology. We want to add a parameter somewhere that will enable us to also use the ring topology

Add support for uploading entire directories to Azure blob storage

Currently, MC2 Client doesn't support uploading entire directories to Azure blob storage. We'd like to add support for this, as data encrypted in sql format is always outputted as a directory with a data sub-directory and a schema sub-directory.

To do this, we'll have to investigate how to upload/download directories to/from Azure blob storage using the Azure Python SDK, and modify the upload_data() and download_data() functions.

Add config file

Add a YAML config file to specify the host IPs, the federated job parameters (worker memory, num workers, etc from the dmlc-submit command).

Add robust exceptions to Python client

Currently the Python client does not have a robust and friendly exception classes and throws a Segmentation fault error on the erratically formatted schema files and this error, which does not do a good job at hinting what goes wrong:

terminate called after throwing an instance of 'std::runtime_error'
  what():  Not a number.
Aborted (core dumped)

We should add some important exception classes to the client with more information that handle the lower-level errors bubbling up to the caller.

Make data on master optional

Currently, the system requires data to be present on all parties, including the master/tracker. We want to make data on the master/tracker optional

Azure CLI

During the project launch process, when I call Azure CLI, I encounter an issue indicating "API deprecated starting 2.21.0". However, the current version of Azure CLI I have installed is 2.0.25, which is lower than the 2.21.0 mentioned in the prompt. How should I solve this problem?

gRPC

Replace SSH calls to start jobs with gRPC calls

Umbrella issue of #12

Host_IP and file_path

Hey guys, I follow your guys readme file to execute the federated xgboost. While, I encounter some problems.
Question1: If the hosts.config should be under the sample folder? Meanwhile, I will run ./start_job.sh this script under the sample folder since I see ../../ at the bottom of this script(run scripts in dmlc-core this folder. Also, is there necessary to include start_job.sh and hosts.config under federated-xgboost folder?
Question2: In my understand, different clients dataset will locate in their own local node. Here, they probably need to have the same file names (eg: train.csv, test.csv). In may case, I launch three different servers in the virtualbox. Thus, they can not have the same file path as the tracker(I treat it as my local machine as the master). Hence, I am not sure whether I can use relative path such as
fxgb.load_training_data('../../data/train.csv') instead of your guys fxgb.load_training_data('home/ubuntu/mc2/data/train.csv').
Question3: As I mention before, I use virtualbox to launch three servers as worker nodes. I encounter the error: File "/Users/shi/Desktop/mc2/dmlc-core/tracker/dmlc_tracker/tracker.py", line 416, in get_host_ip
hostIP = socket.gethostbyname(socket.gethostname())
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
Do you guys have any idea how to solve this issue? i include 192.168.56.101 and other two host ips in hosts.config.
Thanks Guys.

Add logging to Python

To improve usability, we'd like to add a logger to the Python code to provide more detail on what's actually being run

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.