2/3/2024 0 Comments Supertux version 0.3.1I can ray rsync_up my test job script and run it from the head node and everything works as expected. Similar to I get gRPC timeout and Ray client/server errors when I ray attach $config -p 10001 and use ray.init("ray://localhost:10001") in my test job script, but I’m using Python 3.8 and Ray 2.2. My test job is just training a PPO agent with RLlib using a dummy environment defined in my script. EC2 instances are configured with proxy info when they’re launched, and, since I’m using Ray in Docker, the node Docker containers have proxy info configured through environment variables with Docker run options. Additionally, the AWS environment traffic all goes through a proxy. The instances are only accessible through a jump host, so I have a user-defined SSH proxy command in my cluster config file. For context, I’m hosting a Ray Cluster on AWS EC2s in a VPC. ![]() 10:15:48,406 WARNING proxier.py:777 – Retrying Logstream connection. ![]() 10:15:48,405ĞRROR proxier.py:379 – Timeout waiting for channel for ace70b4b753146babdea12418dbbb528įile “/usr/local/lib/python3.6/dist-packages/ray/util/client/server/proxier.py”, line 375, in get_channelįile “/usr/local/lib/python3.6/dist-packages/grpc/_utilities.py”, line 140, in resultįile “/usr/local/lib/python3.6/dist-packages/grpc/_utilities.py”, line 86, in _block Is there restrictions or limitations of running ray >=1.10 with python 3.6.9?įor some reason in ray version above 1.8 I’m getting the following error: (with 1.6 it’s work fine)
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |