"RuntimeError: torch.distributed is not yet initialized but process group is requested" when trying to run API
Created by: jminjie
❓ Questions and Help
After following setup steps I ran metaseq-api-local
and got this output:
$ metaseq-api-local
Traceback (most recent call last):
File "/home/jliu/openpretrainedtransformer/metaseq/metaseq/service/constants.py", line 17, in <module>
from metaseq_internal.constants import LOCAL_SSD, MODEL_SHARED_FOLDER
ModuleNotFoundError: No module named 'metaseq_internal'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/jliu/miniconda3/envs/conda_env_opt/bin/metaseq-api-local", line 33, in <module>
sys.exit(load_entry_point('metaseq', 'console_scripts', 'metaseq-api-local')())
File "/home/jliu/miniconda3/envs/conda_env_opt/bin/metaseq-api-local", line 25, in importlib_load_entry_point
return next(matches).load()
File "/home/jliu/miniconda3/envs/conda_env_opt/lib/python3.9/importlib/metadata.py", line 86, in load
module = import_module(match.group('module'))
File "/home/jliu/miniconda3/envs/conda_env_opt/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "/home/jliu/openpretrainedtransformer/metaseq/metaseq_cli/interactive_hosted.py", line 31, in <module>
from metaseq.service.constants import (
File "/home/jliu/openpretrainedtransformer/metaseq/metaseq/service/constants.py", line 40, in <module>
raise RuntimeError(
RuntimeError: You must set the variables in metaseq.service.constants to launch the API.
Am I missing a step? I tried manually setting LOCAL_SSD, MODEL_SHARED_FOLDER to a new folder I created but then other things failed.
- fairseq Version (e.g., 1.0 or master): followed setup.md
- PyTorch Version (e.g., 1.0) followed setup.md
- OS (e.g., Linux): Ubuntu
- How you installed fairseq (
pip
, source): source - Build command you used (if compiling from source): followed setup.md
- Python version: 3.9.12
- CUDA/cuDNN version: 11.3
- GPU models and configuration: Quadro RTX 5000
- Any other relevant information: