I have written a script using gradio and sometimes (I emphasize this- only sometimes) when I run it I get
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
The strange thing is that different to the answers I got searching the internet , I am not using tokenizers, or LLMs or accessing huggingface models or forking anything.
My code is simply a gradio script that uses dropdowns buttons and dataframes and that when clicked it sends a request to a remote machine. This remote machine is running an API that actually calls a LLava model, but that is process running on a remote machine.
The local machine- where the error is displayed- just sends a request and receives it
The only other thing that occurs to me is that when the process is finished I do
with mlflow.start_run() as run:
mlflow.log_param("some_param",something)
# ... some more param metrics and artifacts logging
print(
f"Logged data to MLFlow with run ID: {run.info.run_id} in experiment: {experiment_name}"
)
The error message appears in this moment and before the final print
Any hint of why this could be happening?