Please consider this code:
import argparse
import humanize as H
import logging
import logging.handlers
import neptune
import numpy as np
import os
import pandas as pd
from posixpath import join
import random
from sklearn.model_selection import train_test_split
import time
from tqdm import tqdm
from datasets import Dataset
import torch
from torch.utils.data import DataLoader
from transformers import T5Tokenizer, T5ForConditionalGeneration
import nltk
from rouge_score import rouge_scorer
import sacrebleu
nltk.download('wordnet')
def setup_logging():
logger = logging.getLogger("my-logger")
logger.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
# console
# ch = logging.StreamHandler()
# ch.setLevel(logging.DEBUG)
# ch.setFormatter(formatter)
# logger.addHandler(ch)
return logger
#< setup_logging
# __ M A I N __
if __name__ == "__main__":
global args
global logger
parser = argparse.ArgumentParser()
parser.add_argument("--log_file", type=str, default=None, help="log file (rotating)")
args = parser.parse_args()
logger = setup_logging()
print(logger.handlers)
logger.info("message")
input()
#< __main__
whose output, when run, is:
$-> python test_log.py
[nltk_data] Downloading package wordnet to /home/user/nltk_data...
[nltk_data] Package wordnet is already up-to-date!
[]
It works as expected. Since there are no handlers, there is no output to console. If I attach the StreamHandler
(commented out in previous code), I get this output:
[nltk_data] Downloading package wordnet to /home/user/nltk_data...
[nltk_data] Package wordnet is already up-to-date!
[<StreamHandler <stderr> (DEBUG)>]
2024-07-19 05:40:35,191 - INFO - message
I have a problem with a larger script: log messages are duplicated. The instructions run at the beginning of the script are exactly the same as in the MWE above. The imports are the same, but the logging output is different: I get output to the console even if I do not attach a handler.
In the larger class I have this method for setting up the logger:
def setup_logging():
print("in setup_logging")
input()
l = logging.getLogger("mylog")
l.setLevel(logging.INFO)
formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
# console
#ch = logging.StreamHandler()
#ch.setLevel(logging.DEBUG)
#ch.setFormatter(formatter)
#l.addHandler(ch)
return l
#< setup_logging
Every time it enters the function, it prints in setup_logging
and wait for user input. I can confirm it is run only once, so the duplicated messages are not due to multiple console handlers.
The scripts starts as in the MWE:
if __name__ == "__main__":
global args
global logger
.... cli arguments ...
args = parser.parse_args()
logger = setup_logging()
print(logger.handlers)
logger.info("message")
input()
but I get a different output:
in setup_logging
[]
INFO:mylog:message
It enters the setup_logging()
, wait for user input. The set of handlers is empty, BUT now it prints the message to the console even if no handlers are set. On the contrary, in the MWE, when no handlers were set, no output message was sent to the console.
If I attach the console handler (in setup_logging()
) and run the larger script, the output is:
[nltk_data] Downloading package wordnet to /home/user/nltk_data...
[nltk_data] Package wordnet is already up-to-date!
in setup_logging
[<StreamHandler <stderr> (DEBUG)>]
2024-07-19 06:00:16,736 - INFO - message
INFO:mylog:message
As you can see the message is duplicated. The first one comes from the console handler (notice the format), the second from … I do not know.
I am not able to debug this and understand what is happening.
Is there anyone who can help?
—
I have tried several solutions. The only one that seems to work is This logger has no handler, yet it prints. Why?
If I add l.propagate = False
in setup_logging()
, then there are no duplicated messages, but I still cannot understand the different behaviour in the two cases.
The documentation for propagate
can be found here: https://docs.python.org/3/library/logging.html
However, I do not understand why the log message is propagated in the “larger” script and not in the MWE.
1