One uses the ‘@’ character in the query string to refer to variables when using pandas.DataFrame.query(). However, doing so seems to throw an error when you refer to class variables. See example below:
In [1]: import seaborn as sns
In [2]: tips = sns.load_dataset('tips')
In [7]: class i:
...: j = 5
...:
In [8]: i.j
Out[8]: 5
In [9]: tips.query("size == @i.j")
Traceback (most recent call last):
File /usr/local/lib/python3.10/site-packages/pandas/core/computation/scope.py:231 in resolve
return self.resolvers[key]
File /usr/local/Cellar/[email protected]/3.10.6_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/collections/__init__.py:986 in __getitem__
return self.__missing__(key) # support subclasses that define __missing__
File /usr/local/Cellar/[email protected]/3.10.6_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/collections/__init__.py:978 in __missing__
raise KeyError(key)
KeyError: 'i'
#Instance variables works correctly as expected
In [19]: m = i()
In [20]: m.j
Out[20]: 5
In [21]: tips.query("size == @m.j")
Out[21]:
total_bill tip sex smoker day time size
142 41.19 5.00 Male No Thur Lunch 5
155 29.85 5.14 Female No Sun Dinner 5
185 20.69 5.00 Male No Sun Dinner 5
187 30.46 2.00 Male Yes Sun Dinner 5
216 28.15 3.00 Male Yes Sat Dinner 5
Note 1: Class variable used to work. It seems to not now, post upgrade to pandas version 2.2.0
Note 2: I do know that I can use an f-string in the query to circumvent this problem. I’d like to avoid that if I can. It also doesn’t make sense for me to instantiate an object of the class just so that I can use it in the query string.
Has something changed? If so please do point me to any documentation around it.
Thanks!
I’ve tried googling for information around the same and can’t find any.