I have pandas series that I want to split into two: one with all the entries of the original series where index contains a certain word and the other with all the remaining entries.
Getting a series of entries which do contain a certain word in their index is easy:
foo_series = original_series.filter(like = "foo")
But how do I get the rest?
anton.kahn is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
0
You could drop
those indices from the original Series:
foo_series = original_series.filter(like = "foo")
non_foo_series = original_series.drop(foo_series.index)
Or use boolean indexing:
m = original_series.index.str.contains('foo')
foo_series = original_series[m]
non_foo_series = original_series[~m]
Example:
# input
original_series = pd.Series([1, 2, 3, 4], index=['foo1', 'bar1', 'foo2', 'bar2'])
# outputs
# foo_series
foo1 1
foo2 3
dtype: int64
# non_foo_series
bar1 2
bar2 4
dtype: int64
You could pass a regex to pandas.Series.filter()
.
foo_series, nonfoo_series = (
original_series.filter(like="foo"),
original_series.filter(regex=r"^(?!.*foo)")
)
This regex can be visualised as:
Output (sample data from the answer by Mozway):
foo_series
foo1 1
foo2 3
dtype: int64
nonfoo_series
bar1 2
bar2 4
dtype: int64