Relative Content

Tag Archive for pythonpandasdataframecsv2-digit-year

How can I clean a year column with messy values?

I have a project I’m working on for a data analysis course, where we pick a data set and go through the steps of cleaning and exploring the data with a question to answer in mind. I want to be able to see how many instances of the data occur in different years, but right now the Year column in the data set is set to datatype object, with values spanning from whole years like 1998, just the last 2 digits likes 87, ranges of presumed years (‘early 1990’s’, ’89 or 90′, ‘2011- 2012’, ‘approx 2001’)