I’m currently using the importhtml function in Google Sheets to scrape values from several different websites and then compare the scraped data with data from my own site. The values on my site are in column A and the scraped values (a list of numbers) get dumped into columns B, E, G, etc. I then use =IF (COUNTIF(A:A, BX/EX/GX), “O”, “X”) to place an X or an O in the column to the right of the scraped values and conditional formatting to turn Xs red so that I can quickly see if I already have the values on my site.
This is all working just fine and has saved me a bunch of time over the past few months (I’m monitoring quite a lot of different sites, so opening them all up to check manually isn’t really practical), but if possible, I’d like to further automate the system. Rather than me having to check the sheet multiple times a day as I do now, I’d like to receive an email whenever an X appears in one of several cells (so, effectively, if the sheet updates and C3, E3, or G3 =“X” – I’d receive an email – ideally, with the the value in the cell to the left of the cell that contains the X).
To further complicate matters, sometimes, the information on the sites that I’m scraping is incorrect, and so upon seeing an X on my sheet for the first time, I have to manually check that the value is accurate. If it is, I add it to my site and the X changes to an O. If it’s not, I just leave the X as is and ignore it until the other site corrects their mistake and it goes away on its own. As such, it’s possible that C3, E3, or G3 (these are the most recent scraped values from the sites that I’m monitoring, and the ones that I’m most concerned with) could have an X in them that remains there for a few days.
With this in mind, I’d need the automated script that sends the email to only do so when the value in the cell to the left of C3, E3, or G3 is first scraped (and only if C3, E3, or G3 is X, as nine times out of ten, I have the values on my site before the sites that I’m scraping do). I think this could potentially be achieved with the raw data (maybe something similar to the if/countif argument that I’m currently using, only instead of returning an X or an O, it sends an email or does nothing), though I’ve no idea how this would work and am keen to avoid a situation where I get an email every time the sheet updates because of a false flag.
Any thoughts or suggestions on how to overcome this problem would be greatly appreciated.