For spam and abuse prevention, we store IP addresses in a number of places. In many instances, it is not necessary to keep these addresses forever.
Going to put some thoughts here since they apply more overall than for any individual task.
First off, I assume for support requests (AO3-5408) we are fine with just not saving IP since it's not used for investigations? All the rest (works/comments/abuse_reports) we do want to keep saving IP in the DB?
How big are those tables for works, comments, and abuse_reports? I'd imagine at least works and comments are enormous.
By purging IP addresses, we're going to create columns which are mostly null except for recent data (sparse columns).
Is it worth offloading these IP addresses into their own tables with a mapping of work_id (for example) to ip_address?
Then the scheduled tasks could just be deleting rows older than X days and the tables would stay more reasonably sized?
We could also drop the ip_address columns off the main tables in that case
Full disclosure, I’m not a DBA (nor do I play one on TV) so this is just me thinking out loud.