Content auditing the flash card way, an aspiration

content-auditing
Tags: #<Tag:0x00007f21b70fb858>

#1

When I say “content auditing” I am referring to the process by which a piece of content is assessed, and then operated upon if required. But that’s about as far as I am going to get into that, because content auditing is the tool of the “SEO master”, and if you want to know how to keep your content evergreen and working for your bottom line, go jump in a river. The water and exercise will help your brain.

No, I want to talk about managing keeping docs up to date, and existing systems. Because I hardly know of any that aren’t just a glorified spreadsheet.

Open call: what are the dope content auditing systems?

Okay, here’s what I want: a repetition based check system. I just don’t know how this works, data-wise.

Say all my content is text files in a git repo somewhere. I might be able to use metadata to mark when something has been updated, and when the next audit will happen. I think front matter like:

audited: 2018-10-16
nextaudit: 1y

Then I could just run a report (shortcode) and get a list of content to check. But there are a lot of built-in presumptions, and it doesn’t help with sites stored in a database.

Of course such a site might have it’s own systems, and I know of at least one WordPress plugin that does something like this (though honestly, I’d probably just add the exact same metadata to a posttype as in the front matter example).

This seems like a great sitemap.xml use case. Because then it doesn’t matter how the site is created, you are auditing the actual information available to humans, and that’s the important part.

Pull in content, track it internally, keeps a log for everyone doing maint.


I could totally build that in WordPress, using the same method I mentioned above, because WordPress is actually really good at scraping data and processing it… in fact, that may be the only thing it is really good at these days…


I don’t know of such a system, but there are two versions, one built on top of the other. I imagine the core being a simple command to print a list of nodes to audit. A lot of the operations will be implicit, since the database will update itself from the sitemap. (Yeah, I know taskwarrior and mutt and tootstream are affecting me; they should be affecting you, too!)

The UI we build on top is whatevs. Maybe something like those broken URL scanners (overlapping usecase!), where it is just curling in the background, but listing in a modal for the user.

The repetitions would be customizable, of course, but we could research ways that content lives, and create a set of conditions and checks that assist in humans not having to do as much busy work.