I’m generating what amount to micro-blog posts. They have individual URLs, but the site is a feedback campaign, so we won’t run-out and I don’t have to future proof.
Currently I’m creating UUIDs, but the example URL is example.org/f/8152c556-0845-412b-81f7-e4a0721bafb7/, and I was thinking it might be a smudge easier to select and copy URLs if it looked more like example.org/f/bafb7.
It’s not guaranteed unique but it’s based on the current microsecond when it’s called so it’s going to be pretty unique unless you’re calling it across multiple servers at the same microsecond.
As a safety measure you could check to make sure that uniqid isn’t already used, and just re-roll if it is. If you want it to be even shorter there’s nothing stopping you from truncating the string to get even shorter ids, but of course it increases the change that you’ll have to re-roll if you do that.
Also, since they’re generated by microseconds, if you choose to truncate it’d probably be better to truncate off the front part rather than the back, since it’s the back part that will change most frequently.
Or if you really want to try hard to be less predictable, you could md5 your uniqids and then truncate them:
As you can see, that little change gets you ids that aren’t so sequence-y.
But depending on how quickly these posts are being generated, none of this might matter that much compared to just using uniqid(). You can fiddle all day trying to get different kinds of pseudo-randomish strings
Another option might be to be to do something like sha1() the content of the post and refer to it by the first couple of characters of the sha1 hash. Sorta similar to how people might refer to git hashes using only the first few characters
php > echo sha1("What's your favorite food? Mine's pizza!");
a25a2317aa17433c7a5acd71ed5fc490d69e5a20
php > echo sha1("I'mma little teapot, short and thin 'cause I'm workin' out!");
fafd060dc465812ffc6b7106ae918a4e009b84be
php >
If these were the contents of each blog post you could refer to them by a25a2317 and fafd060d respectively. This also has the advantage of being based on content rather than just timestamp, so if multiple people posted at the same microsecond they wouldn’t get the same hash. If it’s likely that posts with the same content might come up, just append on more metadata to hash with the content, like user, publish timestamp, etc. It doesn’t matter how much you concat together, the sha1 hash always comes out to the same length:
All great ideas, thanks! I’m using WordPress, which has a built-in to ensure unique URLs (by appending a number at the end, after check $slug, $title, etc.), so no worries there (and I doubt multiple submissions per minute, let along microsecond).
and I doubt multiple submissions per minute , let along microsecond
Ahh, but it doesn’t have to be a constant stream of posts, it only takes one funny happenstance to put a program in a pickle. I’ve found that as soon as I doubt something as being unlikely, it happens.
Then, to quote one of my favorite lovable bears, I found that I have proven myself, “Foolish, Deluded, and a Bear of No Brain at All”.