Catch up on stories from the past week (and beyond) at the Slashdot story archive

typodupeerror

Journal by Chacham

On a permission-based list, the URLs have tracking numbers. The question is, what to use?

Generating a random number for each email is quick and easy, but guaranteeing uniqueness is not as quick. It gets increasingly longer as the space is used, and adding more digits can hurt URL length.

Since every email sent is recorded, each already has a unique number. The problem is predictability.

Two solutions were provided. One, encrypting of the unique number. Second, using the unique number and a non-verified random number.

Encryption is kind of kewl. But decryption takes time (especially when many links are hit at the same time) and hard as it is, there is a single point of failure, for any links that use the same encryption key. The neat upside is, length is lower, and nothing more needs to be stored.

The unique number seems odd. It gives a counter in the table, it adds what needs to be stored, and takes time generating the number.

What do you think?

This discussion has been archived. No new comments can be posted.

• #### Hashing (Score:3, Informative)

on Sunday December 14, 2003 @12:20PM (#7717568) Journal
Solution: use a counter, coupled with a hash of that counter with a secret prefix. Checking the hash corresponds to that number is trivial, deriving the secret from the hash and number is non-feasible for a secure hash algorithm.
• #### Re:Hashing (Score:2)

This is exactly how to do it :) Make sure your secret prefix (or suffix) isn't too short. a md5 has is really hard to decode unless you know exactly how it was made and have all but one of the pieces of the puzzle put through the digest in the first place.

These examples assume you have a db or other storage medium you are storing the countervariable in, matched against the recipient.

in python:
import md5

def onewayid(countervariable):
mymd5 = md5.new(str(countervariable) + 'this is your secret string
• #### Re:Hashing (Score:2)

The only issue i'd have with hash is the single point of failure. If someone somehow does figure out the key, he then can figure out all subsequent hashes.

• #### Re:Hashing (Score:2)

crypt is hashing, just a weaker version than what md5 uses.

When the user on the other end doesn't even know the length of the plaintext, even knowing what id was included with the plaintext is useless. It is _very_ hard to decompose a md5, as-in very computationaly intensive. In some ways, due to the unknown plaintext length issue, its harder than (small key, 128) rsa encryption.

If you've read about issues with md5 hashes and, say, APOP authentication, then let me address that: APOP sends the prefix in
• #### Re:Hashing (Score:2)

Not to say that people could figure it out. Merely, the issue that there is a single point of failure. Sure, its extremely unlikely to figure out, but wouldn't it be better to just not have the single point at all?

If you are still worried, then store the hash with the required info to verify it - then you can use a new suffix/prefix each time. Adding the current timestamp to your plaintext should be fine - a unique id, a secret suffix/prefix, and a timestamp (semi-unique).

But, then (storing it), is that
• #### Re:Hashing (Score:2)

The onyl way to avoid a single point is to do authentication on top :(

Storing a has is more secure than just passing a unique id and a random id - unless your random id is going to have a very significant range, otherwise a md5 is harder to guess.
• #### Re:Hashing (Score:2)

As long as it is the same size as an md5?
• #### Re:Hashing (Score:2)

Same size or larger - remember that a md5 is always 16 bytes (displayed as 32 in hex), meanwhile a random id is only as big as the column type you declare... in mySQL the largest number column type is a DOUBLE or BIGINT, each of which is limited to 8 bytes. Numeric and Decimal columns are limited to a double's prescision.
• #### Re:Hashing (Score:2)

But, the random id can be letters, or numbers converted to base anything. Then, stored in a VARCHAR.

And, there is an attempt at keeping the amount characters to a minimum, to avoid wrapped links.
• #### Re:Hashing (Score:2)

True, you can make it anything, and do base64, for example, on it to reduce display length while keeping it non binary. What I'm trying to point out is several point about doing such a thing:

- Your random number will need to have a large range to be more effective than a md5, more than 16 bytes worth. Thats big.
- finding a random number generator that will provide you with such precision can be done, but it will be slow.
- random number generation and hashing have a lot in common - no random number gen
• #### Re:Hashing (Score:2)

I just read all that you said. Interesting. I'll have to give myself time to chew on it.

[Also, I just re-read what i wrote. Please accept my confidence as a belief in my positions (and trust in my intuition) not as arrogance.]

Though, a couple immediate points.

the random number generating algorithm will be the single point of failure at that point - just as a md5. Unlike a md5 however, most random number generators have a distinctive signature - in other words, given a sequence of generated numbers, one
• #### Re:Hashing (Score:2)

I disagree. If the random number is seeding each time by the current timestamp, the user would have to know the exact time that was generated, what other jobs the machine was running (affecting the time the seed is given), and the order, otherwise knowing what time the job was running doesn't help

I had originally thought you would be sending out many mails at the same instant - a newletter or some sort, in which one seed would be used, hence my talk of a sequence from a single initial seed. Using a diff
• #### Re:Hashing (Score:1)

The next time you guys hold a class send me an invite! This was a very worthwhile thread although I'm still trying to process it all!
(visual learner here)

--Huck
• #### Re:Hashing (Score:2)

I had originally thought you would be sending out many mails at the same instant - a newletter or some sort, in which one seed would be used,

Yes, a newsletter. However, a re-seed is more secure, and negligable timewise, especially since the generator would need to generate a new number anyway.

However, there are only so many algorthms to choose from - and knowning the aprox time the mesage was sent, an attacker could work backwards to try different generators on each millisecond in a 5 seconde period, ra
• #### Re:Hashing (Score:2)

ENTP here :) (on most days)

Oracle's random number generator is much better than most OS generators - If I remember corectly, only OpenBSD was significantly better.

The webpage idea was just an example :)

This sounds like an interesting system you're working on - is this day-job related or pet-project?
• #### Re:Hashing (Score:2)

ENTP here :) (on most days)

All the ENTPs i know are very happy people, and a joy to be around.

Oracle's random number generator is much better than most OS generators - If I remember corectly, only OpenBSD was significantly better.

Ah, good. Thanx.

The webpage idea was just an example :)

Yeah, I know. But there it is. Other than a timestamp, retrieving unique data takes a bit of time. Whereas the web data is more random, it takes that much longer to get.

This sounds like an interesting system you're
• #### timestamp through crypt (Score:1)

What's wrong with just using the timestamp run through crypt? Unique number gets encrypted producing a unique hash that's reasonably unpredictable...

I was about to explain a system I devised that did something like this but I just realized it is probably still copyrighted by my old boss so I can't tell you about it. A real shame too, I rambled on about it for three paragraphs and now this is all you get for a post. Sorry.
• #### Re:timestamp through crypt (Score:2)

What's wrong with just using the timestamp run through crypt? Unique number gets encrypted producing a unique hash that's reasonably unpredictable...

Why can't anyone else do the same thing? Besides, with many emails being sent every second, there would be some possibly duplication.

A real shame too, I rambled on about it for three paragraphs and now this is all you get for a post. Sorry.

Heh. I also had a much larger post first. Got two paragraphs described in detail, and then realized that they probabl
• #### Re:timestamp through crypt (Score:2)

Why can't anyone else do the same thing? Besides, with many emails being sent every second, there would be some possibly duplication.

Ah, I missed that point. So I guess I should ask: What exactly is the purpose of the scheme? Is it to prevent users from getting at things they are not supposed to? (use a login system) Is it to keep them from asking for data outside it's approved of time-frame? (login with a timer?) What exactly is it you really need to do and why? Why can't you use a login? Why can't yo
• #### Re:timestamp through crypt (Score:2)

It's a link inside of an email.....
• #### Re:timestamp through crypt (Score:2)

It's a link inside of an email.....

Yeah, so? They can't use a browser?
• #### Re:timestamp through crypt (Score:2)

The link has to uniquely identify the email, in a pretty non-guessable manner.
• #### Re:timestamp through crypt (Score:2)

So, check-sum some part of the email that is different for each user.
• #### Re:timestamp through crypt (Score:2)

But how would one know what email was sent and when?

Also, if more than one email are essentially the same, the only thing different per user would be minimal.

Finally, different people do have the same name.
• #### Re:timestamp through crypt (Score:2)

Check sum the header since each TO address is different. The Checksum for each header will be different. Keep the Checksum as an ID for each e-mail. Each e-mail will have a different ID even if the differences are minimal. Checksum was designed to give radically different results for very simmilar inputs.

You don't have a problem unless there are two people with the same name and the same e-mail address. If that happens I'd say that those people have a really big problem. I'd even say that it wasn't your
• #### Re:timestamp through crypt (Score:2)

Interesting.

One issue is, though, that the to address does not alway come back. For example, if the to address is a forwarder, and the actual address bounced it, the to address in the bounce will beincorrect.

You don't have a problem unless there are two people with the same name and the same e-mail address. If that happens I'd say that those people have a really big problem. I'd even say that it wasn't your problem to worry about.

Agreed.
• #### Re:timestamp through crypt (Score:2)

One issue is, though, that the to address does not alway come back. For example, if the to address is a forwarder, and the actual address bounced it, the to address in the bounce will beincorrect.

So if you can id your e-mails by this scheme, you can validate the bounce as an authentic bounce and use the checksum embedded in the e-mail as an identifier... message the original e-mail telling them they're getting removed... and remove the e-mail addy associated with the identifier.

If you want to hire me
• #### Re:timestamp through crypt (Score:2)

So if you can id your e-mails by this scheme

Why is that better than a random id?

you can validate the bounce as an authentic bounce and use the checksum embedded in the e-mail as an identifier

Actually, that scheme won't work. Some bounces won't even return the text.

Anyway, that isn't the issue. They have bounces taken care of pretty well. It was just an example of why checksumming the email address won't work.

If you want to hire me as a consultant I used to hire out at \$89 an hour... I'd probably go
• #### Re:timestamp through crypt (Score:2)

Why is that better than a random id?

Huh? Did I say it was better than a random id? I'm suggesting how you get your random id... at least that's what I thought I was suggesting last week.

If you do a crypt on something you can challenge on that something and find out if they know it without having to know or reveal the secret. So you could do the crypt, checksum, md5 or what ever on some secret and then check the thing that comes back to see if they are the same.

There are three forms of security: Cha
• #### Re:timestamp through crypt (Score:2)

The main issue here is to uniquely identify links, and protect against them being guessed. Knowing the correct link is not too bad, but it may point where it shouldn't, or allow the user to access something not otherwise availible to him.
• #### Re:timestamp through crypt (Score:2)

The main issue here is to uniquely identify links, and protect against them being guessed. Knowing the correct link is not too bad, but it may point where it shouldn't, or allow the user to access something not otherwise availible to him.

I've gone back and re-read my posts and some of the others in this JE. Really, a "random" number is fine for the id of the link in a non-guessable way. How you generate that random number is very important since you want it to be really random and really unique. There is