Twitter's UID associated to each tweet will very soon exceed 2,147,483,647

drawkbox · on June 9, 2009

Always use a UUID or GUID (same thing) in todays web applications. When we get to terrabytes of data as the norm for apps int based keys are going to seem so ancient.

Sure it takes more room but space it cheap, and it makes syncing and distributed databases so much easier (not tied to one machine or cluster)

dpcan · on June 9, 2009

I figure many twitter apps are written with SQLite as their database (in AIR, etc) so the Integer, Primary Key, can get up to 9223372036854775807

http://www.sqlite.org/faq.html

What is the point of creating a website like this?

ivankirigin · on June 9, 2009

Maybe I'm just paranoid, but I index the tweet id's in Tipjoy with a string: tweetId = models.CharField(max_length=50, db_index=True)

lallysingh · on June 9, 2009

Anyone notice the footnote on the estimated time of crossing?

"3 the average tweets per second and the Twitpocalypse date are a rough approximation, please do not schedule any vacation around that date"

Steve0 · on June 9, 2009

Meh, anyone who's coding for this platform must have put those id's in a unsigned integer, wouldn't they?

patio11 · on June 9, 2009

I suspect for a lot of people coding against Twitter the notion of integers being signed or unsigned is about as foreign to their experience as pointer arithmetic. I do not consider that a bad thing.

irb > "2,147,483,647".gsub(",","").to_i + 1

2147483648

(Forgive the extra code necessary to strip out the commas. The Windows irb console leaves a little to be desired in terms of midline editing.)

there · on June 9, 2009

you know ruby supports _ as a comma separator, right?

irb(main):002:0> 2_147_483_647 + 1

=> 2147483648

tjogin · on June 9, 2009

It's such a low barrier of entry and almost non-existant learning curve to code for Twitter, so I expect quite a lot of services to have made that mistake. Not the big ones, though.

cl3m · on June 9, 2009

More at http://blog.programmableweb.com/2009/06/09/the-twitpocalypse...

pierrefar · on June 9, 2009

It's actually sooner. They assume 83 tweets per second (reference 2) but the number is 200 tweets per second: http://friendfeed.com/scobleizer/50e673d8/some-stats-from-tw...

m_eiman · on June 9, 2009

From the link:

"Twitter is seeing about 200 tweets per second, during peak loads. - Robert Scoble"

Note "during peak loads", so 83 is probably not too far off.

pierrefar · on June 9, 2009

Fair enough. Do we have any solid references of the actual average rate of tweets?

m_eiman · on June 9, 2009

I haven't seen one, but it's easy enough to get it. Just check the UID of the latest tweet on the public timeline, wait a week and check again.

jsdalton · on June 9, 2009

They must read HN. It now says:

2 at a rate of 151 tweets per seconds

IsaacL · on June 9, 2009

"Values updated every 5 minutes". Looking at the source (I don't actually know JavaScript, but it seems pretty simple) I'm assuming that means they get the current TwitID AND the current rate of increase.

DrJokepu · on June 9, 2009

In the age of multicore 64 bit CPUs, they're not using signed 32 bit integers as primary key, do they?

thristian · on June 9, 2009

To be fair, they say "For some of your favorite third-party Twitter services not designed to handle such a case..." rather than accusing Twitter themselves of primary-key ineptitude.

domdefelice · on June 9, 2009

I think the problem is not in twitter itself but in the services using their APIs.

ks · on June 9, 2009

Twitter is using an unsigned 64-bit bigint according to this:

http://twitter.com/twitterapi/status/2048659057

charlesju · on June 9, 2009

I'm going to put money down that this will not affect anything.

sho · on June 9, 2009

That is quite an amazingly high number. Surely 100m people have not posted 2,000 "tweets" each.

I wonder what proportion of the total is human-typed, and how much is machine generated?

cperciva · on June 9, 2009

Surely 100m people have not posted 2,000 "tweets" each.

Two billion tweets would be 1M people posting 2k tweets each, or 100M people posting 20 tweets each.

sho · on June 9, 2009

Oh. Oops. Of course you are right. That was pretty dumb of me : /

axod · on June 9, 2009

Amazingly high? considering how long twitter has been going? It's extremely small IMHO.

http://news.ycombinator.com/item?id=648725

"Just 10% of Twitter users generate more than 90% of the content, a Harvard study of 300,000 users found."

I'd expect most of it is machine generated.

hugothefrog · on June 9, 2009

I'd also expect most of those numbers not to be used. Their db writes are probably not contending on one monotonic counter. They're probably using some numbering scheme which guarantees uniqueness across multiple writable databases, but not global serialisation based on id.

bretthoerner · on June 9, 2009

I'm not sure how they'd be doing that and still giving each tweet a number so close to all the others. I have no idea where you see a "public timeline" as people mentioned above, so I just did a search for "i"... the following tweets were about a second apart.

2088840357 2088840362 (difference of 5, and that's without a real public timeline where you could easily prove each tweet takes the next number in a sequence)

So it isn't as if they're using UUIDs or something unique... they very much seem to be relying on a single counter (currently).

hugothefrog · on June 9, 2009

For a first guess, if there were 5 writing dbs, each could write every 5th id, starting at a different index (0,1,2,3,4).

You'd end up with globally unique ids, and as long as your writes were (approx.) evenly distributed, all 5 sequences would be (approx.) close to each other.

Who knows, they might have 100 writable dbs, with only 70% up at any point in time, meaning only 70% of all integers are actually used.. anyway, I don't know, I can just imagine them doing something like this if they wanted certain things from the system.

hboon · on June 9, 2009

No. Because their API exposes an explicit ordering of the IDs - there is a sinceID: parameter.

Timothee · on June 9, 2009

You can get the public timeline from here: http://twitter.com/public_timeline

Note that one can delete a message, which could explain some gaps.

jackowayed · on June 9, 2009

I think the gaps are mostly people posting protected tweets, which thus don't get into the public timeline.

jonknee · on June 9, 2009

They do something like that with userid's. Used to be incrementing and now they have gaps.

TweedHeads · on June 9, 2009

ALTER TABLE twits MODIFY COLUMN twitid BIGINT UNSIGNED NOT NULL

There, fixed, send check for $20 to donations@redcross.org

Call me back when they reach 18446744073709551615

sanj · on June 9, 2009

Assuming that Rails is still interacting with the DB, and the ID in question is the primary key (ie, the 'id' on the 'twits' table) isn't quite that easy.

I had to do this a couple of years back when I realized I had a table that'd grow quickly:

http://snippets.dzone.com/posts/show/4422

Feel free to send the donation to the same place.

lpgauth · on June 9, 2009

Is there no easier way then this? Can't believe that wasn't integrated into rails...

jhancock · on June 10, 2009

"Can't believe that wasn't integrated into rails..."

You just made my day. I haven't laughed so hard in ages ;)

joezydeco · on June 9, 2009

Hmm, didn't fix my iPhone client. Try again!

fleaflicker · on June 9, 2009

That command will take several hours to complete and write lock the table.

Andys · on June 10, 2009

Unless you use PostgreSQL.

But then you would have had 64 bit ID by default anyway.

jsonscripter · on June 9, 2009

Why can't you just redirect the writes to a new temporary table and join the original and temporary on reads? (I'm not a database admin)

karanbhangui · on June 9, 2009

I'm actually interested in this answer as well. I know of a company using a proprietary db and now that their growth has literally exploded, they're stuck with it.

jsonscripter · on June 9, 2009

I wonder what literally exploding growth looks like...

blhack · on June 9, 2009

http://dhawhee.blogs.com/d_hawhee/files/109_fake_tits_real_b...

TweedHeads · on June 9, 2009

Nothing the fail whale can't fix ;-)