So let me inform you all and the world that, after many months of work and negotiation, I have acquired 10 million expired threads from 4chan’s history. Roughly half a decade’s worth.
[…]
It’s going on archive.org over the next week. I’ll let you know when it’s done. It’s dozens of gigabytes, and I have it in XML, HTML and MYSQL formats, all of which show different parts of the data. (Conversion strips out some data that original formats might not have, and so on.)
"— Jason Scott / Bump Not Sage: Saving 4Chan - OH. MY. GAWD.