Tumblr Architecture - 15 Billion Page Views A Month And Harder To Scale Than Twitter
Stats
- 500 million page views a day
- 15B+ page views month
- ~20 engineers
- Peak rate of ~40k requests per second
- 1+ TB/day into Hadoop cluster
- Many TB/day into MySQL/HBase/Redis/Memcache
- Growing at 30% a month
- ~1000 hardware nodes in production
- Billions of page visits per month per engineer
- Posts are about 50GB a day. Follower list updates are about 2.7TB a day.
- Dashboard runs at a million writes a second, 50K reads a second, and it is growing.
Software
- OS X for development, Linux (CentOS, Scientific) in production
- Apache
- PHP, Scala, Ruby
- Redis, HBase, MySQL
- Varnish, HA-Proxy, nginx,
- Memcache, Gearman, Kafka, Kestrel, Finagle
- Thrift, HTTP
- Func - a secure, scriptable remote control framework and API
- Git, Capistrano, Puppet, Jenkins
Hardware
- 500 web servers
-
200 database servers (many of these are part of a spare pool we pulled from for failures)
- 47 pools
- 30 shards
- 30 memcache servers
- 22 redis servers
- 15 varnish servers
- 25 haproxy nodes
- 8 nginx
- 14 job queue servers (kestrel + gearman)