Scaling Python on the Web
First session of the day was on Scaling Python on the Web; rough notes which I may clean up later:
- How fast is fast enough?
- Don’t prematurely optimize
- Know where the bottlenecks are, and optimize those specifically?
- Orders of magnitude: static (httpd), dynamic (python), db-queried
- Even 40 req/s in 3.4m pages/day
- Hundreds to low thousands of dynamic page views is usually good enough
- Scaling isn’t about the language, it’s about:
- DRY: cache!
- share nothing
- built a sample photo-app, FlickrKillr, for demonstration purposes
- preloaded with 100k’s users, 10-20 photos each
- first iteration: CGI
- roughly 23 requests/second
- problems:
- loading Python interpreter for each request
- all resources initialized for each request (inc. db connection)
<li> possible remedies:<ul> <li> run a Python web server (long-running process) </li> <li> make one db connection per thread instead of request </li> </ul> </li> <li> other remedies:<ul> <li> fastcgi </li> <li> snakelets, twisted.web, RhubarbTart </li> <li> mod_python </li> </ul> </li> </ul>
- second iteration: python app server (CherryPy used for this demo)
- roughly 139 requests per second
- problems
- global interpreter lock — can only utilize one core on a dual core machine
- sessions in the database — prefer an in-memory session store
<li> remedies:<ul> <li> run multiple instances of CherryPy (overcode GIL) </li> <li> but then we need to balance with something like nginx </li> </ul> </li> <li> other options<ul> <li> cherrypy in mod_python </li> </ul> </li> </ul>
- version 3: load balancing with nginx
- 217 requests/sec
- outstanding problems
- static files read from disk every time
- and they’re being read/written from python
<li> solutions:<ul> <li> memcached </li> <li> combine with memcached w/ nginx </li> </ul> </li> </ul>
- version 4: caching
- 616 req/sec (benchmarking w/ homegrown tool)
- 1750 req/sec (benchmarking w/ ab)
- other notes:
- don’t forget to index
- without an index, the fourth iteration falls down to 28 requests/sec