I'm using the new code base and have a collection of newspapers that total over 200,000 pages. I know the LOC uses caching software (varnish) to speed up page loads and we currently don't have that running. On our site, the load time for the newspaper list page is probably close to 5 minutes (http://nyshistoricnewspapers.org/newspapers/) but the other pages load relatively quickly.
Is there something wrong in the code or is that expected given the queries that django is trying to execute and I need to install a cache to make it work faster?
This is part of a bigger question. What general recommendations do any of you have for putting this software into production with multi-million page collections? This could be server/network specific (Ram, virtualization, multi-server, etc.) or software specific (caching, settings, etc.). We want to release this site to the public relatively soon and want to gear it up to be ready to get hit by users.
Thanks so much for your suggestions and guidance.
Mike Beccaria
Systems Librarian
Head of Digital Initiative
Paul Smith's College
518.327.6376
[log in to unmask]
Become a friend of Paul Smith's Library on Facebook today!
|