In short: all of the above.
World of Logs currently runs on 5 physical servers: 2 frontends, 1 frontend+backend, 1 backend and 1 database. We used to run on just 2 frontends and 1 backend+database, but increased traffic, more features and larger combat log files have caused us to scale capacity a few years ago. Since three of our machines are getting fairly dated, we're thinking of increasing memory in our 2 newest servers and scaling back down to just those two again.
These three server applications consist of a custom-written stateless Java/Jetty-based backend server application optimized for performance, a Python/Django-based frontend that generates page content and contains all stateful information and a PostgreSQL server to store persistent information. All of the client-facing communication is handled by the frontends, whereas the backends merely function as data providers. Frontend requests are divided among frontends using an nginx load balancer, while backend requests are balanced on a round-robin basis.
Our backend servers are stateless in that they do not keep any information on users, guilds or even combat logs. All of that information is kept in a PostgreSQL database that is used solely by the frontend servers. When a page view is requested from a frontend, it either handles the request itself (for non-report pages like rankings) or sends an HTTP request to a backend server for the required data. This request consists of a combat log filename and a specification of the data required. The backend server loads the report, processes it to gather the information, then passes that information back to the frontend in JSON format. The frontend interprets and formats this information, then presents the page content to the client.
To achieve the performance that we do, we use a number of techniques. As you already mentioned, we use a specialized serialization format that not only allows for efficient compression (down to ~5%), but also for very fast deserialization. This both minimizes the bandwidth required for uploading as well as our server-side storage space requirements. When a combat log is first loaded by a backend server, some information (e.g. actor merging, pet association, shield tracking) is precalculated and stored in memory. Once that is done, the report is processed for the information requested by the frontend. That last step is executed for each combat log-specific page view since WOL allows you to select any timerange within the report. The combat log and precalculated information are kept in an LRU cache as long as there is memory available, which makes subsequent requests very fast as you already noted.
As for graph data: most graph series are pre-calculated and stored on the filesystem by the Java backend when the report is first loaded. The frontend then deserializes and post-processes these series based on the page view requested and sends them to the client as part of the page content. The browser then turns these into graphs using the excellent Flot.js library. There is one exception to this flow, which is that the Analyze (Damage/Healing Done/Taken) pages actually use dynamically generated graphs. There are two reasons for this: 1) they were developed later on when we had more processing power available; 2) they are available to subscribers only during peak hours, which limits the impact on total load (they're quite heavy requests); and 3) they allow you to select the graph's granularity, which means we can't easily make use of pre-generated data.
That's pretty much how stuff works around here. Hope you enjoyed it