
Information Caching
Peter Danzig, University of Southern California
Donald Neal, University of Waikato, New Zealand
Table of Contents
Peter Danzig speaks:
Harvest Object Caching
A study was done to determine how much could be done by caching data on
ftp. The results of this indicated that 35% of traffic could be removed by
caching information.
Motivation for Web Caching
- Improve perceived response time
- Cut link bandwidth needs in half
- Offload remote servers to obtain resilency
- Avoid congestion-collapsed international links
- Add robustness to the Web
- Route Web requests to closest, replicated server
AOL has 4 million users and multiple hundred thousand URLs being processed
by AOL ever hour.
Design
- Non-blocking, multi-threaded internal architecture
- DNS Cache (and negative caching)
- Negative Object Cache
- Order of magnitude faster than Netscape & CERN
- Runs on all Unix platforms
- Configurable, scalable resolution protocol
- Performs expensive ops in the backround
- Serves 3M requests/day/CPU
Deployment at BIG ISP
- Set your browser proxy-settings
- Deploy access caches for your browser
- Deploy cluster of cache at access centers
- Configure resolution policy to maximize locality and resiliency
HTTPD Accelerator
- Adds a RAM cache to your web server
- Serve 3M hits/day from desk top workstations
- Support virtually-hosted httpd's
- Accelerator sites on port 80 (web server on 81)
Performance of httpd accelerator (very impressive) is about an order
of magniture improvement over all regular servers.
Technical Challanges
- Transparency
- multiple browsers
- multilingual browsers
- different consistency models
- Routing
- cache working set management
- coordinate cache routing tables across thousands of automous systems
- cache-to-cache authentication
For more information:
Questions
How is the cache updated? The consitency policy:
- If there is an EXPIRED in the header, believe it.
- If there isn't one, calculate something reasonable.
Matt Mathis suggests that if an ISP would do more caching services, the ISP
could reduce the required size need of the ISPs transit link.
What about usage stats? It is a problem. How about adding a single non-cachable
object on a page to get the accounts.
Donald Neal speaks:
New Zeland has been on the Internet since 1989 with a 9.6 baud link. PACCOM
provided initial funding, but since then end-users in NZ have paid for all
costs. These costs are generally allocated on a usage basis.
The cacheing service is provided as a service for which people pay. Folks
get cheaper rates when they use the cache than if they don't.
There is one national cache in NZ. It is connected to two NLANR caches in the
US and the Australia cache.
This not strictly a hierarchy. However, it appears that the minimum savings
is about 12% and the maximum 32%. The average is 25.5%.