=========================================================================== CCR Review #43B Updated Monday 29 Mar 2010 4:27:31pm EDT --------------------------------------------------------------------------- Paper #43: Investigating the Impact of Service Provider NAT on Residential Broadband Users --------------------------------------------------------------------------- Timeliness: 4. Hot topic with a considerable amount of active work Novelty: 3. Distinct addition to the state of the art Technical correctness: 3. One major or several minor technical errors Clarity: 4. In good shape, with minor improvements necessary Recommendation: 3. Reject ===== Summary of contribution ===== This paper is a first attempt at looking into potential SPNAT resource consumption in ISPs, and the dependence of flow expiry timeouts on NAT session table size. The authors use packet traces from the border of a real ISP. They find that table size can be very large, because of a large number of UDP sessions which lasted for a single packet transmission; payload analysis showed that this is almost always due to a single application (Bittorrent). The authors recommend using a small (few seconds) timeout for expiring UDP sessions. ===== Detailed comments ===== The authors have not addressed almost all of the comments mentioned in this review. It would help if the paper shows some evidence of NAT resource requirements, especially from stateful firewalls in today's ISPs (such devices are common). It is also useful to know how session table churn affects a NAT, as well as the applications. --------------------- This paper is a first attempt at looking into potential SPNAT resource consumption in ISPs, and the dependence of flow expiry timeouts on NAT session table size. The authors use packet traces from the border of a real ISP. They find that table size can be very large, because of a large number of UDP sessions which lasted for a single packet transmission; payload analysis showed that this is almost always due to a single application (Bittorrent). The authors further explore the impact of changing session expiry timeout for UDP on the table size, and find that bringing it down from 2min. to a couple of seconds significantly changes the distribution of number of entries. As the authors point out, this can lead to an increase in "churn" in the session table, as well as premature expiry of session entries. The paper is well written, and explores the SPNAT timeout configuration problem well. There are a few concerns: - A high level question is that related to the business aspect. The ISP studied did not have an SPNAT implementation, and hence had a suitable public IP address pool for customers. Would this ISP have an incentive to shift to using a small number of NATs, and thus leave most of its IP address pool unused? Maybe a more practical approach for such ISPs involves using more NATs - and thus lesser users/NAT - after considering the cost trade-off? - It is not uncommon to see transparent but stateful firewalls in ISPs (without NATs) today - to avoid DoS attacks. These firewalls do a job similar to SPNATs. What is the state maintenance and processing overhead in these firewall deployments? Can we reuse any lessons from them? - Traces: this may be confidential information, but it would help to know the scale of the ISP being studied. How many users? Geography? How representative is the ISP and its customer base? Does the ISP have a firewall (also see next point)? - It is also mentioned that the ISP did not have an SPNAT implementation when the traces were collected. How significantly do you think the traffic behavioral patterns change when there is a NAT implementation? For example, some applications (such as Skype, Bittorrent, etc.) change their session-initiation behaviour (ex. relaying, hole punching) when they are behind a NAT. - Sec. 5: You recommend that the short session expiry timeout be used only for the first packet in a UDP session. What is the distribution of the number of packets per UDP flow that you have observed in the traces (after considering 1s and 2min. timeouts)? - Can you comment on the processing (both CPU and memory) overhead and latency of having a significant "churn" in the session table? These are factors that need to be considered when we are using shorter timeout values, especially at rates as high as 0.9 transactions/s/subscriber. Overall, the paper is good, but can be made stronger. A noticeable aspect of the analysis that is missing is a small accompanying study of overheads in a real SPNAT, as detailed above.