Efficient Redundancy: 2007

Sunday, December 30, 2007

Yikes, long times between postings

Well, the holidays have certainly made things difficult to keep things updated as regularly as I was hoping. Toss in a family emergency to a tight schedule and all bets are off.

A few quick notes for those that do peruse this:

- The INFOCOM 2008 is now on-line and it is a monster. The inclusion of the mini-conference on the main list kept me busy noting relevant papers for my students to look at. Multicast and classical QoS (that isn't pure theory) outside of wireless are definitely minor niche research areas now. I'll certainly be out at Phoenix for the conference despite not having a paper in the main conference (sigh, yet again). Amusingly enough, we have a very good chance (nearly finalized) of getting Cisco funding for our work that was rejected so I guess I'll take money over a publication.

- NDSI 08 results are also out. By in large the reviews were very thorough (a nice trait of the USENIX arena of conferences) but a near miss for us on our Lockdown work. Unfortunately, a single reviewer whose criticisms and ratings were out of sync with the rest of the reviews pretty much doomed the paper. We certainly made the discussion phase but with only 30 out of 175 papers making it in, a killer review like that is too hard to overcome. Eh, not much one can in those cases but grin and bear it and simply note to the students that it indeed happens to everyone in the field.

Long story short, we targeted ease of use and intuitiveness (i.e. just make it work) and the reviewer wanted novelty and complexity (i.e. simplicity is bad). Certainly fair for a conference like Security and Privacy but I thought a bit too pointed for the systems-oriented nature of NSDI. Interestingly enough, a core argument in the paper is that a huge problem in security is that we are making these wonderful complex systems that are hard to use and they aren't being used because well .... they are hard to use. It makes me wonder if Ethernet were proposed today (work with me here) would ever have a chance at a major conference or grant review panel. I certainly have some strong opinions on how the networking conference track is going the wrong direction (too much gatekeeping, not enough prospecting) but that is a post for a different day (post tenure of course). Let's just say I no longer frown upon abstract only submissions like I used to that other areas of CS heavily use. If I have time, I'll post the reviews on-line with the comment/response notes that we did for the HotNets paper which I think was a good exercise.

Tuesday, November 27, 2007

Weekly papers - back again

Finally, back with the weekly papers segment after a rough beginning of November. Perhaps it was dodging reactions from INFOCOM reviews and how it went with various folks. More on that later when I have time to do a length post.

Diversity and multiplexing: a fundamental tradeoff in multiple-antenna channels This paper comes from our weekly papers meeting two weeks ago from discussions regarding our INFOCOM reviews regarding the relevance of MIMO to our current work. Transactions on Information Theory is a bit out of our normal purview so kudos to Dave for taking the time to digest the paper in its entirety.

The paper looked at the tradeoffs in a multi-antenna environment with regards to reliability versus capacity. The most relevant portion of the paper is the strong dropoff with regards to either dimension, i.e. if you choose to do both, you will not get a solution that is strong in either dimension. Not exactly a shocking result but the work in the paper is quite sound and a nice discussion point to discuss why our current work on wireless reliability is very interesting.

Near as I can tell, industry has gone the route of capacity over reliability meaning that our results regarding channel reliability are especially apt. In short, our most recent work has been looking at if the reliability of the channel for nodes in close proximity. If loss is primarily from the medium, losses should be correlated in nearby spaces but not necessarily correlated across larger spaces. In contrast, if losses are not correlated in a tight area, it means that it is likely an individual device going crazy, not the medium itself. Most of the works in the literature regarding burstiness / etc. seem to trust that the device itself is good and that the packet got corrupted before arriving, not that the device itself may be a significant source of the packet errors.

While previous works such as SRD by Balakrishnan reached what would be a similar conclusion, the works reached their conclusion for quite different reasons. Put simply, the physical sensors were highly scattered (i.e. APs over 30 feet+ apart) allowing for multi-path effects for different loss probabilities. In contrast, we showed that losses tended to also show a lack of correlation in short distances regardless of orientation, small distance between, and heavy background traffic. Moreover, there were also "weird" periodicity aspects to some of the devices that bear further investigation (Intel Centrino chipset).

Tuesday, October 30, 2007

The day the routers died

Highly amusing video from RIPE

The day the routers died

Saturday, October 20, 2007

New hardware toys

Over the past month or so, I have been searching for low-cost prototype boards for exploring our RIPPS and ScaleBox work. I think I finally have a winner to play with for the near term with the Atmel NGW 100 Network Gateway kit. It has native dual Ethernet support to allow me to do pass through operations with a customized version of Linux already running. A host of other options less important for networking but very cool for tinkering (SD slot, USB, I2C, SPI, GPIO). Most important of all, the price is just about perfect at only $89. For a few more dollars, it would have been nice to see a power supply tossed in but I already had a few 12V supplies lying around from our my mass of external drive purchases in January.

The board is quite fascinating both as an implementation platform and potentially as a teaching platform. At sub-$100, it is close to tolerable to have students simply buy one for themselves. For teaching, a JTAG debugger (enabling remote GDB and recovery from hosing the flash) would be essential but at a $320 list from Digikey, it is a bit pricey but is not necessary for each and every board. I am a bit concerned about how well it could run our current RIPPS code at only a 133 MHz processor but that gives some incentive to help speed it up / streamline it anyway. Perhaps making a hardware version of the WANRay is finally in order without having to toss on a $700 Cisco PIX box.

Initial forays into the box are very promising. I was really surprised how much stuff is running. Telnet into the box worked right away with ssh and ftp already supported. The box is also running a web server as well as a DHCP server for eth1 (labeled LAN). I did not have a chance to test the serial operations as I could not dig up a serial cable amongst my bank of cables at work.

We'll be putting up a Wiki web fairly soon to collect the various information :)

Friday, October 19, 2007

Weekly Papers - Oct 17

SMACK, aka Simplified Mandatory Access Control Kernel by Casey Schaufler attempts to bring MAC (Mandatory Access Control that is, not the network MAC) to the masses via a LSM in Linux. For those unfamiliar with MAC in the security context, think that everything is labeled with explicit access control and stricter rules on changing access. The CIPSO network tagging is also interesting as we had been considering how to convey local context as part of Lockdown during the TCP SYN phase.

Interesting also that the work in that it is a real live implementation.

Wednesday, October 10, 2007

Weekly Papers - Oct 10

Playing Devil’s Advocate: Inferring Sensitive Information from Anonymized Network Traces

The paper by Coull, Wright, Monrose, Collins, and Reiter that looks at how good the state of the art anonomyzation schemes with regards to hiding identity appeared in NDSS 2007. With a bit of data mining (clustering), DNS, and search engines, the work attempts to infer identity despite anonymization.

Very cool results demonstrating what I think those who have been skeptical of anonymization have suspected for quite some time.

Weekly Group Papers

An interesting dilemma facing a new assistant professor is how to manage their fledgling research group. During my graduate work, I came from a fairly small group (3 or 4 students maximum) where we primarily had only individual meetings. The meetings with my adviser were largely informal (just drop in) rather than a specific time. Other groups at Iowa State had specific schedules for meetings.

From my experiences as an assistant professor, I have hopped between multiple management styles, group meetings, group and individual meetings, individual-only meetings, seminar meetings, etc. Currently, we have a weekly group meeting, weekly status reports (via e-mail and on the wiki), and at least one meeting (scheduled or not) outside of the group meeting. This seems to work alright for the students that are fairly well organized near as I can tell. I have been mulling making students include written summaries of individual meetings on the wiki but have held off on that. Cristina Nita-Rotaru of Purdue mentioned how she used that to help improve student writing skills.

One of the neat changes that I started in the spring was an outgrowth of the system seminar. Each week, each student in the group must read and write a quick summary of a current research paper (in area or out) and then discuss that paper briefly in the group meeting. The summaries are posted on our Repository wiki on the NetScale server for full public consumption. Each paper summary should have the appropriate citation info, an abstract, and the DOI link if possible. The students supplement the abstract with commentary regarding the novelty of the work, future papers to follow up on, and discussion related that work to our own. The specific paper topics are often left up to the student with occasional suggestions tendered by myself.

Out of all of the various management decisions, this has certainly been one of the most successful. At a minimum, it forces the students to continually keep up on research and build their bibliography for their upcoming thesis or dissertation. The broader effect is that everyone in the group (especially myself) benefits from getting a quick summary of current work going on in the field. For myself, that can be especially challenging to find time to simply read papers outside of my normal review duties. With networking as diversified as it is across so many conferences, I do not doubt for a moment that I am missing insightful work that occurs out of the top tier conferences. I find it to be quite intellectually stimulating to poke and prod at various works to see how it might relate or could be improved. In some sense, it resembles a conference setting but at a much more rapid pace (6 to 7 papers per week from a more diverse topic pool). Amusingly enough and perhaps others would agree, I find myself the most productive in terms of new ideas when attending conferences, in part from new views imparted by the speakers but often for simply having time to think in largely uninterrupted blocks (no e-mail, no meetings, no visitors).

In keeping with the spirit of our group discussions, I will try to add in weekly posts regarding the most interesting papers discussed that week with a small bit of personal commentary. If one or two readers (most likely my entire blog reading base, ha) pick up on a more obscure paper and help give that paper a bit of prominence, I will consider my endeavor a success.

Monday, October 1, 2007

HotNets VI Review Results Out

Alas, no HotNets paper for our group this year. I'll be posting our submission on-line to our wiki shortly as the paper was geared strictly towards HotNets, i.e. primarily opinion / philosophy versus raw technical substance. It was definitely a learning experience (euphemism for definite reject) as we do not usually dive into the philosophical domain with papers. Certainly a fun paper to write though as we were quite a bit more casual with various bits of puffery throughout the paper. The use of words such as "scurrilous" and phrases of "having your cake and eating it too" are certainly not typical academse fair.

The executive summary of the paper was fairly simple, sites would love to be centralized as it makes a host of management / resource issues much simpler but often do not have the scale to do so. In that context, we described our concept of ScaleBox which represents the amalgamation of my NSF CAREER work on Transparent Bandwidth Conservation, bringing in packet caching, TCP pre-fetching, tail synchronization, and stealth multicast in a single unified architecture. Unfortunately, fitting all of that in only six pages coupled with various larger scale musings which I thought were much more profound (does TCP apply when bandwidth conservation is involved, does it work with the current Internet, how should multicast economics really work) is a recipe for disaster. Couple that with thoroughly imbibing one's one Kool Aid as I was knee deep in writing a DARPA proposal with an incorrect assumption of reader rapport and that spells R-E-J-E-C-T.

With the wikifying of the submission, I'll also be taking the step of putting the raw reviews themselves on-line. An added bonus is that I get to do a point-by-point rebuttal :) I have long been intrigued by the review process with all of its nuance. The Global Internet Symposium approach of having all reviews signed with fully public reviews for accepted papers was quite interesting with mixed results. Those that took the experiment seriously were not the ones where problems with reviewers existed in the first place which was unfortunate. I would have liked to have seen a bit of a post-mortem on GI via the TCCC mailing list but perhaps it was discussed at the TCCC meeting at INFOCOM. The raw reviews being posted was a fantastic step that should be encouraged in the community to foster transparency.

Alternatively, the public review of SIGCOMM and CCR is a bit of a let down in my opinion. While it is certainly wonderful as a more junior professor to have a well established person writing the front article (having Jon Crowcroft write the public review for our edge-to-edge QoS paper, ERM, was a special treat), the public reviews especially for a conference like SIGCOMM often seemed to get watered down. Some had reasonable anecdotes from the TPC but most were fairly bland relative to the paper. Given the tight interweaving of accepted papers at SIGCOMM versus TPC members, I guess this would only be natural. Coming as a relative outsider, the raw reviews give significant more confidence in the thoroughness of the process than the rough equivalent of a NSF panel summary.

While I won't muse too much on where conferences in networking are going as that is best left to the TCCC mailing list or other venues, it is interesting that the philosophy espoused by HotNets is actually the norm outside of systems / networking. Works in progress or abstract-only submissions drive conference submissions rather than conference papers representing completed works in and of themselves. A roommate of one of my graduate students was shocked to find out that conference submissions were actually rejected in our field, even more shocked when he found out the average acceptance ratios.

It is my humble opinion that we are doing ourselves a disservice by focusing so much on completeness or practicality (perceived or actual) rather than the potential discussion or outgrowth points for the paper. Perhaps I am a bit older school but my perception of conferences was that they were a venue for unfinished work with the on-site discussion and reviews serving as an incubation testbed for thought provoking questions. Suffice to say, I find it a bit troubling that there were more interesting works in terms of posing new questions / opening new research areas at BroadNets than INFOCOM this past year. The average quality of the papers at INFOCOM were better but the opportunities for future work seemed considerably less. SIGCOMM is a whole different entity that I'll leave for another day.

Thursday, September 27, 2007

Public vs. Private Firewall Policies

One of the mantras in the security world that has perplexed me is the notion that firewall policies should always be kept hidden from everyone else. The rationale of course is that if the rules are hidden, it will make it that much harder to the attacker to gain entry into the system.

However, I am not sure I completely buy that rationale. Is it really that much effort to probe for open ports on guarded systems? Does it really take that long or is itjust a few more minutes while one of the countless botsin the botnet does the remote scanning? Moreover, does the detection of said port scan do any good either withthe sheer volume of typical port scanning going on in a day to day sense?

From a distributed systems vantage point, the silent failure modes of firewalls can be painful to debug. Sure, if the firewall sends an ICMP Port Not Available, that would be great but most simply sink traffic off into /dev/nullleaving the application to slowly time out, often with abysmal service properties. At what point does the benefit of faster debugging outweigh the "security" benefit of hidden firewall rules? I'm guessing the threshold is much lower than what is typically employed. Debugging a normal system is hard, make the system distributed and itturns quickly into a nightmare.

To be fair, there are cases where I think hiding rules is appropriate, specifically when access is constrained to asubset of hosts. The important information to hide is notnecessarily what ports are exposed but to whom access is granted. One could argue that the hiding of this information does significantly impede the progress of the attack as scanning from an arbitrary host gives imprecise information, sort of a Byzantine-esque (I use this term loosely, not precisely) quality to information gathering. Then again,there are likely levels also in this case as well. A simple 'restrict to the local network' policy (aka the local subnet) really doesn't buy that much time or defense but a 'restrict to an obscure host or bank of admin hosts' would potentially improve defenses.

Perhaps should there be a notional ICMP Firewall Denied message to assist with debugging? Likely too problematic for security purposes(reflection DoS attacks) but interesting from a debugging standpoint. The increased tamping down of ICMP messaging (our campus blocks inbound) also likely makes this a non-starter. Perhaps something in TCP? A can of worms but maybe a nominal TCP options field? Something truly crazy would be the ability to query any host for its firewall ruleset. Crazy indeed.

Monday, September 17, 2007

NC State, IEEE BroadNets

Back to the Midwest after my week long foray to North Carolina.

On Monday, I had a chance to visit Vince Freeh and a few folks at NC State in the Electrical/Computer Engineering and Computer Science departments. Their new building is simply amazing I must say. I gave a seminar on our ScaleBox work which had a wonderful attendance despite competing against a speaker on video games who had given a talk earlier that day. Fortunately, it is early in the semester but I was quite impressed by the sheer number of students we packed into the seminar room. The slides should be posted shortly to the NetScale wiki sometime tomorrow (Tuesday).

Tuesday through Thursday was the IEEE BroadNets conference, a nice smaller conference that had evolved out of the former Opticomm with three tracks on general Internet networking, wireless networking, and optical networking. I mixed quite a bit between the Internet symposium and the wireless networking symposium and unfortunately missed a few good talks here and there. There was a paper on Layer 3 Rogue Wireless Access Point detection that I missed and did not have a chance to catch the authors to get a bit more information.

Some of the highlights I thought were:

TCP acceleration for low-bandwidth links: The paper that received the best paper award for the Internet symposium focused on neat tricks for improving the perceived performance for bandwidth-limited mobile devices. Neat tricks that are certainly timely in today's current Internet.
eMIST testbed: The testbed was a collection of Java test tools for profiling of Internet connectivity on cell phones out of Kevin Almeroth's group at UCSB. Very neat suite of tools that showed some of the pitfalls when trying to design real-time / delay-sensitive applications (primarily games) for mobile devices.
TCP Quick Start: A paper by Scharf analyzed the performance of the Experimental TCP Quick Start RFC. Interesting in that Quick Start seems to share some of the properties of the work in the recent DARPA Control Plane effort led by Tim Gibson as the DARPA PM. While the paper focused exclusively on performance, the core protocol is especially interesting in light of our accelerated admission control schemes. In short, Quick Start probes for capacity via IP Options and then ramps up CWND without probing. Of course, the use of IP Options are a non-starter for any real deployment but a gateway or hybrid setup could have quite a bit of promise.
PoMo: Interesting work by Griffeon, Calvert of University of Kentucky and Bhatterjee (sp?) of Maryland. The work is sponsored by FIND and makes a good faith effort at separating routing from addressing. Plenty of work to be done as the work is still in its infancy but something to keep an eye on.
Ethernet vs. IP in the MAN/WAN: Thanks to my adviser, Arun Somani (the chair at ISU - my alma mater), and Gigi Karmous-Edwards I got roped into serving on a panel discussing the previous topic of a panel consisting of Adel Saleh (DARPA PM), K. K. Ramikrishan (AT&T), and David Allen (Nortel). I learned quite a bit but unfortunately had to follow David Allen who certainly is at the forefront of pushing out Ethernet farther. I highly encourage people to track down both David Allen's slides and K. K. Ramakrishnan's slides as they had interesting perspectives from the ISP and vendor sides. The RFCs governing MACinMAC and PBB-TE are on my list for late night reading when I have a chance. Also neat is that IS-IS is the preferred link state of choice due to its independence of addressing. Research-wise, Adel Saleh's slides are perhaps the most thought provoking as they look at what different permutations might emerge and ask more questions than providing answers. Needless to say I was but a humble assistant professor in the group but I certainly thank Dr. Karmous-Edwards for the wonderful opportunity to be up with the group. You can catch my slides on my musings on what the last-mile is bringing on the NetScale wiki.

All in all, the conference was a very nice small setting where people had excellent opportunities to interact rather than being one among the horde of other attendees (thanks Dr. Rouskas). Next year's BroadNets will be in the UK which should up the European participation significantly and I am guessing will also likely follow the three track formula.

Tuesday, September 11, 2007

Google Talk to iChat

I'm out traveling to IEEE BroadNets for the week and attempting unsuccessfully to connect up to folks using iChat. I'm traveling with my Lenovo tablet as it makes working much easier on the plane (amen to rotating screens and the stylus for doing work).

The use of my tablet of course means that I'm stuck using Windows. While there is an iTunes version for Windows, there is as of yet, no version of iChat for Windows. Hence, I go with option number two, Google Talk which in the past has worked alright. During the winter, it became the stand in for a Ph. D defense at Western Michigan due to inclement weather. Not great but not too bad either.

Anyway, iChat has been positively wonderful on my Mac at work. I can add Google Talk users quite easily and chat without any major issues. Being naive, I assumed the reverse would be true with Google Talk. Perhaps someone can enlighten as to why Google Talk despite using the open Jabber standard insists that I invite users to get a Google Talk account before I can chat with them? The core Jabber protocol isn't terribly difficult and one would think that Google Talk could intermesh nicely with. The whole point of the IETF Jabber effort was to try to standardize these things to get rid of counter-intuitive interactions like this. Near as I could tell, the only way one could chat is with the iChat user initiating the conversation, not the Google Talk user.

More later in the day (or tomorrow) regarding my visit to NC State and the first day of the conference...

Wednesday, September 5, 2007

Batch scheduling - grid computing

One of the recent problems we have been looking at is how to correctly schedule batches of tasks in grid computing where the batches consist of multiple synchronization barriers. For instance, REM (Replica Exchange Management) from the field of bio-complexity sends out a set of N tasks that are synchronized X times (the replica exchange) over the course of the simulation. In short at each synchronization point, all N replicas must finish their respective computation and then a small subset of the data is then exchange to help drive the next set of simulations. Loss tolerance in an m,k sense (m out of k must finish) varies depending upon the application but our current bio-complexity group does not allow for it in their batches.

Currently, the state of the art seems to be employing a catchup cluster of extremely fast machines that notice lagging execution hosts and then migrate the sub-task to the faster catchup cluster. At first glance, this appears to be a rather brute force mechanism for improving performance. While it is hard to argue that it does not have a benefit (who wouldn't benefit from having an idle cluster of fast machines?), there is some interesting theory and tradeoffs to be examined in what the optimal schedule would be and what sort of missed opportunity comes from dedicating the catchup cluster. Moreover, there are certainly practical tradeoffs with regards to job migration (network transfer time) that are also of concern for capturing the tradeoffs correctly. Toss in heterogeneous job execution length (due to parameters), heterogeneous processing capacity of the grid, and possibly the potential for job failure (eviction, node crashing, etc.) and it all gets complicated fairly quickly.

A fascinating problem as it has roots in both grid computing and real-time computing / scheduling (my initial graduate school work). Most interesting is that I think it can draw from some of the properties of the multi-processor EDF (Earliest Deadline First) bound estimation work and may or may not need to employ heuristic-based schemes ala the original Spring kernel scheduling approach. Comments are of course welcome if anyone has any work of note in this particular area.

Thursday, August 30, 2007

Welcome

Welcome to my humble entry in the blogosphere. While I certainly won't claim to be quite as eloquent as others here, my intention is to post various interesting tidbits for subjects of interest, specifically topics related to my areas of research on computer networking and computer security.

As to where the title comes from, it is a wordplay on my core research thrust on Transparent Bandwidth Conservation which seeks to eliminate redundancy from network traffic without modifications to the client. More information can be found on my research group webpage at the NetScale Laboratory.

Efficient Redundancy