The horror! Scientific code and how not to read your arguments…

Over the years I have seen many, many examples of poor programming practise, usually kludges and quick fixes but today I saw the most horrible code for reading in command-line arguments in a C program ever. I just had to share the horror…

   if ( (argc-1) < 5 ) {
	[ Usage error response code removed]

   /* read in command-line arguments */
   numFiles = (argc-1) - 6;
   sscanf( argv[ numFiles+1 ], "%s", insFileName );
   sscanf( argv[ numFiles+2 ], "%s", outFileName );
   sscanf( argv[ numFiles+3 ], "%d", &outType );
   sscanf( argv[ numFiles+4 ], "%hd", &windowStartTimeCodeword0 );
   sscanf( argv[ numFiles+5 ], "%d", &newStartLine );
   sscanf( argv[ numFiles+6 ], "%d", &newEndLine);

Now, where can I start with this? Erm, I’m a bit dumbfounded actually.

Not only does the test for the incorrect number of arguments test for the wrong number but then it uses an index from the last value to reference the other values! Of course, this means that if the wrong numbers of arguments are given then the values are put into the wrong variables. Worse, that could be read from memory the process doesn’t own.

And there’s more.. it blindly sscanf()s them into variables.

Now, you may have seen that if one argument is left off the command line the input file now becomes the executable itself and the output file is actually the input data file. This is how this came to my attention. Trying to debug the program for a student it was found that it wasn’t reading the data correctly… and the data file was mysteriously emptied of its hundreds of megabytes of data each time the program was run. Oops!

So, dear readers, have any of you ever seen a worse command line parsing code segment?

IPv4 addresses almost gone, IPv6 not finished yet. Oops!

As has been noted very widely the last couple of large blocks of Internet Protocol version 4 addresses have been assigned to the local distributors and rightly there have been a large number of people stating that we need to get ready for the transition for IP version 6.

However, there are a few niggly little problems due partly to do with IPv6’s design and partly by tardy implementation, neither of which impact upon the general public and their edge networks but will impact upon the security and management of more corporate networks.

So, what are these two problems? Well, they’re both to do with network address assignment, one of which is a foolish design decision in the protocol itself which has a whole host of unintended consequences related to it.

The feature I’m talking about here is the stateless address assignment where a client machine will self-assign its address and self-discover the route out to the wider Internet. On the face of it it seems like a brilliant idea which will liberate the normal user from worrying about setting up IP addresses and all that tedious and confusing networking stuff, it all “just works”. Brilliant! And, in a perfect world, where everyone is smiley, helpful and trustworthy it would be. It’s a pity that the real world isn’t like that. Having said that, this doesn’t really affect personal networking within peoples’ homes but it does greatly affect the security and policing of corporate networks.

At this point it’s probably best to describe how security and policy are implemented, with regards to network addresses and packet routing in IPv4 networks so as to allow you to contrast the differences and the problems inherent in the self-assigned address world of IPv6. Currently a computer can either be manually assigned an address and network route which then has to be configured directly on the computer in question or it can be assigned automatically from a centrally managed Dynamic Host Configuration Protocol (DHCP) server. In the latter case it’s not only the network address and route information which can be given to the computer but other information such as its host name and various other items which it can use to interact correctly with the rest of the network. The centrally managed DHCP server can also tell any computer it doesn’t know (or the administrators don’t want to have network access) to bog off and hence not get network access. Using this very useful system administrators can assign different outgoing network routes for different sets of client machines which can help with load balancing and various other advantageous policies that only humans with an overview of the whole network can see.

As you can see, IPv6’s self-assignment of addresses and self-discovery of network routes by-passes all this control. If you add to this certain client operating systems being “helpful” and offering network tunnels out of the current network for IPv6 clients to the outside world and offering their services as routers it becomes a security nightmare as local outgoing firewall policies and protections are subverted.

Now, this problem has been foreseen, if belatedly, by a group who have, against the uproar of the IPv6 purists, defined an IPv6 version of DHCP. (Note: the purists hate it because it breaks their ideological tenet that all network peers should be equal and free to do as they wish.)

So, surely this means that IPv6 is ready? Erm, no. You see DHCPv6 is only currently a paper exercise. The technical details have been hammered out and the specification documents (RFCs) have been posted but there are no implementations out there. Ooops!

So, what does this mean for the whole IPv4 to IPv6 transition? Well, it means that internal corporate networks will not be able to change to the new protocol and will be forced to live behind an IPv4 to IPv6 network address translation (NAT) gateway. (Note 2: IPv6 purists cringe even more about this technology, they see NAT as the spawn of the devil as it stops all peers being equal and being able to talk directly with every other.)

I can foresee the transition from IPv4 to IPv6 being a long one with to start with only those machines which live in the no-mans-land where external services live and the core Internet changing over to IPv6 and everything else being behind huge NAT gateways. Internet Service Providers (ISPs), whose customers don’t generally have fixed network addresses anyway, will sit all their customers in IPv4 bubbles and this state of affairs will ossify. All web sites will be forced to use IPv4 compatible addresses.

Eventually, after many years, all the tools and security issues with IPv6 will be sorted out and slowly, very slowly, the corporate world will change their networks one by one, but there will always be “legacy” IPv4 networks in there, well at least for 20 years or so. For ISPs the transition will be quicker. They’ll probably have to begin with a separate product for IPv6 users or merely provide IPv6 gateway routers to new customers (quite probably to begin with using an IPv4 NAT bubble for the home network as quite a bit of embedded A/V equipment will not be IPv6 capable). I can foresee that even this transition will take a good decade. During this time all web servers will have to be on IPv4 mappable addresses.

It’s going to be a very long haul and expect things to break horribly.