Friday, August 21, 2009

Maybe I was Wrong

Curses of James Gosling’s name were common utterances from my mouth back when I was in school and was forced to use Java. No pointers? No private inheritance? No passing by reference? I detected the evolution of programming was watering it down for the lowest common denominator, and it made me livid.

My attitude was, and to a certain extent, still is, leave the dangerous language features in and if people are stupid enough to use them that is their problem. Most everybody agrees that multiple inheritance can lead to some major problems; however, it is immature to assume that it is always bad just because you have yet to find a good use for it. A do-while loop appears to be a useless feature at first sight, but once a programmer encounters a situation where he always needs to do something at least once his tune may change. Thinking in terms of absolutes is dogmatic and ineffectual. My ego is not small, however, even I am not arrogant enough to think that I have iterated through every possible situation to be encountered when creating an application. I am very much against passing by reference; however, when a developer needs to switch object references inside a function and cannot; it is frustrating to say the least.

I always welcome new features to a language that have the potential to make my life easier. I welcomed the implicit typing feature (var) to C# 3 because I assumed that developers would know not to use it as shorthand (as opposed for runtime created objects) in production code; I assumed wrongly. It was easy to sit back and criticize Gosling when I was sitting alone in my basement writing code for my own amusement; however, cursing him is becoming more difficult with each passing day.

A Change of Heart?

As I have left school and started working I have learned one very important lesson; if a language feature exists, programmers will use it. This is because it is always easier to use a feature to work around a mistake than to reengineer the entire program. It is not hard imagine some programmer 15 years back forgetting to implement a feature in a base object and then using multiple inheritance as a workaround; next the other members of his team inheriting from his work are greeted with diamonds. Additionally, most programmers are big on conventions. This sometimes means a programmer will knowingly repeat a mistake, or reuse a language feature that was only appropriate in a particular instance. The sometimes irrational tendency of programmers to stick to conventions at all costs can exacerbate use of bad language feature; even spreading it across a team like a retrovirus.

My frustration at the removal of “dangerous” language features has not subsided; however, I now understand why it has been done. My realization of some of the more ineffectual habits that programmers can exhibit has perhaps changed my mind on the validity of removing features, but it does not make me like it any more.

Saturday, August 1, 2009

More than you ever wanted to know about void pointers

Void pointers in C and C++ are one of the most misconstrued features in programming. Although, there are many pages on the web which explain the basics of void pointers; most are brief and a subcomponent of more general tutorial. In this post, I hope to give a more in-depth explanation; however, I may lack some of the necessary experience with this “mature” language feature, so if any of the more experienced programmers would like to add some caveats or something else I forgot, please feel welcomed to leave your input on the matter.

Long before I knew I wanted to program professionally, I was learning the ropes with some good old Basic. How I yearned for a “variable holder”; this was my term for the feature that would have greatly simplified my life at a time when I knew nothing of OO and functions were still fairly fuzzy. When I found out that “real programmers” don’t use VB I decided to take up teaching myself C# and C++ simultaneously. When I discovered pointers, the C implementation of my “variable holders”, it was love at first sight. They were even better than I had imagined; you could do pointer arithmetic! Not only were they incredibly useful, but they were generally considered dangerous by the general coding community. This danger gave “pointing” a sex appeal I just could not resist; after all, it is not acceptable to point in polite society. Now that you know more than you care to about why I at least imagine myself to be qualified to discuss void pointers, we can move on to the actual void pointing.

The Basics:
A void pointer is a “wildcard” or “rainbow” pointer. More precisely, its value is a memory location whose type is not explicitly known at that point of execution. More concisely, it is a raw memory address. More basically, it points to a location but you don’t know what’s in there. Along with, integers, and floats, void pointers could be considered one of the few basic data types which modern processors recognize.



Explanation:
When many programmers are first introduced to the void pointer they are abhorred by the concept… it seems just a bit too wild. This misconception may stem from people like me who referred to them as “wildcards” in an attempt to convey the concept. Really, this interpretation is inaccurate because in reality they are not wild cards, but rather unknowns. There is a subtle, but very important distinction here because wildcard implies that you can mix and match which is certainly not the case.

I feel as many coders have images of wild eyed, lazy, irresponsible amateurs gone geek for cash pop into their minds when void pointers are brought up; “I’m too lazy to declare a type; what’s the worst that could happen?” However, nothing is further from the truth. Using void pointers when they are appropriate is the responsible thing to do, while using a typed pointer when voids are called for is careless.

In programming, just as in life, one of the worst mistakes you can make is being ignorant while filling in the gaps with fairytales. When a person fills in gaps with make-believe they are not motivated to search for the true answers. In code when you do this you open yourself up for a whole mess of errors. Do you know what happens when you treat a float like an integer? I don’t off the top of my head and I don’t have a compiler handy at the moment; however, I can safely assume I don’t want find out in a production application. Even if your code works fine without void pointers; what about the person reading your code? What about you a month from now? Even though void pointers don’t actually solve any sort of problem at the machine level, they let us admit our ignorance. When we know that we don’t know, at least we can start looking for the answer. When we place an integer pointer where a void should be we are just telling ourselves dangerous fairytales.

Void Pointers in Use:
Here is a quick list of possible uses for void pointers; this list is by no means unabridged:
  • Pointing to a space of memory whose type may be unknown
  • A pointer whose location is known to purposefully go beyond the bounds of a specified array
  • Pointing to uninitialized memory (malloc)
  • A pointer whose location is going beyond the bounds of the program’s allotted memory space
  • A pointer which is pointing to a memory space known to contain multiple types
  • Implementation of generics
  • And of course, pointing to void functions in the case of a void function pointer

There a plenty of situations where a void pointer is appropriate. For example, reading a raw stream from a network connection, file, or a peripheral device are good candidates for void pointer usage. Even if you know the structure of the stream, a pointer which is pointed to the whole file should not be typed. The best practice is to have a stream pointed to by a void pointer. Many good programmers prefer a byte pointer when dealing with streams as all data can be said to boil down to a byte; however, as bytes are treated as integers in C, this is not my preference.



Void Pointer Arithmetic:
Dereferencing a void pointer is not allowed. Also, void pointers cannot be operated upon directly with arithmetic like other pointer types. Although, this may seem surprising at first, it makes sense when you consider the nature of void pointers. When incrementing or decrementing a pointer the memory location of the pointer increases or decreases by the size of the pointer’s data type. If an integer is 32bits then an incremented integer pointer is will be increase by 4 memory locations; 32 ÷ 8 = 4. However, a void pointer does not have a type; therefore, there is no logical amount to increment it by. One could argue that it would logical to increment the void pointer by 8 bits, or one memory position; however, this is not anymore justified that moving the pointer by the word size of the system. There are several options around this, however, if a programmer should find the need to increment a void pointer. A more unusual method would be to cast to a double void pointer; this will be discussed later. Another option would be to cast to an integer. The most conventional method is to convert the void pointer to either a char pointer or a byte pointer. As a programmer would expect, any type of pointer can be converted implicitly to a void pointer; however, as it is not safe, converting from a void pointer to another type requires an explicit cast.



Conversion to a char pointer or a byte pointer for void pointer arithmetic is common because on most systems chars and pointers are 8bits; or one memory position. Although there is something to be said for convention and readability; it is worth noting that on some systems chars or bytes are not 8bits in size. Although, this is unlikely to be a concern, it is still a potential problem with the char and byte approach as type size is not guaranteed by C or C++. If it is essential that a program increments a void pointer by one memory position then integer casting may best method; though, it is at the expense of both convention and convenience.

Void Double Pointers:
Even scarier than void pointers can be double pointers of the void variety. A void double pointer is a pointer to a void pointer. Although, it may seem logical; it is not a pointer to an unknown pointer. Despite the potential unintuitiveness of this concept, a void double pointer is completely typed. A void pointer can point to an integer or void double pointer (it’s a type too), but a properly behaved void double pointer can only point to a void pointer. Because a void double pointer’s dereference is typed, pointer arithmetic can be performed on it just like any other pointer.



Like any other double (or higher) pointer, a void double pointer is incremented or decremented by the word size of the system. Technically, the prior statement is actually inaccurate; it will be incremented by the word size of the system targeted by the compiler. If a 64bit system is running a program generated by a 32bit compiler a double pointer will be decremented by 32bits. However, this caveat is scarcely worth mentioning because in most cases your program logic is only dependent on the target word size.



When a programmer sees some funny operation performed with a double void pointer, this may be a clue that the prior coder was attempting to obtain information about the compile target of a program which will be operating on different types of machines. An example of this would be finding the word size of the target system by taking the difference of two void double pointers after casting them to ints. The second double void pointer’s value is the increment of the first (prior to casting). This type of operation can be performed just as well with any type of double pointer; however, convention dictates it is done with double void pointers. The moral of the story is a funky operation with a void double pointer is one of many potential clues that a program will be targeting multiple platforms and thusly the programmer should be warned accordingly.