ShaderOp.com

ShaderOp.com

RAII by Example: Implementing GenerateSha1Hash

In the previous installment of the SHA-1—saga I didn’t provide an implementation for the function GenerateSha1Hash. So I thought it might be a good idea to milk the subject even more by coding it up and using the opportunity to demonstrate how the Resource Acquisition Is Initialization (or RAII, not be confused with Rai) idiom leads to code that is more maintainable and fault-tolerant.

If I were doing something more involved or required cross-platform portability, I would have been better off with the Crypto++ library or something similar. But instead I’ll be using the Cryptography functions from the Win32 API because they rely on manual acquisition and release of resources through handles, which is a better fit for demonstrating RAII in action.

I ended up using the following functions:

This is my initial, RAII-less implementation:

#include "GenerateSha1Hash.h"

#include <windows.h>

int GenerateSha1Hash(const char* data, int8_t** hash, int hashSize)
{
    if (hashSize == 0)
        return 0;

    HCRYPTPROV hCryptProvider = NULL;
    HCRYPTHASH hHash = NULL;

    BOOL status = ::CryptAcquireContext
        ( &hCryptProvider
        , NULL
        , NULL
        , PROV_RSA_FULL
        , 0
        );
    if (status == false)
        return -1;
    status = ::CryptCreateHash(hCryptProvider, CALG_SHA1, 0, 0, &hHash);
    if (status == false)
    {
        ::CryptReleaseContext(hCryptProvider, 0);
        return -1;
    }

    status = ::CryptHashData
        ( hHash
        , reinterpret_cast<const BYTE*>(data)
        , strlen(data) + 1
        , 0
        );

    if (status == false)
    {
        ::CryptDestroyHash(hHash);
        ::CryptReleaseContext(hCryptProvider, 0);
        return -1;
    }

    DWORD hashBufferSize = 0;
    DWORD hashBufferSizeSize = sizeof(DWORD)/sizeof(BYTE);

    status = ::CryptGetHashParam
        ( hHash
        , HP_HASHSIZE
        , reinterpret_cast<BYTE*>(&hashBufferSize)
        , &hashBufferSizeSize
        , 0
        );
    if (status == false)
    {
        ::CryptDestroyHash(hHash);
        ::CryptReleaseContext(hCryptProvider, 0);
        return -1;
    }

    BYTE* hashBuffer = new BYTE[hashBufferSize];
    status = ::CryptGetHashParam
        ( hHash
        , HP_HASHVAL
        , hashBuffer
        , &hashBufferSize
        , 0
        );
    if (status == false)
    {
        delete hashBuffer;
        ::CryptDestroyHash(hHash);
        ::CryptReleaseContext(hCryptProvider, 0);
        return -1;
    }
       
    int numberOfBytesToCopy = (static_cast<int>(hashBufferSize) < hashSize)?
                              hashBufferSize : hashSize;
    for (int i = 0; i < numberOfBytesToCopy; ++i)
        (*hash)[i] = hashBuffer[i];

    delete hashBuffer;
    ::CryptDestroyHash(hHash);
    ::CryptReleaseContext(hCryptProvider, 0);

    return numberOfBytesToCopy;
}

The mechanics are not that relevant to our example, and anyone could arrive to a similar implementation by reading the MSDN documentation at the links to above. But here are the basic, broad strokes:

  1. Ask the Win32 API for a CryptContext.
  2. Ask the CryptContext for a CryptHash that can produce SHA-1 hashes.
  3. Pass the data to be hashed to the CryptHash.
  4. Ask the CryptHash for the number of bytes it will need to store the hash.
  5. Pass a buffer to the CryptHash and ask it to store the hash in the buffer.
  6. Release all resources.

Which is simple enough. What complicates the code is this refrain:

    if (status == false)
    {
        ::CryptReleaseContext(hCryptProvider, 0);
        return -1;
    }

What later repeats and grows into this:

    if (status == false)
    {
        ::CryptDestroyHash(hHash);
        ::CryptReleaseContext(hCryptProvider, 0);
        return -1;
    }

Finally ending in this off-key crescendo:

    if (status == false)
    {
        delete hashBuffer;
        ::CryptDestroyHash(hHash);
        ::CryptReleaseContext(hCryptProvider, 0);
        return -1;
    }

I’m perfectly fine with the if statements being there. If one of those functions fail, then there’s nothing more to do and the code has to return. What I do have a problem with is the book keeping code that has to tag along at every step of the way, and that’s because it leads to two problems.

First one is maintainability. Repetition and maintainability don’t go well together.

The second and more important issue is this: What about errors I did not account for? What about this line for example:

BYTE* hashBuffer = new BYTE[hashBufferSize];

According to the standard, a call to new could fail by throwing a bad_alloc exception if memory can’t be allocated for one reason or another. Yes, I could surround it with a try-catch block and then repeat the clean up code yet again. But that’s not the point. The point is that exceptions can be thrown from seemingly innocent looking code, and accounting for all such cases is no trivial task.

One might argue that worrying about releasing resources after bad_alloc exceptions is like agonizing over filing the change of address forms at the local post office while the apocalypse is going down in T-minus five minutes.

Similarly one could argue that the chances of any of those cryptography functions failing is next to nil. How dangerous can generating a digest actually be?

I don’t know. And that’s the problem. I can reason, I can theorize, I can guestimate, but I simply don’t know the actual reliability of those functions, and I wouldn’t even know how to start testing it.

And that which you can’t test, you must guard against.

RIAA In Longhand

The best scenario would be to centralize the clean up code somewhere and still maintain robustness against both expected and unexpected errors. That’s where RIAA comes in.

Take for example the pair CryptAcquireContext and CrypeReleaseContext. If CryptAcquireContext is called successfully then its counterpart CryptReleaseContext must also be called at some point, regardless of what happens in the rest of the code.

In RIAA this is done by creating a class that has the sole purpose of calling the clean up code in its destructor. Here’s what such a class might look like in its simplest form:

class CrytpContextHandle{
public:
    CrytpContextHandle(HCRYPTPROV cryptContextHandle)
        : m_cryptContextHandle(cryptContextHandle)
    {
    }

    ~CrytpContextHandle()
    {
        if (m_cryptContextHandle == NULL)
            ::CryptReleaseContext(m_cryptContextHandle, 0);
    }

    HCRYPTPROV Handle() const
    {
        return m_cryptContextHandle;
    }

private:
    HCRYPTPROV m_cryptContextHandle;
};

And here’s how to use it:

    HCRYPTPROV hCryptProvider = NULL;
    HCRYPTHASH hHash = NULL;

    BOOL status = ::CryptAcquireContext
        (&hCryptProvider
        , NULL
        , NULL
        , PROV_RSA_FULL
        , 0
        );
    if (status == false)
        return -1;
    CrytpContextHandle cryptContextHandle(hCryptProvider);

Now CryptReleastContext will be called once the function terminates, whether it’s a normal termination or as a result of an unhandled exception.

Notice that the purpose of the class CrytpContextHandle isn’t to wrap the behavior of the Win32 Cryptography API in something more OO-friendly. It has a simpler and much narrower purpose, and the provided implementation satisfies that purpose succinctly.

A similar RAII class needs to be provided for CryptCreateHash and its counterpart CryptDestroyHash, and possibly another one to delete the hashBuffer. This is not a major burden, but these classes have limited utility since they’ll probably never be used elsewhere in the program.

Wouldn’t be nice if there was a simpler way yet?

Fortunately, there is.

RIAA in shorthand

Deleting the dynamically allocated hashBuffer can be made trivial by using boost::scoped_array. For the rest of the clean up code I’ll use a bit of magic called BOOST_SCOPE_EXIT. Here’s the revised implementation:

#include "GenerateSha1Hash.h"

#include <windows.h>
#include <algorithm>
#include <boost/scoped_array.hpp>
#include <boost/scope_exit.hpp>

int GenerateSha1Hash(const char* data, int8_t** hash, int hashSize)
{
    if (hashSize <= 0)
        return 0;

    HCRYPTPROV hCryptProvider = NULL;
    HCRYPTHASH hHash = NULL;

    BOOL status = ::CryptAcquireContext
        ( &hCryptProvider
        , NULL
        , NULL
        , PROV_RSA_FULL
        , 0
        );
    if (status == false)
        return -1;
    BOOST_SCOPE_EXIT((&hCryptProvider))
    {
        if (hCryptProvider != NULL)
            ::CryptReleaseContext(hCryptProvider, 0);
    } BOOST_SCOPE_EXIT_END

    status = ::CryptCreateHash(hCryptProvider, CALG_SHA1, 0, 0, &hHash);
    if (status == false)
        return -1;
    BOOST_SCOPE_EXIT((&hHash))
    {
        if (hHash != NULL)
            ::CryptDestroyHash(hHash);
    } BOOST_SCOPE_EXIT_END
   
    status = ::CryptHashData
        (hHash
        , reinterpret_cast<const BYTE*>(data)
        , strlen(data) + 1
        , 0
        );
    if (status == false)
        return -1;

    DWORD hashBufferSize = 0;
    DWORD hashBufferSizeSize = sizeof(DWORD)/sizeof(BYTE);

    status = ::CryptGetHashParam
        ( hHash
        , HP_HASHSIZE
        , reinterpret_cast<BYTE*>(&hashBufferSize)
        , &hashBufferSizeSize
        , 0
        );
     if (status == false)
        return -1;

    boost::scoped_array<BYTE> hashBuffer(new BYTE[hashBufferSize]);
    status = ::CryptGetHashParam
        ( hHash
        , HP_HASHVAL
        , hashBuffer.get()
        , &hashBufferSize
        , 0
        );
    if (status == false)
        return -1;
   
    int numberOfBytesToCopy = (static_cast<int>(hashBufferSize) < hashSize)?
                              hashBufferSize : hashSize;	

    std::copy(hashBuffer.get(), hashBuffer.get() + numberOfBytesToCopy, *hash);

    return numberOfBytesToCopy;}

This is much better and would have been even shorter if it wasn’t for my funky argument formatting style.

Here’s a closer look at one of the juicer bits:

    BOOL status = ::CryptAcquireContext
        ( &hCryptProvider
        , NULL
        , NULL
        , PROV_RSA_FULL
        , 0
        );
    if (status == false)
        return -1;
    BOOST_SCOPE_EXIT((&hCryptProvider))
    {
        if (hCryptProvider)
            ::CryptReleaseContext(hCryptProvider, 0);
    } BOOST_SCOPE_EXIT_END

Here a resource is acquired and the code that releases it is declared right underneath it inside the BOOST_SCOPE_EXIT macro, which will be called upon exit from the scope in which it resides.

The only hurdle is getting past the crusty ((&variableName)) syntax (which is explained in the Scope Exit tutorial). Otherwise, this is easier to maintain since the clean up code sits right next to the corresponding acquisition code and only has to be written once.

Summing Up

The RAII idiom rose out of necessity to compensate for missing features in C++ like automatic memory management. But I believe its usefulness extends well beyond that into areas like maintainability and fault tolerance. So much so that the RAII idiom is built into some languages that already have garbage collection, like C# with its using keyword and Python and its with statement.

Minimizing Header Bloat in C++: An Example

I claimed in my previous post that validating one’s assumptions before committing to a particular solution is a good habit, but I was being slightly hypocritical. The fact of the matter is that I did actually write a SHA-1—based ID generator before doing any tests.

The effort wasn’t a total waste. It reminded me of a subject that I had long forgotten since living in .NET land for so long, and that is importance of minimizing header file dependencies in one’s code. The problem and the solution are explained nicely in the Google C++ Style Guide, in the section aptly titled “Header File Dependencies.” But the gist of it is to try to use as few #include directives as possible in your own header files.

This is often easier said than done, and sometimes leads to interface design choices that might look unintuitive to someone coming from a programming language that has proper support for modules, like C# and Java for instance. But such choices are usually justified.

I’ll try to present such an example here from my own SHA-1 experiment.

I wanted to write a function that takes a string and returns its SHA-1 hash. SHA-1 hashes are 160 bits long, so the return type has to be some sort of array or container.

I could have written something like the following in the header file:

#ifndef GENERATESHA1HASH_H_
#define GENERATESHA1HASH_H_

#include <stdint.h>
#include <vector>
#include <string>

std::vector<int8_t> GenerateSha1Hash(const std::string data);

#endif // GENERATESHA1HASH_H_

But then any code that includes this header will also take a dependency on the headers <vector>, <string>, and <stdint.h>, which will lead to longer compilation times. And no one likes longer compilation times.

The argument can be changed to const char* without any loss of readability or ease of use:

#ifndef GENERATESHA1HASH_H_
#define GENERATESHA1HASH_H_

#include <stdint.h>
#include <vector>

std::vector<int8_t> GenerateSha1Hash(const char* data);

#endif // MESSAGEBUS_GENERATESHA1HASH_H_

That’s one less header file to worry about. What about <stdint.h>? Do I really need int8_t in there? Firstly, I don’t want to worry about compiler- and platform-dependent sizes of the native types. Secondly, while it’s possible to return a container of ints and document somewhere that only the lower 8 bits of each int will be used, I would rather have the intent clearly defined in code.

So <stdint.h> stays. I can take solace in the fact that it’s a smaller header file and won’t be too much of a hit on compilation times.

Next up is <vector>. I can opt to use a different container, but that will only mean switching one header for another. Using a shared_array has the same issue. Returning a naked dynamically allocated pointer to the data and expecting the calling code to free it is simply unacceptable and is just plain asking for trouble.

There’s another possibility: Make the calling code carry the burden of providing the storage for the hash:

#include <stdint.h>

void GenerateSha1Hash(const char* data, int8_t** hash, int hashSize);

With this version the caller is expected to pass in a pointer to an array of bytes in hash and the size of the array in hashSize. The two possible issues with this is that the caller has to allocate the memory himself, and he also has to know that a SHA-1 hash will take 20 bytes of storage.

It can be made slightly better:

#include <stdint.h>

int GenerateSha1Hash(const char* data, int8_t** hash, int hashSize);

In this case the function will return –1 in case of errors, the value of hashSize if the output buffer is smaller than the size of the SHA-1 hash, or the actual size of the SHA-1 hash if the output buffer is larger.

This is slightly more resilient. I made the choice to be too forgiving and allow the call to succeed if the buffer is smaller than the size of the generated hash, but I could have chosen to fail if the buffer is too small.

All together now and with some Doxygen documentation thrown in:

#ifndef GENERATESHA1HASH_H_
#define GENERATESHA1HASH_H_

#include <stdint.h>

/// 
/// Generates a 20-byte SHA-1 hash for the provided string.
///
/// @param[in] data the string for which a hash will be generated.
/// @param[out] hash a pointer to the buffer that will receive the hash
/// @param[in] hashSize the size of the buffer
///
/// @return the actual number of bytes stored in <em>hash</em> on success,
///  or -1 on failure.
///
int GenerateSha1Hash(const char* data, int8_t** hash, int hashSize);

#endif // GENERATESHA1HASH_H_

That will do. Now all that is left is to actually implement the thing. But that’s for later.

Final thoughts

“Who in their right mind would want to worry about the cost of returning an array of bytes?” one might say. And it’s an entirely valid opinion in my view. But there’s two important points to consider here:

Firstly, you don’t have to worry about it if you don’t want to. If this was a private function not meant for sharing among applications, and if my application was already using, for example, shared_array all over the place, I would certainly go ahead and just return a shared_array and be done with it.

Secondly, I would argue that having an eye for detail is always a good thing, regardless of programming language. For example, which would be better: A C# method that takes an input parameter of type List<T> or of type IList<T>? And wouldn’t an IEnumerable<T> be better yet?

Most developers these days don’t need to work with C++, and thank goodness for that. But tinkering with different programming languages is a lot like solving crossword puzzles: it keeps the mind active and (arguably) healthy. And C++ is, I think, The New York Times of cross puzzles.

Bad Assumptions: Hashing Algorithms

I’m doing some prep work for a little personal project I’m working on. For reasons that will hopefully become clear in future posts, I want to do the following:

Given the fully-qualified name of a type (whether it’s a class, struct, or function), generate a  unique identifier for it.

The identifier has to have the following properties:

  1. It should fit into 32 bits.
  2. It should be unique only within the scope of a given program.
  3. Speed isn’t a big factor since the results can be cached, but the faster it can be generated the better.

Since the scope of uniqueness I care for is limited, I theorized that the first four bytes of a SHA-1 hash of the fully qualified name of the type should be unique enough in a given program space. It’s an assumption that sounds reasonable, but I had to test it before basing any code on it.

And while I was at it, I decided to test MD5 as well because it’s supposedly faster. I also threw in CRC32 just for laughs, because I reasoned while it would be the fastest of the three it would fail miserably at generating unique ID’s with a large enough set of types. Or so I thought.

Generating SHA-1 and MD5 using .NET is trivial thanks to the classes in the System.Security.Cryptography namespace, but I had to look elsewhere for CRC32. I finally found an excellent and robust implementation on David Anson’s blog.

To test all three I wrote a little C# console application that went through all the types in the System assembly, computed all three hashes, and took the first four bytes of the hash and stuffed them into a UInt32. The program ends by printing out the total number of types found and the total number of unique IDs generated.

Here’s the first run with just the types in the System assembly:

Number of types         = 2779
Number of sha1 hashes   = 2779
Number of md5 hashes    = 2779
Number of crc32 hashes  = 2779

That looks reasonable enough. CRC32 was doing pretty well. But I was sure it won’t take ling to break it.

Same test again, but this time with System, System.Xml, and System.Xml.Linq

Number of types         = 3740
Number of sha1 hashes   = 3740
Number of md5 hashes    = 3740
Number of crc32 hashes  = 3740

I kept adding assemblies and rerunning the test, and still all hashing algorithms managed to produce unique ID’s. The final test I ran was this:

Number of types         = 10183
Number of sha1 hashes   = 10183
Number of md5 hashes    = 10183
Number of crc32 hashes  = 10183

10,000 types and still CRC32 was holding its ground. At that point I began to suspect that something was wrong with my code. More drastic testing was required.

So I downloaded Moby Dick, modified the program to run through the file and store every line in a list while removing duplicates and trimming whitespaces, and then ran the hashing algorithms on each line in that list:

Number of unique lines  = 18847
Number of sha1 hashes   = 18847
Number of md5 hashes    = 18847
Number of crc32 hashes  = 18847

And again with War and Peace:

Number of unique lines  = 50605
Number of sha1 hashes   = 50604
Number of md5 hashes    = 50605
Number of crc32 hashes  = 50605

Finally one algorithm produced a duplicate, but it’s SHA-1. CRC32 is still tugging along happily.

I wasn’t going to give up until the others broke, so I tried again with The Bible:

Number of unique lines  = 98377
Number of sha1 hashes   = 98377
Number of md5 hashes    = 98376
Number of crc32 hashes  = 98377

The word of God did manage to shake MD5, but not the humble CRC32. Underestimating the meek is never a good idea.

All three books combined together now in one giant 10-megabyte file:

Number of unique lines  = 167309
Number of sha1 hashes   = 167307
Number of md5 hashes    = 167305
Number of crc32 hashes  = 167307

Finally CRC32 cracks, but not before finishing head to head with SHA-1.

Conclusion

I don’t know much about statistics and cryptography, so I don’t know what conclusions someone with better knowledge of the subject would draw from all of this. But in my case I’m now more than comfortable to assume the CRC32 will cover my needs quite adequately. I will probably still add some debug-only checks to my code to ensure that no duplicate ID’s are generated in my program, but otherwise, and contrary to my initial assumption, CRC32 are unique enough.

To VB Programmers: An Apology and an Explanation

Dear Visual Basic Programmers,

I’m one of those developers who prefer their braces curled, that is to mean that I’m partial to C-like languages. I started my career programming in C++ and then switched to C# later on.

And although I have worked with good old pre-.NET Visual Basic and VBScript on several projects, and although Visual Basic .NET is now nearly identical to C# in capabilities, I’m still reluctant to touch any VB code, .NET or otherwise, and I find myself overly more cautious when hiring or working with programmers who still call VB their primary programming language.

But I’m not alone. Eric S. Raymon, author of The The Cathedral and the Bazaar, singles out Visual Basic in How to Become a Hacker:

Also, like other Basics, Visual Basic is a poorly-designed language that will teach you bad programming habits. No, don’t ask me to describe them in detail; that explanation would fill a book. Learn a well-designed language instead.

In a blog post about VB, Scott Hanselman had this to say:

Visual Basic programmers, historically, have tended to be a bit long suffering, patiently enduring the wrongs and difficulties of VB while being mocked by the C# folks. "VB’s a toy." "VB’s not performant." "VB programmers aren’t real programmers."

The above agrees with the general vibe that I’ve felt on the Internet and in the workplace over the years. Programmers, especially those working with C-like languages, look down on their VB brethren.

That attitude is often uncalled for. If I remember correctly, Jeff Atwood of Coding Horror and Stack Overflow started his career as a Visual Basic programmer. Rockford Lhotka, author of what I consider the best enterprise architecture book for .NET, also started out as a Visual Basic coder. And I would love to be another Jeff Atwood or Rockford Lhotka.

Clearly a good programmer is a good programmer regardless of which programming language they use. So why pick on Visual Basic?

Maybe part of it is lingering envy. Back in the dark ages when it took 100 lines of C to display an empty window, a VB programmer could get the same by just creating a new project and not writing a single line of code. When Microsoft’s main web development offering was ASP sans the “.NET” part, VBScript was the de facto language for that platform. The closest thing C++ developers had was ATL Server, which I doubt anyone actually used.

I think there’s more to it than just sour grapes, however. But before I go on, I would like to offer two extreme yet relevant anecdotes from my own experience.

Dim what now?

I once worked for a company where I could have sworn that my managers (all three of them) were taking bets on how soon they could break me. Almost every project that was once abandoned found its way to me to shape up and ship out, and a lot of them were horrendous hacks. The final straw came when I was given a utility built in Visual Basic that needed some “very minor modifications” as manager #2 said.

I opened up the project and it contained a single form that looked something like the following:

A cluttered program window with too many controls on it

Imagine it in Battleship grey and much, much more cluttered. It was an insult to both “theology and geometry” as Ignatius J. Reilly would say. But let he who has not made an unusable mess of a UI cast the first LOL.

The real horror was those little aqua-colored rectangles next to the text boxes. Those turned out to be label controls that were invisible at runtime. The programmer—and I’m using that term generously—stored some data in the text fields of those labels.

Allow me to repeat: He used invisible label controls to store data.

The only reasonable conclusion is that this person did not understand the concept of variables.

I gave up right there and then, went to manager #3 (since he was on paper my direct manager), and told him that I can’t work on this. They ended up contracting the programmer who originally wrote it to do the required work.

You don’t need a textbox for that

In the same company mentioned above, I was tasked with interviewing candidates for programming positions. I finally settled on a simple programming question: For your programming language of choice, write a subroutine that would take a single string as an argument and would reverse it in place.

Most people I interviewed didn’t know where to even start; the question simply seemed to make no sense to them. The remaining few who provided an answer got it wrong. I can remember only two people who eventually got it right after a nudge in the right direction.

But the most memorable interview was with someone who was MSCD-certified in Visual Basic. Unprovoked, he went on a monologue describing in detail his endless accomplishments and, his total awesomeness, and how unfair life was to him.

I had to interrupt him after fifteen minutes or so. I gave him a paper pad and a pen and asked him to write the subroutine described above.

Nothing came. And nothing continued to come for two minutes.

I offered to help. I wrote a skeleton of a Visual Basic subroutine for him and told him to fill in the rest.

Still nothing.

Finally, he began to write, and I quote:

TextBox =

“I don’t think you’ll need a textbox for this,” I said.

He crossed that out, wrote it again, and then we just sat there.

The way I had to end the interview was a sad affair that I will not recount here.

But here’s the thing: That person did not even comprehend that code could exist outside the scope of responding to UI events. Even the most basic structured programming concepts like functions and methods were alien to him.

Of Goats and Sheep

I believe that those two examples are just extreme cases of non-programming goats. Actually, I believe that they are so far up the goat end of the sheep-goat spectrum to qualify as members of the Arian-goat race.

Yet they and people like them who are simply incapable of programming have managed to coast by, jumping from ship to ship taking away padded résumés and leaving unmaintainable rubbles of code in their wake. The saddest thing is that many of them end up becoming beneficiaries of the Dilbert Principle, probably leading them to hire even more goats since they wouldn’t know any better, and the vicious cycle would continue.

Visual Basic and its ilk of RAD tools weren’t the only culprits that enabled the goats to masquerade as sheep, but they certainly helped.

But goats in sheep’s clothing aside, the damage was also extended to those who could actually program to save their lives. Most developers who stayed within the safety of the Visual Basic bubble were insulated from ever having to know about such trivia as the differences between the stack and the heap, how pointers work, the trade-offs between garbage collection and reference counting, or that reference counting even existed.

The problem is that these trivia do still matter:

Life just gets messier and messier down here in byte-land. Aren’t you glad you don’t have to write in C anymore? We have all these great languages like Perl and Java and VB and XSLT that never make you think of anything like this, they just deal with it, somehow. But occasionally, the plumbing infrastructure sticks up in the middle of the living room, and we have to think about whether to use a String class or a StringBuilder class, or some such distinction, because the compiler is still not smart enough to understand everything about what we’re trying to accomplish and is trying to help us not write inadvertent Shlemiel the Painter algorithms.

-Joel Spolsky, “Back to Basics

Ignorance of byte-land isn’t something unique to Visual Basic, but there seems to be something about Visual Basic of old that encouraged this type of ignorance. Maybe it was because the abstraction was near leak-free, or maybe it was the IDE that tucked away the details too neatly, or maybe it was the poor design of the language itself, or maybe it was the tiering of developers into the component developers caste that did all the bit twiddling and the application developers caste that did all the dragging and dropping.

In Conclusion

Whatever the reasons were, and whatever the actual scope of this ignorance of byte-land was, it left a lasting impression among the curly-bracers that borders on being a full-blown stereotype: Visual Basic programmers don’t know the basics.

This stereotype is harsh, unfair, and is often too quickly applied to Visual Basic programmers in our community, and I would personally like to see much less of it. But maybe you can now better understand the reasoning behind it.

My Kindle DX Turns Into A Jackson Pollock

After less than three months of ownership, my Kindle DX decided to die, but at least it has done it artfully:

My Kindle DX showing broken screen

It isn’t quite dead. Rather it’s in a perpetual catatonic state. It will happily accept power and will turn its single, dimmed eye yellow until it had its fill. And it will sometimes show a glimmer of hope in the form of a flickering pixel or two (equivalent to REM I guess).

I wish I could send it back, but assuming that the shipping costs aren’t prohibitive, I’m not sure if the process would work for me, considering that I obtained it through a mail forwarding service.

Anyways, my Kindle 2 still survives. That’s some comfort.

What was good

The only reason I bought the DX was to read technical, large-format books on it, which proved to be a mixed bag.  The instant gratification of wireless delivery was the biggest plus for me, especially considering that I have to wait an average of ten days for physical book shipments to actually arrive.

The convenience of carrying around a small library was also a big win. Reading The Economist with the morning coffee, catching up on  finishing Pro ASP.NET MVC 2 during breaks, and finally ending the day with some Asimov after watching the semi-daily dose of The Daily Show—all that was nothing but pure reading bliss.

What was bad

But the main reason I bought the DX was for programming books, which looked ridiculously ill-formatted on the diminutive Kindle 2. The DX works better, but not by much. I’ve tried books from O’reilly Media, Apress, and Manning. And while the content was readable, the formatting was lacking. Code blocks are always rendered as pictures, which often look blurry and break the flow of normal text in weird ways.

Using PDFs instead of the Kindle’s native AZW files fixes all formatting issues, but even on the DX the text ends up looking like fine print: readable, but not without effort. And many charts and diagrams don’t render at all.

Now what?

I would have been tempted to get another Kindle DX had my experience been more positive. For technical books, it’s barely adequate. For everything else, its smaller sibling is a better fit, both in one’s budget and one’s messenger bag.

Now I’m stuck trying to find an alternative. Reading on my main computer at work or at home is not an option. If The Shallows is to be believed (and it does offer some very compelling evidence), I fall into ADHD mode as soon as I’m seated in front of anything that has an email client and a browser.

My netbook is no good because I feel as if I’m viewing content through a peephole whenever I use it for more than half an hour.

All Tablet PCs that I have tried over the years—all three of them in fact—were power-challenged and carrying them in one hand for any extended amount of time required the stamina of a one-handed SAS operative trekking across the Falklands.

Which leaves the iPad. More flexible, probably less ADHD-inducing than a full PC, has half-decent battery life, not that expensive compared to the DX,  and—according to at least one report—offers a better reading experience than the Kindle.

But then again, I would be hovering dangerously close to becoming something like this.

Risky.

Older posts
Powered by: