multipart-mixed

The Perils of Bad C/C++ Schools

Joel Spolsky recently argued against teaching comp-sci courses in Java, saying schools do a great disservice to students and future employers by not teaching C pointers, recursion, and functional programming. I agree on most points, but I must argue that the average school teaching C/C++ isn't doing that great of job, either.

In my day, the Duke CS department taught C++. The data structures course covered what you'd expect, including pointers and recursion. Joel speaks of your program saying "segmentation fault" when you struggle with pointers, but here's the rub: I'm pretty sure most students dink with their program until it stops segfaulting and produces the right output, but they still don't understand pointers.

Pointers

Certainly I didn't master pointers my first year in school. It was during my second year when a master programmer, Ryan Martell (co-author of Bungie's Marathon games), gave me several hours of his time to explain how things really worked. He showed how various pieces of code compiled into assembly language, how the stack and heap work, and exactly how pointers and dereferencing work. It was a revelation -- finally, everything made sense.

Since then I've helped younger friends struggling through comp-sci classes. They'd send me C code that was a mess of asterisks and ampersands, and when I asked them what they were doing, it was clear they simply didn't know. They understood pointers as a concept on a whiteboard, but not how to use them in practice.

There's a certain subset of pointer skills that students just need to know cold. Consider this code:

int* foo()
{
    int buf1[10];
    int *buf2 = new int[10];

    for (int i = 0; i < 10; i++)
    {
        buf1[i] = i;
        buf2[i] = i;
    }

    // ...
}

What's the difference between these two buffers? Where is each allocated? Why could you safely return buf2 from a function but not buf1? Any CS student should be able to answer these offhand.

Recursion

When I learned recursion, it seemed that most of my peers "got" it, too. What they didn't get were the gotchas. Question: why can't you use recursion on a data set of arbitrarily large (or unknown) size? Students should know the answer offhand. [Update: see comments about tail call optimization below.]

Object-Oriented Programming

This is a tough one. I agree with Joel that perhaps OOP masters are born and not made. In industry I've seen junior programmers who take to OOP like a duck to water, and others with years of experience who still don't get it.

But schools must attempt to teach OOP. We need to teach it, however, with the perspective that you don't "know" or "don't know" how to design object-oriented software. It's a continual learning process, and doing it well takes both concentrated study and years of experience. Even those who understand basic concepts like is-a/has-a won't necessarily get finer points like the costs/benefits of inheritance versus object composition. I'd argue that you can't truly appreciate the finer points until you've worked on real projects for a while -- years, not days or months -- and seen how designs stand up (or fall apart) as a project evolves over many versions.

Debugging

It appears that many (most?) schools don't teach students how to use a debugger. It seems so obvious as to be absurd: the debugger is an invaluable tool for figuring out why, where, and how your program isn't behaving. I also consider it a valuable learning tool: you can use gdb as an interactive shell for exploring what's in your buffers and what pointers are pointing to.

Style

This is perhaps a debatable point, but have you looked at the code written by most college students? It's an awful, just truly awful mess. I cringe when I look back at mine, too. Universities need to grade code with the standards they grade English papers. It's not enough that the compiler understands it -- the code should also be readable and consistently formatted to an exacting style guide. Life in industry would be so much better if junior engineers (and senior ones, too) would just write clean code.

Revision Control

Revision control is a topic my CS teachers never touched, yet it's absolutely essential to the day-to-day work of any competent software engineer. Please, please, will somebody start teaching students basic code management practices?

And So On...

Of course it's impossible to teach a student everything -- at some point they need to get into the real world and learn the rest as they go. But it does seem like the average comp-sci major (from a JavaSchool or C/C++ school) is left ill-equipped to work on projects of any consequence.

When it comes down to it, a well-versed engineer should be able to tackle anything thrown at them. If you don't fully understand pointers, how could you ever write a device driver? If you can't use a debugger, how much time do you need to waste before your employer starts to think you're incompetent?

If you're a student now, it's your responsibility to master at least the things Joel mentioned and get exposure to the topics I've added here. I'm sure I've missed some critical points, too -- please do us all a favor and add them to the comments below.

Comments

Q: Why can't you use recursion on a data set of arbitrarily large (or unknown) size?
A: While causing the call stack to overflow is always something to consider with recursion, some compilers can optimize carefully written recursive functions (such that the recursive call comes last) so that the call stack size remains constant.

djc, thanks for your comment, that's good to know. I should also note that other languages (e.g. LISP, Scheme) probably don't have the same kinds of recursion limitations since they're designed to be used that way. I mostly work on embedded systems in C/C++, though, and call stack limitations are a very real consideration.

Dear Josh,

I think you are confusing engineers with scientists. Computer scientists are not concerned with the presentation of code or things like version control. There are more concerned with the theoretical apects of computation. Computer science is exactly that, a science. And as with most scientific study, does not have a direct and immediate or obvious application in the real world. Engineers on the other hand are required to have more of a practical knowledge of how to get things done in industry.

Joel.

fwiw, gcc performs tail call optimisations.

Joel,

Thanks for noting the difference between pure science and engineering. The problem in this case is that most universities do not have a "software engineering" degree. Aspiring software engineers have to settle for a comp sci degree. As such, if a university is concerned with the future welfare of their students, should they not introduce them to basic practices expected of any working professional?

Josh,

If you haven't read it already, I am sure you'll enjoy the classic with the rather long title:

"Debunking the 'Expensive Procedure Call' Myth, or, Procedure Call Implementations Considered Harmful, or, Lambda: The Ultimate GOTO".

by Guy Lewis Steele, Jr.

http://repository.readscheme.org/ftp/papers/ai-lab-pubs/AIM-443.pdf

I finally got pointers when I did a real C plugin with real uses. Schools focus on difficulty, but not complexity. Throw students a complex, practical problem. It gives them a reason to learn the principles involved. If only CS professors bothered reading up on learning sciences research once in a while. Any basic book (like "How People Learn"), teaches this. Search also on problem-based learning, anchored instruction, simulation before instruction, etc. The sad part is that the scientists and engineers are the ones our government pays to improve instruction, when they are the least qualified to know the errors in their instruction.

Thank God, I have a teacher who is really strict about code style ;)

If the "Anonymous" user comes back i'm hoping he or she post more info on those books regarding learning processes. Very nice and informative article.

Regarding the costs of recursion: some compilers heap-allocate what would ordinarily be called "stack frames", and this helps prevent stack overflow. But there is no magic in the design of Lisp and Scheme that prevents certain uses of recursion from blowing up in your face. If you write the naive recursive factorial function and try to find 50!, you're going to get 50 stack frames.

However, as others have pointed out, you can sometimes count on compilers and interpreters to optimize tail calls into gotos. In fact, this is sometimes used to express infinite loops, or operations on infinite data structures.

And this is all much more advanced stuff than any of Joel's "Java Schools" would teach their students. There's the problem, as I see it: it's not a problem with Java, it's a problem with lack of depth. Introductory programming classes spend so much time on the syntax of a switch-case statement that they don't have time to go into greater depth.

The superficial treatment of recursion can lead to an almost superstitious fear of the technique; I had a TA once who was surprised that I used a recursive algorithm for computing GCDs of two Java ints; he was concerned about speed and the possibility of stack overflow. But the depth of recursion for Euclid's Algorithm is very shallow; in this case, I estimate that it couldn't take more than 24 recursive calls to produce a result. There was nothing to worry about---yet this very smart TA worried.

I say that introductory CS classes should use SICP and just water it down some and curve the grades up if they want more graduates.

There is one problem with the "computer science is not engineering" argument. Most students who get a CS degree then go out and get jobs which are in fact computer programming. Not computer "science", but "engineering".

And those students actually think they are computer programmers because they have a CS degree. They may have some background in computer science, but when it comes to programming they're still just hackers. They'll write up 1000 lines of code, and then hack at that code until it seems to work. Then they go on to write the next 1000 lines of code.

If these students were all going off to do computer science jobs, then we wouldn't care if they used revision control, or had any kind of coding style. We wouldn't care about their code at all, because no one else would be running it, or trying to modify it.

I originally went to school for Physics, but switched to computers because it was hard to find a good job that was actually in "Physics science", and not in "Physics engineering". The same is true in computing, but we proclaim some fantasy that there are hundreds of thousands of computer science jobs that people should be going to college for. But the vast majority of the available jobs are much closer to computer engineering than any kind of science.

"As such, if a university is concerned with the future welfare of their students, should they not introduce them to basic practices expected of any working professional?"

Nope... Universities are a business more interested in their bottom line. Not that they don't care about their student's welfare, they just care less about it compared to making money. It's up to the professors to get on board with what you are saying. The 2 Universities I attended required me to take classes well outside my core. Why? So they could make more money off me. Ask yourself this, why do some Universities require more credit hours to graduate with a BSEE or Comp. Sci. degree than others? Yea, that's right, money.

my 2 cents

Better yet, why not teach the kiddies how to program in a real language? The convoluted, non-human-readable syntax of C, C++, Java and C# muddy the intellectual waters of large projects. Teach Pascal, Ada, Ada95 or the new Ada2005, and watch your probability of success increase!

@Tim: Actually, I think, the language is not important. More important is to know how the machine works and that a programming language gives you a certain level of abstraction. I really learned pointers, like Josh, wenn I started coding in assembly. And I learned much more than only pointers, but the possibilities and limitations of a computer. As a sidenote, assembly still has an abstraction level, which some people consider as high. This knowledge helped me since then to understand how languages work and how language specific features are likely to be transformed. A language is then "just" a kind of design of a abstraction level, but there is no "real" language for every task or personal taste. Conclusion: The applied knowledge of how a computer works, helped me in my day-to-day work regardless of the language. Right now this is Java, Smalltalk and C. (Sorry, I'm no native speaker)

Sorry, I meant not Tim but Phaedrus.

"Joel speaks of your program saying 'segmentation fault' when you struggle with pointers,"

Actually, that's not even half of it. When working with pointers, you're lucky if a mistake leads to a seg fault -- at least then you know right away you've done something wrong. The really tricky thing about pointers in my experience is that half of the mistakes you can make lead, not to a program crash, but to bizarre and unpredictable behavior that shows up downstream in totally unexpected and unrelated places. Debugging this kind of mistake leaves you wishing for an exorcism.

hai Josh,

i am a beginning programmer. before that i was a "salesman". i will share my "thing" here as it matches with "perils" & i found this "thing" many months before i read "perils":

i struggled a lot to becomea beginning programmer. i was quite an average student in class, my graduate score is 54% only & later i was a simple, average salesman (yes i came from selling into programming, the day i watched "Hackers" at HBO).

it took me 1 year from a novice to beginner, beginner in the sense of "able to understand variables & functions only, not in the sense of doing real-life coding", i have never done this. anyway, i tried many languages along the way , Python, Ruby, Scheme, Java, Lisp etc & my understanding began to develop the day i picked up an online link of "programming from ground up " by Jonathan Bartlett. this book teaches assembly language on LIGNUX (my cure for Linux & GNU/Linux :-) using GAS. it was really very-hard i gave up after 25 days but those 25 days + 4 months of struggle with "Practical Common Lisp" built the foundation of me as a programmer, i got my "starting-gun (as Pink Floyd says)" & presently i am onto C++.

my 2 years of experience as a salesman + 3 years as a comp. app. student + now 1.5 years of experience as doing programming has convinced me enough that real-life learning, maturity, exceptional & valuable skills, quality of making yourself a craft, all start with the day when you start working on *hard* things, when you start choosing *things* that demand your 100% skill attention, *hard* work that disturbs you mentally, that puts you in an uncomfortable situation for days & months, the work that sucks all of your important personal & family time, that takes away your happy moments... RISE happens then, it is painful BUT quite reward-full too, there is one more bad news ... you have to do this all of your life, at every waking moment. If you do this for 1 year, trust me, it will become a habit & after 3 years of *hardness* you will not feel any pain. Rather if someone will try to take you away your *hard" things then you will feel pain, you will become attached with those *things*, i call them *problems*, they are good. In our profession, if you are not facing any problems related to programming, trust me, you are not doing any work, you are a just a lazy, incompetent fucking moron. look for the programming that gives you *pain*.

personally i prefer Assembly over C in "perils", i hate both Java & Scheme, both look disgusting to me. right now i am learning C++ without doing C. this is just a personal choice, i feel C is more beautiful than C++ (yes, i did some 'C' from Steve Summit notes). I have created a list of things i will be working on for 1 year:

1.) C++
2.) Common Lisp
3.) LIGNUX Assembly
4.) OOA & D (Grady Booch + many others, its will be years long)
5.) Haskell
6.) real-life coding with GNU projects, of course, may be HURD/L4 i dont know exactly.

added 2 more, as advised by you, Josh, thanks for them:

7.) Subversion
8.) GDB

thanks for your precious time.

Post a comment