Why Java? Why Windows10?

Why Java?

We are doing this autonomous car project during a 4-week summer day-camp, and we -- I -- have chosen Java for you to do it in. Why Java? There are a couple practical reasons, plus a compelling technical reason.

First and foremost, Java is the preferred first programming language taught in high school and college. There are interesting exceptions, but they are in the minority because when all the general-purpose programming languages in wide-spread use today are considered, Java offers the best uniformity across operating systems and implementations.

Java also has a huge user base: every Android phone out there. That's not as good as being an international standard (like C/C++), but it does mean that the language is fairly stable and it's less likely that new updates will break existing code. What you wrote (and worked) last year still works the same. That's very important for large projects that extend over time -- although we are doing this in four weeks, we don't want to have to come back in January to fix it before we can show the car off to potential donors or award committees -- and also when you bring in programmers from a variety of backgrounds: the Java they learned in school is the same Java we will be using, no additional training required.

More important, Java offers the best error-checking of any language in serious use today. The vast majority of the development time in any sizeable programming project is debugging (finding and correcting programming errors), and the more help you can get from the compiler, the less time you will spend stumbling around in the dark looking for why your program doesn't do what you thought you told it to do. We have only four weeks to do an awesome project, and we do not have time to waste on errors the compiler can catch for us while we are typing the code in.

A large portion of the errors a good compiler can catch -- but not in popular languages like C and Python -- are related to what we call "strong data types" where the programmer must declare the data type of every variable and is forbidden from accidentally or maliciously changing the type of the data without also adjusting its internal representation to match. Java does this kind of enforcement better than C and far better than Python possibly can, but also far from as well as better languages now long dead. C++ is a superset of C -- that is, any programming errors that are legal in C are also legal in C++, so the compiler cannot flag them as errors -- with the result that while you can write cleaner code in C++ than you can in C, the compiler does not help you do it. You can use Lint to do what the compiler should be doing, but almost nobody does, except in large programming shops where efficiency in the production process (also known as profitability) is more important than programmer egos.

The bottom line is that programming in Java will get us to a working autonomous vehicle program faster than any other language available, and most of our participants already know it.

But what about performance? Java is thought to be slower than C. Yes, it takes longer to start up than languages (like C) that compile directly to native machine code -- because the Java development environment compiles to "byte code" that is then compiled to machine code by a JIT (Just In Time) compiler at run-time -- but the JIT compile happens once in a few seconds at the beginning of the run, and then the program runs at true native code speed. I wrote some test code (see "Code Efficiency" and TimeTest) to verify the hypothesis, and the atomic operations really do happen at raw CPU clock speed. There are some things Java does that are really slow, but you don't need to be doing those things in your program, you just need to do the same things you would need to do in C/C++ anyway (if you want your code to run fast), and then the really slow stuff like garbage collection won't happen.

Java automatically does range-checking so the most common security failure in C and C++ simply cannot happen in Java. That costs a percentage point or two at runtime, but almost nothing compared to the same code the programmer is forced to write manually in C/C++ to get the same code safety, and you don't even need to remember to do it in Java. That greately improves both your development time and the quality of your code when you are done. The Java designers foolishly chose to allow variable-sized slices in multi-dimensioned arrays, so their array access (including bounds-checking) in those cases is substantially slower than a more straight-forward implementation in C, but you don't need to do that: you can pack your multiple-dimensioned arrays into a single-dimensioned array, and then the access is as fast as the machine hardware can do it, no performance loss at all. If you don't go to that extra effort, the cost is small, nothing like the cost of garbage collection, and far less than the cost of crashing in C/C++ from array access out of bounds.
 

Why Windows?

I also chose Win10 for this project. The biggest reason is that the hardware we were able to get ran with Win10 drivers. Some of them said they also supported Linux, but (see "Why Not Linux?" below) it was often a lie. Or maybe just ignorance. Windows is supported by the largest small-computer operating system vendor in the world, and their system runs in some 90% of the computers of that size range in the world, which is a very large number of computers. When something goes wrong -- and it does, often -- these users have a paid-for copy of the system, so they have a natural right to get it fixed (never mind what the End-User License "Agreement" disclaims), and a for-profit company cannot afford to let the bugs go unfixed. So while Win10 and Linux are both written in C and therefore both start out with more bugs than a downtown walk-up flat, the Windows business model motivates Microsoft to fix them quickly, whereas the Linux business model motivates the developers to not bother. The result is that Windows is sturdier and more stable than any other operating system available for single-board computers. Yes, you hear about more Windows bugs than Linux bugs, but that's because it has an order of magnitude bigger market share, with several orders of magnitude more money getting operated on inside those computers, so the Bad Guys concentrate their efforts on Windows rather than Linux, and the result is fewer actual bugs in the Win10 system we chose than in a comparable Linux system. That means much less time fighting the system and more of our cognitive energy left over for making this an outstanding demo at the end of four weeks.
 

Why Not Linux?

The biggest reason for avoiding Linux is that "You get what you pay for." Windows is a commercial product, so they spend a lot of their gross income on keeping it healthy and profitable. Linux is "free, as in free beer," so there's no money to be made on it except as consulting experts who are paid for their time, so the business model (as it is in all so-called "Open Source" products) favors making it as opaque and hard to use as possible, so that the inevitable problems will result in lots of (hopefully paid-for) calls to Linux experts. Do you want to be constantly paying for that help? Steve doesn't, he hopes for free. But the "Total Cost of Ownership" is much less for a high-volume paid product than for Open Source. Time is money, and we don't have time for that cost in our four weeks. Apple's OSX, being a commercial product, wouldn't have the disability that Linux has in terms of support, but their market share is down there somewhere close to Linux, so there's no commercial incentive for Apple to make it available for fun projects like our autonomous cars.

It gets worse. There is only one Win10. The Bad Guys love that, because when they find a flaw, they can exploit it everywhere. There are a zillion Linux distributions, all incompatible. If Linux is ever found to be less vulnerable than Windows, it's because a flaw in one version of Linux probably won't fail in the next version -- nor even the next build of the same version, the code is in different places, so the Bad Guys must make their attack specific to the particular build of Linux it is running on. That's too much work. It's also too much work for the hardware vendors, so most of their drivers only work in a small number of Linux distributions. Which is why you can use their hardware on Windows and (probably) not on Linux, even when they said you can. Being a commercial product makes OSX better than Linux, but not by much.

Then there's basic raw speed. There are two kinds of speed that we are concerned with, the wall-clock time it take to get something done (remember, we have four weeks, start to finish), and the time it takes for a single run. We'll start with the wall-clock time.

Something over three decades ago -- after the Macintosh came out and everybody realized that computers could be easy to use, but before Windows or Linux existed and the PCs all ran DOS (a unix clone like Linux, but not as big) -- one of the magazines did a careful timed study of doing the same tasks on the PC and the Mac. Some of the things they tested on the Mac couldn't even be done on a PC (now it's the other way around, but OSX is not a true Macintosh, but rather another unix clone like Linux). Everything else, the users in their tests consistently reported that DOS was faster than the Mac, even though the actual time on a stopwatch showed the Mac generally about 10% faster. The authors of the article thought about that for a while, and came up with the insight that DOS users were constantly typing in complex commands on the command line, or else thinking about what they had to type next, whereas the Mac users just sat there waiting for the computer. Time goes by faster when you are busy than when you are waiting ("A watched pot never boils"). You see the same effect in southern California freeways during rush hour: people get off the freeway that is rolling along at 30mph to drive at 15 or 20mph on a back route that is 50% longer "because it's faster." They really do that. They are crazy down there. The command line is a lot more tedious and error-prone than mousing over and clicking, so it takes far longer -- numerous times I have stood behind somebody convinced he's "a fast typist" and watched him spend half of his time pounding on the backspace key and retyping most of the whole command back in, thus taking him more time than it would have taken to lift his hand off the keyboard and mouse over to an icon and 2-click it. That's not just in Linux, they do it on OSX and Windoze too, because they can. That -- together with the fact that Mac software "just worked" and needed no profitable (for the vendors) upgrades -- killed the MacOS and got it replaced with the older, buggier, and slower OSX (=unix). Not all of our participants last year took advantage of the performance improvements of mousing versus a command line, but many of them did (probably mostly out of ignorance: I suspect that programmers are sometimes just as foolish as LA drivers ;-)

Because there is a command line, the system vendors have no motivation to build clickable software tools. Why should they? It's easier -- and (so they think) "faster" -- for them to use the command line. So every Linux user (and many OSX users, for the same reason) are stuck with using the command line that is necessarily slower than a mouse click. You cannot type as fast as the computer can load a program and start it running. Do the math: if you type at a nominal 40wpm (about 4 keystrokes per second, less because there are so many special characters in your command line, and far less when you factor in the backing up and retyping), and if the average command is about 25 characters long, then you have taken at least six seconds, but more often like 10-15 seconds to type the command that you can mouse over to an icon and 2-click it in three seconds. Oh, by the way, the mouse is also faster than fingering a trackpad. Doesn't anybody measure this stuff?

Then there's the actual run-time loss induced by the command line. Because there is a command line, the system vendors have no motivation to build clickable software tools. So the de rigueur way to get a complex job done is to type multiple command lines -- veeerry sloooowly (but it feels fast). The more clever programmers take the time to gather their multiple command lines into a shell script, but that is too much like programming for the average user (and also for most programmers, who tend to underestimate the number of times they need to go through this process). The shell script is scripted, that is, the command line interpreter parses each character on the line one character at a time, just as if the computer had nothing better to do than wait for the user to type it in, and then go find that tiny little snippet of program that constitutes the smarts for that command line, and load it into memory and make it runnable, and then actually run it, then go back to the command line interpreter -- which is probably still in memory, but may need to be swapped back in if other things bumped it out -- for the next line in the script. Unless you are running a quiet semiconductor drive, listen to your computer the next time you run a shell script: do you hear your hard drive chatter? Each click in that rattle is one or two hard disk accesses (you are hearing the force to move the read head to a different place on the platter). You cannot hear the difference between the command line processing and the access time, but it's there, and your sequence of command lines is being interpreted by the computer much slower than a single well-written monolithic program, compiled into machine language, would take to do the same job from a single mouse activation. But that well-written monolithic program doesn't exist, because the computer has command lines instead. That's what Linux does for you.

I won't even get into the problems that students experience, if they come into this program at a lower comfort level on their computers. The command line, with its rigid insistence on absolute precision of typing -- and unix traditionally does not tell you what you did wrong when (not if) you made a mistake -- is intimidating and frustrating to novice users. Remember that well-written monolithic program? The one that is written by a professional programmer, rather than a noob just getting started, a professional who has a compiler that does error checking to tell him not only where his mistakes are, but also why. Even if ordinary Linux users rewrote their shell scripts into monolithic programs compiled into machine language, they would get better throughput on their computers (and so would everybody else). But that won't happen, because the computer has command lines.

And older, experienced users like me, when we know how to get a job done and just want to do it, Linux (like every other variant of unix) forces us to do it the slow, error-prone command-line way because the unixies don't have the cojones to take out a stopwatch and measure the time it takes to use the command line -- and then do the Right Thing, which is to throw it away and get a more powerful system. Computer power is where the computer does the work and the users sit there and wait for it, not the other way around. I learned computing on a command line, 20 years before a GUI existed. When the Mac arrived on my desk, my productivity took such a leap forward, I never wanted to go back. Linux is a giant leap backward from the (original) Macintosh, back to the 19th-century days of DOS. OSX is a slightly smaller leap backwards, but Windows is only a (large) step backwards. They all have command lines, but Windows is powerful enough that most users don't even know it has one. Computer power is where the computer does the work and the users sit there and wait for it; in Linux the users do the heavy lifting and the computer sits around and waits for them. Linux makes the users feel powerful, the way a pick and shovel make a ditch-digger feel powerful -- but a BobCat digs a bigger hole in the ground faster. We have four weeks, and Win10 is our BobCat.

If you get a job where you are paid by the hour instead of by the results, maybe you might foolishly want to intentionally slow your rate of delivery down -- it doesn't work: your supervisor knows who the slackers are, and they will be the first to go when the company needs to "downsize" -- but most of my career I was paid by the job, so the faster I produced, the better I got paid. That's true of everybody, but not as obviously when you are on a salary.

The bottom line is that programming the car in Win10 will get us to a working autonomous vehicle faster than any other operating system available, and most of our participants already know it.

Tom Pittman
Rev. 2019 May 29
 

Links:

Computer Power
Organic Food and Linux
C++ Considered Harmful
The Problem with 21st Century AI