First and foremost, Java is the preferred first programming language taught in high school and college. There are interesting exceptions, but they are in the minority because when all the general-purpose programming languages in wide-spread use today are considered, Java offers the best uniformity across operating systems and implementations.
Java also has a huge user base: every Android phone out there. That's not as good as being an international standard (like C/C++), but it does mean that the language is fairly stable and it's less likely that new updates will break existing code. What you wrote (and worked) last year still works the same. That's very important for large projects that extend over time -- although we are doing this in four weeks, we don't want to have to come back in January to fix it before we can show the car off to potential donors or award committees -- and also when you bring in programmers from a variety of backgrounds: the Java they learned in school is the same Java we will be using, no additional training required.
More important, Java offers the best error-checking of any language
in serious use today. The vast majority of the development time in any
sizeable programming project is debugging (finding and correcting programming
errors), and the more help you can get from the compiler, the less time
you will spend stumbling around in the dark looking for why your program
doesn't do what you thought you told it to do. We have only four weeks
to do an awesome project, and we do not have time to waste on errors the
compiler can catch for us while we are typing the code in.
A large portion of the errors a good compiler can catch -- but not in popular languages like C and Python -- are related to what we call "strong data types" where the programmer must declare the data type of every variable and is forbidden from accidentally or maliciously changing the type of the data without also adjusting its internal representation to match. Java does this kind of enforcement better than C and far better than Python possibly can, but also far from as well as better languages now long dead. C++ is a superset of C -- that is, any programming errors that are legal in C are also legal in C++, so the compiler cannot flag them as errors -- with the result that while you can write cleaner code in C++ than you can in C, the compiler does not help you do it. You can use Lint to do what the compiler should be doing, but almost nobody does, except in large programming shops where efficiency in the production process (also known as profitability) is more important than programmer egos.
The bottom line is that programming in Java will get us to a working
autonomous vehicle program faster than any other language available, and
most of our participants already know it.
But what about performance? Java is thought to be slower than C. Yes, it takes longer to start up than languages (like C) that compile directly to native machine code -- because the Java development environment compiles to "byte code" that is then compiled to machine code by a JIT (Just In Time) compiler at run-time -- but the JIT compile happens once in a few seconds at the beginning of the run, and then the program runs at true native code speed. I wrote some test code (see "Code Efficiency" and TimeTest) to verify the hypothesis, and the atomic operations really do happen at raw CPU clock speed. There are some things Java does that are really slow, but you don't need to be doing those things in your program, you just need to do the same things you would need to do in C/C++ anyway (if you want your code to run fast), and then the really slow stuff like garbage collection won't happen.
Java automatically does range-checking so the most common security failure
in C and C++ simply cannot happen in Java. That costs a percentage point
or two at runtime, but almost nothing compared to the same code the programmer
is forced to write manually in C/C++ to get the same code safety, and you
don't even need to remember to do it in Java. That greately improves both
your development time and the quality of your code when you are done. The
Java designers foolishly chose to allow variable-sized slices in multi-dimensioned
arrays, so their array access (including bounds-checking) in those cases
is substantially slower than a more straight-forward implementation in
C, but you don't need to do that: you can pack your multiple-dimensioned
arrays into a single-dimensioned array, and then the access is as fast
as the machine hardware can do it, no performance loss at all. If you don't
go to that extra effort, the cost is small, nothing like the cost of garbage
collection, and far less than the cost of crashing in C/C++ from array
access out of bounds.
It gets worse. There is only one Win10. The Bad Guys love that, because
when they find a flaw, they can exploit it everywhere. There are a zillion
Linux distributions, all incompatible. If Linux is ever found to be less
vulnerable than Windows, it's because a flaw in one version of Linux probably
won't fail in the next version -- nor even the next build of the same version,
the code is in different places, so the Bad Guys must make their attack
specific to the particular build of Linux it is running on. That's too
much work. It's also too much work for the hardware vendors, so most of
their drivers only work in a small number of Linux distributions. Which
is why you can use their hardware on Windows and (probably) not on Linux,
even when they said you can. Being a commercial product makes OSX
better than Linux, but not by much.
Then there's basic raw speed. There are two kinds of speed that we are concerned with, the wall-clock time it take to get something done (remember, we have four weeks, start to finish), and the time it takes for a single run. We'll start with the wall-clock time.
Something over three decades ago -- after the Macintosh came out and everybody realized that computers could be easy to use, but before Windows or Linux existed and the PCs all ran DOS (a unix clone like Linux, but not as big) -- one of the magazines did a careful timed study of doing the same tasks on the PC and the Mac. Some of the things they tested on the Mac couldn't even be done on a PC (now it's the other way around, but OSX is not a true Macintosh, but rather another unix clone like Linux). Everything else, the users in their tests consistently reported that DOS was faster than the Mac, even though the actual time on a stopwatch showed the Mac generally about 10% faster. The authors of the article thought about that for a while, and came up with the insight that DOS users were constantly typing in complex commands on the command line, or else thinking about what they had to type next, whereas the Mac users just sat there waiting for the computer. Time goes by faster when you are busy than when you are waiting ("A watched pot never boils"). You see the same effect in southern California freeways during rush hour: people get off the freeway that is rolling along at 30mph to drive at 15 or 20mph on a back route that is 50% longer "because it's faster." They really do that. They are crazy down there. The command line is a lot more tedious and error-prone than mousing over and clicking, so it takes far longer -- numerous times I have stood behind somebody convinced he's "a fast typist" and watched him spend half of his time pounding on the backspace key and retyping most of the whole command back in, thus taking him more time than it would have taken to lift his hand off the keyboard and mouse over to an icon and 2-click it. That's not just in Linux, they do it on OSX and Windoze too, because they can. That -- together with the fact that Mac software "just worked" and needed no profitable (for the vendors) upgrades -- killed the MacOS and got it replaced with the older, buggier, and slower OSX (=unix). Not all of our participants last year took advantage of the performance improvements of mousing versus a command line, but many of them did (probably mostly out of ignorance: I suspect that programmers are sometimes just as foolish as LA drivers ;-)
Because there is a command line, the system vendors have no motivation to build clickable software tools. Why should they? It's easier -- and (so they think) "faster" -- for them to use the command line. So every Linux user (and many OSX users, for the same reason) are stuck with using the command line that is necessarily slower than a mouse click. You cannot type as fast as the computer can load a program and start it running. Do the math: if you type at a nominal 40wpm (about 4 keystrokes per second, less because there are so many special characters in your command line, and far less when you factor in the backing up and retyping), and if the average command is about 25 characters long, then you have taken at least six seconds, but more often like 10-15 seconds to type the command that you can mouse over to an icon and 2-click it in three seconds. Oh, by the way, the mouse is also faster than fingering a trackpad. Doesn't anybody measure this stuff?
Then there's the actual run-time loss induced by the command line. Because there is a command line, the system vendors have no motivation to build clickable software tools. So the de rigueur way to get a complex job done is to type multiple command lines -- veeerry sloooowly (but it feels fast). The more clever programmers take the time to gather their multiple command lines into a shell script, but that is too much like programming for the average user (and also for most programmers, who tend to underestimate the number of times they need to go through this process). The shell script is scripted, that is, the command line interpreter parses each character on the line one character at a time, just as if the computer had nothing better to do than wait for the user to type it in, and then go find that tiny little snippet of program that constitutes the smarts for that command line, and load it into memory and make it runnable, and then actually run it, then go back to the command line interpreter -- which is probably still in memory, but may need to be swapped back in if other things bumped it out -- for the next line in the script. Unless you are running a quiet semiconductor drive, listen to your computer the next time you run a shell script: do you hear your hard drive chatter? Each click in that rattle is one or two hard disk accesses (you are hearing the force to move the read head to a different place on the platter). You cannot hear the difference between the command line processing and the access time, but it's there, and your sequence of command lines is being interpreted by the computer much slower than a single well-written monolithic program, compiled into machine language, would take to do the same job from a single mouse activation. But that well-written monolithic program doesn't exist, because the computer has command lines instead. That's what Linux does for you.
I won't even get into the problems that students experience, if they come into this program at a lower comfort level on their computers. The command line, with its rigid insistence on absolute precision of typing -- and unix traditionally does not tell you what you did wrong when (not if) you made a mistake -- is intimidating and frustrating to novice users. Remember that well-written monolithic program? The one that is written by a professional programmer, rather than a noob just getting started, a professional who has a compiler that does error checking to tell him not only where his mistakes are, but also why. Even if ordinary Linux users rewrote their shell scripts into monolithic programs compiled into machine language, they would get better throughput on their computers (and so would everybody else). But that won't happen, because the computer has command lines.
And older, experienced users like me, when we know how to get a job done and just want to do it, Linux (like every other variant of unix) forces us to do it the slow, error-prone command-line way because the unixies don't have the cojones to take out a stopwatch and measure the time it takes to use the command line -- and then do the Right Thing, which is to throw it away and get a more powerful system. Computer power is where the computer does the work and the users sit there and wait for it, not the other way around. I learned computing on a command line, 20 years before a GUI existed. When the Mac arrived on my desk, my productivity took such a leap forward, I never wanted to go back. Linux is a giant leap backward from the (original) Macintosh, back to the 19th-century days of DOS. OSX is a slightly smaller leap backwards, but Windows is only a (large) step backwards. They all have command lines, but Windows is powerful enough that most users don't even know it has one. Computer power is where the computer does the work and the users sit there and wait for it; in Linux the users do the heavy lifting and the computer sits around and waits for them. Linux makes the users feel powerful, the way a pick and shovel make a ditch-digger feel powerful -- but a BobCat digs a bigger hole in the ground faster. We have four weeks, and Win10 is our BobCat.
If you get a job where you are paid by the hour instead of by the results, maybe you might foolishly want to intentionally slow your rate of delivery down -- it doesn't work: your supervisor knows who the slackers are, and they will be the first to go when the company needs to "downsize" -- but most of my career I was paid by the job, so the faster I produced, the better I got paid. That's true of everybody, but not as obviously when you are on a salary.
The bottom line is that programming the car in Win10 will get us to a working autonomous vehicle faster than any other operating system available, and most of our participants already know it.
Rev. 2019 May 29
Organic Food and Linux
C++ Considered Harmful
The Problem with 21st Century AI