Learn Programming in Java

<<Previous | ToC | Next >>

Lesson #7: Extras

Java was intended to be a serious programming language, not just a toy for learning on, so there were some things I hurried past without explaining. Today I make up for that fault with additional remarks I have titled "Failed," "Bad Things Happen," "Floating Point," "Fancy Operations," "Other Statements," "Array Bounds," and "Esoterica."

Failed

First of all, BlueJ is a toy environment. The real Java development environment was designed for 19th-century DOS-like text-driven systems such as those preferred by unixies. This is the 21st century, and there is at least one graphical "IDE" (Integrated Development Environment) that works with Java (besides BlueJ) called "Eclipse," and you might as well learn how to write programs there if you plan on using Java seriously. Or maybe you prefer typing and retyping and needless memorizing and working from a command line instead of seeing what you are doing and directly manipulating it so you can focus all of your finite cognitive energies on the task at hand instead of on fighting your tools. Whatever.

You probably didn't know that the word "eclipse" is derived from the Greek verb meaning to "fall out" or "fail". What we now know as the shadow of the earth covering the moon, or else the shadow of the moon blocking sunlight on earth, the ancient Greeks thought of as the moon or sun falling apart and failing in its main purpose, so they called it "failure" ('eclipsis'). Sometimes I think it's funny that we have taken the Greek word to mean the exact opposite, that "eclipse" in English refers to something better taking over what is thus deemed to be inferior. Me, I always think of "eclipse" in its original Greek sense, as a failure, and my first experience with the Eclipse IDE was consistent with this interpretation. But solar and lunar eclipses pass, and the IDE by that name also seems to have recovered somewhat (but not entirely) from its early failures.

If you are running Windows10, this will be somewhat more difficult, so see separate instructions here.

I Googled "download eclipse IDE" and got several hits near the top. Pick one and download the package suitable for your computer. It's some kind of zip file, which will unpack when you double-click it. Most people have a place for programs and shortcuts, so put the resulting folder there. Inside that folder, find the Eclipse application icon (or shortcut) and 2-click it. This opens a window with a half-dozen random icons over an artsy background. Hover over each one to pop up a label, then select the Tutorials. The first one is relatively easy, but if you make a mistake, you cannot get back to the beginning without the magic incantation. After you succeed at the first one, you might try the second, although we will be doing most of that next week. The others are probably over your head at this time. Eclipse is intended for people who already know what they are doing and what they want, not people like yourself just getting started. You have a steep climb ahead of you, but you can do it. If (or rather, when) you get stuck, get help. Help is free on the internet. Just Google your question, and if one of the top hits is in the "stackoverflow.com" domain, look there first, they are usually the best you can do -- at least for Java (they are kind of snotty about C questions).

Before I forget, the magic incantation to get back to the beginning is to find the "Workspace" folder you agreed to and delete it. If you checked the "Do not ask again" checkbox, it will re-create it the next time you start up, but if you want that as a do-over also, delete also the unpacked Eclipse folder and unpack it again. You must delete both folders to truly start over.

Bad Things Happen

When you dragged the "StartHere" folder onto the BlueJ icon, the window it opened had two yellow rectangles (classes) predefined, but we only worked with the "Hello" class. You probably looked inside the "Zystem" class (and that's OK) but there were some things in there I didn't tell you about. Now's the time.

In particular, we do not live in a perfect world, and the programs we write will not be perfect. Sometimes the data is bad, but more often our code is bad. Professional programmers prepare for Bad Things to Happen, and the approved Java way to do that is with exceptions. When some part of the program -- or more often, a library method -- cannot make sense of its data, it throws an exception, which some other part of the program can catch. It's like a return or break, but it can go a long way, and it can carry information about what went wrong (or at least what data didn't make sense and who thought so and how it got there).

"Exception" is a predefined class, which you can use as-is, or you can subclass it (which we will learn about next week) to allow for more information. Any method that can throw an exception (or calls a method that can throw an exception not caught in that method), must declare that fact in its header:

public static void BadThingsHappen() throws Exception {

This puts the Java compiler on notice that whoever calls BadThingsHappen must be able to catch exceptions or else pass them through uncaught (by means of another "throws" clause). The compiler then forces you to think about those problems -- at least as long as it takes you to catch them or pass them through. Usually, as you consider each possible exception, there is one best place to catch the problem and do something intelligent about it. Java requires that to be in the calling chain of the method that detected the error, that is either its caller, or its caller's caller, or somewhere up the ladder, and the mechanism for doing that is the "try" clause with one or more attached "catch" clauses, one for each kind of exception you plan to catch there. Class "Exception" catches them all. For now we'll do that:

try { // these things can cause exceptions...
/// do something dangerous here, like
BadThingsHappen();
} // end of try
catch(Exception e) { // catch all exceptions, information in e
/// do something to recover from the fact that BadThingsHappen didn't finish, or
System.out.println("Got exception " + e);
} // end of catch

The catch clause is kind of like a method declaration with a single parameter of type Exception. You can give that parameter any reasonable name you want, whether "e" or "whatsThis" or "ImTryingToConfuseEverybody", just so that's the name you refer to it by inside the catch clause.

Take a look at Zystem. The method System.in.read() normally does not throw any exceptions, unless input has been redirected from a file or something like that, and the file does not exist. We are not doing redirection this week, so my catch clause does nothing at all other than to set a default value. A credible way to deal with exceptions is to ignore them and keep running, but you must do that on purpose. If you are doing a nuclear reactor program or a medical program -- the license you agreed to makes you promise not to do that, but suppose you were -- then if the core got too hot, your catch clause might want to shut down the reactor, or if the patient wento cardiac arrest, your program could sound an alarm at the nurse's station, depending on what the system designer deemed appropriate for that particular catastrophe. You can do that in Java, even if the vendors' lawyers don't want you to.

Floating Point

When written as constants, all Java numbers must start with a numerical digit (possibly preceded by a negative sign). The four arithmetic operators can apply to any two numbers, and the result will be a number. However, there are different kinds and sizes of numbers. The most common you will see as a programmer are integers (int), and integers come in different sizes starting with byte (8 bits), then short (16 bits), plain int (32 bits) and long (64 bits). Everything we do in this course will be in integers only. Well, almost everything: the Calculator uses floating point.

The earliest computers were used in calculating ballistic tables during WWII, so that gunners (especially on battleships) could know what angle to set the gun for to get a particular distance, and that involved very tiny fractional numbers, so they invented a computer number type that resembled what is otherwise known as scientific notation, a fraction times some power of the exponent (usually a power of ten for people to look at, but a power of two in computers), which is basically how much to shift this fraction left or right to get its true value. This gives a dynamic range from something less than 10^-40 to more than 10³⁵.

To give you an idea of what that means, the smallest particle I could find the size of is a proton or neutron at 10^-15m, and the size of the visible known universe is estimated at 10²⁴m. Floating point numbers in Java follow the IEEE-754 standard (I was draft editor on the committee that defined it some 40+ years ago), and like the standard they come in two sizes, single (float, the above range) and double, which is a lot more range and more than twice as many significant digits (bits) of precision. The fractional part (sometimes incorrectly called "mantissa" as if this were a logarithm, which it is not) of float is 23 bits, which is not quite seven decimal digits of precision. If you are trying to steer a rocket to Jupiter or calculate the interest on a 30-year million-dollar loan compounded daily, you need double; otherwise single-precision float is adequate (but probably not faster on modern 64-bit computers).

When doing math or compares on them, all Java numbers are automatically promoted to the better representation of the two operands (if different), or you can explicitly cast a numeric value into some other numeric representation by using the type name inside parentheses, thus:

int anInt = (int) 3.14159;

Casts into a smaller (or less precise) representation (like the example above) can lose information, so you need to think about what you are doing.

Fancy Operations

There are 26 operators in Java -- not counting functions (methods that return a value, which work like operators but do not look the same in your code). You already know about the four arithmetic operators, and we previously mentioned the six comparison operators and five of the bitwise operators. I will here touch on the remaining eleven operators, but spend more time on the interesting things you can do with operators -- not the remaining 11, but the first 15, the important ones -- in Java or any other language.

Java has four unary operators -- they take a single operand -- and a really weird rule about how they combine with other operators in larger expressions. Operator precedence is defined for each language that has more than one operator, which operations get done first when there are several different operators in a single expression with no parentheses. In Java some of the precedence rules are bizarre (Google "Java operator precedence" if you care). You really only need to know that multiply and divide happen before add and subtract (same as you learned in grade school arithmetic class); I find it useful to know also that logical AND happens before OR (which is true in every language with any precedence at all). Most of the rest are goofy or just plain illogical. They are the same as C/C++ but different from other languages, so I refuse to depend on operator precedence (other than as mentioned) and liberally sprinkle parentheses around in all other cases. Parentheses are essentially free.

Anyway, you probably need to know that the unary operators don't work as you'd expect, they have the highest precedence of all operators (not counting parentheses, which are not operators). Unary negative "-a" is not the equivalent of "0-a" but more like "(0-a)" -- including the parentheses, so you can actually write goofy things like "b+-a" and Java will not complain. I always put the parentheses in.

Besides the obvious negative and plus, there is another numeric unary operator ("~") which inverts every bit so that "~a" can also be thought of as the same as either "a^(-1)" or "-1-a". There is a unary boolean NOT operator "!" which does the same for the single-bit boolean type and is useful inside an if-statement where you have a simple boolean value (variable or function result) and you want the condition to activate on the false result instead of true:

if (!somevalue) DoSomething(); // do it only if (somevalue==false)

There is a third shift operator ">>>" that shifts right with zero fill instead of extending the sign. There is a fifth arithmetic operator, sort of a remainder, but its rather strange treatment of negative operands limits its usefulness to places where you know they are positive (which is most of the time). However, it costs an integer divide, which is rather expensive in time, so most of the time we try to use a power of two as the divisor, and (for non-negative a) "a%8" has exactly the same value as "a&7" but takes much longer to compute in most hardware. It only matters inside the inner loops.

There are logical AND ("&&") and OR ("||") operators that apply only to boolean values and give a boolean result with the additional quality that expression evaluation is terminated as soon as the answer can be known (called short-circuit boolean). This is because "false&&anyboolean" is always false, and "true||anyboolean" is always true. You are expected to be able to depend on that in complex expressions like where you want to do something if (for example) an object whom is not defined or if it is defined but some method (say theMethod) defined for whom's class returns true:

if ((whom==null) || whom.theMethod()) DoSomething();

which is less code but otherwise exactly the same as:

if (whom==null) DoSomething();
else if (whom.theMethod()) DoSomething();

If (whom==null) but you try to call whom.theMethod() anyway, the runtime system will throw a null-object exception. Either of the above two ways to write it prevents that error.

One final operator is ternary, that is, it takes three operands in a short-circuit boolean sort of way. If all you wanted to do is assign to a variable theVar either TrueValue() or FalseValue() depending on whether TestMe is true or false, you could write:

if (TestMe) theVar = TrueValue();
else theVar = FalseValue();

If you wanted to use that value in a larger expression before assigning it, or if you wanted to use it (or the value of a larger expression containing it) you'd need a temporary variable, or else a conditional expression like this:

theVar = TestMe ? TrueValue() : FalseValue();

or for example in a conditional:

if ((TestMe ? TrueValue() : FalseValue())>0) DoSomething();

Note that TrueValue() and FalseValue() are never in the same execution path. If TestMe is true, then TrueValue() is evaluated (it could be any expression, possibly including operations like divide that might fail if TestMe evaluates to false) and not FalseValue(), and the other way around if TestMe is false. Me, I never use conditional expressions, it's just too much to remember (and get compile errors if I remembered wrong), and most compilers generate exactly the same code if you create and use a temporary variable. In other words, the compiler usually creates its own temporary variable. But I mention it because you will see it in the code of programmers with more dollars than sense. You want your code to be readable and admired because you did clever things, not because you used obscure operators which only get you dissed.

Other Statements

There is a second kind of conditional that is more compact (and probably faster) than a very long list of "else if"s but it only works well if every single one of your "else if"s is testing the same (integer or character) variable for a large number of different values relatively close together, like for example if you read some one-letter input command whom and you want to decide what to do with it. If you wrote (or thought about writing) something like this:

if (whom == 'N') {
NewFile();
OpenFile();
} // end of 'N'
else if (whom == 'O') {
OpenFile();
} // end of 'O'
else if (whom == 'V' || whom == 'P') {
Paste();
} // end of 'V'/'P'
else if (whom == 'X') {
Copy();
Delete();
} // end of 'X'
else if (whom == 'C') {
Copy();
} // end of 'C'
else { // none of the above..
System.out.println("Dunno what " + whom + " is");
} // end of default

The switch statement identifies a value to index the cases by, then lists the cases corresponding to the value of the control variable, like this:

switch (whom) {
case 'N':
NewFile();
OpenFile();
break; // end of 'N'
case 'O':
OpenFile();
break; // end of 'O'
case 'V':
case 'P':
Paste();
break; // end of 'V'/'P'
case 'X':
Copy();
Delete();
break; // end of 'X'
case 'C':
Copy();
break; // end of 'C'
default:
System.out.println("Dunno what " + whom + " is");
} // end of switch

The braces are required, and you can have any number of statements between case labels -- including none, if you want two or more cases to do the same thing. Java defines the break statements at the end of each case to be optional -- mostly so you can have several case labels do the same block of code -- but if you forget to put the break in, Bad Things Happen. The Java compiler won't complain, and it won't throw any exceptions, so you may not discover the problem right away. I think it's one of the C mistakes that Java didn't fix, but nobody cares what I think. My compiler calls it an error.

Hmm, I see I did another write-up on switch in Things You Need to Know in Java. You get my age, you forget things in a tutorial as long as this one. Better twice than never

There is also another kind of loop, but I never use it because it's just as easy to write:

while (true) {whatever(); if (!again) break;}

as it is to write:

do {whatever();} while (again);

and I don't need to remember so much. But it's there in the language. Take it or leave it.

The different conditionals and loops are called control structures, and programs using them used to be called "structured programs" before there was OOPS to brag about. The theory is that each control structure has exactly one entry and one exit, but obviously the break and continue statements sort of put the lie to that idea. Maybe if you consider the opening and closing braces to be those entry and exit points, but then there are exceptions. A continue from within a switch statement jumps out of the switch and out of any containing if-else's to the front of the nearest containing loop. Even a break leaps out of any containing if-else's to the end of the nearest containing loop or switch statement. The try-catch combination makes a giant jump out of any number of nested structures -- including any deeply nested subroutine calls -- from where the exception was detected out to whatever catch clause catches it. It's not very fast (you don't want to be doing it all the time), but it's extremely useful. Maybe that's why we never hear about the virtues of structured programming any more, because nobody wants to give up exceptions. All reductionisms are wrong -- including this one. The real world is far too complicated to fit into our nice reductionistic boxes.

Array Bounds

If you plan to write computer programs professionally, you might also want to know this (optional) information:

Better languages than C do not allow uncaught array access errors, but one of the problems with C is that the compiler/runtime has no way to know if you are staying within the bounds of your array. The result is a whole bunch of "security errors" that software vendors are constantly patching.

Checking every array access costs something at runtime, but not much more than a well-written C program that does its own checking. It's lots faster than crashing or (as in C) destroying other data or letting Bad Guys steal secrets. My compiler notices if you did your own checking, and omits its own if so, but I don't think standard Java does that. So if you look at my example code, or especially the runtime code for my Game Engine, you will see all these checks for null (array not yet allocated) and index bounds. It's a good habit, and as compilers get smarter, they will remove their own (now superfluous) checks and your code will run faster and safer. Right now, Java is just safer (which is not a bad thing). In the examples above, null checks are not needed, because the allocation obviously precedes the access. Range checks are not needed because the for-loop keeps the index within bounds, except for the access to "nums[i%11]" where the modulus operator ensures the computed index cannot exceed the array bounds. Compilers can recognize this, and mine does.

Most programming languages require you to specify all the dimensions when you declare a multidimensional array. Two-dimensional Java arrays are defined to be arrays of arrays, and Java arrays are given a dimension (the number of elements) when memory is allocated, which could happen several times for the same array in the course of a program, and even for different sizes, so there's no requirement for the second dimension to be uniform, nor even for all the elements to exist. This imposes a small performance penalty, about the same as accessing the first dimension of an array, but larger than in languages where multidimensionality is fixed at declaration time. This is not a problem, but you need to be aware of the differences when you move to another programming language. Me, I'm always pushing at the performance limits of the computer, so I look for opportunities to make my program run faster.

Esoterica

The rest of today I will tell you about some things you can do in Java (or most any other programming language), that you won't hear about anywhere else. But they are useful things to do, and I do these things all the time. You don't have to do these things yourself, you can write perfectly good programs and never do a single one, but if you do, and people see your code, they might tell you "You can't do that!" But you can. And then you need to explain it to them, and watch the lights come on.

No-Loop While

The first of these relates to using the break statement to unstructure a program. When writing a method for abstracting some operation, it is often the case that there are a variety of separate ways to accomplish the desired purpose. You test some combination of conditions, and if true you do the desired task (or calculate the desired result value) and then return. If the first set of conditions did not prevail, you try a second set of conditions, and if it succeeds you do the job and return. It's like a string of "else if"s but you don't need the "else" because the preceding return statement always exits the method.

But you don't need a subroutine to achieve the effect of testing possibly complex nested conditions, then getting out as soon as one of them is satisfied, you can do the same thing with break within a "while (true)" loop that never actually loops:

while (true) { // once through (never repeat), exit early when it succeeds
if (someTest) { // try this combination
    DoSomething(); // to prepare additional tests...
    if (test2) { // condition met, do it
      /// do whatever you want done on this condition
      break;}} // end of 1st test+whatever
if (anotherTest) { // it doesn't need to be complicated
    /// do whatever you want done on this condition
    break;} // end of 2nd test+whatever
if (moreTest) { // ... and so on
    /// do whatever you want done on this condition
    break;} // end of 3rd test+whatever
/// default, if nothing succeeds
break;} // end of while loop

Notice in particular the DoSomething() within the first test, which cannot be done with a conventional if-else structure, because the else only applies to the most recent if (before the brace pair), not both of them -- including the second test, which cannot be tested until after DoSomething() happens, and that cannot be done outside the protection of the first "if(someTest)". You can still do this kind of thing apart from the while-break structure, but it involves tearing apart the contiguous code, or else an accessory boolean variable that gets repeatedly tested -- and the program fails if you missed one of them:

boolean didit = false;
if (someTest) { // try this combination
    DoSomething(); // to prepare additional tests...
    if (test2) { // condition met, do it
      /// do whatever you want done on this condition
      didit = true;}} // end of 1st test+whatever
if (!didit) if (anotherTest) { // it doesn't need to be complicated
    /// do whatever you want done on this condition
    didit = true;} // end of 2nd test+whatever
if (!didit) if (moreTest) { // ... and so on

Exchange

This one I didn't invent, and it's not very useful in modern huge processors, but if you have limited memory (like in the Goode Olde Dayes) and you need to exchange the values in variable A with variable B, the conventional solution is to use a temporary variable X:

X = A;
A = B;
B = X;

But what if you cannot allocate a temporary variable? You can do the same thing using the exclusive-OR operator:

A = A^B;
B = A^B;
A = A^B;

The exclusive-OR operator is a little hard to understand, so this is really obscure, but if you understand this, you can use the operator anywhere. Let's go through it slowly, starting with all possible results of the exclusive-OR operator applied to two bits:

^ | 0 1
--+-----
0 | 0 1
1 | 1 0

Notice first that the operation is symmetrical, that is, A^B == B^A. Notice also that A^A == 0, regardless of what is in A. You can try this in Java, in your test program, running the debugger: I try to use initial values 3 and 5 for testing bitwise operations, because all four possible combinations are represented in four bits:

0011 3
^0101 5
----- -
0110 6

After the first line (back there) executes we have the combination of both variables in A. The second line starts with that combination in A, but toggles B back off, that is, A^B^B == A^0 == A, leaving the original contents of A now in variable B. The third line starts with that same combination in A, but toggles A back off, that is, A^B^A == B^0 == B, leaving the original contents of B now in variable A. Step through this in the debugger and convince yourself that it works.

Your mission, should you choose to accept it, is to do the same thing in the same number of steps, but using plus and/or minus instead of exclusive-OR. If you really understand how this works, you can do it. But don't get stuck, it's cute and fun, but not overly useful -- except as a way to hone your control over the language. If you are good at puzzles like this, you will be a good programmer. See my solution after you did it (or if you decide you can't).

Get Next Bit

This one is a lot more useful, and I never saw it anywhere, so you are now getting the inside scoop. My first try at the Tic-Tac-Toe game packed all nine board positions into the bits of one integer. It was way too obscure to explsain to beginners, so I abandonned it. When I wanted to loop through all the squares on the board, I started with a one in Bit0, then sequentially shifted that bit left, one bit at a time. While it's not particularly useful in TTT, there are times when you want the first (or an arbitrary) non-zero bit, not all of them. There is a very simple and fast way to get the rightmost non-zero bit out of an integer someBits:

abit = someBits&-someBits;

To understand how this works you need to understand how negative numbers work in the two's complement number system, which is used in all modern computers. Basically it's this: -X = ~X+1, where ~X as you should know, is all the bits flipped 0 to 1 and 1 to 0. You know how to do binary arithmetic, right? The sum of two single bits in the same bit position is the same as the exclusive-OR of those two bits, with a carry out when both bits coming in are 1. So let's see how different numbers go negative in a four-bit integer (where we discard the final carry out):

0:   0000    1: 0001    2: 0010    4: 0100    6: 0110    8: 1000
~   1111     ~ 1110     ~ 1101     ~ 1011     ~ 1001     ~ 0111
+1 (1)0000    +1 1111    +1 1110    +1 1100    +1 1010    +1 1000

I think of it this way: the rightmost run of zeros in the number, and the rightmost one, are all preserved in its negative; everything to the left of the rightmost one is complemented. Convince yourself this is true. Then we remember that X&~X==0 for all values of X. Recall the operation tables of AND and OR for all possible single-bit values:

& | 0 1        | | 0 1
--+-----       --+-----
0 | 0 0        0 | 0 1
1 | 0 1        1 | 1 1

Notice that if either input bit is zero, the AND result bit is zero, so if you toggle all the bits to their opposite state, then either the original bit is zero or the toggled bit is zero, so the result is necessarily zero. Now applying that to the negative, all the bits to the left of the rightmost one are toggled, so the AND of those bits is all zero. Both the rightmost one and all the zeros to its right are preserved, and the AND of the zeros is still zero, but the AND of that rightmost one with its negative (still a one in that bit position) is one, the single bit left standing after everything else has been zeroed.

When you are working with pixels in an image, it is often useful to find runs of the same pixel value. This is easy with black and white images, where each pixel occupies one bit only, 32 (or 64) pixels in each integer. Assuming the pixels are packed "Little-Endian" (the little end of the number is bit 0 = pixel #0), then the first black pixel can be found by the formula above. Rather than step through each successive pixel in turn, we can get a mask of the whole first run of black pixels like this:

abit = somePix&-somePix; // 1st black pixel
nxtz = (somePix+abit)&~somePix; // next white pixel
run = nxtz-abit; // the whole run of black pixels
somePix = somePix&-nxtz; // remove that run from somePix
/// process the run...

You already understand getting the first black pixel. The second line adds that single bit to all the pixels in the original integer; we know that first bit is a one, so adding one to it produces a zero in that bit position and a carry out to be added to the next pixel position. If that one is originally a zero, then the sum leaves a one there with no carry out and all the rest of the pixels remain unchanged in the sum, but if it's also black (one) then it gets flipped to zero and the carry propagates to the next pixel, until the run of black ends and you are left with a single one in the position of that next white pixel (plus all the following pixels unchanged). ANDing the complement of the original pixels zeroes out all those remaining pixels, but keeps that final carry out of the sum, which replaced a zero in the original, the complement of which is one. Confusing? Yes, but a good programmer can focus on those itty BITty details and squeeze awesome results out of your computer. You can do that, if you want to.

But you don't need to do this today, unless you want to. For the first couple years of your career as a programmer, you probably will never need anyhing like this. Bookmark this link and come back to it when you are ready. Then write up a small program with those four lines in it, and maybe a test pattern of pixels like somePix=0x0510F7E0. Put it in a loop and step through in the debugger, watching what happens to the bits. The first run for that test pattern is six pixels 0x07E0, and the second run is four pixels 0xF000, then the remaining three runs are only one pixel each. Do you see how it works? Go thou and do likewise. Invent something useful that nobody else ever thought of.

Revised: 2021 May 8