Turk/2 Compiler Usage Instructions


This is my current version of the Turkish Demitasse (Turk/2 or T2) language as used in all my new programming projects. The language is specified by the T2C grammar, which when given to my TAG compiler (TAGC.exe), generates T2 code, which compiles in the second compiler (T2C.exe) to produce C++ acceptable to the Microsoft VisualStudio C++ compiler (and possibly other Win32-compatible compilers, such as gcc, but I have not tested them). This is the tool set I use for compiling BibleTrans. Sorry, no Linux nor unix versions of the full Framework code are available, because there is no standard graphical user interface (GUI) for those systems, but Wine probably works with these Win32 versions (I didn't try these latest versions, and you still need find a "Windows.h" header set from Microsoft to compile them).

Another version of the same T2 grammar, Turk68 compiles to native 68000 machine code, but doesn't work on anything that doesn't support MacOS/68K code. That's what I use for my local software development, because for a while it ran faster in emulation on my 400MHz PowerPC G3 Mac than the compiled C++ code runs native on a 2GHz PC. It must have been bugs in my translation, because now the PC code runs about 4x faster than the Mac. But the user interface is so much nicer on the Mac than either Windoze or eunuchs (including OSX).

Both versions of the Turk/2 compiler expect to build and read their own version of a Libry.BTL file for MOS system code.

The C version of the TAG compiler with its supporting files, is in a ZIP file. It contains these two ready-to-run programs:

TAGC.exe -- The TAG compiler, to Turk/2, and
T2C.exe -- The T2 compiler, to Win32 C++
and these source files:
TAG2Turk.tag -- source TAG for TAGC
T2Cpp.tag -- source TAG for T2C
Nothing.t2 -- for creating the base library
SysLibs.t2 -- source code (in T2) for MOS API library, requires also Cstuff.t2 to run
TagLibs.t2 -- source code (in T2) for TAG compiler library code, to be compiled after SysLibs.t2
Cstuff.t2 -- source code (in T2) to interface to the Win32 glue TurkFrame.h and SysNames.h
TurkFrame.h -- glue code (in C++) to interface to Win32
SysNames.h -- header code (in C++) to interface to Win32
MakeFile.txt -- (see below) for building the whole system
Libry.BTL -- an empty library file to start with
The Win32 glue code seems to be working now. I built this on a PC running VisualStudio 2003 in WinXP, but it probably also works in later (desktop) systems. The previous revision of everything except the (Microsoft) C compiler was also tested in Win95 -- except the long runs took exceeding long on my memory-challenged Win95 running in emulation on a memory-challenged PowerPC.

(2014 March 11) The TAG compiler seems to have a problem: it runs fine inside VisualStudio, but crashes when running stand-alone. I added some diagnostic code to try to track that down, and it does not crash when that is enabled. So if it crashes for you, turn on caps-lock and hold the shift key down (both at the same time) while starting it up. It will make a huge log file, but (at least for me) runs to completion. [If it still fails, run it again immediately with shift+lock, and send me the second log file. It might not tell me anything, but it's more than I have.] Or else make your own build (C sources here) and run it in the debugger environment. If you ever find out what the problem is, please tell me!

Note that most of this code is nominally still under development. If you have a folder "C:\AllDocs\" it will write a huge log file with information I use to identify runtime errors. If the caps-lock key is locked, then that file gets monstrously bigger, and even if it's not writing the file, it goes through the motions and takes a lot longer to compile. Once it's working better, I will turn off the logging switches.

Give the two ".tag" grammars to TAGC to produce ".t2" files, which can then be compiled in T2C to build the C++ files. But first you need to build the libraries in T2C. Compiling Nothing.t2 when no library file exists creates a file "Libry.BTL" containing a null library. Subsequent compiles appear to work without this step, but fail later. The first time you run it, it asks where the library file is, and fails if you can't give it a file (a properly named empty file works); it saves the file location in the Registry for subsequent runs. Then compile SysLibs.t2, then TagLibs.t2. You can save a copy of the updated Libry.BTL file for subsequent compiler compiles. Save a copy before compiling the TagLibs.t2 file for faster compiles of things other than compilers. You cannot compile a package using a library file already containing that package, ugly things happen. It used to work, but then I changed the library format and I have not gotten around to making it work again.

Finally, compile Cstuff.t2 to C. You can use the base library, but sometimes it works (but slightly slower) if you just do it after the other compiles. My C compiler on the Mac is a limited "student" version with a 32-file limit, so T2C builds a composite ".cp" file for each source file containing all the (compiled) packages from the corresponding ".t2" file except its main, as well as separate ".cpp" and ".h" files for each named package; you can use either the ".cp" or the ".cpp" files (but not both) for libraries like SysLibs.t2, but you should use only the one ".cpp" file for main programs (because it already contains all its component packages). A complete build of T2T2 in a C compiler would take these source files:

TAG2T2m14.cpp // the main program
TagLibs.cp
SysLibs.cp
Cstuff.cp
For T2C, replace the main program with T2Cpp.cpp. The handwritten C glue files TurkFrame.h and SysNames.h will normally come in by #include lines already there. Programs with multiple processes (like BibleTrans) need a main program for each process in the same C++ build. There is no conflict, because the actual "main()" program is in TurkFrame.h. However, when there are multiple source files, it is important that each separate source file have at least one line defining a different value for the C name "_BASE_" (see supplied source files), because the first digit of that number disambiguates the string constant names, which would otherwise produce link errors. C is such an ugly language. Maybe after I get everything working, I'll do a native x86 code compiler and eliminate the C step entirely. I did that a couple years ago for the Mac, and it's so much easier.

The Microsoft compiler does not recognize ".cp" files, so you must change their names to ".cxx" for it to work. The Microsoft compiler (like all their products) is very hard to use (I call it "job security" for their professional customers, because it keeps out the amateurs) so you are on your own getting a project set up -- but there's lots of help on the internet. I tried it and did not succeed, but compiling the C++ files to Win32 programs should be possible on any compatible compiler, for example gcc. Let me know if you succeed, and what you did, and I'll add it to the documentation here.
 

MakeFiles

A complete rebuild can take a while (about an hour on my PC), so T2C.exe can also be run in batch mode. Create any text file with "$FILES$" in the first line, then the full path for each file to compile, followed by a blank line at the end of what it should do. You can (should) designate a folder for the generated .cpp and .h files to be written to by giving the full path of a file that already exists in that folder with a carat+space "^ " at the front of its line; otherwise the files go into the C drive root directory. The build will fail if it cannot open a file.

A dollar "$ " at the front of the file path makes a copy of a ".cp" file (if it exists) giving it the ".cxx" suffix. Other magic characters at the front of a line (followed by a space) are the right-pointing arrow "> ", designating a file to copy the current library to, and the left-pointing arrow, designating a file to copy back into (overwriting) the current library.

Assuming that these folders exist on your system, and assuming you start with an empty library file, the following list should recompile the whole system when you give it as a text file to T2C:

$FILES$
^ C:\AllDocs\T2C\Turk2C\Nothing.t2
C:\AllDocs\T2C\Turk2C\Nothing.t2
> C:\AllDocs\T2C\Turk2C\Nothing.BTL
C:\AllDocs\T2C\Turk2C\SysLibs.t2
$ C:\AllDocs\T2C\Turk2C\SysLibs.cxx
C:\AllDocs\T2C\Turk2C\TagLibs.t2
$ C:\AllDocs\T2C\Turk2C\TagLibs.cxx
C:\AllDocs\T2C\Turk2C\TAG2T2m14.t2
C:\AllDocs\T2C\Turk2C\T2Cpp.t2
< C:\AllDocs\T2C\Turk2C\Nothing.BTL
C:\AllDocs\T2C\Turk2C\Cstuff.t2
$ C:\AllDocs\T2C\Turk2C\Cstuff.cxx


Most of these same source files are part of the Turk68 build, which makes classic MacOS application programs directly. There is no standard unix way to do user interface, and I hate command lines, so there's no unix build. Sorry about that, but not very. I'd say "Get a Mac," but they've stopped making the real thing. Maybe they stopped making usable Windows systems too, I don't know. Most of this stuff seems to run in WINE, but I have not checked everything nor recently.

You are encouraged to experiment with writing your own grammars, and/or make modifications to these two. See "How To Write a Transformational Attribute Grammar" for help getting started doing that.
 

Embedded C

Because T2 is a systems programming language, you need to be able to  do things that the "safe" vanilla language does not allow. In the T68 version that takes the form of PEEK and POKE and C0DE functions imported from package "Dangerous". C natively lets you do some of those things -- which is why it's not a safe programming language (see "C++ Considered Harmful") -- so the obvious way to access them from T2 is to allow embedded C the way most C compilers allow embedded assembler. I did that in four ways, which are enabled by importing "Dangerous".

The Java keyword "protected" used in declaring variables or calling functions generates a C++ namespace prefix. Mostly you don't need to use namespaces, but I did some of that in the Win32 glue to prevent name collision in the linker. That seems to happen a lot in C.

C0DE functions are allowed one additional quoted item, a string of C code emitted instead of all the rest of the items, which is ignored in T68. The C0DE function has whatever value it is given in the source code.

RAW_CODE("quoted") (one underscore) is a valueless function call that can be used inside a function or method to generate a line of C code from a quoted string. The compiler makes no checks nor assumptions about the copied string, but hey, it's C!

RAW__C0DE (two underscores) before or between declarations, followed by a quoted string, emits that string (without the quotes). If the quoted string is empty, then a second quoted string gives the name for a namespace, or if also empty, ends it. This is particularly useful, because a hyphen at the front of the (quoted) line forces the rest of the line into the .cpp file, while a dollar in the same position forces it into the header file. Normally T2C generates these lines automatically, and then a postprocess step extracts the header information for a separate file. When you know what you are doing, you can do all kinds of dangerous but useful things this way, as you can see in my source code.
 

Known Problems

String literals translate as calls to a constructor function to turn them into pointers to an integer array structure containing the implementation of my "String" data type. Part of the C-fixup code captures all these calls and replaces them with pointers to fixed data objects, one set for each compiled file, but it only works properly if the file has one or more "package"s and a main() that "import"s them. So while you can compile individual "package"s, they probably won't work properly. Don't expect a fix any time soon.
 

Links:

The latest copy of the TAG compiler, with notes on how to use it.

A discussion of the Turkish Demitasse language design.

The current version of T2Cpp.tag with its compiled T2C.exe and supporting files, in a ZIP file.


Tom Pittman
2014 March 14