This is an assembly language translator, which assists in the conversion of
assembly language from one CPU to another. Given an assembly language source
file, AT will attempt to translate the instructions and operands from this
file into those suitable for another CPU.


WHAT IT DOES:
------------
AT performs instruction lookups, and pattern matching on the operands, and
uses a lookup table to convert each instruction/operand-class combination
into an instruction sequence for the target processor. This is an automatic
process which will perform much of the mundane work involved in moving code
between CPU's.


WHAT IT DOES **NOT** DO:
-----------------------
AT does NOT convert between major differences in CPU architectures. The
translation performed is on an instruction by instruction basis, and does
not alter logical sequences to accommodate or take advantage of differences
between the source and target processors.

This means that AT works best if you can define a reasonable architectural
mapping between the source and target processors. Registers of the source CPU
must map directly to registers or memory locations on the target CPU, and
these must support similar operations to those available on the source.

Multiple instruction sequences and/or calls to "runtime functions" can be
used to fill in minor gaps, however major differences in architecture may
require these to excess, resulting in a lot of manual work, and a final
result which is large and inefficient.


SOURCE FILE FORMAT:
------------------
The source file to be converted must follow the general syntax of my (DDS)
XASM cross assemblers, with some extensions and limitations, Specifically:

- Labels must begin in Column#1. They may have a ':' at the end, but
  this is not required.

- Instruction must NOT begin in column 1. Ie: they must be indented by 
  at least one space or TAB character. If there is a label on a line
  with an instruction, at least 1 space or TAB must exist between the
  label and the instruction... EVEN IF THE LABEL ENDS WITH ':'.

- The operand field must be separated from the instruction by at least
  one space or TAB. Unlike the XASM assemblers, spaces in the operand
  field ARE permitted.

- If there is a comment, it must begin with a comment delimiter. This
  defaults to ';', but can be set to any character with a command line
  option. (The comment delimiter is optional (not required) with XASM,
  however it is REQUIRED by AT).

- To recognize and extract variable "expressions" from the source operands,
  AT uses these (XASM) rules:

  A variable "expression" in the source operand is defined as one or more
  "value elements". A value element is defined as:
    Any constant: n $n @n %n nH nO nQ nB nT nD 'c' * $
    Any symbol, containing the characters: A-Z 0-9 _ ? !
    Any unary operator, followed by another value element: - ~ =
    Any expression contained within '(' and ')'.

  Multiple value elements can make up one expression, provided that
  they are separated by:
    Any binary operator: '+ - * / \ & | ^ < >'

If your source files do not meet these requirements, you will need to
modify them accordingly. I have included a couple of handy utilities
(XASM2INT and INT2XASM) which assist in conversion of assembly sources
from one form to another.


HOW TO CONFIGURE AT:
-------------------
There are so many processors available, that it makes little sense to
try and provide extensive predefined tables for AT, most code conversion
projects result from outgrowing a dated device, and moving to one of the
"latest and greatest" devices. The source/target combinations are endless,
and within a particular CPU instruction set, every assembler has it's own
particular quirks and "features". This makes it very unlikely that you
would find a predefined table for exactly your requirements.

In the light of the above, my approach is to make AT easily configurable
to the source->target combination being used. This is accomplished by making
it entirely table driven. All the user has to do is to write the tables which
define the instruction conversions for his selected source -> target CPU
combination.

AT's internal tables are optimized for speed and compact storage, and are
horribly difficult to code by hand. For this reason, I have also provided
ATC, the AT-Compiler, which accepts an easy to understand source->target
definition file, and generates the internal tables for you.

To configure a working AT, you must follow these steps:

  - Write source->target translation source (.ATC) file
      Use your favorite editor to create: trantbl.ATC

  - Compile .ATC file to .C file with ATC
      Command: ATC trantbl

  - Compile .C file to .OBJ file with DDS Micro-C/PC *1
      Command: CC trantbl -FM

  - If expression translations are required, modify ATT.C
      Use you favorite editor again!

  - Compile ATT.C to ATT.OBJ with DDS Micro-C/PC *1
      Command: CC ATT -FMOP

  - Link AT.OBJ, ATT.OBJ and translation table.OBJ into AT.EXE
      Command: LC -s AT ATT trantbl

*1: AT is compiled with DDS Micro-C for the PC. Micro-C/PC is available
    at no charge from my web site: http://www.dunfield.com

I have provided a "batch" file (MAKEAT.BAT) which will automatically
perform the above steps. Just enter: MAKEAT <translation source file>


THE TRANSLATION SOURCE FILE:
---------------------------
The translation source file defines the source->output conversions that
AT will perform. This file has the following format:

  ; text    <= Lines beginning with ';' are ignored (comments)

  ~n Text   <= Define error message n(0-9) for later use

  sinst soperand|oinst ooperand[|oinst ooperand ...]
    sinst    = Source instruction
    soperand = Source operand     
    oinst    = Output instruction
    ooperand = Output operand
    (output instruction and operand may be repeated)

The source operand may include '$'s, which defines variable "expression"
strings. These strings are parsed and internally stored as the operand
is classified by AT.

The rules for recognizing expressions are based on the syntax of my
XASM cross assemblers. If your source code uses different operators,
value items or symbol rules, you may need to preprocess your sources
first to make the expressions recognizable by AT.

AT recognizes an expression as consisting of one or more "value elements".
Please refer to the section "SOURCE FILE FORMAT" for a description of
variable "expressions" and "value elements".

Up to five (5) '$'s may be included in each source operand definition.
When classifying the operand, AT will record up to 5 variable expressions
for later retrieval.

The output operands may include '$n', where n = 0-4, and represents one of
the scanned variable expressions. When AT encounters the '$n' sequence, it
will insert the nth corresponding variable element from the classified source
operand at this point in the output sequence.

Before inserting a variable expression in the output sequence, AT will first
call the user defined function "translate()", passing it the variable string,
the instruction/operand identifier, and the value of 'n', providing the
ability to translate the expressions. See the comments in ATT.C for more
information.

At any point in the output operands, you may also include '~n', where
n = 0-9. This will cause the translator to output a predefined diagnostic
message, allowing you to emit warnings for specific translations.

The error messages are defined with a line beginning with ~n. Error
message definitions must occur before their first use.


USING THE TRANSLATOR:
--------------------
Once you have setup the translator, you may run it simply by entering
the command: AT <source_file_name>

This will read the source file, and write the output to the standard
output (console). The output will include error messages and warnings
from the translator. You may redirect this output using the command
line '>' operator.

Following the file name, you may specify any of these command line
options:

-DB      - enable DeBug messages

  This option is very useful when you are trying to figure out why a
  translation is not occurring as expected. As each source line is read
  and processed, AT will display the parsed sections (Label, Instruction,
  Operand and Comment), the instruction index, and operand classification
  that it has determined, as well as the translation index that is being
  used. (Refer to the comments in the ATC output file for the meaning
  of these).

-MCn     - Define how multi-instruction comments are handled

  These options control how comments are handled when the translator
  generates more than one output line for one source line:
  -MC/-MC0 = The extra instructions get no comment at all.
  -MC1     = The extra instructions get a "" comment only.
  -MC2     = Full source comment is repeated for each instruction.

-RC      - Remove comments from output

  This option causes AT to remove all comments from the output file.
  It is most useful in conjunction with -SC (below) to avoid the clutter
  of the source comments being duplicated.

-SC      - Include Source as comments

  This option causes AT to write out each source line as a comment to
  the output file before performing each translation. This allows you
  to see the original code as well as the translated code.

CS=[c]   - set Source comment delimiter character

  This option specifies the comment delimiter character recognized in
  the source file. AT will use this character to identify the location
  of source file comments. The default value is ';'.

CO=[c]   - set Output comment delimiter character

  This option specifies the comment delimiter character that is used
  in the output file. AT will write this character at the beginning
  of any comment. The default value is ';'.

LD=[c]   - set Label Delimiter character

  Specifies a character to be appended to any label written to the
  output file (normally ':' if used). Default value is no delimiter.

IC=n     - set Instruction Column

  Specifies the output column for instructions. AT will output all
  instructions beginning in this column. If a label is long enough
  that it would prevent the instruction from occurring in this column,
  AT will place the label on a preceding line by itself. The default
  instruction column is 8.

OC=n     - set Operand Column

  Specifies the output column for operands. AT will output all operands
  beginning in this column. If the instruction is long enough to prevent
  the operand from starting in this column, AT will output a single
  space. The default operand column is 16.

CC=n     - set Comment Column

  Specifies the output column for comments. AT will output all comments
  which occur with an label and/or instruction beginning at this column.
  If the instruction is long enough to prevent the comment from starting
  in this column, AT will output a single space. The default comment column
  is 40.

TW=n     - Set output TAB width

  Specifies the tab stop settings for the output file. AT will use TAB's
  in the output file to position to the Instruction, Operand and Comment
  columns, and will assume that they will "stop" at this interval. If the
  desired column lies between tab stops, AT will pad with spaces. If TW=
  is set to 0 (zero), AT will use spaces only to position the output file.
  The default tab stop value is 8.

OF=file  - set Output File

  Specifies the name of a file to write with the translated instructions.
  AT will write the translated code to this file instead of the standard
  output (console) device. Error/Warning messages will be unaffected.

EF=file  - set Error/warning File

  Specifies the name of a file to write with the error/warning messages.
  AT will write errors and warnings to this file instead of the standard
  output (console) device. The translation output will be unaffected.


MISC NOTES, THINGS TO WATCH OUT FOR, AND GENERAL HINTS:
------------------------------------------------------
Usually, the most difficult part of a code translation is handling of
the processor condition flags. Many processors handle the flags quite
differently from other. For example, Motorola processors tend to set
the condition flags on every load or store from/to memory, while Intel
generally do not.

This means that conditional jumps may require additional tests on one
processor while they do not on another. Alternatively, code for one
CPU may rely on the flags having not been altered by a memory reference,
and this may not be the case on the new target. Generating an extra
test in the first case, or saving/restoring the flags around every
memory reference in the latter case is never practical, as this results
in a huge code overhead. In both cases, you will find that most of the
time, the extra action is unnecessary and inefficient.

Outputting a warning message every time there is a potential flag
discrepancy is also not practical, as you will quickly get overrun
with excessive warning messages.

This is a case where there is simply no substitute for good old fashioned
"manual labor"... You will have to take the time to scan the translated
code, and make whatever modifications may be necessary to insure that the
condition flags are handled correctly in the cases where they are used
(mainly conditional jumps). Although this seems tedious, I assure you that
scanning the output of the translator beats rewriting and re-entering all
of the code hands down.


On a related note, you should not try to acheive a 100% translation. For
all but the most trivial programs, this simply is not a reasonable goal.
You should instead, use the translator to automate as much of the mundane
and tedious work for you as possible. You must still plan on a fairly
great deal of manual effort and "tweaking".


One feature of AT is that if it cannot perform a translation on any given
source line, it will issue a diagnostic, then write that source line to the
output file unchanged, and proceed with the translation. This can be very
handy, as it allows you the leave certain things for manual translation.
Some instructions simply do not have a direct replacement sequence on the
target processor. Some use hardware or other system features which will not
be available on the target. Some are just "too dangerous" to let a translator
handle for you (stack manipulations, interrupt processing etc.).


When performing manual translation, I find it best to use the -SC option
to include the original source code as comments. As you scan and fixup the
code, just delete these comments as you go. Some people prefer to work with
two files, one containing the "final" code which you will edit as you work
on it, and one containing the original output with source comments. If you
choose this approach, it is usually best to use the -SC and -RC options
together when creating this reference file. That way you will see the
original code with comments, and the translator output without comments.
