HP-41 assembler in C language

The purpose of the asm41 program is to assemble source files written in KHP-41 assembly language to listing files and, optionally, to object files.

This document contains the instructions to compile and use the asm41 program, and the KHP-41 assembly language reference guide.

Copyright © 2015 Kari Pasanen

Contents

Revision history

Source code

The source code is in the following modules:

asm41.c asm41.h

Compiling

The program has been tested only when compiled with gcc (GNU C Compiler).

Use the following command to compile the program in the directory where you have placed the source code modules:

    gcc -o asm41 asm41.c

The example assumes you are using Linux. Modify it in the appropriate way to fit your actual shell and operating system.

Usage

Invoke the assembler by a command of the following type:

    asm41 sourcefile >listingfile 2>errorfile

The example assumes you are using the bash shell under Linux. Modify it in the appropriate way to fit your actual shell and operating system.

Redirecting the standard output stream (stdout) to listingfile and the standard error stream (stderr) to errorfile is actually optional but generally you want to do it. There is no default extension of the sourcefile; you specify it in the invocation as well as the full names of listingfile and errorfile.

The asm41 program reads the source in two passes. During the first pass the program constructs the symbol table while it prints the listing to stderr (forward references of symbols show false values). If there is an error in the source, it is reported and the program stops at the first error encountered. During the second pass the program prints the final listing to stdout.

Object code is optionally generated on the second pass. The contents of the object file is the object code that belongs to the ROM page defined in the source, and its calculated check sum is the last word of the file. It is allowed to assemble code without the object page defined or at locations outside of it. It is also possible to generate object files for multiple pages.

File types

The file types used and generated by the asm41 program, with their recommended filename extensions are:

Object file format

The object file consists of 4096 16-bit words in big endian byte order. This is the .rom file format that can be directly downloaded into a SY-41CL.

Assembly language

The assembly language recognized by the asm41 program is KHP-41. This section of this document is a brief reference guide to the language where familiarity with some other HP-41 assembly dialect – the mnemonics and their meaning in the operation of the processor – is assumed.

If you have a real problem (not expected by the author of the program) in recognizing an instruction mnemonic, i.e., translating it into the HP-41 assembly dialect you are already familiar with, you can assemble a short program with such mnemonics and analyze the instruction codes.

Listing lines output by the assembler are actually not pure assembly language but their documentation is here after the documentation of source lines that defines some terms necessary to understand what kind of lines there are in the listing.

Character set

The character set assumed for the assembly source is ASCII. Technically all eight-bit characters but newlines ('\n') are allowed in comments and in parameters enclosed in braces. However, null characters shouldn't be used in the source because their effect is unpredictable.

Source lines

The assembly source consists of lines and line separators. The line separator is a newline ('\n').

Characters classified as whitespace characters are space (' ') and horizontal tab ('\t').

A line may begin with any number of whitespace characters. When the leading whitespace is stripped, the next character, if there is any before the newline, classifies the line in one of the following categories:

  1. Empty line.
  2. A whole-line comment. Begins with a period ('.').
  3. A directive line. Begins with a percent sign ('%').
  4. An instruction line. Begins with any other character.

Listing lines

An empty line in the source is output as an empty line in the listing.

A whole-line comment in the source is output as a whole-line comment in the listing starting at column 1.

Other source lines and lines added by the assembler based on the source are output in the listing in the following way. For lines corresponding to directive lines in the source:

And for lines corresponding to instruction lines in the source:

The assembler adds comments to lines based on jump instruction lines ("=nnnn" form of parameter to an absolute jump, "$+nn" or "$-nn" to a short relative jump, if the parameter in the source is in another form) and to lines generated by macro expansions (display characters).

If you wish to convert a listing file to a source file, follow these steps:

  1. Preserve lines that begin with '.' (whole-line comments).
  2. Remove lines that don't begin with '.' and where there is a '+' in column 10 (lines added by the assembler).
  3. Remove columns 1–10 from lines that don't begin with '.' (columns added by the assembler).

Syntax notation

Elements in italics are replaceable with appropriate content. Whitespace is not allowed within them if not otherwise stated. At least one whitespace character is needed between the elements but more is allowed.

Elements enclosed in brackets ([]) are optional in the general syntax but either mandatory or forbidden in the context of an actual directive or instruction.

Common syntactic elements

comment

The comment may contain any characters.

hexconstant

The hexconstant is a non-empty string of hexadecimal digits, i.e., characters '0' through '9' and 'A' through 'F', with maximum length of four.

pagenum

The pagenum is a hexconstant of exactly one character in length.

symbol

The symbol is a non-empty string of characters with maximum length of MAX_SYMBOL_LENGTH. The character set is not restricted otherwise but the first character cannot be '$' or '=' and there cannot be any commas (',') or braces ('{' or '}') in a symbol.

The difference of local and global symbols is that only global symbols and their values are output by the EXPORT directive.

org-relative-reference

The org-relative-reference is in one of the following forms:

symconstant

The symconstant is a parameter which can appear in one of the following alternative forms:

displaystring

The displaystring is a non-empty string of characters that represent HP-41 display characters. The character set is restricted to the convertible characters whose ASCII values are 32 (' ') through 111 ('o').

The mapping between an ASCII character and an HP-41 display character is obvious except for the following characters:

filename

The filename is a non-empty string of characters. It may contain any characters except braces.

type_N_parameter

The type_N_parameter is a parameter of some class 0 instructions. Its valid values are single hexadecimal digits, i.e. characters '0' through '9' and 'A' through 'F'.

type_D_parameter

The type_D_parameter is a parameter of some class 0 instructions. Its valid values are decimal numbers 0 through 15. The binary representations of the instructions shuffle these values but the parameter value used in the assembly language has a logical meaning referencing to a specific flag, pointer value or rotation count.

type_TEF_parameter

The type_TEF_parameter is a parameter of class 2 instructions. The meaning of the parameter is the register field the instruction operates on. Its valid values are PT, X, WPT, ALL, PQ, XS, M, S.

Directives

General syntax of a directive line

%directive [parameter] [.comment]

or

%directive [{parameter}] [.comment]

The directive is one of ORG, LABEL, GLABEL, EQU, GEQU, NAME, MESSAGE, DMESSAGE, INCLUDE, EXPORT, QUIET_ON, QUIET_OFF, INITPAGE, OBJDUMP, TURBODIS or TURBOEN.

The actual parameter can be enclosed in braces ('{}'). In the braced form a period ('.') or whitespace characters are allowed within the parameter but braces are not allowed.

Directive ORG

Syntax:

%ORG hexconstant

or

%ORG {hexconstant}

This directive assigns the value given as the parameter hexconstant to the variable org, which means the assumed location of the object code (the page number may be unknown if the source code is page-relative; then use any convenient page number).

An ORG directive is mandatory before the directives LABEL, GLABEL, NAME, MESSAGE or DMESSAGE and before any instruction.

Directive LABEL

Syntax:

%LABEL symbol

or

%LABEL {symbol}

A period ('.') or whitespace characters are allowed in the symbol in the braced form of this directive.

On the first pass this directive assigns the current value of org to the symbol. The symbol and its value are stored in the symbol table as a local symbol. On the second pass this directive does nothing.

Directive GLABEL

Syntax:

%GLABEL symbol

or

%GLABEL {symbol}

A period ('.') or whitespace characters are allowed in the symbol in the braced form of this directive.

On the first pass this directive assigns the current value of org to the symbol. The symbol and its value are stored in the symbol table as a global symbol. On the second pass this directive does nothing.

Directive EQU

Syntax:

%EQU symbol,hexconstant

or

%EQU {symbol,hexconstant}

A period ('.') or whitespace characters are allowed in the symbol in the braced form of this directive.

On the first pass this directive assigns the value hexconstant to the symbol. The symbol and its value are stored in the symbol table as a local symbol. On the second pass this directive does nothing.

Directive GEQU

Syntax:

%GEQU symbol,hexconstant

or

%GEQU {symbol,hexconstant}

A period ('.') or whitespace characters are allowed in the symbol in the braced form of this directive.

On the first pass this directive assigns the value hexconstant to the symbol. The symbol and its value are stored in the symbol table as a global symbol. On the second pass this directive does nothing.

Directive NAME

Syntax:

%NAME displaystring

or

%NAME {displaystring}

This directive acts as a macro that expands the displaystring to CON instructions with their parameters being converted character by character in reverse order from the restricted ASCII character set of the parameter to HP-41 display character values. The constant 0x80 is added to the first value (at lowest org but at end of string).

In addition, on the first pass this directive assigns the current value of org after the macro expansion to a symbol which is named after the first MAX_SYMBOL_LENGTH characters of displaystring. The symbol and its value are stored in the symbol table as a local symbol.

This directive is used to expand a function name as display character constants and to save the beginning address of the function's executable code with its name (cut to MAX_SYMBOL_LENGTH characters) to the symbol table.

The maximum length of displaystring in this directive is twelve because the string must fit in the display.

Directive MESSAGE

Syntax:

%MESSAGE displaystring

or

%MESSAGE {displaystring}

This directive acts as a macro that expands the displaystring to CON instructions with their parameters being converted character by character in forward order from the restricted ASCII character set of the parameter to HP-41 display character values. The constant 0x100 is added to the first value.

Directive DMESSAGE

Syntax:

%MESSAGE displaystring

or

%MESSAGE {displaystring}

This directive acts as a macro that expands the displaystring to CON instructions with their parameters being converted character by character in forward order from the restricted ASCII character set of the parameter to HP-41 display character values. The constant 0x200 is added to the last value.

Directive INCLUDE

Syntax:

%INCLUDE filename

or

%INCLUDE {filename}

This directive includes the filename into the assembly source. File inclusions may be nested. When the included file ends, reading of the original file continues at the next line after the INCLUDE directive.

There are no specific restrictions to the contents of the included file.

Directive EXPORT

Syntax:

%EXPORT filename

or

%EXPORT {filename}

This directive outputs the global symbols and their values from the symbol table as a list of EQU directives to the file named filename. The resulting file can be used in other source files as an include file.

Directives QUIET_ON and QUIET_OFF

Syntax:

%QUIET_ON
%QUIET_OFF

These directives control the output of the program by setting and resetting the quiet mode. No output is made in quiet mode except in the case of an error. Non-quiet is the default mode when the asm41 program starts.

It may be convenient to set quiet mode before the inclusion of files containing EQU directives that define symbols of the HP-41 operating system and other libraries and to reset quiet mode after the inclusion.

Directive INITPAGE

Syntax:

%INITPAGE pagenum

This directive defines the object page. On the second pass of source code the directive zeros the array preallocated for object code storage.

The programmer must correlate the ORG and INITPAGE directives in order to get the object file output as it is wanted. It is wise to place this directive before any instructions belonging to the object page.

Directive OBJDUMP

Syntax:

%OBJDUMP filename

or

%OBJDUMP {filename}

This directive sets the value of the variable org to the last word of the current object page. On the second pass the directive calculates the check sum of the object page and outputs it as a macro expansion line, then dumps the current object page to the object file named filename. Also a 32-bit CRC of the object page is calculated and printed on a comment line as a hexadecimal number.

Directives TURBODIS and TURBOEN

Syntax:

%TURBODIS
%TURBOEN

These directives control the setting and resetting of the turbo mode bits (the two highest bits of the word) in the object code. The turbo mode is a feature of the SY-41CL.

The turbo mode bits need to be set in certain timing loops. Place a TURBODIS directive before the first instruction that needs to be run at normal speed and a TURBOEN directive after the last such instruction. The default at the beginning of assembly is turbo mode enabled.

Instructions

General syntax of an instruction line

instruction [parameter] [.comment]

or

instruction [{parameter}] [.comment]

The instruction is one of the instruction mnemonics known to the program on the basis of the instruction descriptors in asm41.h.

In some instructions the actual parameter can be enclosed in braces ('{}'). In the braced form a period ('.') or whitespace characters are allowed within the parameter but braces are not allowed.

Classes of instructions

There are five classes of instructions:

The instructions in the numbered classes belong to their class according to the two lowest bits of the first word of the instruction code.

Class CON instructions

Class CON instructions have a single parameter. They are not actually meant to be executed. Their meaning is constant data, one or two words in memory.

The class CON instructions with their parameters are:

Class 0 instructions

Class 0 instructions may be parameterless or have a parameter of type type_N_parameter, type_D_parameter or symconstant. Except LDI, which occupies two words, they occupy one word in memory.

The class 0 instructions with their parameters are:

Notice: A '$' in the symconstant parameter of an LDI instruction refers to the assumed location of the constant data word.

Class 1 instructions

Class 1 instructions mean absolute jumps to the address given as symconstant. They occupy two words in memory.

The class 1 instructions with their parameters are:

Class 2 instructions

Class 2 instructions, the arithmetic instructions, have a parameter of type type_TEF_parameter. They occypy one word in memory.

The class 2 instructions with their parameters are:

Class 3 instructions

Class 3 instructions mean short relative jumps to the address given as symconstant. They occupy one word in memory.

The class 3 instructions with their parameters are:

Errors

The program stops when it encounters an error. This error list shows the return value and the text printed into the standard error stream (stderr).

I/O errors

Constraints violations

Syntax errors

Second pass errors

Duplicate symbol error

Object page errors

Customization

The program can be customized in two ways: By changing the values of macros present near the beginning of asm41.c, or by editing the instruction descriptors in asm41.h.

Customization by macros

The default values of the macros are:

#define MAX_LINE_LEN 79
#define COMMENT_COLUMN 34
#define MAX_SYMBOL_LENGTH 10
#define NUM_SYMBOLS 2000
#define NUM_SYMBOL_USES 4000

A directive or instruction line with a comment is printed so that the period ('.') marking the beginning of the comment, irrespective of how it is placed on the source line, begins at column COMMENT_COLUMN.

For the meanings of the other macros refer to the error list.

Customization by instruction descriptors

You can rename the assembly language instruction mnemonics to whatever you like by modifying the descriptors in asm41.h. Change only the mnemonic in the third column, not other columns if you don't know exactly what the effect of your change will be.

The maximum length of the mnemonic is by default 7. If you use longer mnemonics, change the storage allocation of the member mnem (third member) of struct descriptor from 8 to your maximum length + 1.

You could also rename the parameters of type TEF in the array TEF_param at the end of asm41.h. Keep the right order.

Supporting material

HP-41 mainframe entry points as an include file:

sysrom.i41

License

This program is free software: you can redistribute it or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

The full GNU General Public Licence text is available at http://www.gnu.org/licenses/gpl.html.

The attribution of this program to its original author must be preserved in any direct copy or modified version of this program. This is a supplement to the terms of the GNU General Public Licence version 3 allowed by them in section 7. Additional Terms, point b).

Author

The original author of this program is
Kari Pasanen, Jyväskylä, Finland,
Kari Pasanen <kari.pasanen@iki.fi>.