Giter Club home page Giter Club logo

duncanamps / xa80 Goto Github PK

View Code? Open in Web Editor NEW
5.0 5.0 2.0 35.89 MB

XA80 Multi-Platform Cross Assembler for x80 processors (8080, 8085, Z80, Z180)

Home Page: https://github.com/duncanamps/xa80

License: GNU General Public License v3.0

Batchfile 0.08% Pascal 57.38% Assembly 42.16% Shell 0.04% PowerShell 0.33% Makefile 0.02%
8080 8085 assembler cross-platform crossplatform linux macos macro windows z180 z80 cross fpc lazarus pascal xa80

xa80's Introduction

xa80 V1.0(DEV)

xa80 - X-Assembler for x80 processors

xa80_short

This is V1.0 (Development) which is in development, and therefore incomplete. For the latest stable release, please see Release Package V0.3.1.

Synopsis

xa80 is a command line tool that allows the cross assembly of source files aimed at x80 processors (8080,8085,Z80,Z180). It takes an input file (e.g. myfile.z80 or test.asm) and creates the following output files, some of which are optional:

  • .hex file containing output information in the industry standard Intel .hex format
  • .lst file containing a listing of the assembler output
  • .log file containing errors encountered during the assembly
  • .map file containing the symbol information
  • .com file containing the actual machine code which can be executed on, for example, CP/M machines
  • .obj80 file containing object code (fixed/relocatable segments, import/export info, fixup tables)

Key features

Here are some of the key features of xa80:

  • Open source
  • Two pass assembler
  • Supports mnemonics from different processors (8080, 8085, Z80, Z180) as a baked in standard
  • Ability to add additional opcode maps as external files
  • Opcode compiler so you can add your own secret/hidden instructions and extend to other processor variants in the "family"
  • Macro capability with nested expansion of macros allowed
  • Conditional assembly with IF / IFDEF / IFNDEF statements
  • Repetition through REPEAT and WHILE statements
  • Full expression evaluator with many functions and string handling capability
  • Segmented model with fixed and relocatable segments
  • Rich set of command line parameters
  • xa80 Environment variable for commonly used parameters
  • Runs on any hardware supported by Lazarus/FPC (Windows, macOS, Linux, etc. etc.)
  • Fast - will assemble the CP/M BDOS22.ASM (3,289 lines) and CCP22.ASM files (1,325 lines) with map file and listing outputs (total 105 pages) in approx 0.15 seconds using a Core i7 laptop, Acer Aspire 5 A515-56

Development Status

This is very much experimental and was developed by the author as a learning tool for how assemblers, lexical analysers and parsers work in general. Please don't use this for anything serious that you would object to losing. Whilst having been extensively tested, and coming with working examples, there is no guarantee that it will work correctly with all input files.

Development Environment

To modify and compile this software, you will need Lazarus 2.1.0 or later. It has been tested on Windows and Linux. As it is only a simple text and file based application, it should be relatively easy to recompile on other hosts which are supported by the Lazarus ecosystem in 32 and 64 bit flavours, including:

  • Android
  • FreeBSD
  • iOS
  • Linux
  • macOS
  • Raspberry Pi
  • WinCE
  • Windows

Tip: For most people, it won't be necessary to alter or recompile the software. Just use the pre-compiled binaries available from this repository if they are sufficient for your needs. All you will need is xa80 or xa80.exe depending on your operating system. There are packages available containing the binary and manuals.

Dependencies

To modify the grammar for the opcode compiler, or xa80 itself, will require the use of a tool called LaCoGen (Lazarus Compiler Generator). LaCoGen is available from this GitHub. The grammar to deal with operands is contained in the .lac file and for the most part can be left alone. It's only if you want to add new functions or operators that you would need to get involved in changing the grammar file and recompiling with LaCoGen; 99.9% of people won't need to look at this.

Documentation

The docs/ folder contains a user guide explaining how the assembler is used, and also the opcode compiler if you want to get into the detail.

Folder Structure

Folders are organised as follows:

  • root the Lazarus project files, licence and .gitignore
    • binaries/ - Precompiled binaries for various systems
    • docs/ - Documentation (user manual, technical notes)
    • lac/ - The LaCoGen operand grammar for xa80. The xa80oper.lac file is compiled into xa80oper.lacobj which is loaded into the assembler as a resource file. If you don't need to change the basic grammar for operands, then this can be left alone
    • lexer_parser/ - A lightweight lexical analyser which is used to split or pre-parse the input into labels, commands, instructions, operands and comments
    • opcodes/ - The folder containing the opcode compiler oc_comp (see readme.txt in the folder)
      • opcodes/lac/ - Grammar for the opcode compiler, opcode_compiler.lac compiles info opcode_compiler.lacobj
      • opcodes/source/ - The source files describing the different combinations of instructions and operands
    • test_files/ - A set of test files to check that things work, and also includes some deliberate fails to check the assembler response to warning and error conditions
    • units/ - The bulk of the source code resides in here

Known Issues

  • No major issues identified at this time

Development roadmap

  • V0.3.2 - Current Stable Release
  • V1.0 - Current Development: Introduce segmented architecture, object files, debug info generation
  • V2.0 - FUTURE: Introduce 24 bit capability

Author

Duncan Munro [email protected]

xa80's People

Contributors

duncanamps avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

xa80's Issues

Implement CODE / DATA / UDATA

Implement the CODE / DATA / UDATA segmentation, this will be of most use once the object format has been defined.

Add a PROCESSOR() function

Add a PROCESSOR() function so you can do things like the following:

    mult_available = (PROCESSOR() == "Z180")
    :
    :
    IF mult_available ...

.d80 and .o80 filetypes

Change the filetypes for debug and object files to .dbg80 and .obj80 respectively.

These are not currently implemented, tickets #4 and #5 are open to implement these features.

Macro expansion of mnemonics

It should be possible to call a macro with a mnemonic as one of the parameters and have it expanded at runtime. For example:

TESTINST	MACRO		inst
TEST{inst}	PUSH		BC		// Copy BC register
		POP		AF		// into AF
		{inst}				// Perform shift / rotate
		PUSH		AF		// Save flags straight away
		CP		A,D		// Check if A the same
		JP		NZ, FAIL	// No, mismatch, flag an error
		POP		BC		// Get flags back into C
		LD		A,$13		// Mask for Half carry, subtract, and carry flags
		AND		A,C		// Leave just the ones we want
		CP		A,E		// Expected?
		JP		NZ, FAIL	// No, flag as an error
		RET				// Return; all was good
		ENDM

		TESTINST	RLA

However, invoking the macro above just produces the following code which is not what's required:

2012: 17                   53 |             TESTINST    RLA

Implement CPU directive

Some old source code contains the CPU directive, for example:

        TITLE 'Sample program'

        CPU   8080
        ORG   0200H

Allow this command, but give a warning and direct the user to the command line --version switch.

Remove WORD directive

Remove the WORD directive as there are source code examples which use it as a label.

IFDEF not acting correctly on --define command line

It's possible to set a command line define, e.g.

XA80 myfile.asm --define=ORIGIN=0xf800 --processor=8080 --com --listing

When executing the following code, the IFDEF directive fails:

    IFDEF ORIGIN
    :

Need to investigate why it's not being picked up correctly.

Alternative filetypes for command line parameters

When assembling, it's possible to get filetypes other than the default by specifying the full filename on the command line:

xa80 cpm22.asm --com=cpm22.bin

However, when assembling multiple files, it's only possible to use the wildcard to get the default extension:

xa80 *.asm --com

The above produces a .com file for each assembly rather than .bin. It would be useful to have something in one of the following two formats so the assembler can figure out that a wildcard is good, but a different extension is required:

xa80 *.asm --com=.bin --errorlog=.prn
xa80 *.asm --com = *.bin --errorlog=*.prn

Not high priority, can wait until V0.3

Implement TITLE directive

Need a TITLE directive, for example:

        TITLE 'Disk formatter'

        ORG 200H
START:  LD A,'0'

Results of this should show up in the .MAP and .LST files.

Remove indirection for 8080/8085

By default, XA80 takes enclosing brackets and turns them internally into square brackets to represent indirection. For example:

    LD  A,(base+(offset*7))    // Source line where base=200, offset=5
    LD  A,[235]                // Gets turned into this

However, some legacy 8080 code uses this to signify a calculation (the 8080/8085 don't use brackets of any sort for indirection). For example:

    lxi h,(nxtrec-reccnt)	;HL=.fcb(nxtrec)

The above shouldn't automatically get turned into redirection.

Erroneous W1003 Symbol redefined message

When assembling cpm22.asm, there is a warning about a label being redefined, however there is only one instance of the label being defined:

[  0.199] Warning W1003 Symbol CCPSTACK has been redefined
          at line 1216 col 17 in C:\Users\Duncan Munro\Dropbox\dev\lazarus\computing\z80\box80\g_searle\source\cpm22.asm
          CCPSTACK .EQU   $   ;end of ccp stack area.

Investigate and resolve

Add other processors

Currently only targeted at Z80, needs to include 8080, 8085 and Z180 as a minimum.

Add DC/DEFC command

Add the DC / DEFC command, this is for string based input which leaves the high bit of the last character set. So, for example:

KEYWORDS   DEFC   "FOR", "GOSUB", "GOTO", "NEXT", "DIM", ....

The "FOR", instead of coding to 46 4F 52 would code to 46 4F D2.

Round brackets for indirection operands

Currently, XA80 uses square brackets for indirection operands, for example:

LD A, [HL]
LD [0x200],A
LD A,[IX+3]

For increased source code compatibility, it would be preferred to have the ability to have round brackets also. In the current implementation this is not possible as the grammar rules show that a expression can reduce to an operand, and "(" expression ")" can reduce to expression.

So the parser doesn't know which one of the following is an expression, and which is an indirection:

LD A, (2+3)
LD A, (5)

Hence the use of square brackets currently... There is a way to work it out with preparsing to turn (HL) into [HL] etc., this involves bracket counting to ensure there is a defined set of outer brackets. Examples:

(HL)   Indirection
(3+7*1)   Indirection
(3+7)*(1+5)   Not indirection
5*(1+3)   Not indirection
(3 + ASC(")*(") + 9)   Indirection
(3*(7+5)+8)   Indirection

As can be seen, quoted strings including their escape characters need to be taken into account.

Single quoted strings not stripping correctly

For example:

msg:   DB   'Hello there',13,10,0

The example above is encoding the 0x27 single quote characters when it should be stripping them. Fix the problem, and update the test suite to ensure it has some single quoted items in it.

Make SET directive available if the opcode table doesn't use it

Some 8080 code uses the SET directive which is an equivalent to the = directive in XA80:

BUFSIZE SET 128 ; Old school 8080 assembler
BUFSIZE = 128   ; New XA80 style

However, the Z80 uses SET as an opcode:

    SET 4,(IX+offset)   // Set bit 4 of the status register

Currently, this is resolved by defining the SET directive as allowed on 8080/8085 as there is no SET opcode on these processors. Make this automatic by scanning the opcode table for a SET opcode, and if not present, enable the SET directive.

Opcode compiler show version number

Show the version number when the opcode compiler runs. This should just be a format version number for the binary file, for example:

2

This would represent the newer 8 character mnemonic format used in V0.3+ of the assembler.

Add boolean constants

Predefine a few boolean constants in the grammar:

  • On (1)
  • Off (0)
  • True (1)
  • False (0)

Reduce keywords in grammar file depending on processor

Currently, the grammar contains all mnemonics for 8080 / 8085 / Z80 / Z180. This causes a problem with the following 8080 source code:

INC: DB 27

Because INC is a Z80 mnemonic (and therefore defined as a keyword in the grammar), the parser gets confused. The grammar needs to be replicated x4 for the different processors and only contain the opcodes for those processors.

Allow keywords to be used as labels

Due to the variety of different assemblers and source code, it's possible to find keywords being used as labels, for example:

WORD    EQU     2    ; Number of bytes in a word

However, in some versions of V0.2, WORD was used as a synonym for DW (Define Word).

Due to the number of synonyms and the introduction of . commands (e.g. .WORD) it should be possible to allow some of the directives and/or opcodes to be used as labels subject to the following rules:

  1. If a directive is used as normal, it must be flagged as such and cannot later be used as a label
  2. If a directive is used as a label, it must be flagged so it cannot later be used as a directive
  3. A warning should be given when a directive is used as a label

These rules will prevent a situation where a directive is used normally, then as a label. This is fine for pass 1, but when pass 2 is executed the directive is now a label and everything will fall apart.

Add IFDEF / IFNDEF

Add the IFDEF and IFNDEF commands as an alternative to IF DEFINED(x) and IF !DEFINED(x)

Character values as operands

Need to be able to have character values converted to numeric operands, for example:

LD A, 'Z'+1

Currently 'Z' will be treated as a string and will cause an error.

Allow (IX) instead of (IX+0)

The only (IX) or (IY) command without a displacement is JP (IX)/(IY). Amend the opcode table to permit (IX) but code as (IX+0).

Check for non-existent parameter expansions

Consider the following code

MyMac   MACRO   source, dest
        LD      HL,{ssource}
        LD      DE,{dest}
        CALL    CPYMEM
        ENDM
        :
        :
        MyMac OUTPUT, MESSAGE

Note the spelling mistake in LD HL,{ssource}. This is not picked up by the assembler and needs to be. It should check on macro substitutions for any { or } characters left over - this means they weren't expanded correctly.

Negative operand being rejected

Following code from basic.asm:

[  0.051] ERROR E2023 Byte must be in range -127..255
          at line 1443 col 19 in C:\Users\Duncan Munro\Dropbox\dev\lazarus\computing\z80\box80\g_searle\source\basic.asm
                  LD      D,-1            ; Flag "GOSUB" search

Error shows byte value not in the range -127..255 however -1 is within range, not sure why error is coming up.

Add external .opcode.bin provision

Current status is that the assembler covers 8080/8085/Z80/Z180, these processors are baked in. Users wanting to experiment with other processors, undocumented instructions, etc., may not want or have the capability to recompile all the software.

Need to add an external .opcode.bin provision so if the user executes something like the following:

xa80 myfile.asm --com --processor=HD64180

As the processor is not baked in, XA80 should search for an external file either in the directory where xa80 was executed from, or failing that the current working directory. The file searched for should be HD64180.opcode.bin

It should also be possible to add a path, in which case it will just check this path and nowhere else, something like this should work:

xa80 myfile.asm --com --processor=/home/duncan/xa80/extras/HD64180

Also, this should be covered off by the XA80 environment variable if required.

Add versioning to the .opcode.bin files

Opcode binary files are evolving, for example maximum length of a mnemonic has now gone from 5 characters to 8 characters making the files incompatible.

Need to put a major/minor version marker in the files and have the software check it to ensure the .opcode.bin file is from the current or newer version of the software.

Optional colons on labels

Currently, colons are mandatory to separate labels and instructions, for example:

label: LD A,[HL]

Ensure that the new preprocessor does away with this to make the colons optional, for example it should be able to figure out the following:

label: db 'X','Y','Z'
xor a,a
ld b,3
ld hl,buffer
loop ld [hl],a
inc hl
djnz loop

Clean up code around label requirements

There's a new command flag cfLabel which is currently unimplemented and can be applied to directives to ensure a label is present; but this is too black and white - it will either be label, or no label, nothing in between. There needs to be something with three options: Yes label, no label, or don't care, because...

Label Directive
Mandatory EQU/=
MACRO
Optional DB/DEFB
DS/DEFS
DW/DEFW
Never CODE
CPU
DATA
ELSE
END
ENDIF
ENDM
ENDR
EXTERN
GLOBAL
IF
LISTOFF
LISTON
MSGERROR
MSGINFO
MSGWARNING
ORG
REPEAT
TITLE
UDATA
WARNOFF
WARNON
WHILE
ENDW

Purpose of this is to simplify code, there is code in every single CmdXxx routine to do these checks and putting it on the command dispatcher will need the code to be written/debugged only once.

Using # for command prefix

Currently as of V0.3.0 there are basic directives, and another set of the same prefixed with the period. So we have:

    INCLUDE  "myfile.inc"
    .INCLUDE  "myfile.inc"

Both do the same thing. Investigate if a # sign can be used for a command prefix also.

24 bit code

Add 24 bit coding capabilities to suit the EZ80 processor.

DW ORG no longer works

There is fresh code to check that there are not multiple directives on the same line, for example:

    DW  $     // Allowed
    DW  ORG   // Not allowed
    DW  DW    // Not allowed

Either take ORG out, or make it ORG(), not really important as most people would use $ anyway.

Allow label with no space after colon

Colons are optional, some code should put a space between the label and the mnemonic or directive, e.g.

MYLABEL:XOR A,A

This should be accepted as the : is used as the delimiter in this case.

Object file format

Whereas .com and .hex files are currently implemented, object files are not.

Indirection feature - make automatic

Indirection is where the assembler "sees" something which should be the contents of an address pointed to by the operand, internally it turns valid outer parentheses into square brackets:

LD A,(HL) -> LD A,[HL]
LD A,(base+offset) -> LD A,[base+offset]
LD A,(base+1)*(offset+3) -> LD A,(base+1)*(offset+3) ; Not changed as not an indirection!

Currently, to support legacy code, this feature is switched OFF for 8080/8085 and ON for other processors.

The opcode table defines a list of the operands used and as 8080/8085 don't use (HL), (NNNN) etc., it should be possible to know at the time the opcode table is loaded if indirection should be on or off.

Add the appropriate code to make this happen and remove the manual kludges from the code put in by ticket #34.

Add SET as synonym for =

Add the SET directive when using 8080/8085 processors as legacy code uses this to set symbol table values. It can only be used for 8080/8085 as the Zilog instruction set uses SET as a mnemonic for setting bit values.

Add dot commands

Some source code uses .BYTE, .ORG etc. instead of the undotted variants. Allow both.

Debug file format

Debug file format is not implemented yet, use .dbg80 as the filetype when it's done.

Need to consider:

  • What information is going to be stored in the debug file
  • What piece of software is going to make use of it
  • Is there something out there already
  • What is the environment? CP/M? or...?
  • Is there even a need for this?

Warn of unresolved labels

Currently there's no warning if a label is never resolved. It should either give an error on pass 2, or be reported at the end of the assembly.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.