duncanamps / xa80 Goto Github PK
View Code? Open in Web Editor NEWXA80 Multi-Platform Cross Assembler for x80 processors (8080, 8085, Z80, Z180)
Home Page: https://github.com/duncanamps/xa80
License: GNU General Public License v3.0
XA80 Multi-Platform Cross Assembler for x80 processors (8080, 8085, Z80, Z180)
Home Page: https://github.com/duncanamps/xa80
License: GNU General Public License v3.0
Need a TITLE directive, for example:
TITLE 'Disk formatter'
ORG 200H
START: LD A,'0'
Results of this should show up in the .MAP and .LST files.
Due to the variety of different assemblers and source code, it's possible to find keywords being used as labels, for example:
WORD EQU 2 ; Number of bytes in a word
However, in some versions of V0.2, WORD was used as a synonym for DW (Define Word).
Due to the number of synonyms and the introduction of . commands (e.g. .WORD) it should be possible to allow some of the directives and/or opcodes to be used as labels subject to the following rules:
These rules will prevent a situation where a directive is used normally, then as a label. This is fine for pass 1, but when pass 2 is executed the directive is now a label and everything will fall apart.
Some old source code contains the CPU directive, for example:
TITLE 'Sample program'
CPU 8080
ORG 0200H
Allow this command, but give a warning and direct the user to the command line --version switch.
Add the IFDEF and IFNDEF commands as an alternative to IF DEFINED(x) and IF !DEFINED(x)
Colon on end of symbol sometimes not being stripped, ensure there's no trace when the label is being added to the symbol table.
It's possible to set a command line define, e.g.
XA80 myfile.asm --define=ORIGIN=0xf800 --processor=8080 --com --listing
When executing the following code, the IFDEF directive fails:
IFDEF ORIGIN
:
Need to investigate why it's not being picked up correctly.
Remove the WORD directive as there are source code examples which use it as a label.
Show the version number when the opcode compiler runs. This should just be a format version number for the binary file, for example:
2
This would represent the newer 8 character mnemonic format used in V0.3+ of the assembler.
Currently as of V0.3.0 there are basic directives, and another set of the same prefixed with the period. So we have:
INCLUDE "myfile.inc"
.INCLUDE "myfile.inc"
Both do the same thing. Investigate if a # sign can be used for a command prefix also.
Currently only targeted at Z80, needs to include 8080, 8085 and Z180 as a minimum.
Some 8080 code uses the SET directive which is an equivalent to the = directive in XA80:
BUFSIZE SET 128 ; Old school 8080 assembler
BUFSIZE = 128 ; New XA80 style
However, the Z80 uses SET as an opcode:
SET 4,(IX+offset) // Set bit 4 of the status register
Currently, this is resolved by defining the SET directive as allowed on 8080/8085 as there is no SET opcode on these processors. Make this automatic by scanning the opcode table for a SET opcode, and if not present, enable the SET directive.
Consider the following code
MyMac MACRO source, dest
LD HL,{ssource}
LD DE,{dest}
CALL CPYMEM
ENDM
:
:
MyMac OUTPUT, MESSAGE
Note the spelling mistake in LD HL,{ssource}. This is not picked up by the assembler and needs to be. It should check on macro substitutions for any { or } characters left over - this means they weren't expanded correctly.
Add a PROCESSOR() function so you can do things like the following:
mult_available = (PROCESSOR() == "Z180")
:
:
IF mult_available ...
Debug file format is not implemented yet, use .dbg80 as the filetype when it's done.
Need to consider:
By default, XA80 takes enclosing brackets and turns them internally into square brackets to represent indirection. For example:
LD A,(base+(offset*7)) // Source line where base=200, offset=5
LD A,[235] // Gets turned into this
However, some legacy 8080 code uses this to signify a calculation (the 8080/8085 don't use brackets of any sort for indirection). For example:
lxi h,(nxtrec-reccnt) ;HL=.fcb(nxtrec)
The above shouldn't automatically get turned into redirection.
As per the heading, default should be forced to upper case but allow a command line option to have mixed case.
There are a number of new aliases for where directives can be preceded with a period symbol.
Need to update the user manual text and index to reflect this.
Make it so the END directive flags the end of the assembly and any further [code generating] input is treated as an error.
Including file a.inc that has IF statements to see if it's been included or not.
Including a second time, the macro parameter definitions error out even though the IF statement is "off" and not producing code.
When assembling cpm22.asm, there is a warning about a label being redefined, however there is only one instance of the label being defined:
[ 0.199] Warning W1003 Symbol CCPSTACK has been redefined
at line 1216 col 17 in C:\Users\Duncan Munro\Dropbox\dev\lazarus\computing\z80\box80\g_searle\source\cpm22.asm
CCPSTACK .EQU $ ;end of ccp stack area.
Investigate and resolve
There's a new command flag cfLabel which is currently unimplemented and can be applied to directives to ensure a label is present; but this is too black and white - it will either be label, or no label, nothing in between. There needs to be something with three options: Yes label, no label, or don't care, because...
Label | Directive |
---|---|
Mandatory | EQU/= |
MACRO | |
Optional | DB/DEFB |
DS/DEFS | |
DW/DEFW | |
Never | CODE |
CPU | |
DATA | |
ELSE | |
END | |
ENDIF | |
ENDM | |
ENDR | |
EXTERN | |
GLOBAL | |
IF | |
LISTOFF | |
LISTON | |
MSGERROR | |
MSGINFO | |
MSGWARNING | |
ORG | |
REPEAT | |
TITLE | |
UDATA | |
WARNOFF | |
WARNON | |
WHILE | |
ENDW |
Purpose of this is to simplify code, there is code in every single CmdXxx routine to do these checks and putting it on the command dispatcher will need the code to be written/debugged only once.
The only (IX) or (IY) command without a displacement is JP (IX)/(IY). Amend the opcode table to permit (IX) but code as (IX+0).
Add the SET directive when using 8080/8085 processors as legacy code uses this to set symbol table values. It can only be used for 8080/8085 as the Zilog instruction set uses SET as a mnemonic for setting bit values.
Following code from basic.asm:
[ 0.051] ERROR E2023 Byte must be in range -127..255
at line 1443 col 19 in C:\Users\Duncan Munro\Dropbox\dev\lazarus\computing\z80\box80\g_searle\source\basic.asm
LD D,-1 ; Flag "GOSUB" search
Error shows byte value not in the range -127..255 however -1 is within range, not sure why error is coming up.
Currently, the grammar contains all mnemonics for 8080 / 8085 / Z80 / Z180. This causes a problem with the following 8080 source code:
INC: DB 27
Because INC is a Z80 mnemonic (and therefore defined as a keyword in the grammar), the parser gets confused. The grammar needs to be replicated x4 for the different processors and only contain the opcodes for those processors.
It should be possible to have the following code:
label EQU 'Duncan'
and refer to it later as:
table: DB 27,label,'$'
This should expand to:
table: DB 27,'Duncan','$'
Currently it doesn't...
Create an opcode file z80x.opcode to host the undocumented codes in addition to the normal Z80 codes. Can be used for testing #22.
When a string value has been defined, the map file currently doesn't show the value.
Implement the CODE / DATA / UDATA segmentation, this will be of most use once the object format has been defined.
Implement EXTERN and GLOBAL commands to support linked object files and libraries.
Currently, colons are mandatory to separate labels and instructions, for example:
label: LD A,[HL]
Ensure that the new preprocessor does away with this to make the colons optional, for example it should be able to figure out the following:
label: db 'X','Y','Z'
xor a,a
ld b,3
ld hl,buffer
loop ld [hl],a
inc hl
djnz loop
It should be possible to call a macro with a mnemonic as one of the parameters and have it expanded at runtime. For example:
TESTINST MACRO inst
TEST{inst} PUSH BC // Copy BC register
POP AF // into AF
{inst} // Perform shift / rotate
PUSH AF // Save flags straight away
CP A,D // Check if A the same
JP NZ, FAIL // No, mismatch, flag an error
POP BC // Get flags back into C
LD A,$13 // Mask for Half carry, subtract, and carry flags
AND A,C // Leave just the ones we want
CP A,E // Expected?
JP NZ, FAIL // No, flag as an error
RET // Return; all was good
ENDM
TESTINST RLA
However, invoking the macro above just produces the following code which is not what's required:
2012: 17 53 | TESTINST RLA
Current status is that the assembler covers 8080/8085/Z80/Z180, these processors are baked in. Users wanting to experiment with other processors, undocumented instructions, etc., may not want or have the capability to recompile all the software.
Need to add an external .opcode.bin provision so if the user executes something like the following:
xa80 myfile.asm --com --processor=HD64180
As the processor is not baked in, XA80 should search for an external file either in the directory where xa80 was executed from, or failing that the current working directory. The file searched for should be HD64180.opcode.bin
It should also be possible to add a path, in which case it will just check this path and nowhere else, something like this should work:
xa80 myfile.asm --com --processor=/home/duncan/xa80/extras/HD64180
Also, this should be covered off by the XA80 environment variable if required.
Need to be able to have character values converted to numeric operands, for example:
LD A, 'Z'+1
Currently 'Z' will be treated as a string and will cause an error.
Whereas .com and .hex files are currently implemented, object files are not.
Add 24 bit coding capabilities to suit the EZ80 processor.
There is fresh code to check that there are not multiple directives on the same line, for example:
DW $ // Allowed
DW ORG // Not allowed
DW DW // Not allowed
Either take ORG out, or make it ORG(), not really important as most people would use $ anyway.
Currently, XA80 uses square brackets for indirection operands, for example:
LD A, [HL]
LD [0x200],A
LD A,[IX+3]
For increased source code compatibility, it would be preferred to have the ability to have round brackets also. In the current implementation this is not possible as the grammar rules show that a expression can reduce to an operand, and "(" expression ")" can reduce to expression.
So the parser doesn't know which one of the following is an expression, and which is an indirection:
LD A, (2+3)
LD A, (5)
Hence the use of square brackets currently... There is a way to work it out with preparsing to turn (HL) into [HL] etc., this involves bracket counting to ensure there is a defined set of outer brackets. Examples:
(HL) Indirection
(3+7*1) Indirection
(3+7)*(1+5) Not indirection
5*(1+3) Not indirection
(3 + ASC(")*(") + 9) Indirection
(3*(7+5)+8) Indirection
As can be seen, quoted strings including their escape characters need to be taken into account.
For example:
msg: DB 'Hello there',13,10,0
The example above is encoding the 0x27 single quote characters when it should be stripping them. Fix the problem, and update the test suite to ensure it has some single quoted items in it.
Currently there isn't one...
Predefine a few boolean constants in the grammar:
Opcode binary files are evolving, for example maximum length of a mnemonic has now gone from 5 characters to 8 characters making the files incompatible.
Need to put a major/minor version marker in the files and have the software check it to ensure the .opcode.bin file is from the current or newer version of the software.
Colons are optional, some code should put a space between the label and the mnemonic or directive, e.g.
MYLABEL:XOR A,A
This should be accepted as the : is used as the delimiter in this case.
Some source code uses .BYTE, .ORG etc. instead of the undotted variants. Allow both.
When assembling, it's possible to get filetypes other than the default by specifying the full filename on the command line:
xa80 cpm22.asm --com=cpm22.bin
However, when assembling multiple files, it's only possible to use the wildcard to get the default extension:
xa80 *.asm --com
The above produces a .com file for each assembly rather than .bin. It would be useful to have something in one of the following two formats so the assembler can figure out that a wildcard is good, but a different extension is required:
xa80 *.asm --com=.bin --errorlog=.prn
xa80 *.asm --com = *.bin --errorlog=*.prn
Not high priority, can wait until V0.3
Add the DC / DEFC command, this is for string based input which leaves the high bit of the last character set. So, for example:
KEYWORDS DEFC "FOR", "GOSUB", "GOTO", "NEXT", "DIM", ....
The "FOR", instead of coding to 46 4F 52 would code to 46 4F D2.
Indirection is where the assembler "sees" something which should be the contents of an address pointed to by the operand, internally it turns valid outer parentheses into square brackets:
LD A,(HL) -> LD A,[HL]
LD A,(base+offset) -> LD A,[base+offset]
LD A,(base+1)*(offset+3) -> LD A,(base+1)*(offset+3) ; Not changed as not an indirection!
Currently, to support legacy code, this feature is switched OFF for 8080/8085 and ON for other processors.
The opcode table defines a list of the operands used and as 8080/8085 don't use (HL), (NNNN) etc., it should be possible to know at the time the opcode table is loaded if indirection should be on or off.
Add the appropriate code to make this happen and remove the manual kludges from the code put in by ticket #34.
Currently there's no warning if a label is never resolved. It should either give an error on pass 2, or be reported at the end of the assembly.
-d/--define doesn't appear to be implemented yet, needs to be fixed.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.