Giter Club home page Giter Club logo

dasm's Introduction

DASM

DASM is a multiarchitecture portable assembler. You can port your experimental architectures to it, with a simple description file. Right now the syntax of the description files is kinda ugly but it will improve with time. Currently it has no main architecture backend, but im working on a backend for MIPS1.

This is part of a new toolchain that im creating, with similar ideology. C Compiler with single file backends.

Goals

  • Better description file <- In Development
  • Relocations
  • RISCV Backend <- In Development
  • C Like Structures, with nesting
  • x86 Backend
  • Moxie Backend
  • Support for GNU AS syntax
  • RISCV Backend
  • Preprocessor Macros and Includes <- In Development
  • Object files <- In Development

Syntax

//C like Comments
# Comment
/*
Multiline Comments
*/

//C Like struct
struct identifier
{
	byte 4, apple;
	byte 4, orange;
	byte 8, lime; 
}

//Access Struct
// struct_identifier.member


//Labels identifier:
main:
     mnemonic operand, operand;

Error Messages

Compiling

The assembler needs the following tools: gcc, flex, bison, make

make -B ARCH="Your_Arch_Prefix"

The default architecture is moxie. So to compile the default architecture do: make -B. Then you can test it by doing ./build/bin/dasm test/moxie_test.asm

After compiling the binary file will be located at build/bin/

The Backend

Each architecture backend requires its own foler at arch/arch_name, for example arch/mips1. On the architecture folder, you need to create a file named the same as your architecture, for example mips.id. After creating the instruction description file you can start describing your architecture.

Description File

Note: This description file syntax will be replaced with a much better one, check riscv backend, which contains new unimplemented syntax

Any assembler syntax includes the mnemonic and the operands. As shown in the next table.

Mnemonic OP0 OP1
ld R0 0xff

How we describe such instruction? Well, we first need to let the assembler know our operands. We do so by defining our arguments or operands.

In the following example we define our registers and their address.

//Registers
arg registers
{
	(R0){0x00},
	(R1){0x01}
}

The definition can be broken into parts.

arg This defines an Argument type definition. registers Is just an identifier to identify the set of arguments. { Opens the body of the argument defenition. (R1){0x01} This defines a link. This links R1 to constant 0x01. We can also declare multiple link by separating them with a comma. } At the end we close the definition.

At this point we have just declared what R0 means, but we havent declarent ld(The mnemonic) or 0xff a value. We dont need to declare what constants are because the assembler already provides an argument for it, which is the type numeric. At this point we declared the operands, we now need to declare the order of those operands and the instruction mnemonic; We do so with instruction description templates.

This is an example of a instruction descriprion tenplate (IDT).

def <Instruction Mnemonic> {

    arg_template{
      //Argument Links
    }

    //Constraints on how big can constants be. (Not implemented Yet)
    max {
      //Links to define the constraints
    }
    
    encode {
    "
    //C code for the Instruction Encoding.
    "
    }

    mnemonic {
    "Instruction Mnemonic"
    }
}

Then the ld R0, 0xff IDT would be like this:

def ld {

    arg_template{
      //Argument Links
      (op1){registers}, //Defines OP1 as a Register
      (op2){numeric} //Defines OP2 as a Constant
    }

    //Constraints on how big can constants be. (Not implemented Yet)
    max {
      //Links to define the constraints
    }
    
    encode {
    "
    //C code for the Instruction Encoding.
    "
    }

    mnemonic {
    "ld"
    }
}

This is one of the most important parts of an IDT, the op1 and op2. Dont actually define the order of the operands, they define a way to link multiple argument names into a single operand. The part that defines the order of the operands is actually the order on how they are declared. For example in the next example, the register operand will go first and the numeric operand go second.

    arg_template{
      //Argument Links
      (op1){registers}, //Defines OP1 as a Register
      (op2){numeric} //Defines OP2 as a Constant
    }

Enconding

One of the internal components of an IDT is the encode scope encode {""}, after the assembler matches the IDTs by using the arg_template (Argument Template) declaration on each IDT. It then will execute the C code on the encode scope. The encode scope is a way for the programmer to have more flexibility on how the instruction will be encoded and on how it will be stored.

    encode {
    "
    //C code for the Instruction Encoding.
    "
    }

The backend generator will generate a similar function as the next example.

void addu_encode_function(BIN_BUFFER* bin_buffer, ARG_TABLE* arg_table, int op)

The code inside of the encode scope will be allocated on a function with the same arguments. The bin_buffer is the output buffer for the assembler. Here is were all the assembled instruction will be stored into. The arg_table argument is an struct that contains all the operand values. And the op is like an unique ID for each instruction. Accesing each structure to assemble the instruction can get complicated, so there are macros to help access such structures, defined in arg_access.h

Now we know that some instructions use the same encoding squeme, and to make it easier to do instruction encoding we use MACROS. Defining macros can be done somewhere on the instruction desciprion file. By doing:

macros 
{"
  //These are C macros
    #define I_ADDU	            0b100001
    #define I_ADD	              0b100000
    #define I_ADDIU	            0b001001
"}

By using macros the encode scope can look something like this:

encode 
{
"
  //We refer to this macros as "Arg Access Macros" defined in arg_access.h
  APPEND_BYTE(GET_SIZE_32_SEC_1(ENCODE_R(I_ADDU, GET_OP1_VAL, GET_OP2_VAL, GET_OP3_VAL, 0x00, 0x00)));
  APPEND_BYTE(GET_SIZE_32_SEC_2(ENCODE_R(I_ADDU, GET_OP1_VAL, GET_OP2_VAL, GET_OP3_VAL, 0x00, 0x00)));
  APPEND_BYTE(GET_SIZE_32_SEC_3(ENCODE_R(I_ADDU, GET_OP1_VAL, GET_OP2_VAL, GET_OP3_VAL, 0x00, 0x00)));
  APPEND_BYTE(GET_SIZE_32_SEC_4(ENCODE_R(I_ADDU, GET_OP1_VAL, GET_OP2_VAL, GET_OP3_VAL, 0x00, 0x00)));
"
}

Types of Arg Access Macros

Bin Buffer Access

#define APPEND_BYTE(byte) append_byte(bin_buffer, byte)

Operand Access

//Access The operand
#define GET_OP1 arg_table->data[0]
//Access Value from operand
#define GET_OP1_VAL arg_table->data[0].value
//Acess Name from operand
#define GET_OP1_NAME arg_table->data[0].name
//Access Domain from operand
#define GET_OP1_DOMAIN arg_table->data[0].domain

Argument Domains

#define REGISTER "register"
#define NUMERIC "numeric"
#define ADDRESS "address"

Misc

To enable debuging the macro _DEBUG_ has to be defined in the file debug.h

dasm's People

Contributors

bitglitcher avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.