Giter Club home page Giter Club logo

ccom's Introduction

CCom Logo

CCom language

Compiler for the CCom (Conditional Comments) language.

Note: This language is part of the Compose Generator project, but also can be used independently.

Introduction

CCom is a language for pre-processing source files. It's primary purpose is to evaluate conditional sections in formats like YAML or XML, but can also be used for a variety of programming languages with support for comments.

Documentation

Please visit the documentation at ccom.compose-generator.com.

Supported data formats

Line comment iden Block comment iden open Block comment iden close
Assembly ; - -
C // /* */
C++ // /* */
Dart // /* */
Dockerfile # - -
Elixir # """ """
Go // /* */
Groovy // /* */
Haskell -- -{ -}
HTML - <!-- -->
Java // /* */
JavaScript // /* */
Julia # #= =#
Kotlin // /* */
Lua -- --[[ ]]
Pascal - (* *)
Perl # =item =cut
PHP // /* */
Powershell # <# #>
Python # """ """
R # - -
Ruby # =begin =end
Rust // /* */
Spice // /* */
SQL -- - -
Swift // /* */
TypeScript // /* */
XML - <!-- -->
YAML # - -

Note: Formats like JSON, where no comments are supported can also work with CCom, however then the file input is not valid before pre-processing it with CCom.

Install

Install on Debian / Ubuntu

$ sudo apt-get install ca-certificates
$ curl -fsSL https://server.chillibits.com/files/repo/gpg | sudo apt-key add -
$ sudo add-apt-repository "deb https://repo.chillibits.com/$(lsb_release -is | awk '{print tolower($0)}')-$(lsb_release -cs) $(lsb_release -cs) main"
$ sudo apt-get update
$ sudo apt-get install ccom

Install on Fedora

$ sudo dnf -y install dnf-plugins-core
$ sudo dnf config-manager --add-repo https://server.chillibits.com/files/repo/fedora.repo
$ sudo dnf install ccom

Install on CentOS

$ sudo yum install -y yum-utils
$ sudo yum-config-manager --add-repo https://server.chillibits.com/files/repo/centos.repo
$ sudo yum install ccom

Install on Raspbian

$ sudo apt-get install ca-certificates
$ curl -fsSL https://server.chillibits.com/files/repo/gpg | sudo apt-key add -
$ sudo echo "deb [arch=armhf] https://repo.chillibits.com/$(lsb_release -is | awk '{print tolower($0)}')-$(lsb_release -cs) $(lsb_release -cs) main" > /etc/apt/sources.list.d/chillibits.list
$ sudo apt-get update
$ sudo apt-get install ccom

Install on Windows

CCom gets distributed for Windows via the new Windows package manager called winget. In the future, winget will be available for download in the Microsoft Store. Currently, the easiest way to install winget is, to download it manually from GitHub. Visit the installation instruction from Microsoft.
As soon as the Windows package manager is installed on your Windows machine, you can open powershell and execute this installation command:

$ winget install ChilliBits.CCom

Use with Docker

Linux:

$ docker run --rm -it -v $(pwd):/ccom/out chillibits/ccom

Windows:

$ docker run --rm -it -v ${pwd}:/ccom/out chillibits/ccom

Note: This command does not work with Windows CMD command line. Please use Windows PowerShell instead.

Usage

In general, you call the cli like so:
ccom [options] <input>

CLI options

Option Shortcut Description Default
--benchmark <number> -b <no> Execute compiler benchmarks. n is the number of benchmark runs 0
--compiler <name> (temporarily unavailable) -c <name> Can be used to switch the compiler backend. Valid inputs are cpp and java "cpp"
--data <data> -d <data> JSON string or path to JSON file, which holds the evaluation work data {}
--lang <lang> -l <lang> File format / programming language (e.g. yaml, java, html, ...) "auto"*
--mode-single -m Set input mode to single statement list -
--out-file <path> -o <path> Path to output file. If you omit this flag, the output will be printed to the console -
--silent -s Only print raw compiler output and no debug output -
--force -f Ignore safety checks. Warning: This could cause demage -
--line-comment-iden <string> -lci <string> Specifies the line comment char(s) of your data format "#"
--block-comment-iden-open <string> -bcio <string> Specifies the opening block comment char(s) of your data format ""
--block-comment-iden-close <string> -bcic <string> Specifies the closing block comment char(s) of your data format ""

*) Lang "auto" determines the language based on the file extension of the input file.

Work process

The first thing CCom does, is to analyze the input and determine, whether it is a single condition or a source file. This depends on how you call the CCom CLI. To switch to a single statement, you can call it with the flag --mode-single

Source file mode

CCom takes the whole file and feeds it into the interpreter, starting with the CONTENT grammar node.

Here is an example YAML file, which can be evaluated by CCom:

...
property1: "value1"
#? if has service.angular | has service.mysql {
# property2:
# 	property3: true
#? }
...

Another example for Java with line comments (whatever you want to achieve with that)

public class Example {
	public static void main(String[] args) {
		//? if has property_name | has property
		//? {
		// if () {
		//    System.out.println("True");
		// }
		//? }
		//? if not has property_name {
		// System.out.println("False");
		//? }
	}
}

Another Java example with block comments:

public class Example {
	public static void main(String[] args) {
		/*? if has property_name {
			if () {
				System.out.println("True");
			}
		}*/
		/*? if has property_name {
			if () {
				System.out.println("True");
			}
		}*/
	}
}

Single condition mode (--mode-single)

CCom feeds the condition into the interpreter, starting by the STMT_LST grammar node. The result is either true or false.

Here is an example input:

has service.angular | has service.frontend[1] | has backend

Data file

CCom needs a JSON data tree to work with. To pass a file or a JSON string to CCOM, please use the --data flag.
An example file looks like this:

{
    "version": "0.7.0",
    "service": {
        "frontend": [
            {
                "label": "Spring Maven",
                "name": "spring-maven",
                "dir": "./spring-maven"
            },
            {
                "label": "Spring Gradle",
                "preselected": false
            }
        ],
        "backend": []
    },
    "var": [
        {
            "name": "SPRING_MAVEN_SOURCE_DIRECTORY"
        },
        {
            "name": "SPRING_MAVEN_PACKAGE_NAME",
            "value": "com.chillibits.test-app"
        }
    ]
}

To access 0.7.0, you can use the key version. To access ./spring-maven, you can use the key service.frontend[0].dir.

Grammar

Note a grammar is dependent on the line comment identifiers and the block comment identifiers. In this particular case the line comment char is //, the block comment char open is /* and the block comment char close is */

Start symbol: CONTENT.

CONTENT               --> CHARS (SECTION CHARS)*
SECTION               --> COM_LINE_BLOCK | COM_BLOCK_BLOCK
COM_LINE_BLOCK        --> COM_LINE_IDEN if STMT_LST COM_LINE_IDEN? { PAYLOAD COM_LINE_IDEN }
COM_BLOCK_BLOCK       --> COM_BLOCK_IDEN_OPEN IF_BLOCK COM_BLOCK_IDEN_CLOSE
IF_BLOCK              --> if STMT_LST { PAYLOAD }
COM_LINE_IDEN         --> //?
COM_IDEN_PAYLOAD      --> //
COM_BLOCK_IDEN_OPEN   --> /*?
COM_BLOCK_IDEN_CLOSE  --> */
PAYLOAD               --> (COM_IDEN_PAYLOAD CHARS)+
STMT_LST              --> STMT (`|` STMT)*
STMT                  --> HAS_STMT | COMP_STMT | CONTAINS_STMT
HAS_STMT              --> not? has KEY
COMP_STMT             --> KEY COMP_OP VALUE
CONTAINS_STMT         --> KEY not? contains KEY COMP_OP VALUE
KEY                   --> IDENTIFIER INDEX? (.IDENTIFIER INDEX?)*
INDEX                 --> [NUMBER]
IDENTIFIER            --> (Letter|UNDERSCORE) (LETTER* DIGIT* UNDERSCORE*)*
VALUE                 --> STRING | NUMBER | BOOLEAN
STRING                --> "CHARS_LIMITED"
NUMBER                --> DIGIT+
BOOLEAN               --> true | false
COMP_OP               --> == | != | < | > | <= | >=
CHARS                 --> ({UNICODE}\{COM_LINE_IDEN, COM_BLOCK_IDEN_OPEN})*
CHARS_LIMITED         --> (LETTER* DIGIT* SCHAR* UNDERSCORE*)*
LETTER                --> a|b|...|y|z|A|B|...|Y|Z
DIGIT                 --> 0|1|2|3|4|5|6|7|8|9
SCHAR                 --> -|.|[|]|{|}|/|\|'
UNDERSCORE            --> _
UNICODE               --> Any unicode character

Keywords

  • if
  • has
  • not
  • contains

Special characters

  • ?
  • |
  • .
  • "
  • { / }

AST Classes

Class diagram

Contribute to the project

If you want to contribute to this project, please ensure you comply with the contribution guidelines.

© Marc Auberer 2021-2022

ccom's People

Contributors

dependabot[bot] avatar github-actions[bot] avatar marcauberer avatar splines avatar vedantmgoyal9 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

ccom's Issues

CLI bug when no language was specified & Dockerfile was used

This bug was found when developing Compose Generator. It appears when not specifing a langage and feeding CCom with a Dockerfile.

After CCom finishes executing, the contents of the Dockerfile look like this:

panic: runtime error: slice bounds out of range [1:0]

goroutine 1 [running]:
main.getCommentIdenFromLang(0x0, 0x0, 0x7ffc4edd2c14, 0x59, 0xc0000575d0, 0x76b020, 0x76b020, 0x0, 0x41311d, 0xc000016cb8)
	/home/runner/work/ccom/ccom/cli/preprocessor.go:157 +0x2a5
main.analyze(0xc0000e7c20, 0xc0000e7c68, 0xc0000e7c30, 0x0, 0x0, 0xc0000e7c90, 0xc0000e7c48, 0xc0000e7c58, 0x580000c000083500)
	/home/runner/work/ccom/ccom/cli/preprocessor.go:103 +0x24e
main.processInput(0x7ffc4edd2c14, 0x59, 0x600ad3, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7ffc4edd249d, ...)
	/home/runner/work/ccom/ccom/cli/preprocessor.go:32 +0xb7
main.main.func1(0xc00007ea80, 0x7ffc4edd2c14, 0x59)
	/home/runner/work/ccom/ccom/cli/ccom.go:97 +0x438
github.com/urfave/cli/v2.(*App).RunContext(0xc000080d00, 0x64a798, 0xc0000160c8, 0xc0000100a0, 0x5, 0x5, 0x0, 0x0)
	/home/runner/go/pkg/mod/github.com/urfave/cli/[email protected]/app.go:322 +0x6fe
github.com/urfave/cli/v2.(*App).Run(...)
	/home/runner/go/pkg/mod/github.com/urfave/cli/[email protected]/app.go:224
main.main()
	/home/runner/work/ccom/ccom/cli/ccom.go:116 +0xbdd

Move token types from lexer to token class

@marcauberer I think it'd be better to move the token types to the token header file.

enum TokenType {
// Unknown token type
TOK_UNKNOWN,
// End of file
TOK_EOF,
// Keywords
TOK_IF, // if
TOK_HAS, // has
TOK_NOT, // not
// Boolean values
TOK_TRUE, // true
TOK_FALSE, // false
// Operators
TOK_OR, // |
TOK_EQUALS, // ==
TOK_NOT_EQUALS, // !=
// Misc
TOK_IDENTIFIER, // e.g. test
TOK_NUMBER, // e.g. 123
TOK_STRING, // "test"
TOK_DOT, // .
TOK_BRACE_OPEN, // {
TOK_BRACE_CLOSE, // }
TOK_INDEX, // [123]
TOK_COM_IDEN_PAYLOAD, // //
TOK_COM_LINE_IDEN, // //?
TOK_COM_BLOCK_IDEN_OPEN, // /*?
TOK_COM_BLOCK_IDEN_CLOSE, // */
TOK_ARBITRARY // e.g. asd'!?fowen7a_=sdfkh%"
};

Error message on single statements without specifying lang

Currently the cli throws following error when calling it with a single statement and without specifying a language via the --lang flag:

Please use lang 'auto' only in combination of valid file paths as file input

As a single file does not have comment identifiers, there should no language be needed to call CCom.

Missing support for underscores in identifier names

I wonder why we did not notice this earlier, but CCom currently does not support underscores in identifiers, which is definitely not the intended solution.
The problem is that isalnum() does not return true when the input is an underscore.

Line break bug in C++ compiler

Following C++ compiler bug was found while testing for Compose Generator:

Following example:

...
#? if var.QUESTDB_ENABLE_PG_WIRE_ENDPOINT == "yes" {
#  - 8812:8812
#? }
#? if var.QUESTDB_ENABLE_INFLUX_LINE_ENDPOINT == "yes" {
#  - 9009:9009
#? }
#? if var.QUESTDB_ENABLE_HEALTH_ENDPOINT == "yes" {
#  - 9003:9003
#? }
...

which has following compile output if all conditions evaluate to true:

- 8812:8812 - 9009:9009 - 9003:9003

A workaround is to specify the conditional sections with a blank line between them:

...
#? if var.QUESTDB_ENABLE_PG_WIRE_ENDPOINT == "yes" {
#  - 8812:8812
#? }

#? if var.QUESTDB_ENABLE_INFLUX_LINE_ENDPOINT == "yes" {
#  - 9009:9009
#? }

#? if var.QUESTDB_ENABLE_HEALTH_ENDPOINT == "yes" {
#  - 9003:9003
#? }
...

Don't increase position on EOF

We shouldn't increase the position after having reached EOF since this will screw up the Math.min(...) calculation otherwise.

int advance() {
InputStringPos++;
if (InputStringPos > FileInput.length() - 1) {
CurrentChar = EOF; // Return EOF, when string ends
} else {
CurrentChar = (unsigned char) FileInput[InputStringPos];
ColNum++;
}
return CurrentChar;
}

std::string getLookahead() {
int length = std::min((int) FileInput.length() - InputStringPos, (int) MaxLookahead);
return FileInput.substr(InputStringPos, length);
}

In Java, I will implement it like this:

    /**
     * Consumes current char and advances to next character.
     */
    private void advance() {
        // Check for EOF
        if (position == file.length - 1) {
            next = 0; // 0 stands for EOF
            return;
        }

        // Increase head position (if not EOF)
        position++;

        // Read next char
        next = file[position];
        lineCol++;
    }

Replace method with constructor

Why is this not a constructor?

void initLexer(bool isSingleStatement, const std::string& inputFileInput, const std::string& inputLineCommentChars,
const std::string& inputBlockCommentCharsOpen, const std::string& inputBlockCommentCharsClose) {
fileInput = inputFileInput;
// Build conditional comment chars, based on comment chars input
lineCommentChars = inputLineCommentChars.empty() ? "" : inputLineCommentChars + "?";
blockCommentCharsOpen = inputBlockCommentCharsOpen.empty() ? "" : inputBlockCommentCharsOpen + "?";
blockCommentCharsClose = inputBlockCommentCharsClose;
payloadCommentChars = inputLineCommentChars;
maxLookahead = std::max({lineCommentChars.length(), blockCommentCharsOpen.length(),
blockCommentCharsClose.length(), payloadCommentChars.length()}) + 1;
currentContext = isSingleStatement ? SECTION : ARBITRARY;
// Load first char into the buffer
advance();
}

Operator == false is working, but != true is not

Randomly stumbled over this bug while testing:

For example: someIdentifier == false returns true, but the inverse operation someIdentifier != true returns false.

The problem might be the evaluation of comparison statements in the interpreter.

Don't throw an error when contains stmt or comp stmt are applied on a non-existent field

For Compose Generator we have following single statement:

services.database contains name == "elasticsearch"

This is problematic, because currently CCom terminates with an error if the key services.database does not exist. Same thing for compStmts.
As we currently only support a logical or (|) and no logical and, we cannot check if the key services.database exists before executing the statement above.

A solution for that would be to simply evaluate comp and contain statements to false if the key does not even exist.

Contains Stmt

Discussed in #37

Originally posted by marcauberer June 1, 2021
Compose Generator needs a feature which can search in an input data array after an element, that contains attribute <x> to have value <y>.

I had the idea to introduce a new statement type. Here is an example for the suggestion:
service.frontend contains name == "spring-maven"
or
vars contains number >= 5

As you see, this would give us the possiblity to not only check if attribute <x> is equal to value <y>, furthermore we can also check if an array contains any element with attribute <x>, having a value not equal to <y>. Or we could use more comparison operators like >, <, >=, <=.

Line breaks in conditional sections being ignored

When having line breaks within one conditional section, these are removed after compiling. That causes potential parsing errors later for CG.

Example:

image: adguard/adguardhome:v${{ADGUARD_VERSION}}
container_name: ${{PROJECT_NAME_CONTAINER}}-frontend-adguard
restart: always
volumes:
  - ${{VOLUME_ADGUARD_DATA}}:/opt/adguardhome/work
  - ${{VOLUME_ADGUARD_CONFIG}}:/opt/adguardhome/conf
ports:
  - 53:53/tcp
  - 53:53/udp
#? if var.ADGUARD_ENABLE_DHCP == "true" {
#   - 67:67/udp
#   - 68:68/tcp
#   - 68:68/udp
#? }
#? if var.ADGUARD_ENABLE_UI == "true" {
#   - 80:80/tcp
#   - 443:443/tcp
#   - 443:443/udp
#   - 3000:3000/tcp
#? }
#? if var.ADGUARD_ENABLE_DOT == "true" {
#   - 853:853/tcp
#? }
#? if var.ADGUARD_ENABLE_DOQ == "true" {
#   - 784:784/udp
#   - 853:853/udp
#   - 8853:8853/udp
#? }
#? if var.ADGUARD_ENABLE_DNS_CRYPT == "true" {
#   - 5443:5443/tcp
#   - 5443:5443/udp
#? }

The log message from CG is:

ERROR: 2021/11/01 11:20:59.593752 load.go:170: Unable to load 'frontend-adguard-home': 1 error(s) decoding:

* error decoding 'Ports': Invalid ip address 53:53/udp
- 80:80/tcp - 443:443/tcp - 443: address 53:53/udp
- 80:80/tcp - 443:443/tcp - 443:: too many colons in address

Get rid of `TOK_UNKNOWN`

What is the purpose of having an token where we can't be sure about what it is exactly? Could you just throw an error here?

// Otherwise, just return the character as its ascii value.
Token result = Token(TOK_UNKNOWN, std::string(1, (char) curChar), lineNum, colNum);

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.