Giter Club home page Giter Club logo

linguist-unknown's Introduction

Brain Logo

Open Source Travis Badges GPL

Linguist Unknown

This repository is used as a Web Browser extension for the website GitHub.com in order to detect and highlight unknown, lost or new programming languages. Oh! And you are as well able to overwrite syntax highlighting of known languages such as C, Javascript and many others!

See CONTRIBUTING.md before creating a pull request.

table of contents

Why should you download it?

There are numerous cool languages out there whose syntaxes are not being highlighted on GitHub. That happens because the Linguist Project targets only the main existent programming languages.

Because of that, most of the time it is frustrating to see a new-or-unknown-language source code. Linguist Unknown is a project that helps new, lost or unknown languages to be visualized on GitHub. It helps you to do what you already do on your favorite Text Editor.

We believe that all languages should be highlighted on GitHub; just the way it should always be. :) There is an ocean of programming languages out there and by downloading Linguist Unknown you're making every drop of this ocean count!

How can I download and use it?

Two Simple Steps:

On/Off

How to highlight my language(s)?

  1. Download and install Linguist Unknown.
  2. Add a file named .linguist.yml into the root of your GitHub repository to tell Linguist Unknown your language(s) grammar(s).
  3. Write your grammar(s) rules. The example below tells Linguist Unknown that you have a programming language called Foo whose extensions are .foo and .bar. It also tells that Foo's single linge comment is defined by //, whereas its multiline comments are defined by /* and */. Last but not least, it defines the color of your tokens i.e. identifier.color, number.color. It also helps you to define the color groups of your grammar's keywords, operators and customizable regexes
Foo:
  extensions:
    - ".foo"
    - ".bar"

  default:
    color: "#808A9F"

  identifier:
    color: "#333333"

  number:
    color: "#FF6600"

  string:
    color: "#333300"

  comment:
    color: "#CCF5AC"
    single_line: "//"
    begin_multiline: "/*"
    end_multiline: "*/"

  group:
    - color: "#72EEBB"
      operators:
        - "==="
        - ">="
      keywords:
        - "int"
        - "float"
      regexes:
        - regex: "&(amp;)\/[^\/]*\/([\\S]?)*"
          modifier: ""

    - color: "#FF00FF"
      keywords:
        - "if"
        - "else"
        - "switch"
        - "let"

    - color: "#000000"
      multiline:
        - begin: "\"\"\""
          end:   "\"\"\""
  1. Test it. Go to https://github.com/your/repository/path/to/file.foo or https://github.com/your/repository/path/to/file.bar and check if is highlighted! Simple as that!

// Obs.: Make sure you refresh your browser's cached data.

Examples

Brain Language
Brain:
  extensions:
    - ".br"
    - ".brain"

  default:
    color: "#969896"

  group:
    - color: "#a71d5d"
      operators:
        - ">"
        - "<"
        - "^"
        - "&lt;"
        - "&gt;"

    - color: "#333333"
      operators:
        - "["
        - "]"
        - "{"
        - "}"
        - "?"
        - ":"
        - ";"
        - "!"

    - color: "#0086b3"
      operators:
        - "+"
        - "-"
        - "*"
        - "/"
        - "%"
        - "_"

    - color: "#795da3"
      operators:
        - "."
        - ","
        - "$"
        - "#"
Output

Brain

Test
Test:
  extensions:
    - ".test"

  default:
    color: "#FF8272"

  identifier:
    color: "#FF99FF"

  number:
    color: "#FF6600"

  string:
    color: "#333300"
  
  comment:
    color: "#969896"
    single_line: "//"
    begin_multiline: "/*"
    end_multiline: "*/"

  group:
    - color: "#72EEBB"
      keywords:
        - "for"
        - "while"
      regexes:
        - regex: "&(amp;)\/[^\/]*\/([\\S]?)*"
          modifier: ""

    - color: "#FF00FF"
      keywords:
        - "if"
        - "else"
        - "switch"
        - "let"
Output

Test

How do I know if it works?

After downloading and installing it, visit one (or all) of the cool languages we have gathered in this repository:

Language GitHub (or info) Repository URL to test Test file written by
AdvPL AdvPL repo ./examples/AdvPL/JSONTest.prw haskellcamargo
Brain Brain repo ./examples/Brain/human_jump.brain luizperes
Brainfuck Brainfuck (Wikipedia) ./examples/Brainfuck/hellbox.bf Robert de Bath
BrazukaScript BrazukaScript repo ./examples/BrazukaScript/wesley_safadao.bra luizperes
C C (Wikipedia) ./examples/C/io.c luizperes
Capybara Capybara repo ./examples/Capybara/helloworld.capy haskellcamargo
Headache Headache repo ./examples/Headache/func.ha LucasMW
Monga Monga repo ./examples/Monga/bf.monga LucasMW
Moon Moon repo ./examples/Moon/examples.moon MaiaVictor
Quack Quack repo ./examples/Quack/fn_stmt.qk luizperes
Siren Siren repo ./examples/Siren/100-doors.siren robotlolita
Test -- ./examples/Test/test.test luizperes

If they're highlighted, you're good to go!

How do I know my YAML is valid?

Please read the documentation and check if your YAML is valid here

Documentation

Multiple languages in same repo

It's simple, in your .linguist.yml:

Foo:
  extensions:
    - ".foo"
  # ... other rules

Bar:
  extensions:
    - ".bar"
  # ... other rules
extensions

List of extensions for your language.

extensions:
  - ".ext1"
  - ".ext2"
default

The default configurations for your language.

default.color

All tokens with undefined color will have this color. If this color is not defined, it will use GitHub's default color: #24292e.

default:
  color: "#F00BAF"
identifier

The rules for identifiers in your language. // Obs.: Right now we only have the property color, but we may add other properties later such as custom identifiers.

identifier.color

The color for your language's identifiers. If this color is undefined, it will user the property default.color instead.

identifier:
  color: "#F00BAF"
number

The rules for numbers in your language. // Obs.: Right now we only have the property color, but we may add other properties later such as custom numbers.

number.color

The color for your language's numbers. If this color is undefined, it will user the property default.color instead.

number:
  color: "#F00BAF"
string

The rules for strings in your language. // Obs.: Right now we only have the property color, but we may add other properties later such as custom strings.

string.color

The color for your language's strings. If this color is undefined, it will user the property default.color instead.

string:
  color: "#F00BAF"

comment

Group of lexemes related to your comment tokens.

comment.color

The color for your language's comments. If this color is undefined, it will user the property default.color instead.

comment:
  color: "#F00BAF"
  ...
single_line

The lexeme for your single line comments, such as //, # and others

comment:
  single_line: "//"
  # ... other rules
begin_multiline

The lexeme for the begin of your multiline comments, such as /*, { and others

comment:
  begin_multiline: "/*"
  # ... other rules
end_multiline

The lexeme for the end of your multiline comments, such as */, } and others

comment:
  end_multiline: "*/"
group

Represents a list of color rules for your keywords, operators and others. Example

group:
  - color: "#F00BAF"
    keywords:
      - "if"
      - "while"
      - "for"
  - color: "#333333"
    keywords:
      - "int"
      - "float"
    operators:
      - "==="
      - "!=="
      - "=="
    multiline:
      - begin: "<begin>"
      - end:   "</end>"
  - color: "FF0000"
    regexes:
      - regex: "&(amp;)\/[^\/]*\/([\\S]?)*"
        modifier: ""
      - regex: "^#(?:[0-9a-fA-F]{3}){1,2}"
        modifier: "i"
group.color

Defines the color group for your keywords, operators and others (such as regexes). If undefined, it will user the property default.color instead.

group:
  - color: "#F00BAF"

  # ... other rules
group.keywords

Defines a list of keywords for a color group.

group:
  - color: "#F00BAF"
    keywords:
      - "if"
      - "else"

  # ... other rules
group.operators

Defines a list of operators for color group.

group:
  - color: "#F00BAF"
    operators:
      - "=="
      - "!="
      - ">"

  # ... other rules
group.regexes

Defines a list of regexes for a color group. The regexes properties can be used as a property that may identify custom lexemes not included by Linguist Unknown. For example, imagine that #FFFFFF is a valid lexeme in your language, to highlight it with red color, you would most likely do:

group:
  - color: "#FF0000"
    regexes:
      - regex: "^#(?:[0-9a-fA-F]{3}){1,2}"
        modifiers: ""

  # ... other rules
group.multiline

Defines a list of multiline lexemes for a color group. It is very useful when you have a lexeme that takes multiple lines (not intended to be used for comments).

group:
  - color: "#FF00FF"
    multiline:
      - begin: "<table>"
        end:   "</table>"

   # ... other rules

Contributing

Feel free to send your pull requests. Read our CONTRIBUTING.md file :)

LICENSE

This project extends GNU GPL v. 3, so be aware of that, regarding copying, modifying and (re)destributing.

linguist-unknown's People

Contributors

caricati avatar luizperes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

linguist-unknown's Issues

Implement Test Files

DownloadHelper

  • YAML should be downloaded and parsed properly.

Utilities

  • Method tryMatchUrlExtension should callback a function with a single Language object.
  • Check if method getPossibleFilepath is returning an appropriate url.
  • Check if method refresh is drawing properly (if possible)

Token

  • Check if tokens are being created properly // it will be checked on the lexer

Highlighter

  • Check functions: openSpan, closeSpan and getSpan
  • Check function isId
  • Check function isNumber
  • Check function isLiteralString
  • Check function startsWith
  • Check function matchRegex
  • Check function getId
  • Check function getNumber
  • Check function getLiteralString
  • Check function getMultilineComment
  • Check if function lexer is producing the tokens according to the Language object. // Check the 'Token' item once it is done
  • Check if function paint is producing the right span colors, according to the Token list
  • Check function draw, if possible

Bootstrap

  • Test the chrome variables, if possible

.linguist.yml

  • Create tests with many languages testing all properties of the Language object, if possible

Obs.:

I am not sure if it will be possible to reproduce all tests, thus I am mentioning if possible in the bullet points, meaning that we will check the viability to do so. If not possible, we will just check the bullet point normally (it won't be implemented)

Highlighter generating extra spans

The highlighter is sometimes generating the folowing code

<span style="color:#969896;">/* comment */</span>
<span style="color:#FF8272;">  ==== </span>
<span style="color:#FF00FF;">if</span>
<span style="color:#FF8272;"> </span>
<span style="color:#FF99FF;">huhdufd</span>
<span style="color:#FF8272;"> </span>
<span style="color:#FF00FF;">let</span>
<span style="color:#FF8272;"> </span>
<span style="color:#FF99FF;">fdjijfdf</span>
<span style="color:#FF8272;"> == !!!!! </span>
<span style="color:#72EEBB;">while</span>
<span style="color:#FF8272;"> </span>
<span style="color:#72EEBB;">for</span>
<span style="color:#FF8272;"> </span>
<span style="color:#FF00FF;">let</span>
<span style="color:#FF8272;"> </span>
<span style="color:#72EEBB;">for</span>
<span style="color:#FF8272;"> </span>
<span style="color:#72EEBB;">for</span>
<span style="color:#FF8272;"> </span>
<span style="color:#72EEBB;">for</span>
<span style="color:#FF8272;"> </span>
<span style="color:#FF99FF;">rfrereer</span>
<span style="color:#FF8272;"> ===</span>

If you check it again, you'll find many times the following code <span style="color:#FF8272;"> </span>. This behavior is a bug as empty spaces should be inserted into the nearest span (left or right)

.linguist.yml needs some refactoring

I believe we should do some refactoring for the .linguist.yml file.

We can keep the keyword grammar and organize it more or less like:
linguist

We could also implement something like operator overload with a keyword pattern, for exemple:

pattern:
  id: regex
  number: regex

getNumber function is not working properly

We will need to reimplement the function getNumber inside the Highlighter. There's a bug going on https://github.com/github-aux/linguist-unknown/blob/development/src/scripts/ling-highlighter.js#L150

The tests are failing when we try to parse a number at the index 6 for the input ******1_1_1_BLAH111******. It should return 1, but it is returning 1_1.

It also seems that the isNumber function is not recognizing numbers started with + or -, such as -9 or +1. Update: the parser will not recognize these kind of numbers in order to reduce its ambiguity. The user then will need to create their own rules for that

Travis error: https://travis-ci.org/github-aux/linguist-unknown/jobs/241995488

Advertise package v1.0

  • Search where to advertise
  • Make a proper plan such as creating a Github page or writing an article about it. See the best days to post on Social Media as well.
  • Take one day to post in all searched social media websites. Make sure to not post in different hours, once we want to get as many stars as possible to enter on Github trending.

Try-catch won't suppress server error

The try-catch on https://github.com/github-aux/linguist-unknown/blob/chrome/src/scripts/ling-loader.js#L13 is always printing a console error when the status of a XMLHttpRequest is 404
error_linguist
According to here, here and here says that "Unfortunately, our browser vendors don't allow us to suppress network errors in 2016.". It seems to be still the same in 2017. However, there is a workaround such as having a proxy between the script and the server.
For now, though, I will leave it here for a future discussion. Should that be considered a known bug or an expected behavior?

Create package v1.0

Chrome

  • Generate .min.js files on chrome branch
  • Create tag v1.0-chrome on chrome branch
  • Upload chrome extension for tag v1.0-chrome

Firefox

  • Generate .min.js files on firefox branch
  • Create tag v1.0-firefox on firefox branch
  • Upload firefox add-on for tag v1.0-firefox

Support for multiline token

We today only have support for multiline comments and would be good that we added that to the rules, so languages with multiline tokens may be supported.

see image below:
siren

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.