Giter Club home page Giter Club logo

gyp's People

Contributors

gazunder avatar guspascual avatar metthal avatar plusvic avatar targodan avatar wayrick avatar wxsbsd avatar zohiartze avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gyp's Issues

Inconsistency with official libyara grammar

While attempting to parse a valid yara ruleset with the latest version of master, the following error was returned by gyp.Parse():
syntax error: unexpected _HEX_NUMBER_, expecting _NUMBER_

The relevant snippet of the line in the ruleset where the error occurred is as follows:
xor(0x01-0xff)

I've verified that no error is returned in v0.9.0, so I believe the bug was introduced in the latest commit, when the new _HEX_NUMBER_ and _OCT_NUMBER_ tokens introduced an inconsistency with the official libyara grammar, and the xor string modifiers definition was not updated.

panic on protobuf marshalling when condition contains filesize or entrypoint

When using y2j to generate a json from a yara file and then j2y to convert the json back to a yara file an exception (panic) is generated.
This happens if a rule contains either filesize or entrypoint in the condition. It seems to be a problem in the protobuf (de)marshalling of the condition.

panic: unexpected node "*pb.Expression_Keyword"

goroutine 1 [running]:
github.com/VirusTotal/gyp/ast.expressionFromProto(0xc000108dc0, 0x2, 0x2)
	/home/wayrick/code/external/VirusTotal/gyp/ast/serialization.go:475 +0x11cb
github.com/VirusTotal/gyp/ast.expressionsFromProto(0xc000163bd8, 0x2, 0x2, 0x400, 0x7f4b75086e00, 0x20300000000000)
	/home/wayrick/code/external/VirusTotal/gyp/ast/serialization.go:381 +0x75
github.com/VirusTotal/gyp/ast.createOperationExpression(0x6b9cce, 0x1, 0xc000163bd8, 0x2, 0x2, 0xc000000180, 0xc000163ba8)
	/home/wayrick/code/external/VirusTotal/gyp/ast/serialization.go:113 +0x4c
github.com/VirusTotal/gyp/ast.binaryExpressionFromProto(0xc000108c80, 0xc0000662a0, 0x90d7c0)
	/home/wayrick/code/external/VirusTotal/gyp/ast/serialization.go:374 +0x4b4
github.com/VirusTotal/gyp/ast.expressionFromProto(0xc000108c00, 0x0, 0x0)
	/home/wayrick/code/external/VirusTotal/gyp/ast/serialization.go:469 +0xe7b
github.com/VirusTotal/gyp/ast.RuleFromProto(0xc0001983f0, 0x1)
	/home/wayrick/code/external/VirusTotal/gyp/ast/serialization.go:44 +0x25d
github.com/VirusTotal/gyp/ast.RuleSetFromProto(0xc000140c40, 0xd)
	/home/wayrick/code/external/VirusTotal/gyp/ast/serialization.go:16 +0x94
main.main()
	/home/wayrick/code/external/VirusTotal/gyp/cmd/j2y/main.go:49 +0x251

These are the 2 minimal rules that (either one) will generate the error:

rule test_entrypoint {
condition:
    entrypoint > 0
}

rule test_filesize {
condition:
    filesize > 0
}

It seems to only be tied to filesize and entrypoint.

Incorrect parsing of multi-line comments inside hex strings

The parser is handling multi-line comments inside hex strings incorrectly. Let's use the following rule as an example:

rule TEST {
    strings:
      $ = {
            01 [5]    /* comment 1 */
            02        /* comment 2 */
      }
    condition:
      all of them
}

The rule is correct, but the parser is returning the following error:

unexpected RBRACE, expecting BYTE or MASKED_BYTE or LBRACKET or LPARENS

This is because the once the parser finds the opening /* for the first comment, it greedily consumes all the characters until the closing */ in the second comment. This causes the 02 byte to fall inside the comment, so the parser is actually seeing the string as: { 01 [5] }, which is syntactically invalid.

Writing Source of Imports if erroneous

Heya, parsing and then writing yara rules with multiple imports is broken. There is no newline emitted at the end of an import, leading to outputs such as this:

import "pe"import "elf"

Pull request is incoming. :)

Can you manipulate A Rule's condition when represented as an AST

After parsing a Yara Ruleset into a ast.RuleSet is there a way to manipulate a rule's condition?

Take this sample ruleset

rule rule_1 {
    strings:
        $header_v2 = {50 02}
        $header_v3 = {50 03}
        $header_v4 = {50 04}
        $header_v5 = {50 05}

    condition: (
        for any of them: ($ at 0)
    )
}

rule rule_2 {
    strings:
        $pattern_v4_4 = {8c 05 65 23 61 6c [0-32] 90}
        $pattern_v4_5 = {8c 06 65 52 65 63 [0-32] 90}

    condition:
        any of them
}

I would like to make rule_2's condition rule_1 and any of them however I can't find an ast.Condition or any similar struct to do this. Here is the code I have so far for getting the rules condition. Now that I have it is there a way to modify it?

r2 := r.Rules[1]
for _, cond := range r2.Condition.Children() {
    fmt.Println("Condition", cond)
}

Rule with nil condition.

When this rule is parsed with Gyp, it returns a rule with an nil condition, no syntax error is returned as it should.

rule foo {
   strings:
       $a = "foo"
   condition:
       for all i in (0..(filesize - 10) : ($a at i))
}

TextString escaped characters not properly handled

Something is wrong with escape sequences. When transforming a yara rule via gyp (i.e. test.yara -> y2j -> test.json -> j2y -> test_out.yara) a TextString containing escape sequences is changed and can become broken.

Most basic example:

$ cat test.yara
rule test {
strings:
    $s = "\""
condition:
    $s
}
$ ./y2j -o test.json test.yara 
$ ./j2y -o test_out.yara test.json
$ cat test_out.yara 

rule test {
  strings:
    $s = "\"
  condition:
    $s
}%

In the output yara rule the string is modified and broken (no ending quotation mark).

YARA Dependency Chain & YARA Ruleset Diff

Over the past year I’ve been using GYP fairly heavily to help manage a large YARA ruleset. One of the things I needed to do was to:

  • Build a dependency chain for
    • a rule
    • a list of rules
  • Diff two rulesets

So I wrote code to accomplish both these tasks. I believe this code can be useful to others (I’ve been asked about it by multiple people) so I’d like to open source it. I think it would make a good addition to GYP, but I’d like to get feedback and see if the community thinks it should be added to GYP. Is this something that makes sense to move into GYP?

At a high level my implementation plan (very open to feedback) is to create a new dir at the root of GYP called utils. Within this dir I’d add yara_diff.go and yara_dependency_walker.go. I’d also write two simple CLI go programs to serve as examples of how to use them (maybe throw them in an examples dir).

Dependency Chain

The purpose of this code is to return a list of dependent rules for a given rule. This code takes in a ruleset and a list of rule identifiers to get dependencies for. For each identifier the code will recursively get the the rules that it depends on.

Yara Ruleset Diff

This code will take in two YARA rulesets and return a data structure that shows if a rule was modified, deleted, added, or is the same.

Primary expression groupings do not serialize and then deserialize to the same rule

I had this same issue with yara-parser, in which serialization to JSON and then deserialization back to YARA produces a different ruleset.

Consider this rule:

rule test {
  condition:
    true or (false and true)
}

It serializes to this:

{
  "rules": [
    {
      "modifiers": {

      },
      "identifier": "test",
      "tags": [
      ],
      "meta": [
      ],
      "strings": [
      ],
      "condition": {
        "orExpression": {
          "terms": [
            {
              "boolValue": true
            },
            {
              "andExpression": {
                "terms": [
                  {
                    "boolValue": false
                  },
                  {
                    "boolValue": true
                  }
                ]
              }
            }
          ]
        }
      }
    }
  ]
}

It reserializes back to this:

rule test {
  condition:
    true or false and true
}

The groupings on the primary expressions seem to get lost. The easiest way to handle all of it seemed to be to throw parentheses around everything, but that makes formatting kind of ugly

YaraSerializer modifies condition section

In the process of working with your library I found out that when you parse a ruleset from string and then, at some point, you want to serialize one of the parsed rules back to string, numbers in condition section will be converted to decimal form, e.g:

rule := ruleset.Rules[0]
newRuleset := &pb.RuleSet{
	Rules: []*pb.Rule{rule.AsProto()},
}

buf := &bytes.Buffer{}
serializer := gyp.NewSerializer(buf)
serializer.Serialize(newRuleset)

as a result, for condition like: uint16 ( 0 ) == 0x5a4d and filesize < 40KB I got uint16(0) == 23117 and filesize < 40960.

Is it possible to add some kind of flag or something to save condition section as string rather than as Expression? Or maybe there's any other way to get parsed rule as a string?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.