Giter Club home page Giter Club logo

Comments (14)

hillu avatar hillu commented on June 11, 2024 2

@N0body007 what @ozanh said is correct. Go's internal filename representation is UTF-8 which works for your Chinese characters just fine. However, the filename is then passed as a C string to yr_scanner_scan_file, yr_filemap_map, yr_filemap_map_ex where it is finally passed as-is to the CreateFileA API. The A stands for ASCII and the UTF-8 multibyte representations of your non-ASCII characters will likely be misinterpreted … the file is simply not found.

I suggest the following change:

s.SetCallback(&m)
f, err := os.Open(targetFile)
if err != nil {
    fmt.Println("Open: err: ", err)
    return
}
defer f.Close()
if err := s.ScanFileDescriptor(f.Fd()); err != nil {
    fmt.Println("ScanFileDescriptor: err: ", err)
    return
} else {
    fmt.Printf("Matches: %+v", m)
}

from go-yara.

hillu avatar hillu commented on June 11, 2024

Sure. Could you please provide a small Go program that fails to scan your file using a dummy ruleset? I am interested to see what exact function calls you are using.

from go-yara.

N0body007 avatar N0body007 commented on June 11, 2024

Oh, sorry, I forgot to provide the demonstration program. When I change the target file name to Chinese filename,it goes wrong. The content of target file(123.txt) is "abc".

package main
import (
	yara "github.com/hillu/go-yara"
	"fmt"
)
func main() {
	rule := "rule test : tag1 { strings: $a = \"abc\"  condition: $a }"
	compiler, err := yara.NewCompiler()
	if compiler == nil || err != nil {
		return
	}
	if err = compiler.AddString(rule, ""); err != nil {
		return
	}
    rules, err := compiler.GetRules()
    if err != nil {
	return
	}
	s, err := yara.NewScanner(rules)
	if err != nil {
		return
	}
	targetFile:="123.txt"
	//targetFile:="123测试.txt"
	var m yara.MatchRules
	if err := s.SetCallback(&m).ScanFile(targetFile); err != nil {
		fmt.Println("ScanFile err :",err)
		return
	} else {
       fmt.Printf("Matches: %+v", m)
	}

}

from go-yara.

hillu avatar hillu commented on June 11, 2024

Great. What error are you getting when scanning the file that contains Chinese characters?

from go-yara.

N0body007 avatar N0body007 commented on June 11, 2024

The error is “could not open file” from the source code of yara. You can try the code above. When the filename is 123.txt, the result is nomal. But when the filename contains Chinese characters, it will get error.

from go-yara.

ozanh avatar ozanh commented on June 11, 2024

AFAIK, yara 4.1 does not support unicode file names. You can open the file with Go’s os.Open and provide file descriptor to go-yara to scan. I did it like this and solved it before.

from go-yara.

N0body007 avatar N0body007 commented on June 11, 2024

AFAIK, yara 4.1 does not support unicode file names. You can open the file with Go’s os.Open and provide file descriptor to go-yara to scan. I did it like this and solved it before.

Thank you. Maybe now I understand where the reason is.

from go-yara.

N0body007 avatar N0body007 commented on June 11, 2024

@N0body007 what @ozanh said is correct. Go's internal filename representation is UTF-8 which works for your Chinese characters just fine. However, the filename is then passed as a C string to yr_scanner_scan_file, yr_filemap_map, yr_filemap_map_ex where it is finally passed as-is to the CreateFileA API. The A stands for ASCII and the UTF-8 multibyte representations of your non-ASCII characters will likely be misinterpreted … the file is simply not found.

I suggest the following change:

s.SetCallback(&m)
f, err := os.Open(targetFile)
if err != nil {
    fmt.Println("Open: err: ", err)
    return
}
defer f.Close()
if err := s.ScanFileDescriptor(f.Fd()); err != nil {
    fmt.Println("ScanFileDescriptor: err: ", err)
    return
} else {
    fmt.Printf("Matches: %+v", m)
}

Thank you. It really works. But I want to solve this problem by changing the source code of yara. Do you have any good suggestion?

from go-yara.

hillu avatar hillu commented on June 11, 2024

@N0body007 There is an issue open in YARA: VirusTotal/yara#1487.

A workaround similar to what I described above has been implemented in yara.c (the command line program), but there hasn't been any feedback whether this works or not. Note that currently, only the builds done with Visual Studio would be affected by this fix.

Or do you want to fix this in the libyara API?

from go-yara.

N0body007 avatar N0body007 commented on June 11, 2024

Yeah, I want to fix this in the libyara API.

from go-yara.

hillu avatar hillu commented on June 11, 2024

Yeah, I want to fix this in the libyara API.

OK, in that case we ought to continue the conversation in a new issue within the YARA project. Would you like to open an issue there?

from go-yara.

N0body007 avatar N0body007 commented on June 11, 2024

Hmm, I'll solve it by myself first, and then I'll open a new issue within the YARA project if I can't. Thank you.

from go-yara.

hillu avatar hillu commented on June 11, 2024

Hmm, I'll solve it by myself first, and then I'll open a new issue within the YARA project if I can't. Thank you.

My advice: The issue, at its core, is simple to fix: In yr_filemap_map_ex, replace the CreateFileA with CreateFileW. Assume that the string that got passed in is encoded as UTF-8; convert that to UTF-16 so it can be used by CreateFileW. I am pretty sure that a pull request that implements this change would be accepted.

But since this should really be discussed as a YARA issue (or pull request), so I am closing this issue.

from go-yara.

N0body007 avatar N0body007 commented on June 11, 2024

I get it. Thank you.

from go-yara.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.