Giter Club home page Giter Club logo

advutils's People

Contributors

amironoff avatar dmit25 avatar zhongkaifu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

advutils's Issues

Inconsistent numerical format problem when using this on different cultures

Hello, I am using this library as part of RNNSharp and CRFSharp, and other very good libraries, buit by you, and ran into several problems.

My cultural region is "es-AR" as well as any Spanish culture, the decimal point specification is opposite to the standard "en-US" and other English ones, so any time you write or read a file, containing any double or single floating point number, the data gets "silently" corrupted, adn the decimal points get ignored and taken as formatting strings. so comma is used instead of decimal point, and viceversa.

This corrupts many tests, and invalidates portability of libraries,and training sets, but for worse.. this happens under the hood, so if you are unaware of this, you may run into trouble as I personally did!

My recommendation is to mention this on former releases, and please format the numbers, as well as the parsers to a Culture independent numbering System as provided on the .NET library

I modified manually this on the source files, and it is algo "better " because many training sets, don't need very precise numbers, even many times there are internally treated as single (7 digits) and externally printed out as "double" unecessarly clogging the training/model files.

My suggestion is using the .NET "G5" floating number conversion format (5 digits) for training/dictionary data, s well ass G10 for seldom numeric values. I internally do use 3-4 digits and it proved not to alter any results, after being trained and dumped into a file. also I suggest to use the conversion:

double.Parse(string, System.Globalization.CultureInfo.InvariantCulture)

as default Parsing, to avoid such problems!

Also there is no performance degradation on neither this formatting.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.