Comments (11)
@Anipik Ah, sorry, I got delayed. I'm currently in www.arctic15.com and rather tied. See the linked PR for further discussion and the last comment. I should be able to get onto it the next weekend, knock on the tree. Naturally if you need to do something in the meantime, feel free to do so. :)
from machinelearning.
@veikkoeeva can you please paste the error output ?
from machinelearning.
@Anipik Hmm, indeed... It failed on the assert, but there might be something written to stdout before that (a refactoring idea, might be good to assert the terms directly). I'm not on that code currently, but I'll check in about 14 hours.
from machinelearning.
@veikkoeeva generally it mentions the line and the file name where the matching of the outputs failed.
from machinelearning.
@Anipik I took a look and tried to fix a bit, but I could lend a hand a bit. Do you happen to know where the results to LogisticRegression-bin-norm-CV-breast-cancer-rp.txt
are written? This eludes me somehow. The result file looks like this
LogisticRegression AUC Accuracy Positive precision Positive recall Negative precision Negative recall Log-loss Log-loss reduction F1 Score AUPRC /l2 /ot /nt Learner Name Train Dataset Test Dataset Results File Run Time Physical Memory Virtual Memory Command Line Settings 0,9945 0,969373 0,959559 0,952772 0,975316 0,977394 0,134393 85,57494 0,956039 0,988987 0,1 0,001 1 LogisticRegression %Data% %Output% 99 0 0 maml.exe CV tr=LogisticRegression{l1=1.0 l2=0.1 ot=1e-3 nt=1} threads=- dout=%Output% data=%Data% seed=1 xf=BinNormalizer{col=Features numBins=5} /l2:0,1;/ot:0,001;/nt:1
and I suppose the problem is visible. In the PR I have gone through the places the code uses one-by-one and the results in some other files, such as in LogisticRegression-bin-norm-CV-breast-cancer-out.txt
look consistent with the assertion data currently (though the runner still reports it as a failure, I'll check that later).
from machinelearning.
Ah that's interesting @veikkoeeva , thanks for bringing this up.
The rp
files are written by the so-called ResultProcessor
, the code for which lies in the src/Microsoft.ML.ResultProcessor
project... so this LogisticRegression-bin-norm-CV-breast-cancer-rp.txt
file, I would expect is the result of running ResultProcessor
on top of the LogisticRegression-bin-norm-CV-breast-cancer-out.txt
I'd expect to see alongside it.
from machinelearning.
Yeah the problem is that the decimal separator in your language pack is comma instead of decimal.
and we just match the rp files using string matching
The matching is being done here https://github.com/Anipik/machinelearning/blob/master/test/Microsoft.ML.TestFramework/BaseTestPredictorsMaml.cs#L211
The fix could be https://github.com/Anipik/machinelearning/blob/master/test/Microsoft.ML.TestFramework/BaseTestBaseline.cs#L618
instead of directly using . as a data separator you can dynamically obtain the decimal separator
from machinelearning.
@danmosemsft can you add a non-english queue for this repo too ?
from machinelearning.
@TomFinley Thanks, I'll see if get the rest fixed today (it's 19:00 here).
@Anipik Hmm, good to know. As you can see, I've tried to fix all instances that print numbers in non-invariant way. It occurred I could fix the comparison, but then one would have files that aren't easy to diff, say, when asking help here or comparing oneself. If this approach is OK, I think I should add a note to the commit about this.
My locale is fi-FI
, by the way.
from machinelearning.
@danmosemsft can you add a non-english queue for this repo too ?
@Anipik feel free to open an issue, and make the addition, if it's analogous to corefx's.
from machinelearning.
@veikkoeeva is this issue resolved ?
from machinelearning.
Related Issues (20)
- DataFrame.OrderBy methods incorrect behavior with null values
- ML.net models binaries that runs locally in Windows 2019 server throws exception
- Make the Apply method available to StringDataFrame column
- Get Topics Used By LDA?
- How to get actual topics used to make predictions by LDA
- DataFrame doesn't decode boolean arrays correctly from Arrow HOT 6
- LightGbm does not exist in namespace Microsoft.ML.Trainers HOT 3
- Urgent need for a speech recognition samples
- How to analyze speech by tensorflow.Net
- trying to carry out the 'ML.NET in ten minutes' project, but the script for creating the machine learning model does not work on my computer
- DataFrame IndexOufRange exception on attemp to call Apply method
- How to gain progress when training for a long time๏ผ
- Create HTML from C# in Jupyter
- Regression FairLearn with AutoML? HOT 2
- Is there an equivalent to pandas' get_dummies, in Microsoft.Data.Analysis?
- ML.NET can't add Evaluate logic into pipeline HOT 1
- How to predict text type based on input text? HOT 1
- Accessing data by column after adding columns to a DataFrame returns error data
- Specify Categorical Features in LightGBM
- Get Loss During Training for Visualization (Learning Curve Graph)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from machinelearning.