unicode-org / conformance Goto Github PK
View Code? Open in Web Editor NEWUnicode & CLDR Data Driven Testing
Home Page: https://unicode-org.github.io/conformance/
License: Other
Unicode & CLDR Data Driven Testing
Home Page: https://unicode-org.github.io/conformance/
License: Other
This would save time and output. Logging "10 of 1000" is not necessary in non-interactive runs.
Add this to the testing and reporting to support more flexible testing. For example:
ICU4C release 74.2 with icu69 test data on all test types
ICU4C 74.2 with icu74 testdata on all test types
NodeJS 18. with icu72 testdata
NodeJS 20.1.0 with icu73 test data
NodeJS 21.6.0 with icu72 data
NodeJS 21.6.0 with icu73 data
NodeJS 21.6.0 with icu74 data
Started in PR#94: #94
The Rust executor is getting an error when trying to execute sendOneLine
, and it does so for every batch of 10,000 tests that it sends.
Ex:
Testing ../executors/rust/target/release/executor / coll_shift_short. 190,000 of 192,707
Testing ../executors/rust/target/release/executor / coll_shift_short. 191,000 of 192,707
Testing ../executors/rust/target/release/executor / coll_shift_short. 192,000 of 192,707
!!! sendOneLine fails: input => {"label": "0190000", "string1": "\u2eb6!", "string2": "\u2eb6?", "test_type": "coll_shift_short"}
{"label": "0190001", "string1": "\u2eb6?", "string2": "\u2eb7!", "test_type": "coll_shift_short"}
...
#EXIT<. Err = [Errno 2] No such file or directory: '../executors/rust/target/release/executor'
!!!!!! processBatchOfTests: "platform error": "None"
Issues:
Bonus points: in the future, we can use a logging library so that we can more easily control the behavior differently on our local machines vs. on CI
For the executables that we run (test data generator, test executor), we should validate the inputs to the executable against the schema within the executable, right before we use them.
So if step A
generates output a
that goes into step B
that generates b
, ..., then we want step B
validating values in a
right before it processes them.
That protects us against the data inconsistency of stale data problem.
Use compare_type in collation to reduce test failures. Consider other options, too, e.g., strength.
ICU4C uses "_" to separate components of the locale string. However, test data, Dart, ICU4X, and Node all use "-".
Some of these issues are a part of the test framework (ex: schema definition), some might be related to the ICU4J executor, some might be for the ICU4J NumberFormatter APIs.
groupingStrategy
are not homogeneous. This caused a problem for the ICU4J executor when handling the parsed value, requiring a workaround to stringify the parsed value followed before conversion to an enum was possible.groupingStrategy
don't match the enum names for NumberFormatter.GroupingStrategy
.halfCeil
or halfFloor
? It is not a part of java.math.RoundingMode
notation
should be an enum instead of an open ended stringcurrencyDisplay
in ICU4J NumberFormatter?roundingMode = exact
Also, for any of the existing collation tests, they are implicitly defaulting to the root locale, which is und
. Updating these tests to have a specified locale means that we set the locale to be und
.
All output warnings and errors should become available to later stages of the processing. At this time, they are merely output as logging to a terminal.
The code has been using "Rust" instead of "ICU4X". We should rename accordingly.
Since the thing under test is an i18n library, we should rename our code according to the library name under test. The version number of the language runtime needed for the library version is a separate thing, and may not correspond 1:1 anyways (ex: ICU4X 1.0 and ICU4X 1.1 were developed against Rust 1.61, ICU4X 1.2 was developed against Rust 1.68.2).
Configure logging to have a single global settings file/config.
Also, make the logging level in CI be high enough to not show test execution progress.
Handling the characters in escaping and converting to UnicodeString may be part of the problem here.
Some types of collation are missing verification data, giving runtime errors with no explanation, e.g., label "00000", "00002", ...
Fix these so the data and verifications are correct.
In the summary page and also in the detail page, the platform version is shown but not the ICU4X version, e.g.,
"platform: {'cldrVersion': '43.1.0', 'icuVersion': 'icu4x/2023-05-02/73.x', 'platform': 'rust', 'platformVersion': '1.73.0'}"
This should show the ICU4X version, e.g., 1.3 or 1.4, not "1.73".
The current test generator doesn't create tests for collation data when either of the test strings contains an incomplete surrogate. These are recorded in the logging files but they are not stored in any data or mentioned in any dashboards.
Is the schema of generated test cases affected by the version of ECMA-402 being used? If so, then include that version, too.
A follow-on task (or subtask) of #43.
From a fresh checkout of main
, when running sh generateDataAndRun.sh
, I get the following:
#EXIT<. Err = [Errno 2] No such file or directory: '../executors/rust/target/release/executor'
!!!!!! processBatchOfTests: "platform error": "None"
Traceback (most recent call last):
File "/usr/local/google/home/elango/oss/conformance/testdriver/testdriver.py", line 111, in <module>
main(sys.argv)
File "/usr/local/google/home/elango/oss/conformance/testdriver/testdriver.py", line 101, in main
driver.runPlans()
File "/usr/local/google/home/elango/oss/conformance/testdriver/testdriver.py", line 91, in runPlans
plan.runPlan()
File "/usr/local/google/home/elango/oss/conformance/testdriver/testplan.py", line 86, in runPlan
self.runOneTestMode()
File "/usr/local/google/home/elango/oss/conformance/testdriver/testplan.py", line 219, in runOneTestMode
numErrors = self.runAllSingleTests(per_execution)
File "/usr/local/google/home/elango/oss/conformance/testdriver/testplan.py", line 279, in runAllSingleTests
allTestResults.extend(self.processBatchOfTests(testLines))
TypeError: 'NoneType' object is not iterable
1
Verifier starting on 9 verify cases
Verifying test coll_shift_short on rust executor
Cannot load ../TEMP_DATA/testResults/rust/coll_test_shift.json result data: Expecting value: line 1 column 1 (char 0)Traceback (most recent call last):
File "/usr/local/google/home/elango/oss/conformance/verifier/verifier.py", line 500, in <module>
main(sys.argv)
File "/usr/local/google/home/elango/oss/conformance/verifier/verifier.py", line 491, in main
verifier.verifyDataResults()
File "/usr/local/google/home/elango/oss/conformance/verifier/verifier.py", line 189, in verifyDataResults
self.compareTestToExpected()
File "/usr/local/google/home/elango/oss/conformance/verifier/verifier.py", line 267, in compareTestToExpected
self.report.platform_info = self.resultData['platform']
AttributeError: 'Verifier' object has no attribute 'resultData'. Did you mean: 'result_path'?
1
CLDR is adding test data for likely subtags:
We have support for this in both Intl and ICU4X. It would be a good test to add.
When there are many test failures or errors, there are too many instances to report each one individually. Many of the test cases might look the same, and without any subgrouping.
It might be helpful to implement some simple unsupervised clustering of the input values (say, taking the top 10 most frequent values per input struct key) and report the top 10 counts.
The testdriver code assumes that the --icu_version parameter for the test driver is defined and that it refers to existing data. However, the value may be missing or may not be one of the defined test sets.
Proposed solution: check all defined testdata directories. If icu_version is not defined or a bad value is given, use the highest number ICU version, e.g., a value of "xyz" will look at subdirectory names and pick the one that sorts highest.
For example, if the directories are [icu73, icu72, and icu71], a missing or incorrect value for icu_version will select icu73 data for testing.
The path for the dart_web executor isn't correct, and some parameters need updating.
See PR#84 for a fix.
testdata_gen.py
hardcodes the source of data using a Github URL for a file from a specific version of ICU: https://github.com/unicode-org/conformance/blob/main/testgen/testdata_gen.py#L334
Instead, we should:
We should include SpecialCasing.txt when we get around to writing a casemap adapter: https://unicode.org/Public/UNIDATA/SpecialCasing.txt
Simply to remove unneeded detail. Fix to include any failures.
The ignorePunctuation option doesn't have an effect on the test results for coll_shift_short data. This may be a problem in NodeJS
We can either use:
nvm
(Node Version Manager)Created from comment at #67 (comment)
+1 from me on this. Doing so should be win-win for everyone. It will probably feel like using jQuery.
It seems like the best way to do this in Python is using the Beautiful Soup library (docs). I've used JSoup in Java before, and that was really nice (powerful and easy). Beautiful Soup and JSoup seem to be comparable.
Using a regular HTML file as the input for HTML templating, rather than some special syntax that requires some special engine to interpret, is a simpler way to go. (Examples of special syntax HTML templating that are all-too-common still: ex1, ex2). The simplicity is that you keep code in Python along with the caller to the library, and you keep markup in HTML, and you don't mix the two. Not having to deal with yet another syntax is a follow on benefit.
For the test driver and test data generator in Python, we should use logging instead of just printing to the console.
At the least, it's equivalent. But the potential benefits are:
logging.debug()
, logging.error()
) allow us to indicate what severity a statement isRevisit #145 (comment), where an executor encounters an error in processing a test case. Instead of returning the test case input line as is in the error response, the error handling code is transforming the input line before including in the error response. This transformation seems unintended, unless there is a good reason.
For test reports, add pagination to speed review of test failures / errors / unimplemented options. This could use JSON data loaded directly rather than creating tables in the Python code.
Current code uses settings of the the verifier object in function compareTestToExpected. It should use the data in the vplan object.
This more accurately will represent the dependent relationship between the codebases/data.
Now that we have schemas for test input and output, we should enable runtime validation of those test inputs & outputs across the board.
Doing so will enable the realization of a large chunk of the value proposition for having the schemas. It would ensure that all test cases passed to executors, and all data received from executors, adhere to the contracts defined by the schemas.
Some options for defining a schema:
JSON Schema is a natural first choice. Also, it would take more effort to deal with Protobuf (perhaps too prohibitive in statically typed languages, even if possible in dynamic ones).
Only need to have a single tool to use JSON Schema since purpose is to validate once the JSON test data cases generated by the test generation tool.
The DDT_DATA
directory is obsolete at this point, and it seems to be just a copy of a portion of the TEMP_DATA
directory that get created locally to store intermediate files.
We should remove the DDT_DATA
directory. At this point, all scripts referencing that directory are obsolete, too.
Do not remove any Python code references to ddt_data
. The Python identifier is the alias used for datasets.py
when importing that Python file/module.
Testdriver with dart_native gives this in Linux environment. This needs to be fixed to run dart_native tests.
----> STDOUT= ><
!!!!!! !!!! ERROR IN EXECUTION: 255. STDERR = Unhandled exception:
UnimplementedError: Insert diplomat bindings here
#0 Collation4X.compareImpl (package:intl4x/src/collation/collation_4x.dart:16)
#1 Collation.compare (package:intl4x/src/collation/collation.dart:28)
#2 testCollator (file:///usr/local/google/home/ccornelius/ICU_conformance/conformance/executors/dart_native/bin/executor.dart:74)
#3 main. (file:///usr/local/google/home/ccornelius/ICU_conformance/conformance/executors/dart_native/bin/executor.dart:49)
#4 _RootZone.runUnaryGuarded (dart:async/zone.dart:1594)
#5 _BufferingStreamSubscription._sendData (dart:async/stream_impl.dart:339)
#6 _BufferingStreamSubscription._add (dart:async/stream_impl.dart:271)
#7 _SyncStreamControllerDispatch._sendData (dart:async/stream_controller.dart:776)
#8 _StreamController._add (dart:async/stream_controller.dart:650)
#9 _StreamController.add (dart:async/stream_controller.dart:598)
#10 _Socket._onData (dart:io-patch/socket_patch.dart:2381)
#11 _RootZone.runUnaryGuarded (dart:async/zone.dart:1594)
#12 _BufferingStreamSubscription._sendData (dart:async/stream_impl.dart:339)
#13 _BufferingStreamSubscription._add (dart:async/stream_impl.dart:271)
#14 _SyncStreamControllerDispatch._sendData (dart:async/stream_controller.dart:776)
#15 _StreamController._add (dart:async/stream_controller.dart:650)
#16 _StreamController.add (dart:async/stream_controller.dart:598)
#17 new _RawSocket. (dart:io-patch/socket_patch.dart:1899)
#18 _NativeSocket.issueReadEvent.issue (dart:io-patch/socket_patch.dart:1356)
#19 _microtaskLoop (dart:async/schedule_microtask.dart:40)
#20 _startMicrotaskLoop (dart:async/schedule_microtask.dart:49)
#21 _runPendingImmediateCallback (dart:isolate-patch/isolate_patch.dart:123)
#22 _RawReceivePort._handleMessage (dart:isolate-patch/isolate_patch.dart:190)
WARNING:root:!!!!!! process_batch_of_tests: "platform error": "!!!! ERROR IN EXECUTION: 255. STDERR = Unhandled exception:
UnimplementedError: Insert diplomat bindings here
#0 Collation4X.compareImpl (package:intl4x/src/collation/collation_4x.dart:16)
#1 Collation.compare (package:intl4x/src/collation/collation.dart:28)
#2 testCollator (file:///usr/local/google/home/ccornelius/ICU_conformance/conformance/executors/dart_native/bin/executor.dart:74)
#3 main. (file:///usr/local/google/home/ccornelius/ICU_conformance/conformance/executors/dart_native/bin/executor.dart:49)
#4 _RootZone.runUnaryGuarded (dart:async/zone.dart:1594)
#5 _BufferingStreamSubscription._sendData (dart:async/stream_impl.dart:339)
#6 _BufferingStreamSubscription._add (dart:async/stream_impl.dart:271)
#7 _SyncStreamControllerDispatch._sendData (dart:async/stream_controller.dart:776)
#8 _StreamController._add (dart:async/stream_controller.dart:650)
#9 _StreamController.add (dart:async/stream_controller.dart:598)
#10 _Socket._onData (dart:io-patch/socket_patch.dart:2381)
#11 _RootZone.runUnaryGuarded (dart:async/zone.dart:1594)
#12 _BufferingStreamSubscription._sendData (dart:async/stream_impl.dart:339)
#13 _BufferingStreamSubscription._add (dart:async/stream_impl.dart:271)
#14 _SyncStreamControllerDispatch._sendData (dart:async/stream_controller.dart:776)
#15 _StreamController._add (dart:async/stream_controller.dart:650)
#16 _StreamController.add (dart:async/stream_controller.dart:598)
#17 new _RawSocket. (dart:io-patch/socket_patch.dart:1899)
#18 _NativeSocket.issueReadEvent.issue (dart:io-patch/socket_patch.dart:1356)
#19 _microtaskLoop (dart:async/schedule_microtask.dart:40)
#20 _startMicrotaskLoop (dart:async/schedule_microtask.dart:49)
#21 _runPendingImmediateCallback (dart:isolate-patch/isolate_patch.dart:123)
#22 _RawReceivePort._handleMessage (dart:isolate-patch/isolate_patch.dart:190)
"
In many of the test failures for number format, the reason is that "furlong" is not a recognized unit. I think that the test data is incorrect, however. Perhaps the unit is not correctly set for many of the test cases.
We can speed up our end-to-end CI in different ways:
The code in characterize_failures_by_options in verifier/testreport.py can be improved a lot by using collections.defaultdict.
See comments in this to fix the #124
ICU4X in conformance testing shows more that 20% of the tests failing, seen here:
ICU4X/icu73
The actual collator options are seen in the test failure detail, with a few examples here. The inputs are s1 and s2 and the actual options used are given
We need some help debugging help with this!
It's not clear why the Rust executor fails to build in PR #59 . In executors/rust/Cargo.toml
, all the versions of dependencies are fixed to a specific version (except for rust_version_runtime
, which moved to version 2.x years ago).
I started PR #60 to fix (or at least diagnose) the error. It gives similar error output.
@sffc Any thoughts?
Lots of debug lines are printed by schema checking. It's unnecessary!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.