patch / cldr-number-pm5 Goto Github PK

View Code? Open in Web Editor NEW

8.0 6.0 3.0 475 KB

Localized number formatters using the Unicode CLDR

Home Page: https://metacpan.org/pod/CLDR::Number

License: Other

Perl 100.00%

unicode cldr i18n perl5

cldr-number-pm5's People

Contributors

Stargazers

Watchers

Forkers

redhotpenguin syspete d-e-f-e-a-t

cldr-number-pm5's Issues

load locale data for each locale from a different module

Suggested by @aarondcohen:

[13:43] Aaron Cohen: as an added benefit, you could break CLDR::Number::Data::* up by locale
[13:43] Aaron Cohen: so epople would load less into memory if they aren't using the other locales

Although the number-related locale data is relatively small per locale, the aggregate is increasingly large with each CLDR release. Another idea is to remove any data from a locale that is the same as what would already be inherited.

create package for common constants

So we can define $N, $P, $C, $M, and $Q all in one place.

add FAQ about fallback for non-existant locales

Users occasionally report that the wrong formatting is used for several non-existant locales including Mexican English (en-MX) and Brazilian Spanish (es-BR). We should document that since these locales don’t exist, they would fall back to English (en) and Spanish (es), respectively.

Note that I also plan on bringing up the issue to the CLDR Technical Committee that es-XX, where XX is any country within Latin America (419) should fall back to es-419 even if es-XX is not a valid locale. This would, however, require a new structure added to the LDML spec unless a locale was created for each combination of es with each remaining country within 419.

superscripting exponent format for scientific notation

Issue imported from the TODO:
https://github.com/perl-cldr/cldr-number-perl5/blob/master/lib/CLDR/Number/TODO.pod

format lengths (full, long, medium, short, narrow)

CLDR::Number::Role::Base already has the length attribute, which is not currently used. Valid lengths are full, long, medium, short, and narrow.

The desired functionality is described in UTS #35:

We should create a new test file: t/length.t

upgrade to CLDR v29

The cldr29 branch was generated with the CLDR v29-beta1:
https://github.com/patch/cldr-number-pm5/compare/cldr29

Everything looks good and no tests were broken. When the CLDR v29 is officially released, we can regenerate, document in Changes, and release to CPAN.

Moo::Role-related bug in Perl 5.8.1 through 5.8.3

Most releases of CLDR::Number have had inconsistent but common Moo::Role-related test failures in Perl v5.8.1 through v5.8.3. The oldest version of Perl that has not been known to have this problem is v5.8.4, although there are very few reports on that version.

We should either figure out the problem and fix it, or raise the minimum version of Perl from v5.8.1 (September 2003) to v5.8.4 (April 2004), which I would not be against.

Test reports:
http://matrix.cpantesters.org/?dist=CLDR-Number+0.19

Typical output:

Use of uninitialized value in method lookup at /home/njh/perl5/perlbrew/perls/perl-5.8.1/lib/site_perl/5.8.1/Moo/Role.pm line 138.
Use of uninitialized value in method lookup at /home/njh/perl5/perlbrew/perls/perl-5.8.1/lib/site_perl/5.8.1/Moo/Role.pm line 138.
Can't locate object method "is_role" via package "Moo::Role" at /home/njh/perl5/perlbrew/perls/perl-5.8.1/lib/site_perl/5.8.1/Moo/Role.pm line 138.
BEGIN failed--compilation aborted at /home/njh/.cpan/build/CLDR-Number-0.19-M6Ajmp/blib/lib/CLDR/Number/Role/Format.pm line 13.
Compilation failed in require at /home/njh/perl5/perlbrew/perls/perl-5.8.1/lib/site_perl/5.8.1/Module/Runtime.pm line 313.
Compilation failed in require at /home/njh/.cpan/build/CLDR-Number-0.19-M6Ajmp/blib/lib/CLDR/Number.pm line 32.

fix failing tests from CPAN Testers reports

We've been getting a lot of reports like this with 3 failing tests since the first CPAN upload this morning:
http://www.cpantesters.org/cpan/report/f66b08b0-6612-11e3-8a8a-6b1ebd322218

In fact, they all seem to be failing:
http://matrix.cpantesters.org/?dist=CLDR-Number+0.00_02

default numbering systems

Issue imported from the TODO:
https://github.com/perl-cldr/cldr-number-perl5/blob/master/lib/CLDR/Number/TODO.pod

support different rounding modes

As per the CLDR spec (see below), default rounding is half-even. There is no current way to change the rounding mode. We use Math::BigFloat, which supports the following modes: even, odd, +inf, -inf, zero, trunc, common. Let's add a rounding_mode attribute and decide if we should use the same modes and names as Math::BigFloat.

An implementation may allow the specification of a rounding mode to determine how values are rounded. In the absence of such choices, the default is to round "half-even", as described in IEEE arithmetic. That is, it rounds towards the "nearest neighbor" unless both neighbors are equidistant, in which case, it rounds towards the even neighbor. Behaves as for round "half-up" if the digit to the left of the discarded fraction is odd; behaves as for round "half-down" if it's even. Note that this is the rounding mode that minimizes cumulative error when applied repeatedly over a sequence of calculations.

other numbering systems (native, traditional, finance)

Issue imported from the TODO:
https://github.com/perl-cldr/cldr-number-perl5/blob/master/lib/CLDR/Number/TODO.pod

test upgrade using preliminary CLDR v28 data

From @JCEmmons:

To: cldr-users
Subject: Preliminary JSON available for release 28
From: John Emmons
Date: Tue, 1 Sep 2015 00:09:28 -0500

A preliminary version of the JSON for the upcoming CLDR release 28 is now
available on github for testing. Please see
https://github.com/unicode-cldr/cldr-json for details. Any errors or
omissions should be reported via CLDR trac by filing a new ticket at
http://unicode.org/cldr/trac/newticket

use Math::BigFloat as much as possible

We’re already using Math::BigFloat in most situations for rounding using the round_mode and ffround methods. Let’s continue to use if for any functionality we can, replacing existing code in CLDR::Number: is_nan, is_inf, is_pos, is_neg, etc.

We need plurals data from CLDR.

handle undef as method argument

Handle undef by warning and returning undef like core Perl functions.

locales should inherit from defined parent locales when available

Right now, the inheritance works like zh-Hant-MO → zh-Hant → zh → root, but Part 1 Core §4.1.1 Parent Locales defines exceptions in the LDML for different parents.

For example:

 <parentLocale parent="zh_Hant_HK" locales="zh-Hant-MO"/>

This would modify the inheritance to zh-Hant-MO → zh-Hant-HK → zh-Hant → zh → root.

Others are defined with a parent of root to skip normal steps altogether. The most notable problem with the current inheritance is that es-US (US Spanish), es-MX (Mexican Spanish), es-CR (Costa Rican Spanish), etc., inherit directly from es (European Spanish) instead of es-419 (Latin American Spanish).

currency symbol lengths

Issue imported from the TODO:
https://github.com/perl-cldr/cldr-number-perl5/blob/master/lib/CLDR/Number/TODO.pod

possible Moo v1.000006 / v1.000007 bug

Three CPAN Testers’ reports are reporting massive test failures that may be related to Moo v1.000006 and v1.000007. More investigation is needed.

Here are the related CPAN Testers’ reports:

improve docs for a broader audience

tl;dr: Let’s improve the docs! Please add doc requests or suggestions in the comments here.

The first goal of this project was to implement the standardized Unicode CLDR–based localized number formatting defined in UTS #35, Part 3: Numbers. Much of the CLDR::Number documentation, however, does not go into detail to describe functionality to developers without existing familiarity with the CLDR. This project shouldn’t require external knowledge in order to use it. One problem is that it allows for a lot of advanced customization that most developers will never need to use or know about when they can instead depend on the defaults provided for the requested locale (and currency for prices). Perhaps the docs should be split into 100% self-contained intro-level with more examples, and advanced-level with all the gritty options and external references. These days I write much more documentation for developers than actual coding, and while I have less time for maintaining my CPAN modules, I’d like to commit some time to improve these docs.

Thanks to @Ovid for bringing this to my attention:

write FAQ

Started this in commit fdc3e57.

integrate repo with Coveralls

https://coveralls.io/r/perl-cldr/cldr-number-pm5

'accounting' currency format in addition to default 'standard'

Issue imported from the TODO:
https://github.com/perl-cldr/cldr-number-perl5/blob/master/lib/CLDR/Number/TODO.pod

maximum integer digits

Implement the functionality supplied by the maximum_integer_digits attribute, which already exists as a stub. There doesn’t appear to be a symbol associated with this.

UTS #35, Part 3, §3.3:

If the number of actual integer digits exceeds the maximum integer digits, then only the least significant digits are shown. For example, 1997 is formatted as 97 if the maximum integer digits is set to 2.

parsed pattern caching

Issue imported from the TODO:
https://github.com/perl-cldr/cldr-number-perl5/blob/master/lib/CLDR/Number/TODO.pod

non-Latin (latn) numbering systems (thai, geor, hant, etc.)

May be easier to start out with numbering systems that have a @type of numeric as well as a value for @digits.

Issue imported from the TODO:
https://github.com/perl-cldr/cldr-number-perl5/blob/master/lib/CLDR/Number/TODO.pod

passing prebuilt locales to formatters from CLDR::Number

Issue imported from the TODO:
https://github.com/perl-cldr/cldr-number-perl5/blob/master/lib/CLDR/Number/TODO.pod

locale subtag attributes for use without locale attribute parsing

Issue imported from the TODO:
https://github.com/perl-cldr/cldr-number-perl5/blob/master/lib/CLDR/Number/TODO.pod

move tests to Test::Class

currency spacing rules

Issue imported from the TODO:
https://github.com/perl-cldr/cldr-number-perl5/blob/master/lib/CLDR/Number/TODO.pod

remove Math::BigFloat for Inf/NaN checking

We started using Math::BigFloat in CLDR::Number v0.14 [issue #45] to check for infinity, NaN, and negatives, but this addition has created many failing test reports:

http://matrix.cpantesters.org/?dist=CLDR-Number+0.14

It turns out that Perl 5.22 overhauled infinity and NaN values to be more consistent across platforms and operations, including stringifying to Inf and NaN instead of the previous inf and nan; however, Math::BigFloat doesn’t understand those titlecased values and treats them both as NaN. We’re better off performing the checks ourselves for now, as well as submitting an issue for the Math::BigInt project.

add algorithmic (non-decimal) numbering systems

We now support non-Latin (latn) numbering systems, but only decimal systems, not algorithmic systems like hant (Traditional Chinese Numerals), hebr (Hebrew Numerals), roman (Roman Numerals), etc.

Using Locale::CLDR corrupts CLDR::Number

In a project I am using CLDR::Number for quite some time to format numbers in the right locale.

Now I want to use Locale::CLDR to get country names in the correct language. However, as soon as I use Locale::CLDR, formatting an integer number via CLDR::Number fails with the message:
Can't locate object method "ffround" via package "Math::BigInt" at <path_to>/perllib/CLDR/Number/Role/Format.pm line 260

I can easily reproduce this using the following script:

#!/usr/bin/perl

use strict;

use CLDR::Number;
use Locale::CLDR;

my $cldr = CLDR::Number->new(locale => 'en');
my $formatter = $cldr->decimal_formatter(minimum_fraction_digits => 2, maximum_fraction_digits => 2);
print 'Success: ', $formatter->format(15.23), "\n";
print 'Fail: ', $formatter->format(42.0), "\n";

Here, the formatting of the number 42 will fail with the indicated message. As soon as I remove the line use Locale::CLDR, the formatting works as expected.

Do you know why using Locale::CLDR causes CLDR::Number to break? I know that the latter is a somewhat older module, but I do not want to let go of it. If there is a more up-to-date module with a similar interface as CLDR::Number, then I will definitely check it out.

deprecate mutable locales

The locale attribute being mutable has caused additional code, complexity, and bugs. The problem is that it is a rw attribute that sets a dozen or so other rw attributes. It's difficult to maintain these inherited attributes that should be lazy, publicly writable, and change based on changes to locale. The solution is to change locale from rw to ro. This is backward-incompatible, but there are no known real-world uses of a mutable locale other than convenience in unit tests and examples.

Publicly announce upcoming deprecation of the locale method used as a setter and request feedback.
Document the deprecation in the next release of CLDR::Number.
Warn when mutating the locale in a further release.
Finally, change the locale from rw to ro and remove related code.

Comments and suggestions highly appreciated!

round half-even with rounding increment

By default we use Math::BigFloat for rounding and round in half-even mode. If a rounding increment greater than 1 is provided in the pattern or via the rounding_increment attribute, we instead use Math::Round::nearest because it supports rounding increments; however, it does not support half-even rounding, which I believe we should be performing along with rounding increments. We need to investigate alternatives and possibly ask for clarification on the CLDR mailing list.

number parsers

Add number parsers under CLDR::Number::Parse as described in UTS #35, Part 3, §7.

scientific notation formatter

Issue imported from the TODO:
https://github.com/perl-cldr/cldr-number-perl5/blob/master/lib/CLDR/Number/TODO.pod

infinity and NaN are not supported by all perls

Some older perls on some systems don’t support inf and nan. Here are a few failing test reports from CLDR::Number v0.12.

I think we should just test for support in the test file t/inf-nan.t and skip with a diag warning when not supported, as well as documenting that the feature depends on perl’s support for the given system.

quiet down expected warnings in tests

Use Test::Warnings so we don't actually warn to STDERR while running tests.

Here's the only current problem:

t/inheritance.t ....... ok
default_locale 'xx' is unknown at (eval 36) line 44.

escaped quoting bug in Perl v5.8.8

Escaped quotes are being returned in formats as \xF7\xB0\x80\x84 (utf8-encoded \x{1F0000}) instead of the proper '. This is happening in all two CPAN Testers’ reports for Perl v5.8.8 and no other versions. The other reports from v5.8.x are v5.8.5 and v5.8.9, which do not have this problem.

Here are the related CPAN Testers’ reports:

Here are the three failing tests, which are the same in both reports:

#   Failed test 'single quote itself'
#   at t/from_uts35.t line 57.
#          got: '1 oÃ·Â°Â€Â„clock'
#     expected: '1 o'clock'
# Looks like you failed 1 test of 41.
t/from_uts35.t ........ 
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/41 subtests 

#   Failed test at t/quoting.t line 16.
#          got: 'Ã·Â°Â€Â„123Ã·Â°Â€Â„'
#     expected: ''123''

#   Failed test at t/quoting.t line 17.
#          got: '#Ã·Â°Â€Â„#'
#     expected: '#'#'
# Looks like you failed 2 tests of 7.
t/quoting.t ........... 
Dubious, test returned 2 (wstat 512, 0x200)
Failed 2/7 subtests

change internal placeholder non-Unicode codepoints to PUA

Change non-Unicode codepoints to Private Use Area codepoints. These are internally used as placeholders. We're currently using U+1F0000, U+1F0001, U+1F0002, U+1F0003, and U+1F0004, but this caused bug #20, which required a hacky workaround.

Tests fail (with latest Moo?)

There are new test failures — see http://www.cpantesters.org/cpan/report/487b0514-e060-11e5-a971-eac272d7c31d for a sample.

Statistical analysis from test failures generated on my machine suggests that the problem is caused by the latest Moo (negative theta is bad):

****************************************************************
Regression 'mod:Moo'
****************************************************************
Name                   Theta          StdErr     T-stat
[0='const']           1.0000          0.0000    30849180474401392.00
[1='eq_1.007000']             0.0000          0.0000       0.00
[2='eq_2.000001']             0.0000          0.0000       1.98
[3='eq_2.000002']             0.0000          0.0000       3.36
[4='eq_2.001000']            -1.0000          0.0000    -28977759259709780.00

R^2= 1.000, N= 74, K= 5
****************************************************************

preparsed patterns for predefined locales

Issue imported from the TODO:
https://github.com/perl-cldr/cldr-number-perl5/blob/master/lib/CLDR/Number/TODO.pod

significant digits

Add the significant_digits attribute, the @ symbol in patterns, and associated functionality described in UTS #35, Part 3, §3.5.

format inf, -inf, and nan

Perl treats inf, -inf, and nan as numbers; CLDR has formats for infinity, nan, and the negative sign; so let's format them appropriately.

integrate repo with Travis CI

https://travis-ci.org/perl-cldr/cldr-number-pm5

support spelled-out currencies

Add support for spelled-out currencies using the unitPattern and displayName with a count attribute. For example, 5000 JPY (Japanese Yen) in ja (Japanese) would be 5,000 円 (as opposed to ￥5,000), which uses the unitPattern {0} {1} and displayName 円 with the count other. See UTS #35, Part 3, §4: Currencies for details.

Review the ICU API for this feature and determine what attribute should be used to enable it. Also consider how to best store and load the data because it will take much more memory than the other currency data.

This feature has been requested by users.

validate method arguments

Consider using Params::Validate. See also #22 for handling undef.

support CLDR v27

CLDR v25 was released today:
http://unicode-inc.blogspot.com/2014/03/cldr-version-25-released.html

The changes are primarily structural in nature and very few of these changes affect numbers, while none of these structural changes affect the implemented portions of CLDR::Number.

Here are the locale data changes that affect us:

new locales fy (West Frisian), fy-NL, ug (Uyghur), ug-Arab, ug-Arab-CN, prg (Prussian)
data improvements for official languages
number symbol fixes

Additionally there is "Better locale matching, with better fallbacks; likely subtags for regions; added scripts for various languages" but our locale matching and fallbacks were already rather minimal. We should obviously use the new version when implementing matching/fallback improvements.

add minimum grouping digits

Minimum grouping digits were added to the spec in CLDR v26 (#33). LDML stores the related value as minimumGroupingDigits and we should add the minimum_grouping_digits attribute.

http://www.unicode.org/reports/tr35/tr35-numbers.html#Number_Elements

The minimumGroupingDigits can be used to suppress groupings below a certain value. This is used for languages such as Polish, where one would only write the grouping separator for values above 9999. The minimumGroupingDigits contains the default for the locale.

http://cldr.unicode.org/translation/numbering-systems

In some languages, the grouping separator is suppressed in certain cases. For example, see china-auf-wachstumskurs.gif, where there is a grouping separator in 12 080 but not in 4720. The minimumGroupingDigits determines what the default for a locale is.

patch / cldr-number-pm5 Goto Github PK

cldr-number-pm5's People

Contributors

Stargazers

Watchers

Forkers

cldr-number-pm5's Issues

Recommend Projects

Recommend Topics

Recommend Org