Comments (8)
Can we make a basic character substitution map so that special or accented characters match plaintext, or is this too complicated with Unicode?
https://www.npmjs.com/package/normalize-strings
from periodo-reconciler.
Well, there we go.
from periodo-reconciler.
Another way to do this is to normalize input to Unicode NFD (decomposed) and then use a regex to strip all accent class characters (in Ruby this is \p{M}
).
from periodo-reconciler.
Yep, that's probably a better approach than relying on a whitelist of characters.
from periodo-reconciler.
Fixed by 48d5efd
from periodo-reconciler.
One issue with replacing Roman numerals with their Arabic equivalents is that the same processing is applied to both period labels and queries. Doing that replacement on a period label is no problem, since it's very unlikely that there would be a stray "I" "V" or "X" token appearing in a period label that was not a Roman numeral. However, there might be a situation where that would be the case in a query-- like if someone wanted all the periods that started with a "V". That's not a real case, but it's not far from one.
from periodo-reconciler.
But this comment is more about free-text searching than reconciliation, right? I would assume that one wouldn't have a dataset for reconciliation that just had letters like 'V' that weren't numerals.
from periodo-reconciler.
Yes, you're right. Implemented in #7 if @rybesh is fine with it!
from periodo-reconciler.
Related Issues (10)
- "SyntaxError: unexpected token function" on run HOT 3
- Improving reconciliation performance: better precision HOT 13
- Improving reconciliation performance: better recall HOT 4
- Broader/narrower matches (continues convo in #2)
- Matching multiple periods to a single value
- Any plans to host this on perio.do? HOT 3
- Better control of pop-up window with period details HOT 1
- Sensitivity of string matching HOT 1
- Support Data Extension API HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from periodo-reconciler.