teamcohen / secondstring Goto Github PK
View Code? Open in Web Editor NEWA bunch of fancy soft string matching routines, with some accompanying datasets
License: Other
A bunch of fancy soft string matching routines, with some accompanying datasets
License: Other
Would be nice to have some sort of quickstart section on the README page to figure out how to get going from scratch.
Also a link to the Javadoc would be good.
And maybe a reference to this paper
And maybe the original SourceForge page?
I used this program in 2007 so now giving back with contributions.
We are facing an issue with the class below in a hadoop job. we are wondering if there are any limitations in using this class for calculating the similarity. Please advise.
Class: com.wcohen.ss.Levenstein
Method invocation:
Levenstein().score(str1, str2)
Error:
ERROR [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.StackOverflowError
at com.wcohen.ss.MemoMatrix.get(MemoMatrix.java:40)
at com.wcohen.ss.NeedlemanWunsch$MyMatrix.compute(NeedlemanWunsch.java:41)
at com.wcohen.ss.MemoMatrix.get(MemoMatrix.java:40)
at com.wcohen.ss.NeedlemanWunsch$MyMatrix.compute(NeedlemanWunsch.java:41)
at com.wcohen.ss.MemoMatrix.get(MemoMatrix.java:40)
at com.wcohen.ss.NeedlemanWunsch$MyMatrix.compute(NeedlemanWunsch.java:41)
at com.wcohen.ss.MemoMatrix.get(MemoMatrix.java:40)
at com.wcohen.ss.NeedlemanWunsch$MyMatrix.compute(NeedlemanWunsch.java:41)
at com.wcohen.ss.MemoMatrix.get(MemoMatrix.java:40)
at com.wcohen.ss.NeedlemanWunsch$MyMatrix.compute(NeedlemanWunsch.java:41)
at com.wcohen.ss.MemoMatrix.get(MemoMatrix.java:40)
at com.wcohen.ss.NeedlemanWunsch$MyMatrix.compute(NeedlemanWunsch.java:41)
at com.wcohen.ss.MemoMatrix.get(MemoMatrix.java:40)
at com.wcohen.ss.NeedlemanWunsch$MyMatrix.compute(NeedlemanWunsch.java:41)
at com.wcohen.ss.MemoMatrix.get(MemoMatrix.java:40)
at com.wcohen.ss.NeedlemanWunsch$MyMatrix.compute(NeedlemanWunsch.java:41)
at com.wcohen.ss.MemoMatrix.get(MemoMatrix.java:40)
at com.wcohen.ss.NeedlemanWunsch$MyMatrix.compute(NeedlemanWunsch.java:41)
at com.wcohen.ss.MemoMatrix.get(MemoMatrix.java:40)
at com.wcohen.ss.NeedlemanWunsch$MyMatrix.compute(NeedlemanWunsch.java:41)
at com.wcohen.ss.MemoMatrix.get(MemoMatrix.java:40)
at com.wcohen.ss.NeedlemanWunsch$MyMatrix.compute(NeedlemanWunsch.java:41)
at com.wcohen.ss.MemoMatrix.get(MemoMatrix.java:40)
at com.wcohen.ss.NeedlemanWunsch$MyMatrix.compute(NeedlemanWunsch.java:41)
Hello
I am currently packaging secondstring for the Debian project because it is one of the dependencies of the OpenRefine tool.
The files
src/com/wcohen/ss/TagLink.java
src/com/wcohen/ss/tokens/TagLinkToken.java
mention Horacio Camacho as the author but he does not seem to be a member of the original "TeamCohen". Could you clarify if Horacio Camacho also licensed his work under the University-of-Illinois-NCSA-Open-Source-License ? Thanks in advance
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.