Use-case: Searching for a best match
Given a string Q
, and a list of strings L
, I want to find the distance between Q
and every element of L
and then pick the element with the shortest distance.
Current solution
The current solution is to call a distance function with two String parameters. This function assumes nothing about the Strings and hence some information is recomputed every time (for example, profiles of a string).
Proposal
Two additional APIs can be provided to improve performance:
Profile getProfile(String)
double distance(Profile p1, Profile p2)
For convenience, a third API would also be useful:
double distance(Profile p, String s) {
return distance(p, getProfile(s));
}
This can be used like this:
String query = "alex";
QGram qg = new QGram(2);
Profile queryProfile = qg.getProfile(query);
list forEach { element ->
println(qg.distance(queryProfile, element));
}
Further, if the list is going to be persistent, it could also be possible to serialize the profile of each element of the list into the persistent store. Then, both the query string and the list element's profile need not be recomputed every time!
If the overall idea sounds good to you, I will write more about how the Profile
type could be made type safe across the different implementations of StringSimilarityInterface
.