Giter Club home page Giter Club logo

nicknames's Introduction

CI PyPI version

Nicknames

A hand-curated CSV file containing English given names (first names) and their associated nicknames.

There are Python, SQL, Java, Perl, and R parsers provided for convenience.

This is a relatively large list with roughly 1100 canonical names. Any help from people to clean this list up and add to it is greatly appreciated. The first name in a line is the canonical name, and the rest are nicknames for that name.

This lookup file was initially created by mining this genealogy page from the Center for African American Research, Inc. Because the lookup originates from a dataset used for genealogy purposes there are old names that aren't commonly used these days, but there are recent ones as well. Examples are "gregory", "greg", or "geoffrey", "geoff". There was also a significant effort to make it machine readable, i.e. separate it with commas, remove human conventions like "rickie(y)" would need to be made into two different names "rickie", and "ricky". Due to the source of the original data, the dataset is heavily biased towards traditionally African American names. Names from other groups may or may not be present.

This project was created by Old Dominion University - Web Science and Digital Libraries Research Group. More information about the creation of this lookup can be found on this blog post about the creation of this library

Python API

The Python parser is available on PyPI from

pip install nicknames

and then you can do:

from nicknames import NickNamer

nn = NickNamer()

# Get the nicknames for a given name as a set of strings
nicks = nn.nicknames_of("Alexander")
assert isinstance(nicks, set)
assert "al" in nicks
assert "alex" in nicks

# Note that the relationship isn't symmetric: al is a nickname for alexander,
# but alexander is not a nickname for al.
assert "alexander" not in nn.nicknames_of("al")

# Capitalization is ignored and leading and trailing whitespace is ignored
assert nn.nicknames_of("alexander") == nn.nicknames_of(" ALEXANDER ")

# Queries that aren't found return an empty set
assert nn.nicknames_of("not a name") == set()

# The other useful thing is to go the other way, nickname to canonical:
# It acts very similarly to nicknames_of.
can = nn.canonicals_of("al")
assert isinstance(can, set)
assert "alexander" in can
assert "alex" in can

assert "al" not in nn.canonicals_of("alexander")

# You can combine these to see if two names are interchangeable:
union = nn.nicknames_of("al") | nn.canonicals_of("al")
are_interchangeable = "alexander" in union

For more advanced usage, such as loading your own data, read the source code.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.