Giter Club home page Giter Club logo

rust-petname's People

Contributors

allenap avatar dependabot-preview[bot] avatar hardliner66 avatar scsibug avatar tranzystorekk avatar vadixidav avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

rust-petname's Issues

Switch to GitHub Actions

Currently using Travis CI. It works, but I find it confusing. This would also be an opportunity to learn GitHub Actions.

Cheap, non-allocating generator

Currently, initializing a generator causes the parsing and allocation of all the words. Hence it is too expensive for an on-the-fly use e.g. in a webserver where names are generated on client request. So it has to be initialized once for all, but then it uses memory.

Even if you only need one name, you have to process all the words, when you'd only need the total count of words and the map from a word index to its reference.

There could be a generator that uses a static slice of words: no need for runtime parsing nor allocation, and the wordlist is only in static program memory, not in the heap (÷2 memory usage).

Edit: Also space could be saved if words were not stored as a static newline-separated string but as something like (&'static [[u8; 1]], &'static [[u8; 2]], &'static [[u8; 3]], ...) i.e. lists of words indexed by length (even more efficient than [&'static str] because we don't need one reference per word). I hope such a list can be built at compile-time from the newline-separated files.

Upstream petname has -u / --ubuntu option

This generates Ubuntu-style names: alliteration of first character of each word.

I like this, but I'm not sure I want to copy this UX exactly. Maybe an --alliterate flag?

Improve README.md

  • Move the "Features & no_std support" section down – but call it out in the introduction – because I think it could be off-putting for someone who just wants a CLI or a simple library.
  • In the introduction, list the ways the crate can be used: CLI, library (as-is, stripped-down, or fully no_std).
  • Give examples of situations where petname is useful?
  • Move the code examples into the generated docs, i.e. https://docs.rs/petname/.

Make `Names` non-public, hide behind `impl Trait`

There's no need for Names to be public. It has a cardinality method that is convenient, but it just passes through to Petnames::cardinality, so we could remove it. A kind-of replacement would be to implement Iterator::size_hint.

Ensure that default word lists are unique

The following snippet will find duplicates:

find words -name '*.txt' |
  while read filename; do
    printf '=== %q\n' "$filename" && sort "$filename" | uniq -d
  done

In CI we want to ensure that list is empty.

Upstream petname has -d / --dir option

This allows loading alternative word lists. The target should contain adverbs.txt, adjectives.txt, and names.txt. The default upstream is /usr/share/petname. The default in rust-petname is to use the compiled-in word lists.

Optimise word lists for common operations

This follows on from ideas in #76:

... space could be saved if words were not stored as a static newline-separated string but as something like (&'static [[u8; 1]], &'static [[u8; 2]], &'static [[u8; 3]], ...), i.e. lists of words indexed by length (even more efficient than [&'static str] because we don't need one reference per word). I hope such a list can be built at compile-time from the newline-separated files.

There are at least a couple of common things that the petname command-line tool lets you do that might benefit from preprocessing the word lists before they're compiled in, i.e. alliteration, and word length limits:

    -a, --alliterate                  Generate names where each word begins with the same letter
    -A, --alliterate-with <LETTER>    Generate names where each word begins with the given letter

    -l, --letters <LETTERS>           Maximum number of letters in each word; 0 for unlimited [default: 0]

Separately, I am thinking about changing the -l, --letters <LETTERS> option to take a range, e.g. 3-8. That might have a bearing on how to preprocess the default word lists.

Non-repeating iterator

Neither Petnames::generate nor Names guarantee not to repeat the same name. It should be possible to create a non-repeating iterator by shuffling the word lists then yielding their cartesian product. This is obviously more expensive than choosing words on demand, though not awful I assume if the guarantee of non-repetition is needed. Though, maybe there's a less expensive way? Unless there is a method of doing this with near-zero cost, the default should still be as it is now.

Template strings

Ideas:

  • petname %j-%v-%n would yield "$adjective-$adverb-$noun", e.g. "fully-select-airedale" (same as petname -w3).
  • petname 'My %J %N would yield "My $Adjective $Noun", e.g. "My Keen Toucan" (literal text & capitalisation).
  • ...

Macro to compile in custom word lists

With the default-words feature, one can compile in the word lists that ships with this crate. There's a certain amount of machinery built around that, but it's not possible to easily compiles in custom word lists – one has to reproduce much of that same machinery. It would be cool if anyone could make use of that via, say, a petnames! macro.

Panics when pipe is closed in streaming mode (--count=0)

For example:

$ petname --count=0 | grep ^e | head -n1
enough-calf
thread 'main' panicked at 'failed printing to stdout: Broken pipe (os error 32)', src/libstd/io/stdio.rs:805:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Overlap between word lists

For example, the following words are common to both words/medium/names.txt and words/large/names.txt:

bee, buck, bunny, colt, coral, dane, drake, fawn, fisher, jay, joey, kit, ling, loris, mara, marlin, martin, merlin, molly, phoebe, phoenix, raven, ray, rhea, robin, shad, wren, zander

This is a problem because Petnames::default concatenates word lists of all sizes without deduplication, so these common words are more likely to be chosen than others.

This is not a problem upstream because only one word list (small, medium, or large) is used at any time. This is selected by the --complexity option at the command-line, so fixing #6 may eliminate this issue.

`petname` integration tests did not spot buffering issue

I think at some point during the v2 alpha/beta period I wrapped stdout with a buffered writer. It was never being explicitly flushed, but its Drop impl does that so it should have worked. However, I was creating the buffer in main and keeping it in scope until the end of the function – and since I was calling std::process::exit directly in main, the buffer's Drop impl was not called before the process exited.

The bug is: I didn't notice this, and released 2.0.0 with this issue!

The integration tests currently in this project call run directly, i.e. they missed the logic in main that caused the bug. An integration test that invokes petname from the outside could have prevented this.

Upstream petname has -c / --complexity option

where it can be one of [0, 1, 2]; 0 = easy words, 1 = standard words, 2 = complex words.

This switches between the small (0), medium (1), and large (2) word sets. Values other than 0, 1, or 2 use the default word set, which is small.

Not sure I want to copy this UX exactly. Maybe it's more interesting to have --small, --medium, --large flags, or a --word-set={small,medium,large} option.

`petname --non-repeating` applies randomness "unevenly"

When running petname --non-repeating, for simplicity of implementation the word lists are shuffled only once at the start, then iterated through like a counter. For example, supposing we have 2 word lists:

a, b, c
x, y, z

The output of petname --non-repeating --stream might be:

c-y
a-y
b-y
c-z
a-z
b-z
c-x
a-x
b-x

Note that for the second word, we see all the ys, then al the zs, then the xs. It could be in any order, but they'll always be grouped. Note also that for the first word the order in which we see them is repeated.

It's easy to observe this:

$ petname --non-repeating --alliterate-with k --count 30
known-koala
key-koala
keen-koala
kind-koala
knowing-koala
known-killdeer
key-killdeer
keen-killdeer
kind-killdeer
knowing-killdeer
known-kid
key-kid
keen-kid
kind-kid
knowing-kid
known-kite
key-kite
keen-kite
kind-kite
knowing-kite
known-kiwi
key-kiwi
keen-kiwi
kind-kiwi
knowing-kiwi
known-kitten
key-kitten
keen-kitten
kind-kitten
knowing-kitten

My expectation is that there would be no obvious patterns like this.

Support for platforms that getrandom does not support

Currently, it isn't possible to use this library on a platform I am using. The reason is that petname seems to depend on getrandom which has unsupported targets (https://docs.rs/getrandom/0.2.1/getrandom/#unsupported-targets).

I DO have an implementation of Rng on my platform from the rand crate, which petname supports, but since it also depends on getrandom, I am unable to use petname (without forking it). Would you be willing to accept a PR to introduce features to petname so that when you use default-features = false that it does not pull in getrandom?

Relevant error:

error: target is not supported, for more information see: https://docs.rs/getrandom/#unsupported-targets
   --> /home/worleyg/.cargo/registry/src/github.com-1ecc6299db9ec823/getrandom-0.2.1/src/lib.rs:214:9
    |
214 | /         compile_error!("target is not supported, for more information see: \
215 | |                         https://docs.rs/getrandom/#unsupported-targets");
    | |_________________________________________________________________________^

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.