Giter Club home page Giter Club logo

itertools2's Introduction

itertools2

The R package itertools2 is a port of Python's excellent itertools module to R for efficient looping and is a replacement for the existing itertools R package.

Installation

You can install the stable version on CRAN:

install.packages('itertools2', dependencies=TRUE)

If you prefer to download the latest version, instead run the following after installing devtools:

devtools::install_github('ramhiser/itertools2')

License

The itertools2 R package is licensed under the MIT License. However, this package depends on the iterators R package, which is licensed under the Apache License, Version 2.0. Both packages are freely available for commercial and non-commerical usage. Please consult the licensing terms for more details.

itertools2's People

Contributors

ramhiser avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

itertools2's Issues

Satisfy BDR for CRAN Submission

Welp. Got Ripley'd over last night's CRAN submission. First, rather than ignoring the extended description in cran-comments.md, which remained the same, Ripley said:

None of this is relevant to an update of an existing package!

Next, because itertools2 is in the package title in DESCRIPTION, Ripley said:

Please do not repeat the name, and use title case as per 'Writing R Extensions'.

In case I didn't understand fully, Ripley closed the email with:

Do improve the title before re-submission, and reduce our reading load to relevant material.

Helper functions for consuming iterators

Writing unlist(as.list(it)) gets kind of old. Plus, it can get weird when it returns something more than a single numeric or character. With this in mind, helper functions are a must. Something along the lines of:

  • to_list
  • to_vector
  • to_dataframe

Port groupby() from Python

While this is an interesting function, dplyr handles this case well. Also, groupby may not translate easily to R because of the lack of dictionaries.

itee does not work properly when an iterator is passed

When an iterator is passed to itee, the function does not behave as expected.

When the object passed is simply a vector, then the behavior is fine. Example:

 > iter_list <- itee(1:4, n=2)
 > lapply(iter_list, iterators::nextElem)
 [[1]]
 [1] 1

 [[2]]
 [1] 1

 > lapply(iter_list, iterators::nextElem)
 [[1]]
 [1] 2

 [[2]]
 [1] 2

 > lapply(iter_list, iterators::nextElem)
 [[1]]
 [1] 3

 [[2]]
 [1] 3

 > lapply(iter_list, iterators::nextElem)
 [[1]]
 [1] 4

 [[2]]
 [1] 4

!> lapply(iter_list, iterators::nextElem)
 Error: StopIteration

Now, consider the same case where the vector has first been passed to iterators::iter.

 > iter_list <- itee(iterators::iter(1:4), n=2)
 > lapply(iter_list, iterators::nextElem)
 [[1]]
 [1] 1

 [[2]]
 [1] 2

 > lapply(iter_list, iterators::nextElem)
 [[1]]
 [1] 3

 [[2]]
 [1] 4

!> lapply(iter_list, iterators::nextElem)
 Error: StopIteration

The individual elements are not independent of each other as they should be. This has to do with how itee behaves currently. The passed object is replicated via base::replicate.

Write package vignette

A vignette is needed with several examples to demonstrate:

  1. How to use itertools2
  2. Why itertools2 is useful
  3. Why itertools2 is necessary

Better handling of iterators by ichain

When ichain is applied to objects of class iterator, the results are a bit confusing and are quite different from Python's itertools.

R version:

 > it <- ichain(islice('ABCDEFG', 2, 4), islice('ABCDEFG', 1, 3))
 > nextElem(it)
 [1] "ABCDEFG"
 > nextElem(it)
 Error: StopIteration

Compare with the Python version:

>>> from itertools import chain, islice
>>> it = chain(islice('ABCDEFG', 2, 4), islice('ABCDEFG', 0, 2))
>>> it.next()
'C'
>>> it.next()
'D'
>>> it.next()
'A'
>>> it.next()
'B'
>>> it.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

iseq_along not found

Example code

library(itertools2)
it3 <- iseq_along(iris)
Error: could not find function "iseq_along"

Random selection from an iterator/iterable object

In many cases, it makes a lot of sense to randomly select from some object (or iterator) by sampling the indices of the object. Example:

set.seed(42)
n <- nrow(iris)
indices <- seq_len(n)
train_idx <- sample(indices, 2/3 * n)

train_data <- iris[train_idx, ]
test_data <- iris[-train_idx, ]

If n is extremely large, the indices vector becomes extremely large. To avoid this overhead, it makes sense to have some interface like:

it <- isample(iseq_len(n), 50)
as.list(it) # vector of length 50

Integrate with magrittr

Iterators and itertools2 will be more flexible and useful if they can be used with magrittr pipes (and ultimately with dplyr). It turns out that pipes do, in fact, work in some cases.

Example (related to #41):

as_vector <- function(x) {
  unlist(as.list(x))
}

set.seed(42)
irep(function(x) rnorm(1), times=10) %>% as_vector
# [1]  1.37095845 -0.56469817  0.36312841  0.63286260  0.40426832 -0.10612452  1.51152200 -0.09465904
# [9]  2.01842371 -0.06271410

Next Steps

  • Construct lots of use cases where pipes are used.
  • Lots of unit tests based on these examples.

Retool to use purrr as backend?

As discussed in #46, having the iterators package as a backend for itertools2 yields slow performance in a few cases. On the other hand, @hadley is on a functional-programming kick (a good thing). Besides, the purrr package is purrrdy sweet and fits the more modern R workflow.

Explore retooling purrr as a backend.

Initial release to CRAN

  • Add @kschaef as package coauthor
  • Double-check README
  • Double-check DESCRIPTION
  • Double-check NEWS
  • Pass R CMD CHECK
  • Builds on Windoze
  • Tag git repo with version 0.1
  • Push package to CRAN

ichunk

I eveluated both itertools::ichunk and itertools2::ichunk, and the first (at least for me) looks more logical while trying to handle objects which length not divisible by chunk_size. For example I have vector v=1:5 and want to obtain list(c(1,2), c(3,4), c(5)), is it possible with itertools2::ichunk?

`icount` masks the `icount` function from the `iterators` package, but does not include `iterators::icount`'s functionality

The iterators version of icount generates integer sequences with a fixed step size of 1, starting from 1, either forever (with no argument) or until a specified stopping point is reached. The itertools2 version generates infinite integer sequences with a specified step size.

There are two problems with this:

  1. icount is part of the iterators package and not itertools, so I don't feel it's appropriate to mask.
  2. Loading itertools2 will drastically alter the behavior of the function, necessitating defensive ::s everywhere.

I understand that your icount function is closer to Python's than itertools::icount. Your thoughts?

And by the way, a brief comparison of itertools and itertools2 would be a nice addition to the readme.

hasNext

Do you have plans to implement hasNext method (similar to itertools::hasNext)?
EDIT: Also itertools::ihasNext() is very useful.

itee() doesn't function with pre-existing iterators

Example code

library(itertools2)
iterator <- iter(1:9)
iter_list <- itee(iterator)
iter_list <- itee(iterator, 2)
nextElem(iter_list[[1]])
[1] 1
nextElem(iter_list[[2]])
[1] 2

Likely we won't be able to rely on replicate for iterators.

Combinatoric generators should accept a single integer

Similar to utils::combn, iff a single integer is passed to icombinations or ipermutations, the iterator should return the combinations/permutations of the sequence. For example, the following should be equivalent:

  • icombinations(5, 2)
  • icombinations(1:5, 2)

Currently, as.list(icombinations(5, 2)) returns list().

Better handling of iterators by izip

When izip is applied to objects of class iterator, the results are a bit confusing and are quite different from Python's itertools.

R version:

> it <- izip(islice('ABCDEFG', 2, 4), islice('ABCDEFG', 1, 3))
> nextElem(it)
 Error: StopIteration

Compare with the Python version:

>>> from itertools import izip, islice
>>> it = izip(islice('ABCDEFG', 2, 4), islice('ABCDEFG', 0, 2))
>>> it.next()
('C', 'A')
>>> it.next()
('D', 'B')
>>> it.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Implement an 'in' operator or something similar

Currently, we are using the iterators package, which requires the nextElem function to be called to get the next element from the iterator. This is annoying.

We should be able to do something like (similar to Python):

for (i in iter(1:3)) {
  print(i)
}

... but nope.

Something like %in% may work. See this SO post.

Implement enumerate

Port Python's built-in enumerate function. See, for example, the implementation in the Kmisc package.

irep should match base::rep() when both times and each args are given

itertools2::irep() needs to be modified so that it matches base::rep() when both the times and each args are provided.

Example:

 > rep(1:4, times=2, each=3)
  [1] 1 1 1 2 2 2 3 3 3 4 4 4 1 1 1 2 2 2 3 3 3 4 4 4
> unlist(as.list(irep(1:4, times=2, each=3)))
  [1] 1 1 1 2 2 2 3 3 3 4 4 4

IIRC, if each is specified in irep, the times argument is ignored.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.