Giter Club home page Giter Club logo

Comments (7)

NivekT avatar NivekT commented on August 18, 2024 1

From a purely naming perspective, I think FileOpener is more accurate but FileLoader isn't wildly misleading either.

I had a look at numpy.load, json.load, pickle.load, these seem to either read or parse through files that have already been opened and return some structured data. In that sense, FileLoader definitely behaves differently relative to other modules. Renaming it is likely better.

Perhaps we can rename it, but still leave FileLoader functional with a deprecation warning? I would imagine most people who started using IterDataPipe in PyTorch Core would be using FileLoader and it would be a BC-breaking change.

from data.

ejguan avatar ejguan commented on August 18, 2024

I had a look at numpy.load, json.load, pickle.load, these seem to either read or parse through files that have already been opened and return some structured data. In that sense, FileLoader definitely behaves differently that other modules. Renaming it is likely better.

Thanks for digging into it. You are right. And, the most concerning part is the functional API since we are currently using load_file_... to for each FileLoader. I don't want users to complain the inconsistent behavior across libs.

Perhaps we can rename it, but still leave FileLoader functional with deprecation warning? I would imagine most people who started using IterDataPipe in PyTorch Core would be using FileLoader and it would be a BC-breaking change.

Thank you for pointing out. If we plan to move, we should definitely add deprecation warning at least before official release.

from data.

ejguan avatar ejguan commented on August 18, 2024

Also want to gather some insights from domains since this is going to be BC breaking.
cc: @pmeier @Nayef211

from data.

Nayef211 avatar Nayef211 commented on August 18, 2024

Also want to gather some insights from domains since this is going to be BC breaking. cc: @pmeier @Nayef211

I agree that FileLoader sounds misleading and that FileOpener would be a better name for what the datapipe is actually doing. I also think adding a deprecation warning is a good idea, so that users have time to migrate to the new datapipe.

from data.

pmeier avatar pmeier commented on August 18, 2024

+1 for FileOpener.

Not sure about the deprecation warning though. I mean torchdata is not even "released" yet and is also clearly labeled as prototype. Of course you can go for it, but that also ups the maintenance burden. For torchvision it is fine to ping me on PRs that break BC so I can land a fix quickly.

from data.

ejguan avatar ejguan commented on August 18, 2024

I agree with @pmeier mainly because we didn't mention these DataPipes and functionalities in PyTorch release. And, based on the prototyping policy, we should be able to switch name directly and prevent users to keep using a deprecated feature or name at prototyping phase.

Another way as I mentioned is to add a deprecation warning before our official release. Then, cleaning up during our branch cut. It would add more burden to us maintaining the repo.

I would prefer option 1.

from data.

NivekT avatar NivekT commented on August 18, 2024

I agree that based on the prototyping policy we should be able to rename as we wish. I think the policy is very clear for things that exist in TorchData. Do you think that policy is clear to users for things that are in PyTorch Core? If so, then we can rename it without deprecation.

from data.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.