Comments (4)
There are two questions here:
Why do some data modules pass batch size to the data loader while some pass batch size to NRandomCrop?
This is by design. Many datasets such as GID-15 contain very large image scenes (7k x 7k px). These are too large to pass directly to a model and must first be cropped. If the user desires a batch size of 64 and a patch size of 64, instead of loading 64 of these images (may exceed memory) and cropping each to a small image (64 x 64 px), we load a single image and crop it to 64 random subsets. This was done during The Great Data Module Refactor of 2023 (#992). I don't particularly love this solution, but the alternative was for us to add more parameters/complexity.
Why do OSCD and LEVIR-CD+ specifically do this?
This I'm not sure about. These images aren't honestly that big. We could probably use the normal scheme. I would be open to reviewing which datasets use RandomNCrop and trying to phase it out or replace it with something better. This is also something we plan to upstream to Kornia eventually.
from torchgeo.
Can we close this issue, or do we want to take action on these data modules for small-ish images?
from torchgeo.
Thanks for taking the time to explain the reasoning - this will be useful for anyone else digging into these two datasets.
I think we can close this, and as you suggest creating an issue to review the use of RandomNCrop - I half remember that there is now a more systematic approach possible?
from torchgeo.
I'm unaware of a more systematic approach.
from torchgeo.
Related Issues (20)
- Incompatible image size with RandomGeoSampler HOT 3
- Easier way to use Data Processing steps outside of datamodule HOT 4
- Benchmarking of all pre-trained weights HOT 4
- Add instructions on downloading the DeepGlobeLandCover dataset HOT 5
- The new lightly release breaks BaseTask with timm imports HOT 5
- SSL Weight Decay HOT 6
- Migrate from Radiant MLHub to Source Cooperative HOT 13
- Datamodule augmentation defaults HOT 8
- NCCM checksum error HOT 6
- Support additional SatlasPretrain models. HOT 6
- Document significance of macro vs micro averaging HOT 3
- Add BalancedRandomGeoSampler balancing positives and negatives HOT 2
- Add support for Lightning Streaming Dataset HOT 14
- Add `ignore_index` support for Jaccard Loss HOT 1
- Unpin torch, use a min or range? HOT 4
- trainers.segmentation JaccardLoss receiving num_classes, should be a List[int]? HOT 8
- GeoDataset: non-deterministic behavior HOT 5
- Sentinel 2 dataset can't see files downloaded from Copernicus Browser - filename doesn't fit regex HOT 1
- Errors & improvements in Metrics descriptions HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from torchgeo.