Giter Club home page Giter Club logo

Comments (17)

PavlosMelissinos avatar PavlosMelissinos commented on July 17, 2024

First of all, thanks for the interest and, yup, all that sounds great.

I wasn't sure about the license, to be honest I didn't really expect to see any activity. :P
It should be ok now, I just changed it to MIT.

Anyway, I think the project needs some work before it is merged with keras-contrib. For instance:

  1. The data loader (and therefore training in general) is cpu-bound because it works on a single thread/process.
  2. The dataset definition, e.g. mscoco.py should be a class, not a bunch of functions like it is now.
  3. Right now the autoencoder is pretty weak, I'm getting about 1.5% [email protected]:0.95 on the ms-coco validation set so it definitely needs some engineering before it's decent.
  4. The aspect ratio is distorted because I naively resize the images with no respect to it (and default cv2.resize doesn't work, so I had to use it in "nearest neighbor" mode which results in pixelated masks).
  5. I generally don't like that I'm using opencv. It's probably overkill in this case and the bgr thing causes a lot of confusion.

Maybe it would be more appropriate if the actual model definition (which should be ready) were split from the data part and the PRs were submitted separately?

from enet-keras.

PavlosMelissinos avatar PavlosMelissinos commented on July 17, 2024

I really didn't expect this, so please don't work on the project before the original authors confirm it's ok to do so (I should have asked them much earlier). The main problem is I'm not sure whether this is considered derivative work and the original code is under a creative commons license (non-commercial).

EDIT: I've removed the LICENSE file temporarily until this clears up. Sorry for the confusion.

from enet-keras.

ahundt avatar ahundt commented on July 17, 2024

Fair enough, I'm actually looking both at this model and to train others on coco. It looks like the upstream repository you link doesn't train on coco, so hopefully that bit is okay?

1. & 4. Keras-FCN addresses the theading in item 1 and the distortion in item 4, I have a fork at https://github.com/ahundt/Keras-FCN/tree/densenet_atrous, and again relevant discussion in keras-team/keras-contrib#63. However, some things there would need to be done a bit differently to address my post in #2.

2. isn't that bad but yeah perhaps a class would be useful, though none of the keras datasets seem to use one. If you're making changes anyway you may want to consider simply integrating directly with Keras-FCN.

3. may be in part due to what I posted in #2, where categories write over each other since a single pixel can have multiple categories in coco. I'm dealing with similar mAP issues in keras-team/keras-contrib#63, but on the pascal voc dataset. It looks like you've started on a solution in yield_image, right?

5. instead of opencv I've generally seen pillow as the easy to use alternative which is already imported by keras. Another alternative I've seen in some cases https://github.com/scikit-image/scikit-image. I agree OpenCV can really make life difficult, especially when it comes to installation and bgr ordering.

from enet-keras.

PavlosMelissinos avatar PavlosMelissinos commented on July 17, 2024

Yes, the only thing that's slightly dangerous is the model directory (in particular encoder.py and decoder.py), so we can safely discuss and modify everything else.

1 & 4. I will take a look at your links then. I'm aware of other multithreaded data loaders as well (one I saw on twitter and also pytorch has a really nice one but I'm not sure it's easy to isolate from the rest of the codebase), so I might open a new issue and explore the alternatives.
2. Indeed, encapsulation keeps public code clean whereas inheritance allows for better code reuse. This part needs some polishing either way.
3. Check #2 .
5. BGR, inverted shape (w,h) vs (h,w) in numpy and installation issues, especially with conda. Opencv is just too quirky... You're right about the alternatives and I will gradually phase opencv out as soon as I make sure most of the necessary features are supported by either pillow or skimage.

Thanks for your contribution, I really appreciate it :)

from enet-keras.

PavlosMelissinos avatar PavlosMelissinos commented on July 17, 2024

I pushed some more work and also decided to add the MIT license after all. You can check it out if you want.

from enet-keras.

ahundt avatar ahundt commented on July 17, 2024

cool looks nice! I think your generator mechanism is much better than what I had been doing. I also saw your contribution to Keras-FCN which was great!

Oh did you happen to see my dataset pull requests for keras-contrib?

Maybe there is a good way to incorporate your most recent changes there, I think my download code is still a bit more automated than what you've got but your COCO generators are definitely a huge step up, my version that writes to .npy files takes up a couple TB and is thus super slow! Your version looks like it is all in memory so it probably runs 100x faster haha.

There was also a new paper just published that's "realtime" like enet but much more accurate. https://arxiv.org/abs/1704.08545v1

from enet-keras.

ahundt avatar ahundt commented on July 17, 2024

no data augmentation at the moment I assume? Sometimes it isn't necessary if the dataset is sufficiently large anyway, perhaps coco is big enough. With this update have you been able to match the enet paper's results?

from enet-keras.

PavlosMelissinos avatar PavlosMelissinos commented on July 17, 2024

from enet-keras.

ahundt avatar ahundt commented on July 17, 2024

Good points! I also have an update: François Chollet, Keras' author, said he is interested in directly incorporating dense prediction/FCN into the Keras API, so I'm seeking suggestions/feedback at keras-team/keras#6538

from enet-keras.

PavlosMelissinos avatar PavlosMelissinos commented on July 17, 2024

Great news, keras needs that. Segmentation and captioning are not shown enough love. I think some parts need to be cleared up but I will have to take a better look at your dataset and preprocessing standardization proposals to see if and how we can combine our work. That issue is a good start I believe.

from enet-keras.

ahundt avatar ahundt commented on July 17, 2024

I looked through your updates and it is looking great!

It would be cool to create a pull request to incorporate some of your tools that are already set up well into Keras when it is ready. I could also do the pull request legwork by reorganizing of a simple copy/paste commit you make into a keras branch for credit purposes.

I think most of utils.py could be in a pull request into keras/preprocessing/image.py.

data_loader.py looks like a more reusable alternative to ImageDataGenerator/SegDataGenerator. Is it in progress, and were you planning on additional augmentation options?

What do you think about a more general generate_samples_from_disk, which is a modified collect_image_files_from_disk?

Here is my idea for the generate_samples_from_disk API:

def generate_samples_from_disk(sample_sets, callbacks=load_image,  sample_size=None, data_dirs=None):
    """Generate numpy arrays from files on disk in groups, such as single images or pairs of images.
    :param sample_sets: A list of lists, each containing the data's filenames such as [['img1.jpg', 'img2.jpg'], ['label1.png', 'label2.png']].
        Also supports a list of txt files, each containing the list of filenames in each set such as ['images.txt', 'labels.txt'].
        If None, all images in the folders specified in data_dirs are loaded in lexicographic order.
    :param callbacks: One callback that loads data from the specified file path into a numpy array, `load_image` by default. 
       Either a single callback should be specified or a callback must be provided for each sample set, and must be the same length as sample_sets. 
    :param data_dirs: Directory or list of directories to load. 
        Default None means each entry in sample_sets contains the full path to each file.
        Specifying a directory means filenames sample_sets can be found in that directory.
        Specifying a list of directories means each sample set is in that separate directory, and must be the same length as sample_sets.
    :param sample_sizes: 
    :return: 
    """

To do that I believe the python unpack mechanism would be the thing to use for that, but otherwise the implementation shouldn't be too complicated.

from enet-keras.

PavlosMelissinos avatar PavlosMelissinos commented on July 17, 2024

data_loader.py looks like a more reusable alternative to ImageDataGenerator/SegDataGenerator. Is it in progress ...?

I don't plan to add extra features but there are pieces here and there that need to be fixed, some gaps that I need to cover.

For instance, one problem with data_loader.py is that loading from disk is incomplete (might not work as expected currently). I haven't worked on it recently but the plan is to turn it into a separate dataset class, like MSCOCO currently is ('json' mode works fine), so that they have the same functionality.

But first I need to clean up datasets.py a bit and decide which methods should be exposed. Any feedback is welcome of course! For instance, I don't like the way the conversion between id/cid/category is done nor that I had to create a new dataset in order to use a subset of MSCOCO; MSCOCOReduced has a bug which crashes the program during training.

were you planning on additional augmentation options?

That's not in my immediate plans, I'm afraid. I'm currently trying out evaluation and fixing stuff around prediction, which will take some time and then I'll get to making everything else robust. I liked your suggestion on the other thread though, about using a functional-like API for transformations. As soon as the samples are in the correct form (e.g. numpy array), a chain of transformations would be very handy to have. However, parallel computation requires that the generator be thread-safe, which means that it should be wrapped in something like this:

class threadsafe_iter:
    """Takes an iterator/generator and makes it thread-safe by
    serializing call to the `next` method of given iterator/generator.
    """
    def __init__(self, it):
        self.it = it
        self.lock = threading.Lock()

    def __iter__(self):
        return self

    def next(self):
        with self.lock:
            return self.it.next()

taken from here

What do you think about a more general generate_samples_from_disk, which is a modified collect_image_files_from_disk?

Here is my idea for the generate_samples_from_disk API:

LGTM (I always wanted to say that 😛).


All in all, I have no unit tests in place so I'm really not confident about code correctness. I've been changing stuff all the time within functions until recently and sometimes it was getting really difficult to debug. I know unit testing will pay off in the long term but seems so boring that I prefer to procrastinate 😞... Anyway, that's the main reason I'm not submitting any pull requests to other projects at the moment.

I have cleaned up cropping a little bit by making label an optional parameter. That could be a PR to KerasFCN maybe.

Furthermore, this repository is getting too big compared to what it was initially created for; it's not just about enet anymore. Do you think I should a) split it up, b) rename the repo to something more general like keras-segmentation or c) keep it as it is and just move a couple of the modules that are the most ready to a more complete, third-party repository? I'm in favor of b personally (and then build towards c if it's possible). The goal is always to submit to keras-contrib (and keras in the longer term) but are you aware of other projects that are bigger than this repo and could make use of parts of the code now?

Thanks again for the interest, it's a great motivator for me against slacking off. I really didn't expect it to be so effective!

from enet-keras.

ahundt avatar ahundt commented on July 17, 2024

Here is my idea for the generate_samples_from_disk API:
LGTM (I always wanted to say that 😛).

Cool I'll post that API proposal on the keras dense prediction issue.

But first I need to clean up datasets.py a bit and decide which methods should be exposed. Any feedback is welcome of course!

I recall some of the function names weren't very clear like load_data and load_dataset, and it is a bit tough to figure out the order of things and how they would be chained. Looking through the code helped a bit.

The same is kind of true for the MSCOCO dataset, it is hard to tell the order you would call the functions, and why you would call each one. I guess that's always one of the difficulties with generator/async/functional style APIs.

  • What's the difference between raw and combined sample generators?
  • What if the top level version was renamed COCO_JSON?
  • A good test of the MSCOCO dataset API for extensibility would be loading cocostuff.
  • Download and loading may also be made optional in the call to _init_ with an COCOJSON(dataset='COCO'), like keras models.
  • Instead of manually creating a reduced class, what if you just pass a category list to init if you want a subset? default would be full set.
  • What kind of image rate are you getting now? 5-10 per second or something?

Do you think I should a) split it up, b) rename the repo to something more general like keras-segmentation or c) keep it as it is and just move a couple of the modules that are the most ready to a more complete, third-party repository? I'm in favor of b personally (and then build towards c if it's possible).

I'd say C, that's the decision I made when I started contributing directly to Keras-FCN. Could do Keras-FCN or a pull request for upstream keras. Keras-FCN is now pretty simple for that purpose, I'm now a collaborator. Probably wouldn't be too hard to add the enet model either! B is okay as a backup. :-)

from enet-keras.

PavlosMelissinos avatar PavlosMelissinos commented on July 17, 2024

I recall some of the function names weren't very clear like load_data and load_dataset, and it is a bit tough to figure out the order of things and how they would be chained. Looking through the code helped a bit.

The same is kind of true for the MSCOCO dataset, it is hard to tell the order you would call the functions, and why you would call each one. I guess that's always one of the difficulties with generator/async/functional style APIs.

You're right.

  • What's the difference between raw and combined sample generators?

Raw sample generator yields the full image containing a single annotation (no merging). I'm tempted to keep it but it really is of no use right now. I thought it would be cleaner to delegate merging to a higher level at a later step but I'm not so sure now.

Combined sample generator merges all annotations in an image and yields a single image/label pair per image, it's the whole scene. Needs some refactoring so as to reuse the lower level generators.

Haha, naming is not my strong suit :/

  • What if the top level version was renamed COCO_JSON?

As an answer to what problem?

I was thinking about splitting each dataset into a static dataset config (e.g. a json/dictionary) and a dataset loader. MS-COCO still has the same classes and static properties no matter how annotations are stored: image files, the original json, numpy arrays, pandas dataframes or a database.

  • Download and loading may also be made optional in the call to init with an COCOJSON(dataset='COCO'), like keras models.
  • A good test of the MSCOCO dataset API for extensibility would be loading cocostuff.

Good ideas.

  • Instead of manually creating a reduced class, what if you just pass a category list to init if you want a subset? default would be full set.

I agree. I had to modify the constructor and I chose to play it safe by avoiding to merge with the proper MS-COCO class and mess it up.

  • What kind of image rate are you getting now? 5-10 per second or something?

Yes, something like that, about a second at 4x512x512xdim or slightly faster at 8x256x256xdim. Which sucks because ENet is supposed to get more than 40 samples per second at this spatial size.

I'd say C, that's the decision I made when I started contributing directly to Keras-FCN. Could do Keras-FCN or a pull request for upstream keras. Keras-FCN is now pretty simple for that purpose, I'm now a collaborator. Probably wouldn't be too hard to add the enet model either! B is okay as a backup. :-)

That's a good plan. Does Keras-FCN have plans to merge with keras-contrib and/or keras?

Would you like to submit some of these suggestions as issues? Otherwise I can do it myself tomorrow.

from enet-keras.

ahundt avatar ahundt commented on July 17, 2024

What kind of image rate are you getting now? 5-10 per second or something?
Yes, something like that, about a second at 4x512x512xdim or slightly faster at 8x256x256xdim. > Which sucks because ENet is supposed to get more than 40 samples per second at this spatial size.

Threaded data loading would probably make a big difference. I've also noticed keras often seems much slower than tensorflow. If it is a very high priority for you see the following:

Otherwise giving it time may let some aspects of the problem resolve themselves as the underlying tools mature. :-)

What if the top level version was renamed COCO_JSON?
As an answer to what problem?

Distinguishing between:

  1. COCO aka MSCOCO the dataset
  2. COCO JSON the annotation format loaded by pycocotools the API

COCO JSON can be used to load a bunch of datasets, so it would make sense for either 'COCO' or 'MSCOCO' to be that new constructor param I mentioned.

That's a good plan. Does Keras-FCN have plans to merge with keras-contrib and/or keras?

I'm planning to do that. :-)
Original author said it is okay to do so + it is MIT licensed.

Would you like to submit some of these suggestions as issues? Otherwise I can do it myself tomorrow.

Yes, but should I create the issues in Keras-FCN or enet-keras? Up to you.

from enet-keras.

PavlosMelissinos avatar PavlosMelissinos commented on July 17, 2024

Otherwise giving it time may let some aspects of the problem resolve themselves as the underlying tools mature. :-)

Haha, indeed, gives a new meaning to lazy programming. :p Jokes aside, the FLOSS community is awesome!

Distinguishing between:

  1. COCO aka MSCOCO the dataset
  2. COCO JSON the annotation format loaded by pycocotools the API

COCO JSON can be used to load a bunch of datasets, so it would make sense for either 'COCO' or 'MSCOCO' to be that new constructor param I mentioned.

Ah I see. Yes, I've been considering that and seems like a good idea.

I'm planning to do that. :-)
Original author said it is okay to do so + it is MIT licensed.

Nice!

Yes, but should I create the issues in Keras-FCN or enet-keras? Up to you.

Ok, Keras-FCN is too different from this repository in some aspects and both need to change so I'd rather not start working on a merge and then find out that it has to change again. When we decide on a preprocessing API design for keras I will make a PR to Keras-FCN with my parts so that we can merge the features and then keep working from there. Sound good?

I believe suggestions that explicitly have to do with the enet-keras implementation (e.g. dataset related) should stay here for now.

from enet-keras.

PavlosMelissinos avatar PavlosMelissinos commented on July 17, 2024

Also:

Threaded data loading would probably make a big difference

I'm not sure about that, GPU utilization (on a single GTX1070) is usually around 80% and rarely falls below 50% according to nvidia-smi. Would be great for multi-gpu/distributed environments, however.

from enet-keras.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.