ymy-k / hi-sam Goto Github PK

View Code? Open in Web Editor NEW

192.0 12.0 10.0 6.61 MB

[arXiv preprint] Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation

License: Apache License 2.0

Python 100.00%

hierarchical-text-segmentation segment-anything high-quality-text-stroke-segmentation sam segment-anything-model

hi-sam's People

Contributors

Stargazers

Watchers

Forkers

binshi-bing mattcintron targgget gogiants1 hajungong007 dlml tomoyukun ali-fayzi dlove1204 jeonghosuh

hi-sam's Issues

Polygon3 error

I am getting this error while building the polygon3 wheel. I am running python 3.8 on colab. Can someone help?

Unable to export the model to onnx / torch trace format

[Question] Line Segmentation Training on Custom Dataset

Hi,

really exited the train code is now released. What would be the best method to train Hi-SAM for line Seg on Custom Datasets ? My First intution would be to convert the custom dataset into the format of the ones given for line seg. The line Segmentation model was trained on ctw1500 and is not included in the data prep guide.Any Ideas on how to custom train Hi-SAM for lien seg ?

Thanks in Advance

When will the training code be released?

Thanks for your excellent work! When will the training code be released?

Cool Work!

How is output token in Fig3 obtained?

Thanks for your splendid work, but I didn't see any description of how to get the output token before S-Decoder and output tokens before H-Decoder. The paper says ''Let ts out ∈ R1×256 denote the inherited output token, which is the first slice of SAM’s output tokens.'' but it's still a little ambiguous.

File "vit_h_maskdecoder.pth" is missing

I tried to run the Hi-SAM Visualization Demo of hierarchical segmentation using the following command

python demo_hisam.py --checkpoint pretrained_checkpoint/hi_sam_l.pth --model-type vit_l --input demo/img293.jpg --output demo/ --hier_det

but got the error
FileNotFoundError: [Errno 2] No such file or directory: 'pretrained_checkpoint/vit_h_maskdecoder.pth'

I did not find any reference to this file in the README. Do you provide this file for download anywhere?

Poor Paragraph segmentation quality

Hi, thanks for sharing your work.

By the way, I found your demo_hisam.py shows poor quality.
I tried to use demo_hisam.py following your direction,

python demo_hisam.py --checkpoint pretrained_checkpoint/hi_sam_l.pth --model-type vit_l --input demo/2e0cb33320757201.jpg --output demo/ --hier_det

This is how i executed the your code and i get the result below.
(Even pretrained weight trained by HierText, which contains input image)

Could you tell me If I made mistake using demo by any chance ?

What about the performance for chinese text

Thanks for your excellent job, Is the model effective for Chinese text

Poor textline detection quality.

Hi, first of all, nice job.
I found that quality of the textline detection model is poor. To be more precise, many lines are just not segmented.

To reproduce:
python demo_text_detection.py --checkpoint pretrained_checkpoint/line_detection_ctw1500.pth --model-type vit_h --input demo/1.jpg --output demo/ --dataset ctw1500

Images: