storyicon / comfyui_segment_anything Goto Github PK

Based on GroundingDino and SAM, use semantic strings to segment any element in an image. The comfyui version of sd-webui-segment-anything.

License: Apache License 2.0

Python 100.00%

comfyui groundingdino sam segment-anything custom-nodes stable-diffusion

comfyui_segment_anything's Introduction

ComfyUI Segment Anything

This project is a ComfyUI version of https://github.com/continue-revolution/sd-webui-segment-anything. At present, only the most core functionalities have been implemented. I would like to express my gratitude to continue-revolution for their preceding work on which this is based.

I have ensured consistency with sd-webui-segment-anything in terms of output when given the same input.

Requirements

Please ensure that you have installed Python dependencies using the following command:

pip3 install -r requirements.txt

Models

The models will be automatically downloaded when used. You can also manually download them according to the table below. If the automatic download is slow, you can set the HTTP_PROXY and HTTPS_PROXY environment variables to use a proxy.

bert-base-uncased

You can download the model from https://huggingface.co/bert-base-uncased/tree/main into the models/bert-base-uncased folder located in the root directory of ComfyUI, like this:

ComfyUI
    models
        bert-base-uncased
            config.json
            model.safetensors
            tokenizer_config.json
            tokenizer.json
            vocab.txt

You can also skip this step. During the inference process, bert-base-uncased will be automatically downloaded through the transformers library, and its directory is typically ~/.cache/huggingface/hub/models--bert-base-uncased.

GroundingDino

Please directly download the models and configuration files to the models/grounding-dino directory under the ComfyUI root directory, without modifying the file names.

name	size	config file	model file
GroundingDINO_SwinT_OGC	694MB	download link	download link
GroundingDINO_SwinB	938MB	download link	download link

SAM

Please directly download the model files to the models/sams directory under the ComfyUI root directory, without modifying the file names.

name	size	model file
sam_vit_h	2.56GB	download link
sam_vit_l	1.25GB	download link
sam_vit_b	375MB	download link
sam_hq_vit_h	2.57GB	download link
sam_hq_vit_l	1.25GB	download link
sam_hq_vit_b	379MB	download link
mobile_sam	39MB	download link

Contribution

Thank you for considering to help out with the source code! Welcome contributions from anyone on the internet, and are grateful for even the smallest of fixes!

If you'd like to contribute to this project, please fork, fix, commit and send a pull request for me to review and merge into the main code base.

comfyui_segment_anything's People

Contributors

Stargazers

Watchers

comfyui_segment_anything's Issues

ModuleNotFoundError: No module named 'timm'

Traceback (most recent call last):
File "D:\ComfyUI\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI\nodes.py", line 1798, in load_custom_node
module_spec.loader.exec_module(module)
File "", line 883, in exec_module
File "", line 241, in call_with_frames_removed
File "D:\ComfyUI\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI\custom_nodes\comfyui_segment_anything_init.py", line 1, in
from .node import *
File "D:\ComfyUI\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 17, in
from sam_hq.build_sam_hq import sam_model_registry
File "D:\ComfyUI\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\build_sam_hq.py", line 13, in
from .modeling.tiny_vit import TinyViT
File "D:\ComfyUI\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\modeling\tiny_vit.py", line 15, in
from timm.models.layers import DropPath as TimmDropPath,
ModuleNotFoundError: No module named 'timm'

Cannot import D:\ComfyUI\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI\custom_nodes\comfyui_segment_anything module for custom nodes: No module named 'timm'

GroundingDinoModelLoader (segment anything) ERROR

ERROR:root:!!! Exception during processing !!! ERROR:root:Traceback (most recent call last):
File "C:\ComfyUI_windows_portable\python_embeded\lib\site-packages\urllib3\util\retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /bert-base-uncased/resolve/main/tf_model.h5 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x0000013742FF3340>, 'Connection to huggingface.co timed out. (connect timeout=10)'))
我手动下载里tf_model.h5，但是不知道要放到哪里
I downloaded the tf_model.h5 manually, but I don't know where to put it

[Feature Request] - GroundingDINOSamSegment node output different masks/segs for each identified object

I'm not even sure if this is possible, but perhaps this is something @dnl13 might be interested in implementing in his fork.

I would love to see a variant of this node with multiple masks outputs, each one spitting out a different identified object.

For example, let's say that, given the image of a face as input, I ask to DINO to identify "mouth", "eyes", and "nose". Rather than having a single mask output that unifies these three otherwise disjoined elements, the node would have three outputs, one dedicated to the mouth mask, one to the eyes mask, and one to the nose mask.

This would allow further manipulation of the three objects in completely different ways. For example, I might decide that I want to re-render the nose as a painting, while the mouth is a photograph, and the eyes are a line art.

Eventually, these three masks would have to recombine in a single image. Not sure about that part, yet, but having the three masks is a starting point.

With this approach we could create multi-layererd images achieving a level of creativity that is not easy to obtain today.

Now. I realise that ComfyUI doesn't allow the creation of dynamic outputs, so this hypothetical node couldn't have as many outputs as defined by the user prompt. But, I would be happy to see 4 mask outputs to give users a degree of flexibility. If the user specifies 5 objects, the 5th object could be ignored or placed in a "everyhing else" mask, perhaps?

I understand that it's an ambitious idea. I thought it would be interesting to discuss.

Cannot import /root/ComfyUI/custom_nodes/comfyui_segment_anything module for custom nodes: A Message class can only inherit from Message

Traceback (most recent call last):
File "/root/ComfyUI/nodes.py", line 1735, in load_custom_node
module_spec.loader.exec_module(module)
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/root/ComfyUI/custom_nodes/comfyui_segment_anything/init.py", line 1, in
from .node import *
File "/root/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 17, in
from sam_hq.build_sam_hq import sam_model_registry
File "/root/ComfyUI/custom_nodes/comfyui_segment_anything/sam_hq/build_sam_hq.py", line 13, in
from .modeling.tiny_vit import TinyViT
File "/root/ComfyUI/custom_nodes/comfyui_segment_anything/sam_hq/modeling/tiny_vit.py", line 15, in
from timm.models.layers import DropPath as TimmDropPath,
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/init.py", line 3, in
from .models import create_model, list_models, list_pretrained, is_model, list_modules, model_entrypoint,
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/models/init.py", line 1, in
from .beit import *
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/models/beit.py", line 49, in
from timm.data import IMAGENET_DEFAULT_MEAN, IMAGENET_DEFAULT_STD
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/data/init.py", line 5, in
from .dataset import ImageDataset, IterableImageDataset, AugMixDataset
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/data/dataset.py", line 13, in
from .readers import create_reader
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/data/readers/init.py", line 1, in
from .reader_factory import create_reader
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/data/readers/reader_factory.py", line 3, in
from .reader_image_folder import ReaderImageFolder
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/data/readers/reader_image_folder.py", line 11, in
from timm.utils.misc import natural_key
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/utils/init.py", line 15, in
from .summary import update_summary, get_outdir
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/utils/summary.py", line 9, in
import wandb
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/init.py", line 27, in
from wandb import sdk as wandb_sdk
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/sdk/init.py", line 4, in
from .artifacts.artifact import Artifact # noqa: F401
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/sdk/artifacts/artifact.py", line 36, in
from wandb.apis.normalize import normalize_exceptions
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/apis/init.py", line 43, in
from .internal import Api as InternalApi # noqa
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/apis/internal.py", line 3, in
from wandb.sdk.internal.internal_api import Api as InternalApi
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/sdk/internal/internal_api.py", line 48, in
from ..lib import retry
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/sdk/lib/retry.py", line 17, in
from .mailbox import ContextCancelledError
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/sdk/lib/mailbox.py", line 10, in
from wandb.proto import wandb_internal_pb2 as pb
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/proto/wandb_internal_pb2.py", line 6, in
from wandb.proto.v3.wandb_internal_pb2 import *
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/proto/v3/wandb_internal_pb2.py", line 15, in
from google.protobuf import timestamp_pb2 as google_dot_protobuf_dot_timestamp__pb2
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/google/protobuf/timestamp_pb2.py", line 19, in
_builder.BuildTopDescriptorsAndMessages(DESCRIPTOR, 'google.protobuf.timestamp_pb2', globals())
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/google/protobuf/internal/builder.py", line 108, in BuildTopDescriptorsAndMessages
module[name] = BuildMessage(msg_des)
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/google/protobuf/internal/builder.py", line 85, in BuildMessage
message_class = _reflection.GeneratedProtocolMessageType(
TypeError: A Message class can only inherit from Message

Cannot import /root/ComfyUI/custom_nodes/comfyui_segment_anything module for custom nodes: A Message class can only inherit from Message
Searge-SDXL v4.0 in /root/ComfyUI/custom_nodes/SeargeSDXL

Other nodes have also reported this issue, but no solution has been found

Masking with GroundingDinoSAMSegment not working properly on Mac

I've set up a workflow to get the mask for the floor in a room, and given GroundingDino the prompt "floor" but it don't seem to do the masking correctly at all. I have tried this on two separate Mac's, both being M1 Macs. I've confirmed that this don't happen on PC.

I've also tried changing the threshold, and if I run it on the default "0.3" I get the following error:

Error occurred when executing GroundingDinoSAMSegment (segment anything):

cannot unpack non-iterable NoneType object

File "/Users/Alexander/Documents/AI/ComfyUI3/ComfyUI/execution.py", line 153, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/Alexander/Documents/AI/ComfyUI3/ComfyUI/execution.py", line 83, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/Alexander/Documents/AI/ComfyUI3/ComfyUI/execution.py", line 76, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/Alexander/Documents/AI/ComfyUI3/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 305, in main
(images, masks) = sam_segment(

Please let me know how to fix this, highly dependent on it working locally.

"Expected all tensors to be on the same device"

Hi! After update to the latest version I get this error:

ERROR:root:Traceback (most recent call last):
  File "e:\Stablediffusion\ComfyUI_windows_portable\ComfyUI\execution.py", line 153, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "e:\Stablediffusion\ComfyUI_windows_portable\ComfyUI\execution.py", line 83, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "e:\Stablediffusion\ComfyUI_windows_portable\ComfyUI\execution.py", line 76, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "E:\StableDiffusion\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 305, in main
    (images, masks) = sam_segment(
  File "E:\StableDiffusion\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 229, in sam_segment
    masks, _, _ = predictor.predict_torch(
  File "e:\Stablediffusion\ComfyUI_windows_portable\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\StableDiffusion\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\predictor.py", line 114, in predict_torch
    sparse_embeddings, dense_embeddings = self.model.prompt_encoder(
  File "e:\Stablediffusion\ComfyUI_windows_portable\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "e:\Stablediffusion\ComfyUI_windows_portable\python_embeded\lib\site-packages\segment_anything\modeling\prompt_encoder.py", line 158, in forward
    box_embeddings = self._embed_boxes(boxes)
  File "e:\Stablediffusion\ComfyUI_windows_portable\python_embeded\lib\site-packages\segment_anything\modeling\prompt_encoder.py", line 97, in _embed_boxes
    corner_embedding = self.pe_layer.forward_with_coords(coords, self.input_image_size)
  File "e:\Stablediffusion\ComfyUI_windows_portable\python_embeded\lib\site-packages\segment_anything\modeling\prompt_encoder.py", line 214, in forward_with_coords
    return self._pe_encoding(coords.to(torch.float))  # B x N x C
  File "e:\Stablediffusion\ComfyUI_windows_portable\python_embeded\lib\site-packages\segment_anything\modeling\prompt_encoder.py", line 189, in _pe_encoding
    coords = coords @ self.positional_encoding_gaussian_matrix
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)

Regardless which models I selected. What went wrong?

Kind regards

error when execute with sam_hq_vit_h model

I have this problem when I execute with sam_hq_vit_h model， It work fine with other models.

Mac arm64 MPS support?

platform: mac arm64
device: mps

Loads SAM model: /Users/liangbinsi/Documents/ComfyUI/models/sams/sam_vit_h_4b8939.pth (device:AUTO)
!!! Exception during processing!!! Tensor for argument #2 'mat2' is on CPU, but expected it to be on GPU (while checking arguments for mm)
Traceback (most recent call last):
  File "/Users/liangbinsi/Documents/ComfyUI/execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "/Users/liangbinsi/Documents/ComfyUI/execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "/Users/liangbinsi/Documents/ComfyUI/execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "/Users/liangbinsi/Documents/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 325, in main
    (images, masks) = sam_segment(
  File "/Users/liangbinsi/Documents/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 247, in sam_segment
    masks, _, _ = predictor.predict_torch(
  File "/Users/liangbinsi/.pyenv/versions/3.10.0/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/Users/liangbinsi/Documents/ComfyUI/custom_nodes/ComfyUI-dnl13-seg/libs/sam_hq/predictor.py", line 114, in predict_torch
    sparse_embeddings, dense_embeddings = self.model.prompt_encoder(
  File "/Users/liangbinsi/.pyenv/versions/3.10.0/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/liangbinsi/.pyenv/versions/3.10.0/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/liangbinsi/.pyenv/versions/3.10.0/lib/python3.10/site-packages/segment_anything/modeling/prompt_encoder.py", line 158, in forward
    box_embeddings = self._embed_boxes(boxes)
  File "/Users/liangbinsi/.pyenv/versions/3.10.0/lib/python3.10/site-packages/segment_anything/modeling/prompt_encoder.py", line 97, in _embed_boxes
    corner_embedding = self.pe_layer.forward_with_coords(coords, self.input_image_size)
  File "/Users/liangbinsi/.pyenv/versions/3.10.0/lib/python3.10/site-packages/segment_anything/modeling/prompt_encoder.py", line 214, in forward_with_coords
    return self._pe_encoding(coords.to(torch.float))  # B x N x C
  File "/Users/liangbinsi/.pyenv/versions/3.10.0/lib/python3.10/site-packages/segment_anything/modeling/prompt_encoder.py", line 189, in _pe_encoding
    coords = coords @ self.positional_encoding_gaussian_matrix
RuntimeError: Tensor for argument #2 'mat2' is on CPU, but expected it to be on GPU (while checking arguments for mm)

CPU support required

Or directml~

Error occurred when executing GroundingDinoSAMSegment (segment anything):

torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.

File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\execution.py", line 155, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\execution.py", line 85, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\execution.py", line 78, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 317, in main
boxes = groundingdino_predict(
^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 182, in groundingdino_predict
boxes_filt = get_grounding_output(
^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 170, in get_grounding_output
outputs = model(image[None], captions=[caption])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\custom_nodes\comfyui_segment_anything\local_groundingdino\models\GroundingDINO\groundingdino.py", line 279, in forward
features, poss = self.backbone(samples)
^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\custom_nodes\comfyui_segment_anything\local_groundingdino\models\GroundingDINO\backbone\backbone.py", line 151, in forward
xs = self0
^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\custom_nodes\comfyui_segment_anything\local_groundingdino\models\GroundingDINO\backbone\swin_transformer.py", line 732, in forward
x_out, H, W, x, Wh, Ww = layer(x, Wh, Ww)
^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\custom_nodes\comfyui_segment_anything\local_groundingdino\models\GroundingDINO\backbone\swin_transformer.py", line 448, in forward
x = checkpoint.checkpoint(blk, x, attn_mask)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch_compile.py", line 24, in inner
return torch._dynamo.disable(fn, recursive)(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch_dynamo\eval_frame.py", line 417, in _fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch_dynamo\external_utils.py", line 25, in inner
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\utils\checkpoint.py", line 460, in checkpoint
raise ValueError(

Change model: bert-base-chinese does not work, when can Chinese input be used

The default model: bert-base-uncased
Want to change to Chinese: bert-base-chinese
However, it was downloaded in /ComfyUI/models/bert-base-chinese and changed the file model name bert-base-chinese. Again, the bert-base-uncased task is loaded by default

It would be great if I could type in Chinese, after all, Chinese people love it!

ModuleNotFoundError: No module named 'segment_anything'

I'm not having any luck getting this to load. Reinstalling didn't work either. It's the only extension I'm having issues with.

Traceback:

Traceback (most recent call last):
  File "C:\Users\user\ComfyUI_windows_portable\ComfyUI\nodes.py", line 1734, in load_custom_node
    module_spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "C:\Users\user\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\__init__.py", line 1, in <module>
    from .node import *
  File "C:\Users\user\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 16, in <module>
    from sam_hq.predictor import SamPredictorHQ
  File "C:\Users\user\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\predictor.py", line 4, in <module>
    from segment_anything import SamPredictor
ModuleNotFoundError: No module named 'segment_anything'

Cannot import C:\Users\user\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything module for custom nodes: No module named 'segment_anything'

Always downloads the model file when I first start comfyui

I've already downloaded these files ahead of time, why does it still automatically go to download them the first time I start comfyui?

Error occurred when executing GroundingDinoSAMSegment (segment anything):

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)

File "/root/autodl-tmp/stable-diffusion-webui/extensions/sd-webui-comfyui/ComfyUI/execution.py", line 153, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "/root/autodl-tmp/stable-diffusion-webui/extensions/sd-webui-comfyui/ComfyUI/execution.py", line 83, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "/root/autodl-tmp/stable-diffusion-webui/extensions/sd-webui-comfyui/ComfyUI/execution.py", line 76, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "/root/autodl-tmp/stable-diffusion-webui/extensions/sd-webui-comfyui/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 314, in main
(images, masks) = sam_segment(
File "/root/autodl-tmp/stable-diffusion-webui/extensions/sd-webui-comfyui/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 236, in sam_segment
masks, _, _ = predictor.predict_torch(
File "/root/miniconda3/envs/xl_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/autodl-tmp/stable-diffusion-webui/extensions/sd-webui-comfyui/ComfyUI/custom_nodes/comfyui_segment_anything/sam_hq/predictor.py", line 114, in predict_torch
sparse_embeddings, dense_embeddings = self.model.prompt_encoder(
File "/root/miniconda3/envs/xl_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/miniconda3/envs/xl_env/lib/python3.10/site-packages/segment_anything/modeling/prompt_encoder.py", line 158, in forward
box_embeddings = self._embed_boxes(boxes)
File "/root/miniconda3/envs/xl_env/lib/python3.10/site-packages/segment_anything/modeling/prompt_encoder.py", line 97, in _embed_boxes
corner_embedding = self.pe_layer.forward_with_coords(coords, self.input_image_size)
File "/root/miniconda3/envs/xl_env/lib/python3.10/site-packages/segment_anything/modeling/prompt_encoder.py", line 214, in forward_with_coords
return self._pe_encoding(coords.to(torch.float)) # B x N x C
File "/root/miniconda3/envs/xl_env/lib/python3.10/site-packages/segment_anything/modeling/prompt_encoder.py", line 189, in _pe_encoding
coords = coords @ self.positional_encoding_gaussian_matrix

Why is there an error when using memory overflow, have you guys encountered the same problem

Error occurred when executing GroundingDinoSAMSegment (segment anything):

Allocation on device 0 would exceed allowed memory. (out of memory)
Currently allocated : 10.88 GiB
Requested : 768.00 MiB
Device limit : 11.73 GiB
Free (according to CUDA): 18.25 MiB
PyTorch limit (set by user-supplied memory fraction)
: 17179869184.00 GiB

Suggestion: Extract labels with positioning

Would be so cool if I would be able to input an image, and get back an image with squares with labels on the image, telling what it is. So perhaps returning the label + the coordinate of the box showing the content of the label, and then being able to input the coordinate for a box and masking that specific box.

Input image size causes errors, under described conditions

https://github.com/Nuked88/ComfyUI-N-Nodes/issues/9

Input image size causes errors, under these conditions,
width of 512 doesn't ERROR at least till threshold=0.6

In case someone runs into similar issues

SAMModelLoader Failure on Apple Silicon due to CUDA Deserialization Error

Issue Description

When attempting to run SAMModelLoader for the segment anything functionality on an Apple Silicon Mac, an error is encountered indicating a problem with attempting to deserialize an object on a CUDA device, even though torch.cuda.is_available() returns False.

Environment

Operating System: macOS Sonoma 14.4.1 (M1 Max, Apple Silicon)
Python Version: 3.11
PyTorch Version: torch==2.1.2 ; torchvision==0.16.2

Error Message

Error occurred when executing SAMModelLoader (segment anything):

Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

Expected Behavior

The model loader should detect the absence of CUDA and automatically adjust to use CPU for model deserialization and execution, allowing the functionality to proceed without error, or ideally switch to the GPU Apple Silicon provides.

Actual Behaviour

The process fails with an error message indicating an attempt to deserialize a CUDA object on a system where CUDA is not available, due to torch.cuda.is_available() returning False.

Screenshot

TypeError: cannot unpack non-iterable NoneType object

Hello
I'm getting this error sometimes (not always)

ERROR:root:!!! Exception during processing !!!
ERROR:root:Traceback (most recent call last):
  File "[...]\ComfyUI\execution.py", line 153, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "[...]\ComfyUI\execution.py", line 83, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "[...]\ComfyUI\execution.py", line 76, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "[...]\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 305, in main
    (images, masks) = sam_segment(
TypeError: cannot unpack non-iterable NoneType object

on line 305

I think it happens when there is no detection box?

In that case the sam_segment function returns a single None

Perhaps that needs to be return (None, None) as line 305 expects sam_segment to return (images, masks) for destructuring

I don't know python, but perhaps my comment helps point in the right direction 🤷

Thanks for reading! Great Node! Love it!

Error occurred when executing GroundingDinoModelLoader (segment anything): (MaxRetryError("HTTPSConnectionPool(host='hf-mirror.com', port=443): Max retries exceeded with url: /bert-base-uncased/resolve/main/tokenizer_config.json (Caused by ProxyError('Unable to connect to proxy', NewConnectionError(': Failed to establish a new connection: [WinError 10061] 由于目标计算机积极拒绝，无法连接。')))"), '(Request ID: d909f1db-a767-476b-b802-74918ea69c27)')

Diagnostics-1715002033.log

Fails to load

ComfyUI output during loading:

Traceback (most recent call last):
  File "S:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\nodes.py", line 1887, in load_custom_node
    module_spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "...\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\__init__.py", line 1, in <module>
    from .node import *
  File "...\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 17, in <module>
    from sam_hq.build_sam_hq import sam_model_registry
  File "...\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\build_sam_hq.py", line 13, in <module>
    from .modeling.tiny_vit import TinyViT
  File "...\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\modeling\tiny_vit.py", line 15, in <module>
    from timm.models.layers import DropPath as TimmDropPath,\
ModuleNotFoundError: No module named 'timm'

Cannot import ...\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything module for custom nodes: No module named 'timm'

RuntimeError: Expected all tensors to be on the same device but found at least two devices, cuda:0 and cpu!

the full error output like following

Loads SAM model: E:\DEV\ComfyUI_windows_portable\ComfyUI\models\sams\sam_vit_h_4b8939.pth (device:AUTO)
final text_encoder_type: bert-base-uncased
!!! Exception during processing !!!
Traceback (most recent call last):
  File "E:\DEV\ComfyUI_windows_portable\ComfyUI\execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "E:\DEV\ComfyUI_windows_portable\ComfyUI\execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "E:\DEV\ComfyUI_windows_portable\ComfyUI\execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "E:\DEV\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 325, in main
    (images, masks) = sam_segment(
  File "E:\DEV\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 247, in sam_segment
    masks, _, _ = predictor.predict_torch(
  File "E:\DEV\ComfyUI_windows_portable\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\DEV\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\predictor.py", line 114, in predict_torch
    sparse_embeddings, dense_embeddings = self.model.prompt_encoder(
  File "E:\DEV\ComfyUI_windows_portable\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\DEV\ComfyUI_windows_portable\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\DEV\ComfyUI_windows_portable\python_embeded\lib\site-packages\segment_anything\modeling\prompt_encoder.py", line 158, in forward
    box_embeddings = self._embed_boxes(boxes)
  File "E:\DEV\ComfyUI_windows_portable\python_embeded\lib\site-packages\segment_anything\modeling\prompt_encoder.py", line 97, in _embed_boxes
    corner_embedding = self.pe_layer.forward_with_coords(coords, self.input_image_size)
  File "E:\DEV\ComfyUI_windows_portable\python_embeded\lib\site-packages\segment_anything\modeling\prompt_encoder.py", line 214, in forward_with_coords
    return self._pe_encoding(coords.to(torch.float))  # B x N x C
  File "E:\DEV\ComfyUI_windows_portable\python_embeded\lib\site-packages\segment_anything\modeling\prompt_encoder.py", line 189, in _pe_encoding
    coords = coords @ self.positional_encoding_gaussian_matrix
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method `wrapper_CUDA_mm)

Any suggestion for this?

GroundingDinoSAMSegment MASK Output issue with Video/ImageBatches

Hey,
first of all, thank you for this very nice Nodes!
But there is an issue with the MASK output of the GroundingDinoSAMSegment Node.
When i try to feed a Vidoe input the IMAGE output looks correct, but the MASK output is just one large image with all Frames sticking verticaly instead of beeing an Image-Batch like the ImagePreview shows.
~~I am guessing there is something "wrong" in the mask_decoder_hq.py at line 135-152 ?~~

How to select mask region without "prompt input"?

Is this possible to generate all the different mask regions in different colors, and choose the needed and unwanted mask regions to combine. Like the https://github.com/Uminosachi/sd-webui-inpaint-anything. Sometimes, it's just hard to describe the region you want, but easily to select from the "color mask result".

upper bound and larger bound inconsistent with step sign

Im getting the following error was wondering if any of you guys had encountered it.

Error occurred when executing GroundingDinoSAMSegment (segment anything): upper bound and larger bound inconsistent with step sign

terminal shows me this:
groundingdino/models/GroundingDINO/bertwarper.py", line 252, in generate_masks_with_special_tokens_and_transfer_map
position_ids[row, previous_col + 1 : col + 1] = torch.arange(
^^^^^^^^^^^^^
RuntimeError: upper bound and larger bound inconsistent with step sign

Im trying to use the reposer plus workflow nerdyrodent posted. im running comfyui with --force-fp16 on an Intel mac pro 2019 with a AMD Radeon Pro 580X 8 GB and 48 gb of ram.

JSONDecodeError encountered while using GroundingDinoModelLoader

I have downloaded the following file and placed it in the ComfyUI/models/grounding-dino file.
But the following error occurred when using the GroundingDinoModelLoader node.

ComfyUI stderr: NameError: name 'groundingdino' is not defined

这个问题要怎么解决
How to solve this problem，Help me

Crash and close, anyone with this error?

Loads SAM model: C:\Users\WarMachineV10SSD3\Pictures\SD\ComfyPortable\ComfyUI_windows_portable\ComfyUI\models\sams\sam_vit_b_01ec64.pth (device:CPU)
final text_encoder_type: bert-base-uncased
C:\Users\WarMachineV10SSD3\Pictures\SD\ComfyPortable\ComfyUI_windows_portable\ComfyUI\venv\lib\site-packages\transformers\modeling_utils.py:907: FutureWarning: The device argument is deprecated and will be removed in v5 of Transformers.
warnings.warn(
[F D:\a_work\1\s\pytorch-directml-plugin\torch_directml\csrc\engine\dml_tensor_desc.cc:135] Check failed: !is_dim_broadcast || non_broadcast_dim_size == 1

Is there any method to segment more than 1 word caption in groundingdino?

In my workflow, it can not work because it can not inpainting two mask. In sd-webui, if I input two words, it can give me a whole mask combine them together.

Requirements checking at every ComfyUI startup

Requirement already satisfied: segment_anything in c:\comfyui\python_embeded\lib\site-packages (from -r requirements.txt (line 1)) (1.0)
Requirement already satisfied: timm in c:\comfyui\python_embeded\lib\site-packages (from -r requirements.txt (line 2)) (0.6.13)
Requirement already satisfied: addict in c:\comfyui\python_embeded\lib\site-packages (from -r requirements.txt (line 3)) (2.4.0)
Requirement already satisfied: yapf in c:\comfyui\python_embeded\lib\site-packages (from -r requirements.txt (line 4)) (0.40.2)
Requirement already satisfied: torch>=1.7 in c:\comfyui\python_embeded\lib\site-packages (from timm->-r requirements.txt (line 2)) (2.2.1+cu121)
Requirement already satisfied: torchvision in c:\comfyui\python_embeded\lib\site-packages (from timm->-r requirements.txt (line 2)) (0.17.1+cu121)
Requirement already satisfied: pyyaml in c:\comfyui\python_embeded\lib\site-packages (from timm->-r requirements.txt (line 2)) (6.0.1)
Requirement already satisfied: huggingface-hub in c:\comfyui\python_embeded\lib\site-packages (from timm->-r requirements.txt (line 2)) (0.19.4)
Requirement already satisfied: importlib-metadata>=6.6.0 in c:\comfyui\python_embeded\lib\site-packages (from yapf->-r requirements.txt (line 4)) (7.0.0)
Requirement already satisfied: platformdirs>=3.5.1 in c:\comfyui\python_embeded\lib\site-packages (from yapf->-r requirements.txt (line 4)) (4.1.0)
Requirement already satisfied: tomli>=2.0.1 in c:\comfyui\python_embeded\lib\site-packages (from yapf->-r requirements.txt (line 4)) (2.0.1)
Requirement already satisfied: zipp>=0.5 in c:\comfyui\python_embeded\lib\site-packages (from importlib-metadata>=6.6.0->yapf->-r requirements.txt (line 4)) (3.17.0)
Requirement already satisfied: filelock in c:\comfyui\python_embeded\lib\site-packages (from torch>=1.7->timm->-r requirements.txt (line 2)) (3.13.1)
Requirement already satisfied: typing-extensions>=4.8.0 in c:\comfyui\python_embeded\lib\site-packages (from torch>=1.7->timm->-r requirements.txt (line 2)) (4.8.0)
Requirement already satisfied: sympy in c:\comfyui\python_embeded\lib\site-packages (from torch>=1.7->timm->-r requirements.txt (line 2)) (1.12)
Requirement already satisfied: networkx in c:\comfyui\python_embeded\lib\site-packages (from torch>=1.7->timm->-r requirements.txt (line 2)) (3.2.1)
Requirement already satisfied: jinja2 in c:\comfyui\python_embeded\lib\site-packages (from torch>=1.7->timm->-r requirements.txt (line 2)) (3.1.2)
Requirement already satisfied: fsspec in c:\comfyui\python_embeded\lib\site-packages (from torch>=1.7->timm->-r requirements.txt (line 2)) (2023.10.0)
Requirement already satisfied: requests in c:\comfyui\python_embeded\lib\site-packages (from huggingface-hub->timm->-r requirements.txt (line 2)) (2.31.0)
Requirement already satisfied: tqdm>=4.42.1 in c:\comfyui\python_embeded\lib\site-packages (from huggingface-hub->timm->-r requirements.txt (line 2)) (4.66.2)
Requirement already satisfied: packaging>=20.9 in c:\comfyui\python_embeded\lib\site-packages (from huggingface-hub->timm->-r requirements.txt (line 2)) (23.2)
Requirement already satisfied: numpy in c:\comfyui\python_embeded\lib\site-packages (from torchvision->timm->-r requirements.txt (line 2)) (1.24.4)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in c:\comfyui\python_embeded\lib\site-packages (from torchvision->timm->-r requirements.txt (line 2)) (10.2.0)
Requirement already satisfied: colorama in c:\comfyui\python_embeded\lib\site-packages (from tqdm>=4.42.1->huggingface-hub->timm->-r requirements.txt (line 2)) (0.4.6)
Requirement already satisfied: MarkupSafe>=2.0 in c:\comfyui\python_embeded\lib\site-packages (from jinja2->torch>=1.7->timm->-r requirements.txt (line 2)) (2.1.3)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\comfyui\python_embeded\lib\site-packages (from requests->huggingface-hub->timm->-r requirements.txt (line 2)) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in c:\comfyui\python_embeded\lib\site-packages (from requests->huggingface-hub->timm->-r requirements.txt (line 2)) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\comfyui\python_embeded\lib\site-packages (from requests->huggingface-hub->timm->-r requirements.txt (line 2)) (1.26.18)
Requirement already satisfied: certifi>=2017.4.17 in c:\comfyui\python_embeded\lib\site-packages (from requests->huggingface-hub->timm->-r requirements.txt (line 2)) (2023.11.17)
Requirement already satisfied: mpmath>=0.19 in c:\comfyui\python_embeded\lib\site-packages (from sympy->torch>=1.7->timm->-r requirements.txt (line 2)) (1.3.0)

Is it possible to remove the requirements checking at every ComfyUI startup?

Run on CPU

Is it possible to add an option to run the processing on the CPU rather than GPU?

what is "treshold" work for?

How to merge multiple masks generated using multiple prompts？

I use the prompt word (human. cup) to obtain the mask of the person holding the cup, but GroundingDino outputs the masks of the person and the cup separately.
How to merge these two masks together and output a mixed mask？

Why not cover multiple parts, such as arms and legs. Merge mask will report an error

Merging mask soil will report an error

Suggestion: allow multiple prompts ("captions")

The Grounding Dino model permits an array of "captions" (i.e. prompts). It would be great to allow multiple prompts so that a single call to the model can mask more than one thing. See https://github.com/storyicon/comfyui_segment_anything/blob/f2283d4e4207d4352dfa2fe00ee952dc4918e6ef/node.py#L152C1-L153C1

Cannot Import - for field conv_cfg is not allowed: use default_factory

Hey there!
I've encountered a weird error I am unable to fix by myself, requesting back up!)

Traceback (most recent call last):
File "C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\nodes.py", line 1872, in load_custom_node
module_spec.loader.exec_module(module)
File "", line 940, in exec_module
File "", line 241, in call_with_frames_removed
File "C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything_init.py", line 1, in
from .node import *
File "C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 17, in
from sam_hq.build_sam_hq import sam_model_registry
File "C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\build_sam_hq.py", line 13, in
from .modeling.tiny_vit import TinyViT
File "C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\modeling\tiny_vit.py", line 15, in
from timm.models.layers import DropPath as TimmDropPath,
File "C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\timm_init_.py", line 2, in
from .models import create_model, list_models, is_model, list_modules, model_entrypoint,
File "C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\timm\models_init_.py", line 28, in
from .maxxvit import *
File "C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\timm\models\maxxvit.py", line 225, in
@DataClass
^^^^^^^^^
File "dataclasses.py", line 1230, in dataclass
File "dataclasses.py", line 1220, in wrap
File "dataclasses.py", line 958, in _process_class
File "dataclasses.py", line 815, in _get_field
ValueError: mutable default <class 'timm.models.maxxvit.MaxxVitConvCfg'> for field conv_cfg is not allowed: use default_factory

Cannot import C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything module for custom nodes: mutable default <class 'timm.models.maxxvit.MaxxVitConvCfg'> for field conv_cfg is not allowed: use default_factory

Getting a "Ran out of input" error from one of the dependencies when comfyui starts with segment anything

OS Linux Mint 21.2 x86_64, Kernel: 5.15.0-92-generic. Python version 3.10, latest comfyui

Traceback (most recent call last):
File "/opt/LLM/jttw/components/comfyui/nodes.py", line 1872, in load_custom_node
module_spec.loader.exec_module(module)
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/opt/LLM/jttw/components/comfyui/custom_nodes/comfyui_segment_anything/init.py", line 1, in
from .node import *
File "/opt/LLM/jttw/components/comfyui/custom_nodes/comfyui_segment_anything/node.py", line 19, in
from local_groundingdino.util.utils import clean_state_dict as local_groundingdino_clean_state_dict
File "/opt/LLM/jttw/components/comfyui/custom_nodes/comfyui_segment_anything/local_groundingdino/util/utils.py", line 12, in
from local_groundingdino.util.slconfig import SLConfig
File "/opt/LLM/jttw/components/comfyui/custom_nodes/comfyui_segment_anything/local_groundingdino/util/slconfig.py", line 14, in
from yapf.yapflib.yapf_api import FormatCode
File "/opt/LLM/jttw/components/comfyui/comfyui_venv/lib/python3.10/site-packages/yapf/init.py", line 41, in
from yapf.yapflib import yapf_api
File "/opt/LLM/jttw/components/comfyui/comfyui_venv/lib/python3.10/site-packages/yapf/yapflib/yapf_api.py", line 38, in
from yapf.pyparser import pyparser
File "/opt/LLM/jttw/components/comfyui/comfyui_venv/lib/python3.10/site-packages/yapf/pyparser/pyparser.py", line 44, in
from yapf.yapflib import format_token
File "/opt/LLM/jttw/components/comfyui/comfyui_venv/lib/python3.10/site-packages/yapf/yapflib/format_token.py", line 23, in
from yapf.pytree import pytree_utils
File "/opt/LLM/jttw/components/comfyui/comfyui_venv/lib/python3.10/site-packages/yapf/pytree/pytree_utils.py", line 30, in
from yapf_third_party._ylib2to3 import pygram
File "/opt/LLM/jttw/components/comfyui/comfyui_venv/lib/python3.10/site-packages/yapf_third_party/_ylib2to3/pygram.py", line 29, in
python_grammar = driver.load_grammar(_GRAMMAR_FILE)
File "/opt/LLM/jttw/components/comfyui/comfyui_venv/lib/python3.10/site-packages/yapf_third_party/_ylib2to3/pgen2/driver.py", line 252, in load_grammar
g.load(gp)
File "/opt/LLM/jttw/components/comfyui/comfyui_venv/lib/python3.10/site-packages/yapf_third_party/_ylib2to3/pgen2/grammar.py", line 96, in load
d = pickle.load(f)
EOFError: Ran out of input

Generate multiple results at once

Is there any way I can merge all the results? Can't do it when I only need one result

how to use ESAM

how to use efficient sam, can I just change the name?

Request for Configurable Default Addresses in extra_model_paths.yaml

Can the default addresses in groundingdino_model_dir and sam_model_dir be made configurable in extra_model_paths.yaml?

Also, I encountered an issue where even though I have already downloaded the model at the specified address, it still tries to call the Hugging Face API. Is AutoTokenizer only capable of remote calls?

can someone guide me how to install this plugin?

when i git in UI manager its will be fail, can someone guide me how to download this?
thanks

The number of images input is not the same as the number output

hey,I don't konw why, but increase the thresthold may decrease the output number. Maybe this is a bug, or how can I avoid this? Thank you.

'SAMWrapper' object has no attribute 'image_encoder'

Updated to the latest ComfyUI and SAM + Grounding Dino seems to have broke, getting the following error in the UI:

Error occurred when executing GroundingDinoSAMSegment (segment anything):

'SAMWrapper' object has no attribute 'image_encoder'

File "/home/sid/ComfyUI/execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/ComfyUI/execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/ComfyUI/execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 325, in main
(images, masks) = sam_segment(
^^^^^^^^^^^^
File "/home/sid/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 240, in sam_segment
predictor = SamPredictorHQ(sam_model, sam_is_hq)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/ComfyUI/custom_nodes/comfyui_segment_anything/sam_hq/predictor.py", line 22, in __init__
super().__init__(sam_model=sam_model)
File "/home/sid/miniconda3/envs/comfyui/lib/python3.11/site-packages/segment_anything/predictor.py", line 31, in __init__
self.transform = ResizeLongestSide(sam_model.image_encoder.img_size)
^^^^^^^^^^^^^^^^^^^^^^^

And the following in the terminal:

Traceback (most recent call last):
  File "/home/sid/ComfyUI/execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sid/ComfyUI/execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sid/ComfyUI/execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sid/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 325, in main
    (images, masks) = sam_segment(
                      ^^^^^^^^^^^^
  File "/home/sid/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 240, in sam_segment
    predictor = SamPredictorHQ(sam_model, sam_is_hq)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sid/ComfyUI/custom_nodes/comfyui_segment_anything/sam_hq/predictor.py", line 22, in __init__
    super().__init__(sam_model=sam_model)
  File "/home/sid/miniconda3/envs/comfyui/lib/python3.11/site-packages/segment_anything/predictor.py", line 31, in __init__
    self.transform = ResizeLongestSide(sam_model.image_encoder.img_size)
                                       ^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'SAMWrapper' object has no attribute 'image_encoder'

Prompt executed in 14.11 seconds

Please suggest what I can try to fix//troubleshoot - the same workflow was working fine till yesterday...

about bert-base-uncased

Error occurred when executing GroundingDinoModelLoader (segment anything):

We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like bert-base-uncased is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\execution.py", line 153, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\execution.py", line 83, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\execution.py", line 76, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 268, in main
dino_model = load_groundingdino_model(model_name)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 108, in load_groundingdino_model
dino = local_groundingdino_build_model(dino_model_args)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\local_groundingdino\models_init_.py", line 17, in build_model
model = build_func(args)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\local_groundingdino\models\GroundingDINO\groundingdino.py", line 362, in build_groundingdino
model = GroundingDINO(
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\local_groundingdino\models\GroundingDINO\groundingdino.py", line 97, in init
self.tokenizer = get_tokenlizer.get_tokenlizer(text_encoder_type)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\local_groundingdino\util\get_tokenlizer.py", line 19, in get_tokenlizer
tokenizer = AutoTokenizer.from_pretrained(text_encoder_type)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\python_embeded\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 733, in from_pretrained
config = AutoConfig.from_pretrained(
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\python_embeded\lib\site-packages\transformers\models\auto\configuration_auto.py", line 1048, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\python_embeded\lib\site-packages\transformers\configuration_utils.py", line 622, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\python_embeded\lib\site-packages\transformers\configuration_utils.py", line 677, in _get_config_dict
resolved_config_file = cached_file(
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\python_embeded\lib\site-packages\transformers\utils\hub.py", line 470, in cached_file
raise EnvironmentError(.

I can't seem to get it to output more than 6 images. How is this used in video workflows?

When I input a video set to eg. 200 frame cap I only get 6 masked images out. Often I get crashes too
[rawvideo @ 0x6067e5fab440] Invalid buffer size, packet size 648000 < expected frame_size 1944000
[vist#0:0/rawvideo @ 0x6067e5fab2c0] Error submitting packet to decoder: Invalid argument
[vist#0:0/rawvideo @ 0x6067e5fab2c0] Decode error rate 1 exceeds maximum 0.666667

Memory requirements?

What's the VRAM requirement to run? I'm hitting out of memory issues on a 10GB 3080.

i have already installed (segment anything) but following node types were not found while using

When loading the graph, the following node types were not found:
InvertMask (segment anything)
GroundingDinoSAMSegment (segment anything)
GroundingDinoModelLoader (segment anything)
SAMModelLoader (segment anything)
Nodes that have failed to load will show as red on the graph.

i have already installed (segment anything) and I'm still getting this error. please help

(IMPORT FAILED) segment anything

I wanted to document an issue with installing SAM in ComfyUI. Using the node manager, the import fails. I attempted the basic restarts, refreshes, etc. Attempted an update of ComfyUI - still no dice. Is the issue regarding running on CPU?

M1 Max
32 GB of memory
Ventura

TypeError: cannot unpack non-iterable NoneType object

!!! Exception during processing !!!
Traceback (most recent call last):
  File "D:\tools\ComfyUI_windows_portable\ComfyUI\execution.py", line 152, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "D:\tools\ComfyUI_windows_portable\ComfyUI\execution.py", line 82, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "D:\tools\ComfyUI_windows_portable\ComfyUI\execution.py", line 75, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "D:\tools\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 305, in main
    (images, masks) = sam_segment(
TypeError: cannot unpack non-iterable NoneType object

How do I fixe the jagged edges on my mask?

I always get jagged edges in my masks, so how do I fixe it so it gets smoother?

storyicon / comfyui_segment_anything Goto Github PK

comfyui_segment_anything's Introduction

ComfyUI Segment Anything

Requirements

Models

bert-base-uncased

GroundingDino

SAM

Contribution

comfyui_segment_anything's People

Contributors

Stargazers

Watchers

Forkers

comfyui_segment_anything's Issues

SAMModelLoader Failure on Apple Silicon due to CUDA Deserialization Error

Issue Description

Environment

Error Message

Expected Behavior

Actual Behaviour

Screenshot

Recommend Projects

Recommend Topics

Recommend Org