Giter Club home page Giter Club logo

comfyui_segment_anything's Introduction

ComfyUI Segment Anything

This project is a ComfyUI version of https://github.com/continue-revolution/sd-webui-segment-anything. At present, only the most core functionalities have been implemented. I would like to express my gratitude to continue-revolution for their preceding work on which this is based.

example

I have ensured consistency with sd-webui-segment-anything in terms of output when given the same input.

Requirements

Please ensure that you have installed Python dependencies using the following command:

pip3 install -r requirements.txt

Models

The models will be automatically downloaded when used. You can also manually download them according to the table below. If the automatic download is slow, you can set the HTTP_PROXY and HTTPS_PROXY environment variables to use a proxy.

bert-base-uncased

You can download the model from https://huggingface.co/bert-base-uncased/tree/main into the models/bert-base-uncased folder located in the root directory of ComfyUI, like this:

ComfyUI
    models
        bert-base-uncased
            config.json
            model.safetensors
            tokenizer_config.json
            tokenizer.json
            vocab.txt

You can also skip this step. During the inference process, bert-base-uncased will be automatically downloaded through the transformers library, and its directory is typically ~/.cache/huggingface/hub/models--bert-base-uncased.

GroundingDino

Please directly download the models and configuration files to the models/grounding-dino directory under the ComfyUI root directory, without modifying the file names.

name size config file model file
GroundingDINO_SwinT_OGC 694MB download link download link
GroundingDINO_SwinB 938MB download link download link

SAM

Please directly download the model files to the models/sams directory under the ComfyUI root directory, without modifying the file names.

name size model file
sam_vit_h 2.56GB download link
sam_vit_l 1.25GB download link
sam_vit_b 375MB download link
sam_hq_vit_h 2.57GB download link
sam_hq_vit_l 1.25GB download link
sam_hq_vit_b 379MB download link
mobile_sam 39MB download link

Contribution

Thank you for considering to help out with the source code! Welcome contributions from anyone on the internet, and are grateful for even the smallest of fixes!

If you'd like to contribute to this project, please fork, fix, commit and send a pull request for me to review and merge into the main code base.

comfyui_segment_anything's People

Contributors

allinws avatar anson2048 avatar antoinedelplace avatar dnl13 avatar frantic avatar guilhermep avatar storyicon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

comfyui_segment_anything's Issues

ModuleNotFoundError: No module named 'timm'

Traceback (most recent call last):
File "D:\ComfyUI\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI\nodes.py", line 1798, in load_custom_node
module_spec.loader.exec_module(module)
File "", line 883, in exec_module
File "", line 241, in call_with_frames_removed
File "D:\ComfyUI\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI\custom_nodes\comfyui_segment_anything_init
.py", line 1, in
from .node import *
File "D:\ComfyUI\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 17, in
from sam_hq.build_sam_hq import sam_model_registry
File "D:\ComfyUI\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\build_sam_hq.py", line 13, in
from .modeling.tiny_vit import TinyViT
File "D:\ComfyUI\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\modeling\tiny_vit.py", line 15, in
from timm.models.layers import DropPath as TimmDropPath,
ModuleNotFoundError: No module named 'timm'

Cannot import D:\ComfyUI\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI\custom_nodes\comfyui_segment_anything module for custom nodes: No module named 'timm'

GroundingDinoModelLoader (segment anything) ERROR

ERROR:root:!!! Exception during processing !!! ERROR:root:Traceback (most recent call last):
File "C:\ComfyUI_windows_portable\python_embeded\lib\site-packages\urllib3\util\retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /bert-base-uncased/resolve/main/tf_model.h5 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x0000013742FF3340>, 'Connection to huggingface.co timed out. (connect timeout=10)'))
我手动下载里tf_model.h5,但是不知道要放到哪里
I downloaded the tf_model.h5 manually, but I don't know where to put it

[Feature Request] - GroundingDINOSamSegment node output different masks/segs for each identified object

I'm not even sure if this is possible, but perhaps this is something @dnl13 might be interested in implementing in his fork.

I would love to see a variant of this node with multiple masks outputs, each one spitting out a different identified object.

For example, let's say that, given the image of a face as input, I ask to DINO to identify "mouth", "eyes", and "nose". Rather than having a single mask output that unifies these three otherwise disjoined elements, the node would have three outputs, one dedicated to the mouth mask, one to the eyes mask, and one to the nose mask.

This would allow further manipulation of the three objects in completely different ways. For example, I might decide that I want to re-render the nose as a painting, while the mouth is a photograph, and the eyes are a line art.

Eventually, these three masks would have to recombine in a single image. Not sure about that part, yet, but having the three masks is a starting point.

With this approach we could create multi-layererd images achieving a level of creativity that is not easy to obtain today.

Now. I realise that ComfyUI doesn't allow the creation of dynamic outputs, so this hypothetical node couldn't have as many outputs as defined by the user prompt. But, I would be happy to see 4 mask outputs to give users a degree of flexibility. If the user specifies 5 objects, the 5th object could be ignored or placed in a "everyhing else" mask, perhaps?

I understand that it's an ambitious idea. I thought it would be interesting to discuss.

Cannot import /root/ComfyUI/custom_nodes/comfyui_segment_anything module for custom nodes: A Message class can only inherit from Message

Traceback (most recent call last):
File "/root/ComfyUI/nodes.py", line 1735, in load_custom_node
module_spec.loader.exec_module(module)
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/root/ComfyUI/custom_nodes/comfyui_segment_anything/init.py", line 1, in
from .node import *
File "/root/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 17, in
from sam_hq.build_sam_hq import sam_model_registry
File "/root/ComfyUI/custom_nodes/comfyui_segment_anything/sam_hq/build_sam_hq.py", line 13, in
from .modeling.tiny_vit import TinyViT
File "/root/ComfyUI/custom_nodes/comfyui_segment_anything/sam_hq/modeling/tiny_vit.py", line 15, in
from timm.models.layers import DropPath as TimmDropPath,
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/init.py", line 3, in
from .models import create_model, list_models, list_pretrained, is_model, list_modules, model_entrypoint,
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/models/init.py", line 1, in
from .beit import *
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/models/beit.py", line 49, in
from timm.data import IMAGENET_DEFAULT_MEAN, IMAGENET_DEFAULT_STD
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/data/init.py", line 5, in
from .dataset import ImageDataset, IterableImageDataset, AugMixDataset
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/data/dataset.py", line 13, in
from .readers import create_reader
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/data/readers/init.py", line 1, in
from .reader_factory import create_reader
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/data/readers/reader_factory.py", line 3, in
from .reader_image_folder import ReaderImageFolder
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/data/readers/reader_image_folder.py", line 11, in
from timm.utils.misc import natural_key
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/utils/init.py", line 15, in
from .summary import update_summary, get_outdir
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/timm/utils/summary.py", line 9, in
import wandb
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/init.py", line 27, in
from wandb import sdk as wandb_sdk
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/sdk/init.py", line 4, in
from .artifacts.artifact import Artifact # noqa: F401
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/sdk/artifacts/artifact.py", line 36, in
from wandb.apis.normalize import normalize_exceptions
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/apis/init.py", line 43, in
from .internal import Api as InternalApi # noqa
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/apis/internal.py", line 3, in
from wandb.sdk.internal.internal_api import Api as InternalApi
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/sdk/internal/internal_api.py", line 48, in
from ..lib import retry
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/sdk/lib/retry.py", line 17, in
from .mailbox import ContextCancelledError
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/sdk/lib/mailbox.py", line 10, in
from wandb.proto import wandb_internal_pb2 as pb
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/proto/wandb_internal_pb2.py", line 6, in
from wandb.proto.v3.wandb_internal_pb2 import *
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/wandb/proto/v3/wandb_internal_pb2.py", line 15, in
from google.protobuf import timestamp_pb2 as google_dot_protobuf_dot_timestamp__pb2
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/google/protobuf/timestamp_pb2.py", line 19, in
_builder.BuildTopDescriptorsAndMessages(DESCRIPTOR, 'google.protobuf.timestamp_pb2', globals())
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/google/protobuf/internal/builder.py", line 108, in BuildTopDescriptorsAndMessages
module[name] = BuildMessage(msg_des)
File "/root/.local/conda/envs/com/lib/python3.10/site-packages/google/protobuf/internal/builder.py", line 85, in BuildMessage
message_class = _reflection.GeneratedProtocolMessageType(
TypeError: A Message class can only inherit from Message

Cannot import /root/ComfyUI/custom_nodes/comfyui_segment_anything module for custom nodes: A Message class can only inherit from Message
Searge-SDXL v4.0 in /root/ComfyUI/custom_nodes/SeargeSDXL

Other nodes have also reported this issue, but no solution has been found

Masking with GroundingDinoSAMSegment not working properly on Mac

I've set up a workflow to get the mask for the floor in a room, and given GroundingDino the prompt "floor" but it don't seem to do the masking correctly at all. I have tried this on two separate Mac's, both being M1 Macs. I've confirmed that this don't happen on PC.

Skärmavbild 2023-11-08 kl  21 42 16

I've also tried changing the threshold, and if I run it on the default "0.3" I get the following error:

Error occurred when executing GroundingDinoSAMSegment (segment anything):

cannot unpack non-iterable NoneType object

File "/Users/Alexander/Documents/AI/ComfyUI3/ComfyUI/execution.py", line 153, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/Alexander/Documents/AI/ComfyUI3/ComfyUI/execution.py", line 83, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/Alexander/Documents/AI/ComfyUI3/ComfyUI/execution.py", line 76, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/Alexander/Documents/AI/ComfyUI3/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 305, in main
(images, masks) = sam_segment(

Please let me know how to fix this, highly dependent on it working locally.

"Expected all tensors to be on the same device"

Hi! After update to the latest version I get this error:

ERROR:root:Traceback (most recent call last):
  File "e:\Stablediffusion\ComfyUI_windows_portable\ComfyUI\execution.py", line 153, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "e:\Stablediffusion\ComfyUI_windows_portable\ComfyUI\execution.py", line 83, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "e:\Stablediffusion\ComfyUI_windows_portable\ComfyUI\execution.py", line 76, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "E:\StableDiffusion\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 305, in main
    (images, masks) = sam_segment(
  File "E:\StableDiffusion\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 229, in sam_segment
    masks, _, _ = predictor.predict_torch(
  File "e:\Stablediffusion\ComfyUI_windows_portable\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\StableDiffusion\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\predictor.py", line 114, in predict_torch
    sparse_embeddings, dense_embeddings = self.model.prompt_encoder(
  File "e:\Stablediffusion\ComfyUI_windows_portable\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "e:\Stablediffusion\ComfyUI_windows_portable\python_embeded\lib\site-packages\segment_anything\modeling\prompt_encoder.py", line 158, in forward
    box_embeddings = self._embed_boxes(boxes)
  File "e:\Stablediffusion\ComfyUI_windows_portable\python_embeded\lib\site-packages\segment_anything\modeling\prompt_encoder.py", line 97, in _embed_boxes
    corner_embedding = self.pe_layer.forward_with_coords(coords, self.input_image_size)
  File "e:\Stablediffusion\ComfyUI_windows_portable\python_embeded\lib\site-packages\segment_anything\modeling\prompt_encoder.py", line 214, in forward_with_coords
    return self._pe_encoding(coords.to(torch.float))  # B x N x C
  File "e:\Stablediffusion\ComfyUI_windows_portable\python_embeded\lib\site-packages\segment_anything\modeling\prompt_encoder.py", line 189, in _pe_encoding
    coords = coords @ self.positional_encoding_gaussian_matrix
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)

Regardless which models I selected. What went wrong?

Kind regards

Mac arm64 MPS support?

platform: mac arm64
device: mps

Loads SAM model: /Users/liangbinsi/Documents/ComfyUI/models/sams/sam_vit_h_4b8939.pth (device:AUTO)
!!! Exception during processing!!! Tensor for argument #2 'mat2' is on CPU, but expected it to be on GPU (while checking arguments for mm)
Traceback (most recent call last):
  File "/Users/liangbinsi/Documents/ComfyUI/execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "/Users/liangbinsi/Documents/ComfyUI/execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "/Users/liangbinsi/Documents/ComfyUI/execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "/Users/liangbinsi/Documents/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 325, in main
    (images, masks) = sam_segment(
  File "/Users/liangbinsi/Documents/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 247, in sam_segment
    masks, _, _ = predictor.predict_torch(
  File "/Users/liangbinsi/.pyenv/versions/3.10.0/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/Users/liangbinsi/Documents/ComfyUI/custom_nodes/ComfyUI-dnl13-seg/libs/sam_hq/predictor.py", line 114, in predict_torch
    sparse_embeddings, dense_embeddings = self.model.prompt_encoder(
  File "/Users/liangbinsi/.pyenv/versions/3.10.0/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/liangbinsi/.pyenv/versions/3.10.0/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/liangbinsi/.pyenv/versions/3.10.0/lib/python3.10/site-packages/segment_anything/modeling/prompt_encoder.py", line 158, in forward
    box_embeddings = self._embed_boxes(boxes)
  File "/Users/liangbinsi/.pyenv/versions/3.10.0/lib/python3.10/site-packages/segment_anything/modeling/prompt_encoder.py", line 97, in _embed_boxes
    corner_embedding = self.pe_layer.forward_with_coords(coords, self.input_image_size)
  File "/Users/liangbinsi/.pyenv/versions/3.10.0/lib/python3.10/site-packages/segment_anything/modeling/prompt_encoder.py", line 214, in forward_with_coords
    return self._pe_encoding(coords.to(torch.float))  # B x N x C
  File "/Users/liangbinsi/.pyenv/versions/3.10.0/lib/python3.10/site-packages/segment_anything/modeling/prompt_encoder.py", line 189, in _pe_encoding
    coords = coords @ self.positional_encoding_gaussian_matrix
RuntimeError: Tensor for argument #2 'mat2' is on CPU, but expected it to be on GPU (while checking arguments for mm)
image

Error occurred when executing GroundingDinoSAMSegment (segment anything):

Error occurred when executing GroundingDinoSAMSegment (segment anything):
Screenshot 2024-01-15 154527

torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.

File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\execution.py", line 155, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\execution.py", line 85, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\execution.py", line 78, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 317, in main
boxes = groundingdino_predict(
^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 182, in groundingdino_predict
boxes_filt = get_grounding_output(
^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 170, in get_grounding_output
outputs = model(image[None], captions=[caption])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\custom_nodes\comfyui_segment_anything\local_groundingdino\models\GroundingDINO\groundingdino.py", line 279, in forward
features, poss = self.backbone(samples)
^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\custom_nodes\comfyui_segment_anything\local_groundingdino\models\GroundingDINO\backbone\backbone.py", line 151, in forward
xs = self0
^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\custom_nodes\comfyui_segment_anything\local_groundingdino\models\GroundingDINO\backbone\swin_transformer.py", line 732, in forward
x_out, H, W, x, Wh, Ww = layer(x, Wh, Ww)
^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\custom_nodes\comfyui_segment_anything\local_groundingdino\models\GroundingDINO\backbone\swin_transformer.py", line 448, in forward
x = checkpoint.checkpoint(blk, x, attn_mask)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch_compile.py", line 24, in inner
return torch._dynamo.disable(fn, recursive)(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch_dynamo\eval_frame.py", line 417, in _fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch_dynamo\external_utils.py", line 25, in inner
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "D:\Stable_Diffusion\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\utils\checkpoint.py", line 460, in checkpoint
raise ValueError(

Change model: bert-base-chinese does not work, when can Chinese input be used

The default model: bert-base-uncased
Want to change to Chinese: bert-base-chinese
However, it was downloaded in /ComfyUI/models/bert-base-chinese and changed the file model name bert-base-chinese. Again, the bert-base-uncased task is loaded by default

It would be great if I could type in Chinese, after all, Chinese people love it!

ModuleNotFoundError: No module named 'segment_anything'

I'm not having any luck getting this to load. Reinstalling didn't work either. It's the only extension I'm having issues with.

Traceback:

Traceback (most recent call last):
  File "C:\Users\user\ComfyUI_windows_portable\ComfyUI\nodes.py", line 1734, in load_custom_node
    module_spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "C:\Users\user\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\__init__.py", line 1, in <module>
    from .node import *
  File "C:\Users\user\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 16, in <module>
    from sam_hq.predictor import SamPredictorHQ
  File "C:\Users\user\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\predictor.py", line 4, in <module>
    from segment_anything import SamPredictor
ModuleNotFoundError: No module named 'segment_anything'

Cannot import C:\Users\user\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything module for custom nodes: No module named 'segment_anything'

Error occurred when executing GroundingDinoSAMSegment (segment anything):

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)

File "/root/autodl-tmp/stable-diffusion-webui/extensions/sd-webui-comfyui/ComfyUI/execution.py", line 153, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "/root/autodl-tmp/stable-diffusion-webui/extensions/sd-webui-comfyui/ComfyUI/execution.py", line 83, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "/root/autodl-tmp/stable-diffusion-webui/extensions/sd-webui-comfyui/ComfyUI/execution.py", line 76, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "/root/autodl-tmp/stable-diffusion-webui/extensions/sd-webui-comfyui/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 314, in main
(images, masks) = sam_segment(
File "/root/autodl-tmp/stable-diffusion-webui/extensions/sd-webui-comfyui/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 236, in sam_segment
masks, _, _ = predictor.predict_torch(
File "/root/miniconda3/envs/xl_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/autodl-tmp/stable-diffusion-webui/extensions/sd-webui-comfyui/ComfyUI/custom_nodes/comfyui_segment_anything/sam_hq/predictor.py", line 114, in predict_torch
sparse_embeddings, dense_embeddings = self.model.prompt_encoder(
File "/root/miniconda3/envs/xl_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/miniconda3/envs/xl_env/lib/python3.10/site-packages/segment_anything/modeling/prompt_encoder.py", line 158, in forward
box_embeddings = self._embed_boxes(boxes)
File "/root/miniconda3/envs/xl_env/lib/python3.10/site-packages/segment_anything/modeling/prompt_encoder.py", line 97, in _embed_boxes
corner_embedding = self.pe_layer.forward_with_coords(coords, self.input_image_size)
File "/root/miniconda3/envs/xl_env/lib/python3.10/site-packages/segment_anything/modeling/prompt_encoder.py", line 214, in forward_with_coords
return self._pe_encoding(coords.to(torch.float)) # B x N x C
File "/root/miniconda3/envs/xl_env/lib/python3.10/site-packages/segment_anything/modeling/prompt_encoder.py", line 189, in _pe_encoding
coords = coords @ self.positional_encoding_gaussian_matrix

Suggestion: Extract labels with positioning

Would be so cool if I would be able to input an image, and get back an image with squares with labels on the image, telling what it is. So perhaps returning the label + the coordinate of the box showing the content of the label, and then being able to input the coordinate for a box and masking that specific box.

SAMModelLoader Failure on Apple Silicon due to CUDA Deserialization Error

SAMModelLoader Failure on Apple Silicon due to CUDA Deserialization Error

Issue Description

When attempting to run SAMModelLoader for the segment anything functionality on an Apple Silicon Mac, an error is encountered indicating a problem with attempting to deserialize an object on a CUDA device, even though torch.cuda.is_available() returns False.

Environment

  • Operating System: macOS Sonoma 14.4.1 (M1 Max, Apple Silicon)
  • Python Version: 3.11
  • PyTorch Version: torch==2.1.2 ; torchvision==0.16.2

Error Message

Error occurred when executing SAMModelLoader (segment anything):

Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

Expected Behavior

The model loader should detect the absence of CUDA and automatically adjust to use CPU for model deserialization and execution, allowing the functionality to proceed without error, or ideally switch to the GPU Apple Silicon provides.

Actual Behaviour

The process fails with an error message indicating an attempt to deserialize a CUDA object on a system where CUDA is not available, due to torch.cuda.is_available() returning False.

Screenshot

320283574-c9b53012-d557-415c-9979-31955607671a

TypeError: cannot unpack non-iterable NoneType object

Hello
I'm getting this error sometimes (not always)

ERROR:root:!!! Exception during processing !!!
ERROR:root:Traceback (most recent call last):
  File "[...]\ComfyUI\execution.py", line 153, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "[...]\ComfyUI\execution.py", line 83, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "[...]\ComfyUI\execution.py", line 76, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "[...]\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 305, in main
    (images, masks) = sam_segment(
TypeError: cannot unpack non-iterable NoneType object

on line 305

I think it happens when there is no detection box?

In that case the sam_segment function returns a single None

Perhaps that needs to be return (None, None) as line 305 expects sam_segment to return (images, masks) for destructuring

I don't know python, but perhaps my comment helps point in the right direction 🤷

Thanks for reading! Great Node! Love it!

Fails to load

ComfyUI output during loading:

Traceback (most recent call last):
  File "S:\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\nodes.py", line 1887, in load_custom_node
    module_spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "...\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\__init__.py", line 1, in <module>
    from .node import *
  File "...\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 17, in <module>
    from sam_hq.build_sam_hq import sam_model_registry
  File "...\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\build_sam_hq.py", line 13, in <module>
    from .modeling.tiny_vit import TinyViT
  File "...\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\modeling\tiny_vit.py", line 15, in <module>
    from timm.models.layers import DropPath as TimmDropPath,\
ModuleNotFoundError: No module named 'timm'

Cannot import ...\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything module for custom nodes: No module named 'timm'

RuntimeError: Expected all tensors to be on the same device but found at least two devices, cuda:0 and cpu!

the full error output like following

Loads SAM model: E:\DEV\ComfyUI_windows_portable\ComfyUI\models\sams\sam_vit_h_4b8939.pth (device:AUTO)
final text_encoder_type: bert-base-uncased
!!! Exception during processing !!!
Traceback (most recent call last):
  File "E:\DEV\ComfyUI_windows_portable\ComfyUI\execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "E:\DEV\ComfyUI_windows_portable\ComfyUI\execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "E:\DEV\ComfyUI_windows_portable\ComfyUI\execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "E:\DEV\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 325, in main
    (images, masks) = sam_segment(
  File "E:\DEV\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 247, in sam_segment
    masks, _, _ = predictor.predict_torch(
  File "E:\DEV\ComfyUI_windows_portable\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\DEV\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\predictor.py", line 114, in predict_torch
    sparse_embeddings, dense_embeddings = self.model.prompt_encoder(
  File "E:\DEV\ComfyUI_windows_portable\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\DEV\ComfyUI_windows_portable\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\DEV\ComfyUI_windows_portable\python_embeded\lib\site-packages\segment_anything\modeling\prompt_encoder.py", line 158, in forward
    box_embeddings = self._embed_boxes(boxes)
  File "E:\DEV\ComfyUI_windows_portable\python_embeded\lib\site-packages\segment_anything\modeling\prompt_encoder.py", line 97, in _embed_boxes
    corner_embedding = self.pe_layer.forward_with_coords(coords, self.input_image_size)
  File "E:\DEV\ComfyUI_windows_portable\python_embeded\lib\site-packages\segment_anything\modeling\prompt_encoder.py", line 214, in forward_with_coords
    return self._pe_encoding(coords.to(torch.float))  # B x N x C
  File "E:\DEV\ComfyUI_windows_portable\python_embeded\lib\site-packages\segment_anything\modeling\prompt_encoder.py", line 189, in _pe_encoding
    coords = coords @ self.positional_encoding_gaussian_matrix
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method `wrapper_CUDA_mm)

Any suggestion for this?

GroundingDinoSAMSegment MASK Output issue with Video/ImageBatches

Hey,
first of all, thank you for this very nice Nodes!
But there is an issue with the MASK output of the GroundingDinoSAMSegment Node.
When i try to feed a Vidoe input the IMAGE output looks correct, but the MASK output is just one large image with all Frames sticking verticaly instead of beeing an Image-Batch like the ImagePreview shows.
I am guessing there is something "wrong" in the mask_decoder_hq.py at line 135-152 ?

issue

upper bound and larger bound inconsistent with step sign

Im getting the following error was wondering if any of you guys had encountered it.

Error occurred when executing GroundingDinoSAMSegment (segment anything): upper bound and larger bound inconsistent with step sign

terminal shows me this:
groundingdino/models/GroundingDINO/bertwarper.py", line 252, in generate_masks_with_special_tokens_and_transfer_map
position_ids[row, previous_col + 1 : col + 1] = torch.arange(
^^^^^^^^^^^^^
RuntimeError: upper bound and larger bound inconsistent with step sign

Im trying to use the reposer plus workflow nerdyrodent posted. im running comfyui with --force-fp16 on an Intel mac pro 2019 with a AMD Radeon Pro 580X 8 GB and 48 gb of ram.

Crash and close, anyone with this error?

Loads SAM model: C:\Users\WarMachineV10SSD3\Pictures\SD\ComfyPortable\ComfyUI_windows_portable\ComfyUI\models\sams\sam_vit_b_01ec64.pth (device:CPU)
final text_encoder_type: bert-base-uncased
C:\Users\WarMachineV10SSD3\Pictures\SD\ComfyPortable\ComfyUI_windows_portable\ComfyUI\venv\lib\site-packages\transformers\modeling_utils.py:907: FutureWarning: The device argument is deprecated and will be removed in v5 of Transformers.
warnings.warn(
[F D:\a_work\1\s\pytorch-directml-plugin\torch_directml\csrc\engine\dml_tensor_desc.cc:135] Check failed: !is_dim_broadcast || non_broadcast_dim_size == 1

Requirements checking at every ComfyUI startup

Requirement already satisfied: segment_anything in c:\comfyui\python_embeded\lib\site-packages (from -r requirements.txt (line 1)) (1.0)
Requirement already satisfied: timm in c:\comfyui\python_embeded\lib\site-packages (from -r requirements.txt (line 2)) (0.6.13)
Requirement already satisfied: addict in c:\comfyui\python_embeded\lib\site-packages (from -r requirements.txt (line 3)) (2.4.0)
Requirement already satisfied: yapf in c:\comfyui\python_embeded\lib\site-packages (from -r requirements.txt (line 4)) (0.40.2)
Requirement already satisfied: torch>=1.7 in c:\comfyui\python_embeded\lib\site-packages (from timm->-r requirements.txt (line 2)) (2.2.1+cu121)
Requirement already satisfied: torchvision in c:\comfyui\python_embeded\lib\site-packages (from timm->-r requirements.txt (line 2)) (0.17.1+cu121)
Requirement already satisfied: pyyaml in c:\comfyui\python_embeded\lib\site-packages (from timm->-r requirements.txt (line 2)) (6.0.1)
Requirement already satisfied: huggingface-hub in c:\comfyui\python_embeded\lib\site-packages (from timm->-r requirements.txt (line 2)) (0.19.4)
Requirement already satisfied: importlib-metadata>=6.6.0 in c:\comfyui\python_embeded\lib\site-packages (from yapf->-r requirements.txt (line 4)) (7.0.0)
Requirement already satisfied: platformdirs>=3.5.1 in c:\comfyui\python_embeded\lib\site-packages (from yapf->-r requirements.txt (line 4)) (4.1.0)
Requirement already satisfied: tomli>=2.0.1 in c:\comfyui\python_embeded\lib\site-packages (from yapf->-r requirements.txt (line 4)) (2.0.1)
Requirement already satisfied: zipp>=0.5 in c:\comfyui\python_embeded\lib\site-packages (from importlib-metadata>=6.6.0->yapf->-r requirements.txt (line 4)) (3.17.0)
Requirement already satisfied: filelock in c:\comfyui\python_embeded\lib\site-packages (from torch>=1.7->timm->-r requirements.txt (line 2)) (3.13.1)
Requirement already satisfied: typing-extensions>=4.8.0 in c:\comfyui\python_embeded\lib\site-packages (from torch>=1.7->timm->-r requirements.txt (line 2)) (4.8.0)
Requirement already satisfied: sympy in c:\comfyui\python_embeded\lib\site-packages (from torch>=1.7->timm->-r requirements.txt (line 2)) (1.12)
Requirement already satisfied: networkx in c:\comfyui\python_embeded\lib\site-packages (from torch>=1.7->timm->-r requirements.txt (line 2)) (3.2.1)
Requirement already satisfied: jinja2 in c:\comfyui\python_embeded\lib\site-packages (from torch>=1.7->timm->-r requirements.txt (line 2)) (3.1.2)
Requirement already satisfied: fsspec in c:\comfyui\python_embeded\lib\site-packages (from torch>=1.7->timm->-r requirements.txt (line 2)) (2023.10.0)
Requirement already satisfied: requests in c:\comfyui\python_embeded\lib\site-packages (from huggingface-hub->timm->-r requirements.txt (line 2)) (2.31.0)
Requirement already satisfied: tqdm>=4.42.1 in c:\comfyui\python_embeded\lib\site-packages (from huggingface-hub->timm->-r requirements.txt (line 2)) (4.66.2)
Requirement already satisfied: packaging>=20.9 in c:\comfyui\python_embeded\lib\site-packages (from huggingface-hub->timm->-r requirements.txt (line 2)) (23.2)
Requirement already satisfied: numpy in c:\comfyui\python_embeded\lib\site-packages (from torchvision->timm->-r requirements.txt (line 2)) (1.24.4)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in c:\comfyui\python_embeded\lib\site-packages (from torchvision->timm->-r requirements.txt (line 2)) (10.2.0)
Requirement already satisfied: colorama in c:\comfyui\python_embeded\lib\site-packages (from tqdm>=4.42.1->huggingface-hub->timm->-r requirements.txt (line 2)) (0.4.6)
Requirement already satisfied: MarkupSafe>=2.0 in c:\comfyui\python_embeded\lib\site-packages (from jinja2->torch>=1.7->timm->-r requirements.txt (line 2)) (2.1.3)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\comfyui\python_embeded\lib\site-packages (from requests->huggingface-hub->timm->-r requirements.txt (line 2)) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in c:\comfyui\python_embeded\lib\site-packages (from requests->huggingface-hub->timm->-r requirements.txt (line 2)) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\comfyui\python_embeded\lib\site-packages (from requests->huggingface-hub->timm->-r requirements.txt (line 2)) (1.26.18)
Requirement already satisfied: certifi>=2017.4.17 in c:\comfyui\python_embeded\lib\site-packages (from requests->huggingface-hub->timm->-r requirements.txt (line 2)) (2023.11.17)
Requirement already satisfied: mpmath>=0.19 in c:\comfyui\python_embeded\lib\site-packages (from sympy->torch>=1.7->timm->-r requirements.txt (line 2)) (1.3.0)

Is it possible to remove the requirements checking at every ComfyUI startup?

Run on CPU

Is it possible to add an option to run the processing on the CPU rather than GPU?

Cannot Import - for field conv_cfg is not allowed: use default_factory

Hey there!
I've encountered a weird error I am unable to fix by myself, requesting back up!)

Traceback (most recent call last):
File "C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\nodes.py", line 1872, in load_custom_node
module_spec.loader.exec_module(module)
File "", line 940, in exec_module
File "", line 241, in call_with_frames_removed
File "C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything_init
.py", line 1, in
from .node import *
File "C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 17, in
from sam_hq.build_sam_hq import sam_model_registry
File "C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\build_sam_hq.py", line 13, in
from .modeling.tiny_vit import TinyViT
File "C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\sam_hq\modeling\tiny_vit.py", line 15, in
from timm.models.layers import DropPath as TimmDropPath,
File "C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\timm_init_.py", line 2, in
from .models import create_model, list_models, is_model, list_modules, model_entrypoint,
File "C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\timm\models_init_.py", line 28, in
from .maxxvit import *
File "C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\timm\models\maxxvit.py", line 225, in
@DataClass
^^^^^^^^^
File "dataclasses.py", line 1230, in dataclass
File "dataclasses.py", line 1220, in wrap
File "dataclasses.py", line 958, in _process_class
File "dataclasses.py", line 815, in _get_field
ValueError: mutable default <class 'timm.models.maxxvit.MaxxVitConvCfg'> for field conv_cfg is not allowed: use default_factory

Cannot import C:\ComfyUI_BLYAT\ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything module for custom nodes: mutable default <class 'timm.models.maxxvit.MaxxVitConvCfg'> for field conv_cfg is not allowed: use default_factory

Getting a "Ran out of input" error from one of the dependencies when comfyui starts with segment anything

OS Linux Mint 21.2 x86_64, Kernel: 5.15.0-92-generic. Python version 3.10, latest comfyui

Traceback (most recent call last):
File "/opt/LLM/jttw/components/comfyui/nodes.py", line 1872, in load_custom_node
module_spec.loader.exec_module(module)
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/opt/LLM/jttw/components/comfyui/custom_nodes/comfyui_segment_anything/init.py", line 1, in
from .node import *
File "/opt/LLM/jttw/components/comfyui/custom_nodes/comfyui_segment_anything/node.py", line 19, in
from local_groundingdino.util.utils import clean_state_dict as local_groundingdino_clean_state_dict
File "/opt/LLM/jttw/components/comfyui/custom_nodes/comfyui_segment_anything/local_groundingdino/util/utils.py", line 12, in
from local_groundingdino.util.slconfig import SLConfig
File "/opt/LLM/jttw/components/comfyui/custom_nodes/comfyui_segment_anything/local_groundingdino/util/slconfig.py", line 14, in
from yapf.yapflib.yapf_api import FormatCode
File "/opt/LLM/jttw/components/comfyui/comfyui_venv/lib/python3.10/site-packages/yapf/init.py", line 41, in
from yapf.yapflib import yapf_api
File "/opt/LLM/jttw/components/comfyui/comfyui_venv/lib/python3.10/site-packages/yapf/yapflib/yapf_api.py", line 38, in
from yapf.pyparser import pyparser
File "/opt/LLM/jttw/components/comfyui/comfyui_venv/lib/python3.10/site-packages/yapf/pyparser/pyparser.py", line 44, in
from yapf.yapflib import format_token
File "/opt/LLM/jttw/components/comfyui/comfyui_venv/lib/python3.10/site-packages/yapf/yapflib/format_token.py", line 23, in
from yapf.pytree import pytree_utils
File "/opt/LLM/jttw/components/comfyui/comfyui_venv/lib/python3.10/site-packages/yapf/pytree/pytree_utils.py", line 30, in
from yapf_third_party._ylib2to3 import pygram
File "/opt/LLM/jttw/components/comfyui/comfyui_venv/lib/python3.10/site-packages/yapf_third_party/_ylib2to3/pygram.py", line 29, in
python_grammar = driver.load_grammar(_GRAMMAR_FILE)
File "/opt/LLM/jttw/components/comfyui/comfyui_venv/lib/python3.10/site-packages/yapf_third_party/_ylib2to3/pgen2/driver.py", line 252, in load_grammar
g.load(gp)
File "/opt/LLM/jttw/components/comfyui/comfyui_venv/lib/python3.10/site-packages/yapf_third_party/_ylib2to3/pgen2/grammar.py", line 96, in load
d = pickle.load(f)
EOFError: Ran out of input

Request for Configurable Default Addresses in extra_model_paths.yaml

Can the default addresses in groundingdino_model_dir and sam_model_dir be made configurable in extra_model_paths.yaml?

Also, I encountered an issue where even though I have already downloaded the model at the specified address, it still tries to call the Hugging Face API. Is AutoTokenizer only capable of remote calls?

image

'SAMWrapper' object has no attribute 'image_encoder'

Updated to the latest ComfyUI and SAM + Grounding Dino seems to have broke, getting the following error in the UI:

Error occurred when executing GroundingDinoSAMSegment (segment anything):

'SAMWrapper' object has no attribute 'image_encoder'

File "/home/sid/ComfyUI/execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/ComfyUI/execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/ComfyUI/execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 325, in main
(images, masks) = sam_segment(
^^^^^^^^^^^^
File "/home/sid/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 240, in sam_segment
predictor = SamPredictorHQ(sam_model, sam_is_hq)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/ComfyUI/custom_nodes/comfyui_segment_anything/sam_hq/predictor.py", line 22, in __init__
super().__init__(sam_model=sam_model)
File "/home/sid/miniconda3/envs/comfyui/lib/python3.11/site-packages/segment_anything/predictor.py", line 31, in __init__
self.transform = ResizeLongestSide(sam_model.image_encoder.img_size)
^^^^^^^^^^^^^^^^^^^^^^^

And the following in the terminal:

Traceback (most recent call last):
  File "/home/sid/ComfyUI/execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sid/ComfyUI/execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sid/ComfyUI/execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sid/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 325, in main
    (images, masks) = sam_segment(
                      ^^^^^^^^^^^^
  File "/home/sid/ComfyUI/custom_nodes/comfyui_segment_anything/node.py", line 240, in sam_segment
    predictor = SamPredictorHQ(sam_model, sam_is_hq)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sid/ComfyUI/custom_nodes/comfyui_segment_anything/sam_hq/predictor.py", line 22, in __init__
    super().__init__(sam_model=sam_model)
  File "/home/sid/miniconda3/envs/comfyui/lib/python3.11/site-packages/segment_anything/predictor.py", line 31, in __init__
    self.transform = ResizeLongestSide(sam_model.image_encoder.img_size)
                                       ^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'SAMWrapper' object has no attribute 'image_encoder'

Prompt executed in 14.11 seconds

Please suggest what I can try to fix//troubleshoot - the same workflow was working fine till yesterday...

about bert-base-uncased

Error occurred when executing GroundingDinoModelLoader (segment anything):

We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like bert-base-uncased is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\execution.py", line 153, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\execution.py", line 83, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\execution.py", line 76, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 268, in main
dino_model = load_groundingdino_model(model_name)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 108, in load_groundingdino_model
dino = local_groundingdino_build_model(dino_model_args)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\local_groundingdino\models_init_.py", line 17, in build_model
model = build_func(args)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\local_groundingdino\models\GroundingDINO\groundingdino.py", line 362, in build_groundingdino
model = GroundingDINO(
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\local_groundingdino\models\GroundingDINO\groundingdino.py", line 97, in init
self.tokenizer = get_tokenlizer.get_tokenlizer(text_encoder_type)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\local_groundingdino\util\get_tokenlizer.py", line 19, in get_tokenlizer
tokenizer = AutoTokenizer.from_pretrained(text_encoder_type)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\python_embeded\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 733, in from_pretrained
config = AutoConfig.from_pretrained(
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\python_embeded\lib\site-packages\transformers\models\auto\configuration_auto.py", line 1048, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\python_embeded\lib\site-packages\transformers\configuration_utils.py", line 622, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\python_embeded\lib\site-packages\transformers\configuration_utils.py", line 677, in _get_config_dict
resolved_config_file = cached_file(
File "D:\Zho_Chinese_ComfyUI_windows_portable_light\Zho_Chinese_ComfyUI_windows_portable\python_embeded\lib\site-packages\transformers\utils\hub.py", line 470, in cached_file
raise EnvironmentError(.

I can't seem to get it to output more than 6 images. How is this used in video workflows?

When I input a video set to eg. 200 frame cap I only get 6 masked images out. Often I get crashes too
[rawvideo @ 0x6067e5fab440] Invalid buffer size, packet size 648000 < expected frame_size 1944000
[vist#0:0/rawvideo @ 0x6067e5fab2c0] Error submitting packet to decoder: Invalid argument
[vist#0:0/rawvideo @ 0x6067e5fab2c0] Decode error rate 1 exceeds maximum 0.666667

Memory requirements?

What's the VRAM requirement to run? I'm hitting out of memory issues on a 10GB 3080.

(IMPORT FAILED) segment anything

I wanted to document an issue with installing SAM in ComfyUI. Using the node manager, the import fails. I attempted the basic restarts, refreshes, etc. Attempted an update of ComfyUI - still no dice. Is the issue regarding running on CPU?

M1 Max
32 GB of memory
Ventura

TypeError: cannot unpack non-iterable NoneType object

!!! Exception during processing !!!
Traceback (most recent call last):
  File "D:\tools\ComfyUI_windows_portable\ComfyUI\execution.py", line 152, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "D:\tools\ComfyUI_windows_portable\ComfyUI\execution.py", line 82, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "D:\tools\ComfyUI_windows_portable\ComfyUI\execution.py", line 75, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "D:\tools\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_segment_anything\node.py", line 305, in main
    (images, masks) = sam_segment(
TypeError: cannot unpack non-iterable NoneType object

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.