google-ai-edge / mediapipe-samples Goto Github PK

View Code? Open in Web Editor NEW

1.3K 1.3K 343.0 131.3 MB

License: Apache License 2.0

Jupyter Notebook 84.51% JavaScript 10.46% HTML 0.63% Python 0.13% Kotlin 4.15% CSS 0.12%

mediapipe-samples's People

Contributors

Stargazers

Watchers

Forkers

jenperson tuan4192 isabella232 cortana-knight pavithran2803 ryadav115 keaganchs tarunjain0751 jggomez lucifertrj hueylaus zaynstark landan-reagan qpc-github dineshoptisol rajanyadav0307 haruiz ortegahz dhegit 25687888 naman-jain-44 dakyz island123123 faizan-saifullah nicfu hcynomo eagle17035 cesarivanmoreno 3k46b pengchaojay pobbyleesh ivanaradosevic alison9saim mike030668 bimmi-22v0015 neilblaze clement218 the-mercury manuelstaffa jin-homlee onuralpszr chenhj59 boua112 aakash1233333 dingcchen devmaes hamzaelmouhtadi 18130602229 ing03201 chyngyz475 xyzacademic gorapadakpadak billycs17 sownbanana imagawa451 kimitozuki harsh-dhillon abubakr9726 pavelatmotivision alizibag primopan diwo valdescassy unixxxx dinlera codingwenod bunyamineymen aries2003 techthiyanes matasoy shawonis08 charon5037 fatpig922 adelobosko batalong123 gaolei-tigger cxmmaycxm st-tuanmai kjsmi2 matosgabriel raahimz maxulfves twgust jotajjjj gvc0461082002 superspeedone vgobbi8 archspire-9k llmworkstudy pchong-msrl srimhub pkuyjxu bearx vis-wa ewwwgiddings webrulon adityakannan19 manik-soin suryatmodulus sri-awadh

mediapipe-samples's Issues

gesnture recognizer can not run on windows

Mediapipe Pose livestream segmentation causes Python to quit

I am trying to run Mediapipe Pose in livestream mode

options = PoseLandmarkerOptions(
    base_options=BaseOptions(model_asset_path='pose_landmarker_full.task'),
    running_mode=VisionRunningMode.LIVE_STREAM,
    output_segmentation_masks=True,
    result_callback=print_result)

with PoseLandmarker.create_from_options(options) as pose:
    cap = cv2.VideoCapture(0)
    i = 0
    while (1):
        succes, image = cap.read()
        image = np.array(image)
        imgRGB = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=imgRGB)
        pose.detect_async(mp_image,i)
        i += 1

however, if I try to output the segmentation mask, the Python interpreter quits without any error messages
I have narrowed it down to this line

def print_result(result: PoseLandmarkerResult, output_image: mp.Image, timestamp_ms: int):

    alpha= result.segmentation_masks[0].numpy_view()

if that code is alpha= result.segmentation_masks , then Mediapipe will run and alpha is of type mediapipe.python._framework_bindings.image.Image object but if I wish to return the pixel data with .numpy_view() , it will cause Python to quit without any messages

Background Segmenter tutorial bug

Background segmenter tutorial has a bug, it shows everything blurred instead of blurring just the background

[Android] Why doesn't the segmentation mask image match layout in pose detection result?

Hello, I'm wonseok.
I got segmentation MPImage when i executed pose detection.
So i resized to fit origin image and show on it.
But It doesn't match location between pose detection and segmentation image.
It looks like biased left or right little. Why?

[origin image]

[segmentation mask]

Modify the mediapipe/examples/pose_landmarker example for Streaming Video

Has anyone created a version of the example below using streaming video?

https://github.com/googlesamples/mediapipe/tree/main/examples/pose_landmarker/python

Low framerate in new segmentation solution

The new segmentation solution for the webcam has a much lower quality compared to the old legacy selfie - segmentation. At least the framerate is obviously much lower. The old solution worked very smoothly. Is there anything that can be tuned to get the same results?

MediaPipe use NNAPI

I want to use Android NNAPI in Hand landmarks detection.
How can i do?
Thanks you

Is there any difference between this framework and the [https://github.com/google/mediapipe] on android platform?

Is there any difference between this framework and the https://github.com/google/mediapipe on android platform？ Or are they compatible?

I found the maven depends on are different:
com.google.mediapipe:solution-core:latest.release
implementation 'com.google.mediapipe:solution-core:latest.release'

com.google.mediapipe:tasks-vision:0.1.0-alpha-5
implementation 'com.google.mediapipe:tasks-vision:0.1.0-alpha-5'

Can they migrate to each other? Or is the model file xx.tflite they use compatible?

call imageProxy.close() in objectdetector detectlivestreamframe after imageProxy.use

what is the diffenent between z in pose_landmarks and z in pose_world_landmarks?

Hi, following the guide, I can run pose estimate of human, but the pose_landmarker_result confused me. There both have z in pose_landmarks and pose_world_landmarks, they are all relative depth, but they are different in value, so what are their definiation?

FLAME indices correspondence

Hello.

I am using your FaceLandmarker found here: https://github.com/googlesamples/mediapipe/blob/main/examples/face_landmarker/python/%5BMediaPipe_Python_Tasks%5D_Face_Landmarker.ipynb

Currently I am trying to use the dense landmarks while training a FLAME-based 3D model. Is there such a correspondence between the dense landmarks you produce (478 points) and the FLAME vertices? This repo (https://github.com/Zielon/metrical-tracker/tree/master/flame/mediapipe) seems to have found a correspondence for 105 points, but not the entire 478 points.

GPU accelerated Whisper inference in Mediapipe?

Hi there,
i really like the idea of low-code ML dev tools.
Especially the GPU acelerated inference on android devices!

@st-duymai & @PaulTR:

is there a Whisper (audio-to-text) demo in scope? With streamlined/recorded audio real time into text transcription? I believe with the GPU support it could be done real time, given that there are binaries that almost real time run on CPU.
if not, could one use .tflite models (like this one) to achieve the above with your framework?

thank you for your time and reply in advance!

Optimized MediaPipe HandGestureRecognizer (JS) example

I came across the CodePen link of MediaPipe HandGestureRecognizer (JS) example from here and I'm curious to know whether these changes (or rather updates) can be applied to the same to optimize & increase code readability.

Arrow Function has been implemented to increase conciseness and readability while adhering to ES6 standards.

async function runDemo() {
  const vision = await FilesetResolver.forVisionTasks(
    "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@latest/wasm"
  );
  gestureRecognizer = await GestureRecognizer.createFromOptions(vision, {
    baseOptions: {
      modelAssetPath:
        "https://storage.googleapis.com/mediapipe-tasks/gesture_recognizer/gesture_recognizer.task"
    },
    runningMode: runningMode
  });
  demosSection.classList.remove("invisible");
}
runDemo();

⬇️ (Updated Code)

const runDemo = async () => {
  const vision = await FilesetResolver.forVisionTasks(
    "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@latest/wasm"
  );
  gestureRecognizer = await GestureRecognizer.createFromOptions(vision, {
    baseOptions: {
      modelAssetPath:
        "https://storage.googleapis.com/mediapipe-tasks/gesture_recognizer/gesture_recognizer.task"
    },
    runningMode: runningMode
  });
  demosSection.classList.remove("invisible");
};
runDemo();

Unnecessary assignment of runningMode variable has been avoided, which extensively reduces the risk of introducing bugs (if reused). We directly pass the string to the setOptions() method instead of assigning the "IMAGE" string to the runningMode variable, which simplifies the code and makes it more concise.

  if (runningMode === "VIDEO") {
    runningMode = "IMAGE";
    await gestureRecognizer.setOptions({ runningMode: runningMode });
  }

⬇️ (Updated Code)

  if (runningMode === "VIDEO") {
    await gestureRecognizer.setOptions({ runningMode: "IMAGE" });
  }

Using let instead of var in the loop (adhering to ES6 standards) & making it concise & easier to read.

  const allCanvas = event.target.parentNode.getElementsByClassName("canvas");
  for (var i = allCanvas.length - 1; i >= 0; i--) {
    const n = allCanvas[i];
    n.parentNode.removeChild(n);
  }

⬇️ (Updated Code)

  const allCanvas = event.target.parentNode.getElementsByClassName("canvas");
  for (let i = allCanvas.length - 1; i >= 0; i--) {
    allCanvas[i].parentNode.removeChild(allCanvas[i]);
  }

Leveraging the optional chaining operator in the Arrow function makes it more concise and readable & avoids the need for additional checks.

function hasGetUserMedia() {
  return !!(navigator.mediaDevices && navigator.mediaDevices.getUserMedia);
}

⬇️ (Updated Code)

const hasGetUserMedia = () => !!(navigator.mediaDevices?.getUserMedia);

Optimizing using a ternary operator which avoids unnecessary repetition of code.

  if (webcamRunning === true) {
    webcamRunning = false;
    enableWebcamButton.innerText = "ENABLE PREDICTIONS";
  } else {
    webcamRunning = true;
    enableWebcamButton.innerText = "DISABLE PREDICITONS";
  }

⬇️ (Updated Code)

  webcamRunning = !webcamRunning;
  enableWebcamButton.innerText = webcamRunning ? "DISABLE PREDICTIONS" : "ENABLE PREDICTIONS";

Optimized using the Arrow function & optional chaining, resulting in reduction of verbose code. Moreover, instead of using === true to compare a boolean value to true (which is unnecessary), the boolean value itself can be used as the condition.

async function predictWebcam() {
  const webcamElement = document.getElementById("webcam");
  // Now let's start detecting the stream.
  if (runningMode === "IMAGE") {
    runningMode = "VIDEO";
    await gestureRecognizer.setOptions({ runningMode: runningMode });
  }
  let nowInMs = Date.now();
  const results = gestureRecognizer.recognizeForVideo(video, nowInMs);

  canvasCtx.save();
  canvasCtx.clearRect(0, 0, canvasElement.width, canvasElement.height);

  canvasElement.style.height = videoHeight;
  webcamElement.style.height = videoHeight;
  canvasElement.style.width = videoWidth;
  webcamElement.style.width = videoWidth;
  if (results.landmarks) {
    for (const landmarks of results.landmarks) {
      drawConnectors(canvasCtx, landmarks, HAND_CONNECTIONS, {
        color: "#00FF00",
        lineWidth: 5
      });
      drawLandmarks(canvasCtx, landmarks, { color: "#FF0000", lineWidth: 2 });
    }
  }
  canvasCtx.restore();
  if (results.gestures.length > 0) {
    gestureOutput.style.display = "block";
    gestureOutput.style.width = videoWidth;
    gestureOutput.innerText =
      "GestureRecognizer: " +
      results.gestures[0][0].categoryName +
      "\n Confidence: " +
      Math.round(parseFloat(results.gestures[0][0].score) * 100) +
      "%";
  } else {
    gestureOutput.style.display = "none";
  }
  // Call this function again to keep predicting when the browser is ready.
  if (webcamRunning === true) {
    window.requestAnimationFrame(predictWebcam);
  }
}

⬇️ (Updated Code)

const predictWebcam = async () => {
  const webcamElement = document.getElementById("webcam");
  // Now let's start detecting the stream.
  if (runningMode === "IMAGE") {
    runningMode = "VIDEO";
    await gestureRecognizer.setOptions({ runningMode: runningMode });
  }
  let nowInMs = Date.now();
  const results = gestureRecognizer.recognizeForVideo(video, nowInMs);

  canvasCtx.save();
  canvasCtx.clearRect(0, 0, canvasElement.width, canvasElement.height);

  canvasElement.style.height = videoHeight;
  webcamElement.style.height = videoHeight;
  canvasElement.style.width = videoWidth;
  webcamElement.style.width = videoWidth;
  results.landmarks?.forEach((landmarks) => {
    drawConnectors(canvasCtx, landmarks, HAND_CONNECTIONS, {
      color: "#00FF00",
      lineWidth: 5
    });
    drawLandmarks(canvasCtx, landmarks, { color: "#FF0000", lineWidth: 2 });
  });

  canvasCtx.restore();
  if (results.gestures.length > 0) {
    gestureOutput.style.display = "block";
    gestureOutput.style.width = videoWidth;
    const categoryName = results.gestures[0][0].categoryName;
    const score = Math.round(parseFloat(results.gestures[0][0].score) * 100);
    gestureOutput.innerText = `GestureRecognizer: ${categoryName}\n Confidence: ${score}%`;
  } else {
    gestureOutput.style.display = "none";
  }
  // Call this function again to keep predicting when the browser is ready.
  if (webcamRunning) {
    window.requestAnimationFrame(predictWebcam);
  }
}

💡 The updated CodePen example can be accessed here.
cc: @jenperson

OS : Ubuntu (22.04 LTS, x64) / Windows 10 Pro (x64)
Browser :

Google Chrome — Version 111.0.5563.147 (Official Build) (64-bit)
Microsoft Edge [Version 112.0.1722.39 (Official build) (64-bit)]

Fail to initialize gesture recognizer

If start gesture recognizer app on arm64-v8a device so everything is OK. But starting on armeabi-v7a device (Android version is the same and equals to 11) makes facing to some problems:

E/tflite: The supplied buffer is not 4-bytes aligned
E/tflite: The model allocation is null/empty
E/native: E20221114 20:29:51.589087  3371 graph.cc:472] Could not build model from the provided pre-loaded flatbuffer: The model allocation is null/empty
W/System.err: com.google.mediapipe.framework.MediaPipeException: unknown: Could not build model from the provided pre-loaded flatbuffer: The model allocation is null/empty
W/System.err:     at com.google.mediapipe.framework.Graph.nativeStartRunningGraph(Native Method)
W/System.err:     at com.google.mediapipe.framework.Graph.startRunningGraph(Graph.java:336)
W/System.err:     at com.google.mediapipe.tasks.core.TaskRunner.create(TaskRunner.java:71)
W/System.err:     at com.google.mediapipe.tasks.vision.gesturerecognizer.GestureRecognizer.createFromOptions(GestureRecognizer.java:194)
W/System.err:     at com.google.mediapipe.examples.gesturerecognizer.GestureRecognizerHelper.setupGestureRecognizer(GestureRecognizerHelper.kt:95)
W/System.err:     at com.google.mediapipe.examples.gesturerecognizer.GestureRecognizerHelper.<init>(GestureRecognizerHelper.kt:50)
W/System.err:     at com.google.mediapipe.examples.gesturerecognizer.fragment.CameraFragment.onViewCreated$lambda-4(CameraFragment.kt:147)
W/System.err:     at com.google.mediapipe.examples.gesturerecognizer.fragment.CameraFragment.$r8$lambda$xiZI6LDAjMBw-J7vyrjSe_CLWo0(Unknown Source:0)
W/System.err:     at com.google.mediapipe.examples.gesturerecognizer.fragment.CameraFragment$$ExternalSyntheticLambda13.run(Unknown Source:2)
W/System.err:     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
W/System.err:     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
W/System.err:     at java.lang.Thread.run(Thread.java:923)
E/GestureRecognizerHelper 56784282: MP Task Vision failed to load the task with error: unknown: Could not build model from the provided pre-loaded flatbuffer: The model allocation is null/empty

I guess the error depends on device arch. What can you advice in this situation?

audio_classification_live_stream - Python

Hi,

I got the following error:

Traceback (most recent call last):  
 File "...\mediapipe\examples\audio_classifier\python\audio_classification_live_stream\classify.py", line 134, in <module>
    main()  
 File "...\mediapipe\examples\audio_classifier\python\audio_classification_live_stream\classify.py", line 129, in main
    run(args.model, int(args.maxResults), float(args.scoreThreshold),  
File "...\mediapipe\examples\audio_classifier\python\audio_classification_live_stream\classify.py", line 57, in run
    classifier = audio.AudioClassifier.create_from_options(options)
 File "...\mediapipe\examples\audio_classifier\python\audio_classification_live_stream\venv\lib\site-packages\mediapipe\tasks\python\audio\audio_classifier.py", line 204, in create_from_options
    return cls(  
 File "...\mediapipe\examples\audio_classifier\python\audio_classification_live_stream\venv\lib\site-packages\mediapipe\tasks\python\audio\core\base_audio_task_api.py", line 64, in __init__
    self._runner = _TaskRunner.create(graph_config, packet_callback)

RuntimeError: ValidatedGraphConfig Initialization failed.
No registered object with name: mediapipe::tasks::audio::audio_classifier::AudioClassifierGraph; Unable to find Calculator "mediapipe.tasks.audio.audio_classifier.AudioClassifierGraph"

Process finished with exit code 1

I use PyCharm and Windows 10
I tried with Python 3.9 and 3.10
I tried the provided code and I also tried to use downloaded models:

base_options = mp.tasks.BaseOptions(model_asset_path=model)
base_options = mp.tasks.BaseOptions(model_asset_path='lite-model_yamnet_classification_tflite_1.tflite')
base_options = mp.tasks.BaseOptions(model_asset_path='yamnet_audio_classifier_with_metadata.tflite')

kotlin.UninitializedPropertyAccessException: lateinit property gestureRecognizerHelper has not been initialized

Hi: using AS Eel and the defaults but package com.google.mediapipe.examples.gesturerecognizer.fragment throws

E/AndroidRuntime: FATAL EXCEPTION: main
Process: com.google.mediapipe.examples.gesturerecognizer, PID: 6318
kotlin.UninitializedPropertyAccessException: lateinit property gestureRecognizerHelper has not been initialized
at com.google.mediapipe.examples.gesturerecognizer.fragment.CameraFragment$initBottomSheetControls$7.onItemSelected(CameraFragment.kt:235)
at android.widget.AdapterView.fireOnSelected(AdapterView.java:957)
at android.widget.AdapterView.dispatchOnItemSelected(AdapterView.java:946)
at android.widget.AdapterView.-$$Nest$mdispatchOnItemSelected(Unknown Source:0)
at android.widget.AdapterView$SelectionNotifier.run(AdapterView.java:910)
at android.os.Handler.handleCallback(Handler.java:942)
at android.os.Handler.dispatchMessage(Handler.java:99)
at android.os.Looper.loopOnce(Looper.java:201)
at android.os.Looper.loop(Looper.java:288)
at android.app.ActivityThread.main(ActivityThread.java:7872)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:936)
I/tflite: Initialized TensorFlow Lite runtime.
W/libc: Access denied finding property "ro.mediatek.platform"
W/libc: Access denied finding property "ro.chipname"
W/libc: Access denied finding property "ro.hardware.chipname"
I/tflite: Created TensorFlow Lite XNNPACK delegate for CPU.
I/CameraManagerGlobal: Connecting to camera service
D/CameraRepository: Added camera: 0
I/Process: Sending signal. PID: 6318 SIG: 9

on a pixel 7.

Presence and visibility for individual landmarks

The PoseLandmarkerResult does only return x, y and z for normalized and world landmarks. Is there any way to get the presence and visibility score? According to the guide, we should have access to these attributes. Is the source code available anywhere?

Thanks a lot for the new solution - any help would be much appreciated!

Lightweight hand tracking algorithm within min_hand_presence_confidence description

I saw a paragraph explaining "Otherwise, a lightweight hand tracking algorithm is used to determine the location of the hand(s) for subsequent landmark detection."
How does a lightweight hand tracking algorithm work?
Thanks.

some problems with the rear camera

When I switch phone camera to back, like this CameraSelector.LENS_FACING_BACK. When the phone camera recognizes the hand, the real-time refresh node and line segment will not accurately fall on the hand, but slightly offset. This will not happen after switching to the front camera. How to optimize and deal with this problem?

Audio Embedded Model for Mediapipe Task

I tried searching for a pre-trained Audio embedded model for the Mediapipe Task API but was unable to find it. Does there exist any model for this job?

I have raised a PR to fill in the missing example on AudioEmbedded #66, waiting for the .tflite model.

Custom Trained Model

Hello,

I am trying to implement the android application for the image segmentation task. (https://github.com/googlesamples/mediapipe/tree/main/examples/image_segmentation/android)

I have an h5 model trained on my own data.

Can you please post a method to transform an h5 to tflite?
Can you please specify also where can i change the image size to be compatible with my model?

How to get presence and visibility parameters ?

In this documentation https://developers.google.com/mediapipe/solutions/vision/pose_landmarker/android ,
It is given that the result will be this:
PoseLandmarkerResult:
Landmarks:
Landmark #0:
x : 0.638852
y : 0.671197
z : 0.129959
visibility : 0.9999997615814209
presence : 0.9999984502792358
But in the implementation, I am only getting x, y and z values but not visibility and presence.

How to use the NPU built into the device

Does mediapipe only support CPU and GPU?
If the mediapipe support NPU , May you let me know how to use it?

Add hand landmarks sample

How to run on darwin/iOS?

Hey all 👋

The previous version of Mediapipe had first class C++ support with the ability to run on iOS and documentation for this.

Is this dropped from solutions like FaceMesh V2? I'm only seeing Android, Python and Web guides.

How to run python example on GPU?

I followed the Python example in Colab notebook(installed mediapipe by pip), but found the GPU seemed like not being occupied. Is there a specific mediapipe version or installation option to use GPU device?

Hand tracking is slower than the old Mediapipe for Android

In the old Mediapipe's sample app my test phone could track hands in real time, in this version it is lagging behind. What could be the problem? First thought that comes to my head is analyze case for CameraX being slower. Old Mediapipe sample used an older CameraX version and handled SurfaceView manually if my memory is serving me right. Is there anyway I can get around this and get the same speed for this version? Forgive me if I am asking anything obvious as I am still rather new to Android development.

how to combine a custom model to default one?

I have follow the page https://developers.google.com/mediapipe/solutions/vision/gesture_recognizer/customize and get a custom model gesture_recognizer.task that can recognize rock paper scissors.

And I found Android SDK can only load one model. So how can I use this model conbime with the default one https://developers.google.com/mediapipe/solutions/vision/gesture_recognizer#models

error running on raspberry pi 4 64-bit python 3.9.2

any other face landmarker models available for download?

the face_landmarker_v2_with_blendshapes.task seems only support CPU?

what other models support GPU ？

How to run two models at the same time on android?

Hello, I really need help
I need to collect the coordinates of the points of both hands and face from one frame. I have combined the code to search for face points and hand points.
Now I have two FaceLandmarkerHelper and HandLandmarkerHelper files in my project. Nothing has changed in them, except the name of the inherited LandmarkerListener functions: onError, onResults, onEmpty. Now it's onErrorFace/onErrorHand, onResultsFace/onResultsHand, onEmptyFace/onEmptHand.
In the CameraFragment, I similarly divided the variables into hands and face: imageAnalyzer, backgroundExecutor. And added a variable handLandmarkerHelper, faceLandmarkerHelper. The class itself inherits from HandLandmarkerHelper.LandmarkerListener и FaceLandmarkerHelper.LandmarkerListener.
All separated variables have duplicated code.

Declaring variables:

private lateinit var handLandmarkerHelper: HandLandmarkerHelper
private lateinit var faceLandmarkerHelper: FaceLandmarkerHelper

private val viewModel: MainViewModel by activityViewModels()
private var preview: Preview? = null 

private var handIimageAnalyzer: ImageAnalysis? = null
private var faceImageAnalyzer: ImageAnalysis? = null

private var camera: Camera? = null 
private var cameraProvider: ProcessCameraProvider? = null 
private var cameraFacing = CameraSelector.LENS_FACING_FRONT

private lateinit var handBackgroundExecutor: ExecutorService
private lateinit var faceBackgroundExecutor: ExecutorService

In the function onViewCreated:

    super.onViewCreated(view, savedInstanceState)

    // Initialize our background executor
    backgroundExecutor = Executors.newSingleThreadExecutor()
    faceBackgroundExecutor = Executors.newSingleThreadExecutor()

    // Wait for the views to be properly laid out
    fragmentCameraBinding.viewFinder.post {
        // Set up the camera and its use cases
        setUpCamera() // метод setUpCamera текущего класса
    }

    // Create the HandLandmarkerHelper that will handle the inference
    handBackgroundExecutor.execute {
        handLandmarkerHelper = HandLandmarkerHelper(
            context = requireContext(),
            runningMode = RunningMode.LIVE_STREAM,
            minHandDetectionConfidence = viewModel.currentMinHandDetectionConfidence,
            minHandTrackingConfidence = viewModel.currentMinHandTrackingConfidence,
            minHandPresenceConfidence = viewModel.currentMinHandPresenceConfidence,
            maxNumHands = viewModel.currentMaxHands,
            currentDelegate = viewModel.currentDelegate,
            handLandmarkerHelperListener = this
        )
    }
    faceBackgroundExecutor.execute {
        faceLandmarkerHelper = FaceLandmarkerHelper(
            context = requireContext(),
            runningMode = RunningMode.LIVE_STREAM,
            minFaceDetectionConfidence = viewModel.currentMinFaceDetectionConfidence,
            minFaceTrackingConfidence = viewModel.currentMinFaceTrackingConfidence,
            minFacePresenceConfidence = viewModel.currentMinFacePresenceConfidence,
            maxNumFaces = viewModel.currentMaxFaces,
            currentDelegate = viewModel.currentDelegate,
            faceLandmarkerHelperListener = this
        )
    }

Part of the function bindCameraUseCases:

    // ImageAnalysis. Using RGBA 8888 to match how our models work
    handImageAnalyzer =
        ImageAnalysis.Builder().setTargetAspectRatio(AspectRatio.RATIO_4_3)
            .setTargetRotation(fragmentCameraBinding.viewFinder.display.rotation)
            .setBackpressureStrategy(ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST)
            .setOutputImageFormat(ImageAnalysis.OUTPUT_IMAGE_FORMAT_RGBA_8888)
            .build()
            // The analyzer can then be assigned to the instance
            .also {
                it.setAnalyzer(handBackgroundExecutor) { image ->
                    detectHand(image) 
                }
            }

    faceImageAnalyzer =
        ImageAnalysis.Builder().setTargetAspectRatio(AspectRatio.RATIO_4_3)
            .setTargetRotation(fragmentCameraBinding.viewFinder.display.rotation)
            .setBackpressureStrategy(ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST)
            .setOutputImageFormat(ImageAnalysis.OUTPUT_IMAGE_FORMAT_RGBA_8888)
            .build()
            // The analyzer can then be assigned to the instance
            .also {
                it.setAnalyzer(faceBackgroundExecutor) { image ->
                    detectFace(image)
                }
            }


    // Must unbind the use-cases before rebinding them
    cameraProvider.unbindAll()

Is it possible to pass two analyzers to bindToLifcycle? Continuation of the function bindCameraUseCases:

    try {
        // A variable number of use-cases can be passed here -
        // camera provides access to CameraControl & CameraInfo
        camera = cameraProvider.bindToLifecycle(
            this, cameraSelector, preview, handImageAnalyzer (or faceImageAnalyzer)
        )

        // Attach the viewfinder's surface provider to preview use case
        preview?.setSurfaceProvider(fragmentCameraBinding.viewFinder.surfaceProvider)
    } catch (exc: Exception) {
        Log.e(TAG, "Use case binding failed", exc)
    }

Or maybe somehow combine both analyzers in one?
imageAnalyzer = handImageAnalyzer + faceImageAnalyzer

If you create one imageAnalyzer and one backgroundExecutor, then in this code:

    imageAnalyzer =
        ImageAnalysis.Builder().setTargetAspectRatio(AspectRatio.RATIO_4_3)
            .setTargetRotation(fragmentCameraBinding.viewFinder.display.rotation)
            .setBackpressureStrategy(ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST)
            .setOutputImageFormat(ImageAnalysis.OUTPUT_IMAGE_FORMAT_RGBA_8888)
            .build()
            // The analyzer can then be assigned to the instance
            .also {
                it.setAnalyzer(backgroundExecutor) { image ->
                    detectFace(image)
                }
                it.setAnalyzer(backgroundExecutor) { image ->
                    detectHand(image)
                }
            }

the latter method will prevail (now detectHand)

Issue with importing mediapipe library using ES6 modules in Workers

I am writing to report an issue I have encountered while importing the mediapipe library using ES6 modules in Workers. When attempting to import mediapipe, I consistently receive the following error: "TypeError: Failed to execute 'importScripts' on 'WorkerGlobalScope': Module scripts don't support importScripts()." This error occurs when I set the 'type' attribute to 'module' when connecting the Worker, as shown in the code snippet below:

new Worker(this.workerScriptUrl, { type: 'module' });

After investigating the issue, it has become apparent that the mediapipe library relies on importScripts to import WebAssembly (wasm) files. Unfortunately, due to the limitation of module scripts in Workers, importScripts is not supported, resulting in the aforementioned error. Consequently, I am unable to utilize the mediapipe library within Workers when using ES6 modules.

I kindly request your assistance in resolving this matter. I propose the implementation of an optional workaround to address this issue. Specifically, it would be greatly appreciated if you could introduce an exception that allows an alternative method, such as using fetch or any other suitable approach, to import the wasm files instead of relying on importScripts. This adjustment would facilitate the seamless integration of the mediapipe library with ES6 modules in Workers, enabling developers to effectively leverage its functionalities.

Thank you for your attention to this matter. I eagerly await your response and any guidance you can provide to help resolve this issue.

on-device-diffusion-plugins ? [question]

recently y'all posted excellent work: https://ai.googleblog.com/2023/06/on-device-diffusion-plugins-for.html

that article links to the MediaPipe image generation app but it redirects to /

https://mediapipe.page.link/on-device-diffusion-plugins

where's the code?

do cool research
ship it

cc the wizards: @YangNaruto @Tingbopku @jiuqiant @lu-wang-g @khanhlvg @lee-ju @[email protected]

image segmentation canvas live-streaming mode seems not showing full Image

Hello, I was trying to modify the image segmentation example. For video live streaming mode, I try to put points into the image, but it seems not centered. I debug the image, it seems that the video within the canvas is cut.
Code :

I call this code within setResults
drawExpectedLocation(mask_with_center.nativeObjAddr)
and I convert the mask_with_center into bitmap and show it

val mask_with_center = Mat(outputHeight, outputWidth, CvType.CV_8UC4,Scalar.all(0.0))
drawExpectedLocation(mask_with_center.nativeObjAddr)
val image = Bitmap.createBitmap(
            mask_dif_largest.cols(),
            mask_dif_largest.rows(),
            Bitmap.Config.ARGB_8888
        );
Utils.matToBitmap(mask_with_center, image);
val scaleFactor = when (runningMode) {
    RunningMode.IMAGE,
    RunningMode.VIDEO -> {
        min(width * 1f / outputWidth, height * 1f / outputHeight)
    }
    RunningMode.LIVE_STREAM -> {
        // PreviewView is in FILL_START mode. So we need to scale up the
        // landmarks to match with the size that the captured images will be
        // displayed.
        max(width * 1f / outputWidth, height * 1f / outputHeight)
    }
}
val scaleWidth = (outputWidth * scaleFactor).toInt()
val scaleHeight = (outputHeight * scaleFactor).toInt()
scaleBitmap = Bitmap.createScaledBitmap(
    image, scaleWidth, scaleHeight, false
)

and I got the result like this

the 2 circle is not centered... I think because the image within canvas is cut. please help me

All CodePen examples for `vision` are failing

Description

The following CodePen demos (examples) are failing because of dependency incompatibility for MediaPipe Tasks Vision package:

I initially assumed that it was an issue with CodePen, but I can confirm that it's valid since I've tested most of them locally too.

cc: @jenperson, @PaulTR

Screenshot 🖼️

The error log populates an issue with building the "@mediapipe/tasks-vision" package [@mediapipe/[email protected]]

Solution ❔

💡 0.1.0-alpha-10 just got released just 6 hours ago, while 0.1.0-alpha-9 got released 2 days ago. Release versions can be explored over here.

The easy fix is to update the CDN → https://cdn.skypack.dev/@mediapipe/tasks-vision@latest with either https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision (stable) or better "https://cdn.skypack.dev/@mediapipe/[email protected]" (latest stable) & you're good to go! 🚀

P.S.: Please consider adding an Issue Template as well as Stale bot for this repository or better import the same one from here & set the config value accordingly. If assigned, I'd love to raise a PR for the same! 😄

OS : Ubuntu (22.04 LTS, x64) / Windows 10 Pro (x64)
Browser :

Google Chrome — Version 111.0.5563.147 (Official Build) (64-bit)
Microsoft Edge [Version 112.0.1722.39 (Official build) (64-bit)]

How to run segmentation in a Worker

On the documentation page:
https://developers.google.com/mediapipe/solutions/vision/image_segmenter/web_js
it is mentioned that segmentForVideo() can also be run in a worker in order to improve performance:

I couldn't find a working example for this. The problem is that importScripts() is not supported in workers, but the mediapipe npm module seems to use this function.Thus I'm getting:

I'm running this all in the renderer process of an Electron app.

Exception in HandLandmarker sample: lateinit property handLandmarkerHelper has not been initialized

E/AndroidRuntime: FATAL EXCEPTION: main
Process: com.google.mediapipe.examples.handlandmarker, PID: 22940
kotlin.UninitializedPropertyAccessException: lateinit property handLandmarkerHelper has not been initialized
at com.google.mediapipe.examples.handlandmarker.fragment.CameraFragment$initBottomSheetControls$9.onItemSelected(CameraFragment.kt:249)
at android.widget.AdapterView.fireOnSelected(AdapterView.java:979)
at android.widget.AdapterView.dispatchOnItemSelected(AdapterView.java:968)
at android.widget.AdapterView.-$$Nest$mdispatchOnItemSelected(Unknown Source:0)
at android.widget.AdapterView$SelectionNotifier.run(AdapterView.java:932)
at android.os.Handler.handleCallback(Handler.java:942)
at android.os.Handler.dispatchMessage(Handler.java:99)
at android.os.Looper.loopOnce(Looper.java:240)
at android.os.Looper.loop(Looper.java:351)

I got this error on my OnePlus 9R device. I can see handLandmarkerHelper has been initialized in onViewCreated() via concurrent executor. That's the reason of the crash I guess. Is it possible to initialize it on the main thread?

NormalizedLandmark depth (z-axis) scale/denormalization

I am trying to use the x, y, and z axes of the landmarks output from your latest FaceLandmarker model. It is mentioned here that the z magnitude follows the scale of the x-axis. I tried de-normalizing the z-axis based on the x-axis but got inaccurate results.

Can you please clarify how to correctly de-normalize the depth (z-axis) values?

Dense face landmark definition

Hello, is there a definition of the location of each landmark in the dense model? If so, how is it corresponding to the vertices of FLAME, GHUM, orPhoMoH models?

Thanks!

Regarding GSoC 2023

Hi,
My Self Devarsh Mavani, 3rd year Information Technology Student at Vishwakarma Government Engineering College. I am looking forward to participating in GSoC under tensorflow this year. I am interested in the project "Interactive Web Demos using the MediaPipe Machine Learning Library ''. To give you a bit of my background, I have Done ML Specialization and Deep Learning course from coursera. I have also done a few Deep-learning related projects. One of them being Sign language recognition in android (using tflite). I participated in GSoC last year under MIT App Inventor, and have over 1 year of experience in open-source development.
I found mediapipe really interesting and want to contribute by making demos for mediapipe. I have a couple of demos that we can build in my mind, such as Sign language recognition and Virtual paint app. I wanted to discuss what type of demo web-applications are the mentors expecting from me, as part of GSoC. Will it be up to students? or will it be discussed to us prior? I am working on a Proposal for this project so I wanted to clarify some of my doubts. It will be really helpful if I can connect to potential mentor.

Thank You,
Devarsh Mavani

image segmentation on ipad safari

Trying to troubleshoot why only on my ipad safari i get a full white mask back every time. The segmentation works fine on desktop but im not entirely sure how else to test/debug this. i do get a base64 image from the segmenter, but its always just a white image on ipad safari.

just updated my computer and ipad to the lastest ios/safari

useEffect(() => {
const loadModel = async () => {
const audio = await FilesetResolver.forVisionTasks(
"https://cdn.jsdelivr.net/npm/@mediapipe/[email protected]/wasm"
);
const options = {
baseOptions: {
modelAssetPath: "/models/selfie_multiclass.tflite",
delegate: "GPU",
},
runningMode: "IMAGE",
outputCategoryMask: true,
outputConfidenceMasks: false,
displayNamesLocale: "en",
};
const segmenter = await ImageSegmenter.createFromOptions(audio, options);
setImageSegmenter(segmenter);
};
loadModel();
}, []);

const initializeCamera = async () => {
try {
const constraints = {
audio: false,
video: { facingMode: "user", width: 512, height: 768 },
};
const stream = await navigator.mediaDevices.getUserMedia(constraints);
const videoElement = videoRef.current;
videoElement.srcObject = stream;
videoElement.onloadedmetadata = () => {
videoElement.play();
setIsCameraInitialized(true);
};
} catch (error) {
console.error("Error accessing webcam:", error);
console.error(error.message);
}
};

const takeSelfie = async () => {
const videoElement = videoRef.current;
const canvasElement = canvasRef.current;
const canvasCtx = canvasElement.getContext("2d");
const { videoWidth, videoHeight } = videoElement;
canvasElement.width = 512;
canvasElement.height = 768;

canvasCtx.drawImage(videoElement, 0, 0, 512, 768);

const imageData = canvasCtx.getImageData(0, 0, 512, 768);
const imageUrl = canvasElement.toDataURL();
setImageURL(imageUrl);
await imageSegmenter.segment(imageData, callback);
};

const callback = (result) => {
const canvasElement = canvasRef.current;
const canvasCtx = canvasElement.getContext("2d");
const { width, height } = canvasElement;
const categoryMask = result.categoryMask.getAsUint8Array();

const imageData = canvasCtx.getImageData(0, 0, 512, 768);
const data = imageData.data;
const headColor = [0, 0, 0, 255]; // Black color for head and hair
const whiteColor = [255, 255, 255, 255]; // White color for other parts

for (let i = 0; i < categoryMask.length; i++) {
  const categoryIndex = categoryMask[i];

  if (
    categoryIndex == 1 ||
    categoryIndex === 2 ||
    categoryIndex == 3 ||
    categoryIndex == 5
  ) {
    // Head and hair category indices
    data[i * 4] = headColor[0];
    data[i * 4 + 1] = headColor[1];
    data[i * 4 + 2] = headColor[2];
    data[i * 4 + 3] = headColor[3];
  } else {
    data[i * 4] = whiteColor[0];
    data[i * 4 + 1] = whiteColor[1];
    data[i * 4 + 2] = whiteColor[2];
    data[i * 4 + 3] = whiteColor[3];
  }
}

canvasCtx.putImageData(imageData, 0, 0);

const segmentedImageURL = canvasElement.toDataURL();
setSegmentedImageURL(segmentedImageURL);
};

UninitializedPropertyAccessException

Getting the following error when I start the app

E/AndroidRuntime: FATAL EXCEPTION: main
Process: com.google.mediapipe.examples.poselandmarker, PID: 23887
kotlin.UninitializedPropertyAccessException: lateinit property poseLandmarkerHelper has not been initialized
at com.google.mediapipe.examples.poselandmarker.fragment.CameraFragment$initBottomSheetControls$8.onItemSelected(CameraFragment.kt:254)
at android.widget.AdapterView.fireOnSelected(AdapterView.java:957)
at android.widget.AdapterView.dispatchOnItemSelected(AdapterView.java:946)
at android.widget.AdapterView.-$$Nest$mdispatchOnItemSelected(Unknown Source:0)
at android.widget.AdapterView$SelectionNotifier.run(AdapterView.java:910)
at android.os.Handler.handleCallback(Handler.java:942)
at android.os.Handler.dispatchMessage(Handler.java:99)
at android.os.Looper.loopOnce(Looper.java:201)
at android.os.Looper.loop(Looper.java:288)
at android.app.ActivityThread.main(ActivityThread.java:7884)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:936)

AAR Unity porting error

While converting the gesturerecognizer example into AAR and inserting it into Unity to call an activity and get a value, an error like this log and it says that initialization failed.

native  com.DefaultCompany.SampleAPI  I  I20230329 11:05:49.779655 14280 resource_util_android.cc:77] Successfully loaded: gesture_recognizer.task

QCamera  [email protected]  I  <HAL><I> openCamera: 794: [KPI Perf]: X PROFILE_OPEN_CAMERA camera id 0, rc: 0

 LGCameraPerf-8996ac_OOS [email protected]  E  powerHintInternal_LGE: 353: mEnable = 0, enable = 1 PowerHint::CAMERA_STREAMING = 12

LGCameraPerf-8996ac_OOS [email protected]  E  powerHintInternal_LGE: 376: powerHint = 11, enable = 1,

QCamera  [email protected]  I  <HAL><I> initialize: 1097: E :mCameraId = 0 mState = 1

 sensors_ha...otionAccel [email protected]  D  processInd: LP2: X: 0.296753 Y: 1.117096 Z: 9.717972 SAM TS: 2602640251 HAL TS:79423187446215 elapsedRealtimeNano:79423270247980

sensors_hal_Ctx         [email protected]  D  poll:polldata:1, sensor:54, type:499898101, x:0.296753 y:1.117096 z:9.717972

sensors_hal_Util        [email protected]  D  waitForResponse: timeout=0

BluetoothRemoteDevices  com.android.bluetooth    D  Property type: 1

 6519-6749  BluetoothRemoteDevices  com.android.bluetooth   W  Skip name update for C0:F0:FB:27:E3:C2

 QCOM PowerHAL  [email protected]   I  Preview power hint start

BluetoothRemoteDevices  com.android.bluetooth  D  Property type: 4

 BluetoothRemoteDevices  com.android.bluetooth    W  Skip class update for C0:F0:FB:27:E3:C2

native                  com.DefaultCompany.SampleAPI         I  I20230329 11:05:49.785691 14280 hand_gesture_recognizer_graph.cc:250] Custom gesture classifier is not defined.

QCamera   [email protected]  I  <HAL><I> initialize: 1130: X

 tflite   com.DefaultCompany.SampleAPI         E  The supplied buffer is not 4-bytes aligned

tflite    com.DefaultCompany.SampleAPI         E  The model allocation is null/empty

 native     com.DefaultCompany.SampleAPI         E  E20230329 11:05:49.786113 14280 graph.cc:472] Could not build model from the provided pre-loaded flatbuffer: The model allocation is null/empty

GestureRec...r 41116847 com.DefaultCompany.SampleAPI         E  MP Task Vision failed to load the task with error: unknown: Could not build model from the provided pre-loaded flatbuffer: The model allocation is null/empty

Why does this log :( I don't know because I don't have knowledge about TensorFlow.
+) I confirmed that it works well when inserting AAR through Android Studio.

How could I use VIDEO mode on gesture recognizer?

Hello,
I've tried to read Video frame to numpy array.
Did I missed something to make an input of recognizer?

import random
import ctypes 
from PIL import Image
with vision.GestureRecognizer.create_from_options(options) as recognizer:
  cap = cv2.VideoCapture('TRAIN_300.mp4')
  print("==== Video Info. ===== ")
  #print(cv2.CAP_PROP_FRAME_WIDTH) 
  #print(cv2.CAP_PROP_FRAME_HEIGHT)
  fps = cv2.CAP_PROP_FPS
  #print(fps)
  timestamps = [cv2.CAP_PROP_POS_MSEC]
  calc_timestamps = [0.0]
  timearray = []
  
  frameCount = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
  frameWidth = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
  frameHeight = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

  buf = np.empty((frameCount, frameHeight, frameWidth, 3), np.dtype('uint8'))

  fc = 0
  ret = True

  while (fc < frameCount  and ret):
    ret, buf[fc] = cap.read()
    fc += 1 
    timestamps.append(cap.get(cv2.CAP_PROP_POS_MSEC))
    ts = cap.get(cv2.CAP_PROP_POS_MSEC)
    cts = calc_timestamps[-1] + 1000/fps
    timearray.append(abs(ts - cts))
  cap.release()
 
  frame_timestamp_ms = timearray[9]
  print(type(buf[9]))
  mp_image = mp.Image(format=ImageFormat.SRGB, data=np.stack(buf[9]))
  
  gesture_recognition_result = recognizer.recognize_for_video(mp_image,frame_timestamp_ms)
 

  #numpy_frame_from_opencv = np.stack(frames, axis=0) # dimensions (T, H, W, C)
  
  #print(len(numpy_frame_from_opencv))
   
  cv2.destroyAllWindows()

==== Video Info. =====
<class 'numpy.ndarray'>
W20230204 14:13:15.370810 88347 gesture_recognizer_graph.cc:122] Hand Gesture Recognizer contains CPU only ops. Sets HandGestureRecognizerGraph acceleartion to Xnnpack.
I20230204 14:13:15.374961 88347 hand_gesture_recognizer_graph.cc:250] Custom gesture classifier is not defined.

TypeError Traceback (most recent call last)
Cell In[10], line 35
33 frame_timestamp_ms = timearray[9]
34 print(type(buf[9]))
---> 35 mp_image = mp.Image(format=ImageFormat.SRGB, data=np.stack(buf[9]))
37 gesture_recognition_result = recognizer.recognize_for_video(mp_image,frame_timestamp_ms)
40 #numpy_frame_from_opencv = np.stack(frames, axis=0) # dimensions (T, H, W, C)
41
42 #print(len(numpy_frame_from_opencv))

TypeError: init(): incompatible constructor arguments. The following argument types are supported:
1. mediapipe.python._framework_bindings.image.Image(image_format: mediapipe::ImageFormat_Format, data: numpy.ndarray[numpy.uint8])
2. mediapipe.python._framework_bindings.image.Image(image_format: mediapipe::ImageFormat_Format, data: numpy.ndarray[numpy.uint16])
3. mediapipe.python._framework_bindings.image.Image(image_format: mediapipe::ImageFormat_Format, data: numpy.ndarray[numpy.float32])

Invoked with: kwargs: format=<ImageFormat.SRGB: 1>, data=array([[[113, 123, 106],
[113, 123, 106],
[113, 123, 106],
...,
[149, 162, 144],
[149, 162, 144],
[147, 160, 142]],

   [[114, 124, 107],
    [114, 124, 107],
    [114, 124, 107],
    ...,
    [149, 162, 144],
    [147, 160, 142],
    [147, 160, 142]],

   [[114, 124, 107],
    [114, 124, 107],
    [114, 124, 107],
    ...,
    [147, 160, 142],
    [146, 159, 141],
    [146, 159, 141]],

   ...,

   [[ 38,  43,  41],
    [ 52,  57,  55],
    [ 68,  74,  69],
    ...,
    [ 19,  24,  22],
    [ 20,  25,  23],
    [ 20,  25,  23]],

   [[ 68,  73,  71],
    [ 92,  97,  95],
    [104, 110, 105],
    ...,
    [ 18,  23,  21],
    [ 19,  24,  22],
    [ 19,  24,  22]],

   [[ 49,  54,  52],
    [ 46,  51,  49],
    [ 62,  68,  63],
    ...,
    [ 18,  23,  21],
    [ 19,  24,  22],
    [ 20,  25,  23]]], dtype=uint8)

Multiclass selfie segmentation

Can you please write a small code in which i can use multiclass selfie segmentation on my local machine?

Hii paul this is regarding the Google Summer of Code '2023

Dear Paul ,

I hope this message finds you well. I am interested in contributing to your _**_android app mediapipe machine learning app development project that is listed on Gsoc'23 **_and have also sent you the proposal regarding this and would like to know whether there is a repository available on GitHub.
If there is already a repository available, could you please share the link with me? Alternatively, if there is no repository available, would you consider creating one so that contributors like myself can easily contribute to the project?
Thank you for considering my request. I look forward to hearing from you soon.
Hope you remember me well!
Thanking you in advance and sorry for creating an issue like this on Github.
Best regards,
Aakash

imageclassification needs gallery imports

We will need to add a new tab to imageclassification to import images/videos. This will follow the object detection example.

outputType : CONFIDENCE_MASK seems weird in Android

hi all, I dont know whether this repo is the right place to ask this, but since I am using the example from this repo, I will ask anyway.

So, I am trying to modify
https://github.com/googlesamples/mediapipe/tree/main/examples/image_segmentation/android
I try to change
.setOutputType(ImageSegmenter.ImageSegmenterOptions.OutputType.CATEGORY_MASK)
into
.setOutputType(ImageSegmenter.ImageSegmenterOptions.OutputType.CONFIDENCE_MASK)
this is the code:

and I read on https://developers.google.com/mediapipe/solutions/vision/image_segmenter/android that

so I print the result, by using

and

it seems the value is not the same as a probability

Note that I am using hair_segmentation.tflite which has only 2 types of output, background and hair