Giter Club home page Giter Club logo

babitmf / bmf Goto Github PK

View Code? Open in Web Editor NEW
682.0 19.0 56.0 34.9 MB

Cross-platform, customizable multimedia/video processing framework. With strong GPU acceleration, heterogeneous design, multi-language support, easy to use, multi-framework compatible and high performance, the framework is ideal for transcoding, AI inference, algorithm integration, live video streaming, and more.

Home Page: https://babitmf.github.io/

License: Apache License 2.0

CMake 5.19% Python 24.90% C++ 56.84% C 2.95% Makefile 0.07% Java 1.64% Shell 1.54% Objective-C 1.14% Objective-C++ 3.27% Cuda 1.28% Go 1.14% Dockerfile 0.04%
bmf bytedance cpp cross-platform python ai arm cuda gpu heterogeneous

bmf's Introduction

BMF - Cross-platform, multi-language, customizable video processing framework with strong GPU acceleration

BMF (Babit Multimedia Framework) is a cross-platform, multi-language, customizable multimedia processing framework developed by ByteDance. With over 4 years of testing and improvements, BMF has been tailored to adeptly tackle challenges in our real-world production environments. It is currently widely used in ByteDance's video streaming, live transcoding, cloud editing and mobile pre/post processing scenarios. More than 2 billion videos are processed by the framework every day.

Here are some key features of BMF:

  • Cross-Platform Support: Native compatibility with Linux, Windows, and Mac OS, as well as optimization for both x86 and ARM CPUs.

  • Easy to use: BMF provides Python, Go, and C++ APIs, allowing developers the flexibility to code in their favourite languages.

  • Customizability: Developers can enhance the framework's features by adding their own modules independently because of BMF decoupled architecture.

  • High performance: BMF has a powerful scheduler and strong support for heterogeneous acceleration hardware. Moreover, NVIDIA has been cooperating with us to develop a highly optimized GPU pipeline for video transcoding and AI inference.

  • Efficient data conversion: BMF offers seamless data format conversions across popular frameworks (FFmpeg/Numpy/PyTorch/OpenCV/TensorRT), conversion between hardware devices (CPU/GPU), and color space and pixel format conversion.

Dive deeper into BMF's capabilities on our website for more details.

Quick Experience

In this section, we will directly showcase the capabilities of the BMF framework around five dimensions: Transcode, Edit, Meeting/Broadcaster, GPU acceleration, and AI Inference. For all the demos provided below, corresponding implementations and documentation are available on Google Colab, allowing you to experience them intuitively.

Transcode

This demo describes step-by-step how to use BMF to develop a transcoding program, including video transcoding, audio transcoding, and image transcoding. In it, you can familiarize yourself with how to use BMF and how to use FFmpeg-compatible options to achieve the capabilities you need.

If you want to have a quick experiment, you can try it on Open In Colab

Edit

The Edit Demo will show you how to implement a high-complexity audio and video editing pipeline through the BMF framework. We have implemented two Python modules, video_concat and video_overlay, and combined various atomic capabilities to construct a complex BMF Graph.

If you want to have a quick experiment, you can try it on Open In Colab

Meeting/Broadcaster

This demo uses BMF framework to construct a simple broadcast service. The service provides an API that enables dynamic video source pulling, video layout control, audio mixing, and ultimately streaming the output to an RTMP server. This demo showcases the modularity of BMF, multi-language development, and the ability to dynamically adjust the pipeline.

Below is a screen recording demonstrating the operation of broadcaster:

GPU acceleration

GPU Video Frame Extraction

The video frame extraction acceleration demo shows:

  1. BMF flexible capability of:

    • Multi-language programming, we can see multi-language modules work together in the demo
    • Ability to extend easily, there are new C++, Python modules added simply
    • FFmpeg ability is fully compatible
  2. Hardware acceleration quickly enablement and CPU/GPU pipeline support

    • Heterogeneous pipeline is supported in BMF, such as process between CPU and GPU
    • Useful hardware color space conversion in BMF

If you want to have a quick experiment, you can try it on Open In Colab

GPU Video Transcoding and Filtering

The GPU transcoding and filter module demo shows:

  1. Common video/image filters in BMF accelerated by GPU
  2. How to write GPU modules in BMF

The demo builds a transcoding pipeline which fully runs on GPU:

decode->scale->flip->rotate->crop->blur->encode

If you want to have a quick experiment, you can try it on Open In Colab

AI inference

Deoldify

This demo shows how to integrate the state of art AI algorithms into the BMF video processing pipeline. The famous open source colorization algorithm DeOldify is wrapped as a BMF pyhton module in less than 100 lines of codes. The final effect is illustrated below, with the original video on the left side and the colored video on the right.

If you want to have a quick experiment, you can try it on Open In Colab

Super Resolution

This demo implements the super-resolution inference process of Real-ESRGAN as a BMF module, showcasing a BMF pipeline that combines decoding, super-resolution inference and encoding.

If you want to have a quick experiment, you can try it on Open In Colab

Video Quality Score

This demo shows how to invoke our aesthetic assessment model using bmf. Our deep learning model Aesmode has achieved a binary classification accuracy of 83.8% on AVA dataset, reaching the level of academic SOTA, and can be directly used to evaluate the aesthetic degree of videos by means of frame extraction processing.

If you want to have a quick experiment, you can try it on Open In Colab

Face Detect With TensorRT

This Demo shows a full-link face detect pipeline based on TensorRT acceleration, which internally uses the TensorRT-accelerated Onnx model to process the input video. It uses the NMS algorithm to filter repeated candidate boxes to form an output, which can be used to process a Face Detection Task efficiently.

If you want to have a quick experiment, you can try it on Open In Colab

Table of Contents

License

The project has an Apache 2.0 License. Third party components and dependencies remain under their own licenses.

Contributing

Contributions are welcomed. Please follow the guidelines.

We use GitHub issues to track and resolve problems. If you have any questions, please feel free to join the discussion and work with us to find a solution.

Acknowledgment

The decoder, encoder and filter reference ffmpeg cmdline tool. They are wrapped as BMF's built-in modules under the LGPL license.

The project also draws inspiration from other popular frameworks, such as ffmpeg-python and mediapipe. Our website is using the project from docsy based on hugo.

Here, we'd like to express our sincerest thanks to the developers of the above projects!

bmf's People

Contributors

chutiantian0923 avatar frankfengw519 avatar huheng avatar jie-fang avatar mmdzzh avatar mpr0xy avatar sbraveyoung avatar sfeiwong avatar taoboyang avatar tongyuantongyu avatar xiaoweiw-nv avatar zhitianwu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bmf's Issues

require sm at /home/dan/zs/cuda118/bmf/bmf/hml/src/core/stream.cpp:130, Stream on device type 1 is not supported

Python Stack ignored

Stack trace (most recent call last):
#5 Object "/usr/bin/python3.8", at 0x5d6065, in _PyObject_MakeTpCall
#4 Object "/usr/bin/python3.8", at 0x5d5498, in PyCFunction_Call
#3 Object "/home/dan/zs/cuda118/bmf/output/bmf/lib/_hmp.cpython-38-x86_64-linux-gnu.so", at 0x7fe59311d0f4, in PyInit__hmp
#2 Object "/home/dan/zs/cuda118/bmf/output/bmf/lib/hmp.cpython-38-x86_64-linux-gnu.so", at 0x7fe593113266, in
#1 Object "/home/dan/zs/cuda118/bmf/output/bmf/lib/libhmp.so.1", at 0x7fe592ef2948, in hmp::current_stream(hmp::Device::Type)
#0 Object "/home/dan/zs/cuda118/bmf/output/bmf/lib/libhmp.so.1", at 0x7fe592eec1b9, in hmp::logging::dump_stack_trace(int)
Traceback (most recent call last):
File "detect_trt_sample.py", line 41, in
main()
File "detect_trt_sample.py", line 13, in main
trt_face_detect = bmf.create_module(
File "/home/dan/zs/cuda118/bmf/output/bmf/builder/bmf.py", line 28, in create_module
return engine.Module(module_info, json.dumps(option), "", "", "")
File "/home/dan/zs/cuda118/bmf/output/demo/face_detect/trt_face_detect.py", line 90, in init
self.stream
= mp.current_stream(mp.kCUDA)
RuntimeError: require sm at /home/dan/zs/cuda118/bmf/bmf/hml/src/core/stream.cpp:130, Stream on device type 1 is not supported

内置资源和可复用的Module

1.阅读一些测试代码,发下有些资源找不到,请问哪里可以获取到这些资源?
比如test_graph.cpp中dynamic_add函数"../files/dynamic_add.json"

TEST(graph, dynamic_add) {
BMFLOG_SET_LEVEL(BMF_INFO);

time_t time1 = clock();
std::string config_file = "../files/graph_dyn.json";
std::string dyn_config_file = "../files/dynamic_add.json";
GraphConfig graph_config(config_file);
GraphConfig dyn_config(dyn_config_file);
std::map<int, std::shared_ptr<Module>> pre_modules;
std::map<int, std::shared_ptr<ModuleCallbackLayer>> callback_bindings;
std::shared_ptr<Graph> graph =
    std::make_shared<Graph>(graph_config, pre_modules, callback_bindings);
std::cout << "init graph success" << std::endl;

graph->start();
usleep(400000);

std::cout << "graph dynamic add nodes" << std::endl;
graph->update(dyn_config);

graph->close();
time_t time2 = clock();
std::cout << "time:" << time2 - time1 << std::endl;

}

2.目前内置的Module数量较少,请问是否有可复用的一些Module?如果有,哪里可以获取?

Run demo with a error in mac os 13.4.1 (22F82)

hi, I run the demo in my mac, but got the error, how to fix the error?

demo % python broadcaster/broadcaster.py
Traceback (most recent call last):
File "/Users/weiliang/Develop/bmf/bmf/demo/broadcaster/broadcaster.py", line 7, in
import bmf
File "/Users/weiliang/.pyenv/versions/3.9.18/lib/python3.9/site-packages/bmf/init.py", line 3, in
from bmf.python_sdk.module_functor import make_sync_func
File "/Users/weiliang/.pyenv/versions/3.9.18/lib/python3.9/site-packages/bmf/python_sdk/init.py", line 1, in
from .module_functor import make_sync_func, ProcessDone
File "/Users/weiliang/.pyenv/versions/3.9.18/lib/python3.9/site-packages/bmf/python_sdk/module_functor.py", line 1, in
import bmf.lib._hmp
ImportError: dlopen(/Users/weiliang/.pyenv/versions/3.9.18/lib/python3.9/site-packages/bmf/lib/_hmp.cpython-39-darwin.so, 0x0002): Library not loaded: @executable_path/../../../../Python
Referenced from: /Users/weiliang/.pyenv/versions/3.9.18/lib/python3.9/site-packages/bmf/lib/_hmp.cpython-39-darwin.so
Reason: tried: '/Users/weiliang/Python' (no such file), '/usr/local/lib/Python' (no such file), '/usr/lib/Python' (no such file, not in dyld cache)

AttributeError: 'bmf.lib._bmf.sdk.Packet' object has no attribute 'get_data'

/usr/lib/python3.7/site-packages/bmf/modules/null_sink.py in process(self, task)
21 elif pkt.get_timestamp() != Timestamp.UNSET:
22 Log.log_node(LogLevel.DEBUG, task.get_node(),
---> 23 "process data", pkt.get_data(), 'time',
24 pkt.get_timestamp())
25 return ProcessResult.OK

AttributeError: 'bmf.lib._bmf.sdk.Packet' object has no attribute 'get_data'

fill_task_input don't consider the incomplete data , is it implied that the module must consider the incomplete data and carry out local caching and splicing processing when handling (TASK)?

If a certain module (MODULE) requires multiple input streams (such as OVERLAY) to function normally, and only some of these input streams have data, is it implied that the module must consider the incomplete data and carry out local caching and splicing processing when handling (TASK)? Otherwise, there will be incomplete input data, leading to processing failure.

bool ImmediateInputStreamManager::fill_task_input(Task &task) {
bool task_filled = false;
for (auto & input_stream : input_streams_) {

        if (input_stream.second->is_empty()) {
            continue;
        }
        //one task cantain mult pkts, NEED add max pkts ctl?
        while (not input_stream.second->is_empty()) {
            Packet pkt = input_stream.second->pop_next_packet(false);
            if (pkt.timestamp() == BMF_EOF) {
                if (input_stream.second->probed_) {
                    BMFLOG(BMF_INFO) << "immediate sync got EOF from dynamical update";
                    pkt.set_timestamp(DYN_EOS);
                    input_stream.second->probed_ = false;
                } else
                    stream_done_[input_stream.first] = 1;
            }
            //READ:取到task的对应输入队列中
            task.fill_input_packet(input_stream.second->get_id(), pkt);
            task_filled = true;
        }
    }

RuntimeError: [json.exception.type_error.302] type must be string, but is array

pkts = (
    bmf.graph().decode({
        'input_path': stream,
        "loglevel": "quiet",
    })['video']
    .start()  # this will return a packet generator
)

for i, pkt in enumerate(pkts):
    # convert frame to a nd array
    if pkt.is_(bmf.VideoFrame):
        vf = pkt.get(bmf.VideoFrame)
        rgb = mp.PixelInfo(mp.kPF_RGB24)
        np_vf = vf.reformat(rgb).frame().plane(0).numpy()
        # we can add some more processing here, e.g. predicting
        print("frame", i, "shape", np_vf.shape)
    else:
        break

When I used the above code to read the stream, an error occurred. When I switched the video stream to a local video, the error disappeared. I don't know where the problem is, but my video stream is correct. I can use ffmpeg to read the stream and save it as mp4.
The following is the error message:
image

BMF框架如何去支持音频流PCM数据的输入

具体描述:这个音频流的数据并不是从流媒体上获取的,而是通过网络传输去不断接收到的流式数据包,想去实时的编码处理(不能保存为本地文件后再去读取本地文件),请问要去如何实现呢?

CFFFilter Demo

The parameters for CFFFilter seem quite complex; could you provide a Python example using CFFFilter?

cpp copy module can't work with gpu transcoding

Module: test/c_module

def test():
    input_video_path = xxx
    output_path = xxxx
    video = bmf.graph().decode({
            "input_path": input_video_path,
            "video_params": {
                "hwaccel": "cuda",
            }
        })["video"]

    video2 = video.c_module('cpp_copy_module',
                            "../../test/c_module/libcopy_module.so", # use your path
                            "copy_module:CopyModule")
        
    (bmf.encode(
        video2,
        video["audio"],
        {
            "output_path": output_path,
            "video_params": {
                "codec": "h264_nvenc",
                "pix_fmt": "cuda",
            }
        }).run())

The output video isn't encoded normally. There're green and red area in the pictures.

But with CPU decoding and GPU encoding, the results are good.

def test():
    input_video_path = xxx
    output_path = xxxx
    video = bmf.graph().decode({
            "input_path": input_video_path,
        })["video"]

    video2 = video.c_module('cpp_copy_module',
                            "../../test/c_module/libcopy_module.so", # use your path
                            "copy_module:CopyModule")
        
    (bmf.encode(
        video2,
        video["audio"],
        {
            "output_path": output_path,
            "video_params": {
                "codec": "h264_nvenc",
            }
        }).run())

encoding and decoding are both hardware-accelerated using GPUs,does BMF copy the GPU data back to memory

case:
ffmpeg -vsync 0 -hwaccel cuda -hwaccel_output_format cuda -hwaccel_device 0 -c:v h264_cuvid -i input.h264 -c:v nvenc_h264 output.h264

Q:
The encoding and decoding are both hardware-accelerated using GPUs, with decoding completed on the GPU and encoding also completed on the GPU, without any memory copying from GPU to host memory.
In this case, what are the differences between BMF’s processing of the decoding process and CPU mode decoding? Are both using vaFrame to receive decoding results, with the address in vaFrame being on the GPU for one and on the CPU for the other?
Also, in this scenario, does BMF copy the GPU data back to memory, encapsulate it as a TASK, and place it in the scheduling queue? Would this not change the original purpose of reducing memory copies?

RuntimeError: BMF(0.0.7) /root/bmf/bmf/c_modules/src/ffmpeg_decoder.cpp:736: error: (-224:BMF Transcode Error) avformat_open_input failed: Protocol not found in function 'init_input'

When I use BMF to process RTSP video, the following problems will occur:

RuntimeError: BMF(0.0.7) /root/bmf/bmf/c_modules/src/ffmpeg_decoder.cpp:736: error: (-224:BMF Transcode Error) avformat_open_input failed: Protocol not found in function 'init_input'

Based on the example test_generator.py, replace the “'input_path': "../../files/big_bunny_10s_30fps.mp4"" in the code with the following content :

frames = ( bmf.graph() .decode({'input_path': "https://*****:1101/rtp/0615746E.live.flv"})['video'] .fps(1) # .ff_filter('scale', 299, 299) # or you can use '.scale(299, 299)' .start() # this will return a packet generator )

docker images : babitmf/bmf_runtime:latest

test push data with raw frame got an error

I use h264 as encode codec, and got the error bellow, and I change this to be mpeg4 and it works well.
My os is ubuntu 20.04 and python is 3.9 and pip install BabitMF

[2023-09-25 03:08:33.274] [error] node id:1 Codec 'libx264' not found
[2023-09-25 03:08:33.274] [error] node id:1 init codec error
[2023-09-25 03:08:33.274] [error] node id:1 catch exception: BMF(0.0.8) /project/bmf/engine/c_engine/src/node.cpp:352: error: (-5:Bad argument) [Node_1_c_ffmpeg_encoder] Process result != 0.
in function 'process_node'

[2023-09-25 03:08:33.274] [error] node id:1 Process node failed, will exit.

import io

import numpy as np
import bmf
from bmf import GraphMode, Module, Log, LogLevel, InputType, ProcessResult, Packet, Timestamp, scale_av_pts, av_time_base, BmfCallBackType, VideoFrame, AudioFrame, BMFAVPacket
from PIL import Image

def init_push_graph(output):
    graph = bmf.graph({"dump_graph": 1, "loglevel": "debug"})
    video_stream = graph.input_stream("video_stream")
    # audio_stream = graph.input_stream("wav_stream")
    decode_stream = video_stream.decode({
        "loglevel": "trace",
        's': '720:1280',
        'pix_fmt': 'rgb24',
        "push_raw_stream": 1,
        "video_codec": "bmp",
        "video_time_base": "1,30000"
        })

    bmf.encode(
            decode_stream,
            None,
            {
                "video_params": {
                    "codec": "h264",
                    "width": 720,
                    "height": 1280,
                    "max_fr": 30,
                    "crf": "23",
                    "preset": "veryfast"
                },
                # "audio_params": {"sample_rate": 44100, "codec": "aac"},
                "loglevel": "trace",
                "output_path": output
            },
        )
    graph.run_wo_block(mode=GraphMode.PUSHDATA)
    return graph

graph = init_push_graph('./test1.mp4')

pts = 0
timestamp = 0
for _ in range(100):
    frame = np.zeros((1280, 720, 3), dtype=np.uint8)
    image = Image.fromarray(frame, mode="RGB")
    byte_stream = io.BytesIO()
    image.save(byte_stream, format='BMP')
    image_bytes = byte_stream.getvalue()
    pkt = BMFAVPacket(len(image_bytes))
    memview = pkt.data.numpy()
    memview[:] = np.frombuffer(image_bytes, dtype=np.uint8)
    pkt.pts = pts
    packet = Packet(pkt)
    packet.timestamp = timestamp
    pts += 1001
    timestamp += 1
    graph.fill_packet("video_stream", packet)

graph.fill_packet("video_stream", Packet.generate_eof_packet())
graph.close()

demo里,google drive的文件无法下载

!gdown --fuzzy https://drive.google.com/file/d/1l8bDSrWn6643aDhyaocVStXdoUbVC3o2/view?usp=sharing -O big_bunny_10s_30fps.mp4
Traceback (most recent call last):
File "/usr/local/bin/gdown", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/gdown/cli.py", line 151, in main
filename = download(
File "/usr/local/lib/python3.10/dist-packages/gdown/download.py", line 203, in download
filename_from_url = m.groups()[0]
AttributeError: 'NoneType' object has no attribute 'groups'

运行demo报:[swscaler @ 0x7f5cbb7f1200] No accelerated colorspace conversion found from yuv420p to rgb24.

运行官方镜像,
docker pull babitmf/bmf_runtime:latest
docker run --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all -it babitmf/bmf_runtime:latest bash
export CMAKE_ARGS="-DBMF_ENABLE_CUDA=ON"
./build.sh
编译后运行demo:
python3 ~/bmf/output/demo/video_enhance/enhance_demo.py
代码未报错,能够生成视频,但运行过程中一直报:No accelerated colorspace conversion found from yuv420p to rgb24.
问题:
生成的output.mp4视频,播放时完全看不到视频内容,显示花屏。

[rtsp @ 0x7f7d7a7f8980] max delay reached. need to consume packet [rtsp @ 0x7f7d7a7f8980] RTP: missed 6 packets

[rtsp @ 0x7f7d7a7f8980] max delay reached. need to consume packet
[rtsp @ 0x7f7d7a7f8980] RTP: missed 2 packets
[rtsp @ 0x7f7d7a7f8980] max delay reached. need to consume packet
[rtsp @ 0x7f7d7a7f8980] RTP: missed 2 packets
[rtsp @ 0x7f7d7a7f8980] max delay reached. need to consume packet
[rtsp @ 0x7f7d7a7f8980] RTP: missed 6 packets
[rtsp @ 0x7f7d7a7f8980] max delay reached. need to consume packet
[rtsp @ 0x7f7d7a7f8980] RTP: missed 2 packets

输入的input_path 是一个由摄像头输出的rtsp流,输出output_path 是一个rtmp流,运行过程中会有很多如上丢包告警,导致拉取的rtmp流画面有很多马赛卡以及卡顿

PyCUDA ERROR: The context stack was not empty upon module cleanup

graph.decode 输入的input_path为直播流时,当直播流突然断开,bmf 会coredump:

[2024-03-23 04:12:10.668] [info] node:c_ffmpeg_encoder 2 scheduler 1
[2024-03-23 04:14:31.322] [info] node id:0 decode flushing
[2024-03-23 04:14:31.322] [info] node id:0 Process node end
[2024-03-23 04:14:31.364] [info] node id:0 close node
[2024-03-23 04:14:31.364] [info] node 0 close report, closed count: 1
[2024-03-23 04:14:31.364] [info] node id:1 eof received
[2024-03-23 04:14:31.364] [info] node id:1 eof processed, remove node from scheduler
[2024-03-23 04:14:31.365] [info] node id:1 process eof, add node to scheduler
[2024-03-23 04:14:31.373] [info] node id:1 Process node end
[2024-03-23 04:14:31.373] [info] node id:1 close node
[2024-03-23 04:14:31.373] [info] node 1 close report, closed count: 2
[2024-03-23 04:14:31.373] [info] node id:2 eof received
[2024-03-23 04:14:31.373] [info] node id:2 eof processed, remove node from scheduler
[2024-03-23 04:14:31.374] [info] node id:2 process eof, add node to scheduler
[2024-03-23 04:14:31.374] [info] node id:2 Process node end
[2024-03-23 04:14:31.374] [info] node id:2 close node
[2024-03-23 04:14:31.374] [info] node 2 close report, closed count:3
[2024-03-23 04:14:31.374] [info] schedule queue 0 start to join thread
[2024-03-23 04:14:31.374] [info] schedule queue 0 thread quit
[2024-03-23 04:14:31.375] [info] schedule queue 0 closed
[2024-03-23 04:14:31.375] [info] schedule queue 1 start to join thread
[2024-03-23 04:14:31.375] [info] schedule queue 1 thread quit
[2024-03-23 04:14:31.375] [info] schedule queue 1 closed
[2024-03-23 04:14:31.375] [info] all scheduling threads were joint

PyCUDA ERROR: The context stack was not empty upon module cleanup.

A context was still active when the context stack was being
cleaned up. At this point in our execution, CUDA may already
have been deinitialized, so there is no way we can finish
cleanly. The program will be aborted now.
Use Context.pop() to avoid this problem.

docker运行blur_gpu module报错

1.docker pull babitmf/bmf_runtime:latest;

2.nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.84 Driver Version: 460.84 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla P40 Off | 00000000:88:00.0 Off | 0 |
| N/A 35C P0 50W / 250W | 16435MiB / 22919MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 Tesla P40 Off | 00000000:8D:00.0 Off | 0 |
| N/A 41C P0 51W / 250W | 18213MiB / 22919MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 Tesla P40 Off | 00000000:B3:00.0 Off | 0 |
| N/A 32C P0 49W / 250W | 15643MiB / 22919MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 Tesla P40 Off | 00000000:B6:00.0 Off | 0 |
| N/A 34C P0 50W / 250W | 11013MiB / 22919MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+

3.nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

4.cvcuda.gaussian_into报错
Line 563: '' failed: no kernel image is available for execution on the device

请问docker环境还需要怎么配置吗?

RuntimeError: require false at /root/bmf/bmf/hml/src/imgproc/imgproc.cpp:154, Unsupport PixelInfo

There was a problem when I used the bmf/test/generator/test_generator.py for read stream testing.

# bmf/test/generator/test_generator.py
for i, pkt in enumerate(pkts):
    # convert frame to a nd array
    if pkt.is_(bmf.VideoFrame):
        vf = pkt.get(bmf.VideoFrame)
        rgb = mp.PixelInfo(mp.kPF_RGB24)
        np_vf = vf.reformat(rgb).frame().plane(0).numpy()  # <------ RuntimeError: require false at /root/bmf/bmf/hml/src/imgproc/imgproc.cpp:154, Unsupport PixelInfo
        # we can add some more processing here, e.g. predicting
        print("frame", i, "shape", np_vf.shape)
    else:
        break

I also tried the method in the document, but it seems to be incorrect.

# https://babitmf.github.io/docs/bmf/multiple_features/graph_mode/generatemode/
for i, frame in enumerate(frames):
     # convert frame to a nd array
     if frame is not None:
         np_frame = frame.to_ndarray(format='rgb24')    # <------ AttributeError: 'bmf.lib._bmf.sdk.Packet' object has no attribute 'to_ndarray'

         # we can add some more processing here, e.g. predicting
         print('frame', i, 'shape', np_frame.shape)
     else:
         break

What is Unsupport PixelInfo?
Or do I have any other methods to process the stream like OpenCV into video frames that can be read iteratively?

Pass non-image data between modules using Packet

When building the controlnet demo, I am trying to build a pipeline that looks like this:

image decoder ---> controlnet inference module ---> image encoder
^
|
prompt reader ---------+

The prompt is read from files and passed to the controlnet inference module through bmf.Packet. The type of the prompt is Python dict, and bmf/python/py_module_sdk.cpp shows that bmf.Packet support all python types. But when I run the pipeline, I get the following error:

[2023-10-11 02:09:40.995] [error] node id:2 catch exception: BMF(0.0.8) /home/scratch.xiaoweiw_sw/bytedance/babitmf/bmf/engine/c_engine/src/node.cpp:352: error: (-5:Bad argument) [Node_2_c_ffmpeg_filter] Process result != 0.
 in function 'process_node'

[2023-10-11 02:09:40.996] [error] node id:2 Process node failed, will exit.
[2023-10-11 02:09:40.996] [info] node 2 got exception, close directly
[2023-10-11 02:09:40.996] [info] node id:2 process eof, add node to scheduler
[2023-10-11 02:09:40.996] [info] schedule queue 0 start to join thread
[ipp1-2035:1225 :0:1315] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x1e0)
==== backtrace (tid:   1315) ====
 0 0x0000000000042520 __sigaction()  ???:0
 1 0x000000000015856d CFFFilter::init_filtergraph()  /home/scratch.xiaoweiw_sw/bytedance/babitmf/bmf/c_modules/src/ffmpeg_filter.cpp:242
 2 0x0000000000159b9c CFFFilter::process_filter_graph()  /home/scratch.xiaoweiw_sw/bytedance/babitmf/bmf/c_modules/src/ffmpeg_filter.cpp:393
 3 0x000000000015ae0e CFFFilter::process()  /home/scratch.xiaoweiw_sw/bytedance/babitmf/bmf/c_modules/src/ffmpeg_filter.cpp:578
 4 0x0000000000372a9e bmf_engine::Node::process_node()  /home/scratch.xiaoweiw_sw/bytedance/babitmf/bmf/engine/c_engine/src/node.cpp:348
 5 0x00000000003a5a7a bmf_engine::SchedulerQueue::exec()  /home/scratch.xiaoweiw_sw/bytedance/babitmf/bmf/engine/c_engine/src/scheduler_queue.cpp:153
 6 0x00000000003a5678 bmf_engine::SchedulerQueue::exec_loop()  /home/scratch.xiaoweiw_sw/bytedance/babitmf/bmf/engine/c_engine/src/scheduler_queue.cpp:111
 7 0x00000000003a8b1e std::__invoke_impl<int, int (bmf_engine::SchedulerQueue::*)(), bmf_engine::SchedulerQueue*>()  /usr/include/c++/11/bits/invoke.h:74
 8 0x00000000003a8a72 std::__invoke<int (bmf_engine::SchedulerQueue::*)(), bmf_engine::SchedulerQueue*>()  /usr/include/c++/11/bits/invoke.h:96
 9 0x00000000003a89d3 std::thread::_Invoker<std::tuple<int (bmf_engine::SchedulerQueue::*)(), bmf_engine::SchedulerQueue*> >::_M_invoke<0ul, 1ul>()  /usr/include/c++/11/bits/std_thread.h:259
10 0x00000000003a898a std::thread::_Invoker<std::tuple<int (bmf_engine::SchedulerQueue::*)(), bmf_engine::SchedulerQueue*> >::operator()()  /usr/include/c++/11/bits/std_thread.h:266
11 0x00000000003a896a std::thread::_State_impl<std::thread::_Invoker<std::tuple<int (bmf_engine::SchedulerQueue::*)(), bmf_engine::SchedulerQueue*> > >::_M_run()  /usr/include/c++/11/bits/std_thread.h:211
12 0x00000000000dc253 std::error_code::default_error_condition()  ???:0
13 0x0000000000094b43 pthread_condattr_setpshared()  ???:0
14 0x0000000000126a00 __xmknodat()  ???:0
=================================
Segmentation fault (core dumped)

I haven't used ffmpeg filter module in the graph, tt seems that bmf will insert ffmpeg filter modules in the graph. Test code as follows.

test_controlnet.py:

import sys

sys.path.append("../../")
import bmf

sys.path.pop()

def test():
    input_video_path = "./ControlNet/test_imgs/bird.png"
    input_prompt_path = "./prompt.txt"
    output_path = "./output.jpg"

    graph = bmf.graph()

    video = graph.decode({'input_path': input_video_path})
    prompt = graph.module('text_module', {'path': input_prompt_path})
    concat = bmf.concat(video['video'], prompt)
    concat.module('controlnet_module', {}).run()

if __name__ == '__main__':
    test()

text_module.py:

import sys
import random
from typing import List, Optional
import pdb

from bmf import *
import bmf.hml.hmp as mp

class text_module(Module):
    def __init__(self, node, option=None):
        self.node_ = node
        self.eof_received_ = False
        self.prompt_path = './prompt.txt'
        if 'path' in option.keys():
            self.prompt_path = option['path']

    def process(self, task):
        pdb.set_trace()
        output_queue = task.get_outputs()[0]

        if self.eof_received_:
            output_queue.put(Packet.generate_eof_packet())
            Log.log_node(LogLevel.DEBUG, self.node_, 'output text stream', 'done')
            task.set_timestamp(Timestamp.DONE)
            return ProcessResult.OK

        prompt_dict = dict()
        with open(self.prompt_path) as f:
            for line in f:
                pk, pt = line.partition(":")[::2]
                prompt_dict[pk] = pt

        out_pkt = Packet(prompt_dict)
        out_pkt.timestamp = 0
        output_queue.put(out_pkt)
        self.eof_received_ = True

        return ProcessResult.OK

def register_inpaint_module_info(info):
    info.module_description = "Text file IO module"

core dumped in frame extract

Hi, when I run the code below in a 4 CPU machine,Aborted (core dumped) happen. Error rate is 6/10. (Run 10 times and error occur 6 times). But in a 16 CPU machine, it doesn't happen. I observed that when executing on the 4cpu machine, the cpu usage is almost 100%. Maybe it is the reason. Apart from adding more CPU, is there any way to avoid this problem?

import bmf
import time
from multiprocessing.pool import ThreadPool
import glob
import numpy as np

def generator_mode(input_list):
    input_path,threads = input_list
    start = time.time()
    graph = bmf.graph()
    video =  graph.decode({
                    'input_path': input_path,
                    "log_level":"quiet",
                    "dec_params": {"threads": threads},
                })['video'].start() # this will return a packet generator
    for pkt in video:
        # convert frame to a nd array
        if pkt.is_(bmf.VideoFrame):
            vf = pkt.get(bmf.VideoFrame)
            v_frame = vf.frame().plane(2).numpy()
        else:
            break
    use = time.time() - start
    return use



if __name__ == '__main__':
    #串行
    # print(time.time())
    test_threads = [0,2,4,6,8]
    video_paths = glob.glob("/root/ori/*.mp4")

    for threads in test_threads:
        for infilename in video_paths:
            extract_u_frame_time = []
            run_path = []
            for i in range(20):
                run_path.append([infilename, str(threads)])
            with ThreadPool(2) as p:
                extract_u_frame_time.extend(p.map(generator_mode, run_path))

the environment version is below:

python=3.7.12
ffmpeg version 4.1.11-0+deb10u1
numpy==1.21.6
BabitMF==0.0.8

stdout of error:

terminate called without an active exception
Aborted (core dumped) happen

Running command

nohup python3 generator_mode.py 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.