Giter Club home page Giter Club logo

flare's Introduction

Flare 后台服务开发框架

Flare Backend Service Framework

English Document

license NewBSD C++ Code Style Platform

腾讯广告 是腾讯公司最重要的业务之一,其后台大量采用 C++ 开发。

Flare 是我们吸收先前服务框架和业界开源项目及最新研究成果开发的现代化的后台服务开发框架,旨在提供针对目前主流软硬件环境下的易用、高性能、平稳的服务开发能力。

Flare 项目开始于 2019 年,目前广泛应用于腾讯广告的众多后台服务,拥有数以万计的运行实例,在实际生产系统上经受了足够的考验。

2021 年 5 月,本着回馈社区、技术共享的精神,正式对外开源。

特点

  • 现代 C++ 设计风格,广泛采用了 C++11/14/17/2a 的新的语法特性和标准库
  • 提供了 M:N 的线程模型的微线程实现Fiber,方便业务开发人员以便利的同步调用语法编写高性能的异步调用代码
  • 支持基于消息的流式 RPC支持
  • 除了 RPC 外,还提供了一系列便利的基础库,比如字符串、时间日期、编码处理、压缩、加密解密、配置、HTTP 客户端等,方便快速上手开发业务代码
  • 提供了灵活的扩充机制。方便支持多种协议、服务发现、负载均衡、监控告警、调用追踪
  • 针对现代体系结构做了大量的优化。比如 NUMA 感知调度组对象池零拷贝缓冲区
  • 高质量的代码。严格遵守 Google C++ 代码规范,测试覆盖率达 80%
  • 完善的文档示例以及调试支持,方便快速上手

系统要求

  • Linux 3.10 或以上内核,暂不支持其他操作系统
  • x86-64 处理器,也支持 aarch64 及 ppc64le,但是未在生产环境上实际使用过
  • GCC 8 或以上版本的编译器

开始使用

Flare 是开箱即用的,已经自带了所需的第三方库,因此通常不需要额外安装依赖库。只需要在 Linux 下,拉取代码,即可使用。

thirdparty/下面的压缩包我们通过Git LFS存储,因此在拉取代码之前您需要确保git-lfs已经正确的安装了。

构建

我们使用blade进行日常开发。

  • 编译:./blade build ...
  • 测试:./blade test ...

Flare还支持bazel作为构建系统,可以查看bazel support

之后就可以参考入门导引中的介绍,搭建一个简单的RPC服务了。

调试

我们相信,调试体验也是开发维护过程中很重要的一部分,我们为此也做了如下一些支持:

测试

为了改善编写单测的体验,我们提供了一些用于编写单测的工具

这包括但不限于:

示例

我们提供了一些使用示例以供参考,下面是一个简单的转发服务(同时包含RPC客户端及服务端的使用)。

#include "gflags/gflags.h"

#include "flare/example/rpc/echo_service.flare.pb.h"
#include "flare/example/rpc/relay_service.flare.pb.h"
#include "flare/fiber/this_fiber.h"
#include "flare/init.h"
#include "flare/rpc/rpc_channel.h"
#include "flare/rpc/rpc_client_controller.h"
#include "flare/rpc/rpc_server_controller.h"
#include "flare/rpc/server.h"

using namespace std::literals;

DEFINE_string(ip, "127.0.0.1", "IP address to listen on.");
DEFINE_int32(port, 5569, "Port to listen on.");
DEFINE_string(forward_to, "flare://127.0.0.1:5567",
              "Target IP to forward requests to.");

namespace example {

class RelayServiceImpl : public SyncRelayService {
 public:
  void Relay(const RelayRequest& request, RelayResponse* response,
             flare::RpcServerController* ctlr) override {
    flare::RpcClientController our_ctlr;
    EchoRequest echo_req;
    echo_req.set_body(request.body());
    if (auto result = stub_.Echo(echo_req, &our_ctlr)) {
      response->set_body(result->body());
    } else {
      ctlr->SetFailed(result.error().code(), result.error().message());
    }
  }

 private:
  EchoService_SyncStub stub_{FLAGS_forward_to};
};

int Entry(int argc, char** argv) {
  flare::Server server{flare::Server::Options{.service_name = "relay_server"}};

  server.AddProtocol("flare");
  server.AddService(std::make_unique<RelayServiceImpl>());
  server.ListenOn(flare::EndpointFromIpv4(FLAGS_ip, FLAGS_port));
  FLARE_CHECK(server.Start());

  flare::WaitForQuitSignal();
  return 0;
}

}  // namespace example

int main(int argc, char** argv) {
  return flare::Start(argc, argv, example::Entry);
}

Flare内部基于M:N的用户态线程实现,因此通过Flare同步的请求外界服务、使用Flare内置的各种客户端的同步接口均不会导致性能问题。如果有更复杂的并发或异步等需求可以参考我们的文档

另外,示例中*.flare.pb.h通过我们的Protocol Buffers插件生成。这样生成的接口相对于Protocol Buffers生成的cc_generic_services而言,更易使用。

更复杂的示例

实际使用中,往往会面对需要并发请求多种后端的场景,下面的示例介绍了如何在Flare中进行这种操作:

// For illustration purpose only. Normally you wouldn't want to declare them as
// global variables.
flare::HttpClient http_client;
flare::CosClient cos_client;
EchoService_SyncStub echo_stub(FLAGS_echo_server_addr);

void FancyServiceImpl::FancyJob(const FancyJobRequest& request,
                                FancyJobResponse* response,
                                flare::RpcServerController* ctlr) {
  // Calling different services concurrently.
  auto async_http_body = http_client.AsyncGet(request.uri());
  auto async_cos_data =
      cos_client.AsyncExecute(flare::CosGetObjectRequest{.key = request.key()});
  EchoRequest echo_req;
  flare::RpcClientController echo_ctlr;
  echo_req.set_body(request.echo());
  auto async_rpc_resp = echo_stub.AsyncEcho(EchoRequest(), &echo_ctlr);

  // Now wait for all of them to complete.
  auto&& [http, cos, rpc] = flare::fiber::BlockingGet(
      flare::WhenAll(&async_http_body, &async_cos_data, &async_rpc_resp));

  if (!http || !cos || !rpc) {
    FLARE_LOG_WARNING("Failed.");
  } else {
    // All succeeded.
    FLARE_LOG_INFO("Got: {}, {}, {}", *http->body(),
                   flare::FlattenSlow(cos->bytes), rpc->body());
  }

  // Now fill `response` accordingly.
  response->set_body("Great success.");
}

这个示例中,我们:

  • 通过三种不同的客户端(HTTP腾讯云COS、RPC)发起了三个异步请求;
  • 通过flare::fiber::BlockingGet同步等待所有请求完成。这儿我们只会阻塞用户态线程,不会存在性能问题;
  • 打印日志输出各个服务的响应。

出于展示目的,我们这儿请求了三个异构的服务。如果有必要,也可以通过这种方式请求同构的、或者部分同构部分异构的服务。

参与开发

我们非常欢迎参与共同建设,对于希望了解 Flare 更多内部设计的开发者,或需要对 Flare 进行二次开发的开发者而言,flare/doc/下有更多的技术文档可供参考。

详情请参考CONTRIBUTING.md

性能

由于业务需求的特点,我们在设计过程中更倾向于优化延迟及抖动的平稳性而非吞吐,但是也在这个前提下尽量保证性能。

出于简单的对比目的,我们提供了初步的性能数据

致谢

  • 我们的底层实现大量参考了brpc的设计。
  • RPC部分,grpc给了我们很多启发。
  • 我们依赖了不少开源社区的第三方库,站在巨人的肩膀上使得我们可以更快更好地开发本项目,也因此积极地回馈给开源社区。

在此,我们对上述项目一并致以谢意。

flare's People

Contributors

0x804d8000 avatar 4kangjc avatar chen3feng avatar guoxiangcn avatar liuyuhui avatar maplefu avatar sf-zhou avatar yinghaoyu avatar zjyuan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flare's Issues

About unboxed_type_t

template <class... Ts>
using unboxed_type_t = typename std::conditional_t<
sizeof...(Ts) == 0, std::common_type<void>,
std::conditional_t<sizeof...(Ts) == 1, std::common_type<Ts...>,
std::common_type<std::tuple<Ts...>>>>::type;

这个会不会更好点,更易于理解? common_type有点难懂

template <class T0 = void, class... Ts>
struct unboxed_type {
  using type = std::conditional_t<sizeof...(Ts) == 0, T0, std::tuple<T0, Ts...>>;
};

template <class... Ts>
using unboxed_type_t = typename unboxed_type<Ts...>::type;

How could I use flare in my own project?

Flare is an awesome rpc framework and flare/base has many usefull function capsulations, and I want to use them in my own project.
There is no CMakeFiles or configure script, I do no know how to build it as an indepedent lib.
So I tried to use BLADE too. My project has the following directory tree.

|__code
    |__BLADE_ROOT: my own BLADE_ROOT
    |__thirdparty
        |__flare: a git submodule pointing to Tencent/flare

I event wrote a tool (https://gist.github.com/ericuni/627cacb9978c9db58ae7360c4d31de6b) to process all BUILD files and BLADE_ROOT in flare.
But still failed.

build glog/press failed

I'm trying build in a docker environment and I have installed libunwind8 and libunwind-dev.

glog(master ✗) pwd                                                                                                                                                                            
/root/git/flare/thirdparty/glog

glog(master ✗) bb :press                                                                                                                                                                      root@dev 19:33:50
Blade: Entering directory `/root/git/flare'
Blade(info): Loading config file "/root/opt/blade-build/blade.conf"
Blade(info): Loading config file "/root/git/flare/BLADE_ROOT"
Blade(info): Loading BUILD files...
Blade(info): Loading done.
Blade(info): Analyzing dependency graph...
Blade(info): Analyzing done.
Blade(info): Generating backend build code...
Blade(info): Generating done.
Blade(info): Building...
Blade(info): Adjust build jobs number(-j N) to be 8
[1/1] LINK BINARY build64_release/thirdparty/glog/press
FAILED: build64_release/thirdparty/glog/press
g++ -o build64_release/thirdparty/glog/press  -Lthirdparty/jdk/lib  @build64_release/thirdparty/glog/press.rsp -lrt -lz -lpthread
/usr/bin/ld: build64_release/thirdparty/glog/lib/libglog.a(libglog_la-utilities.o): in function `google::GetStackTrace(void**, int, int)':
/root/git/flare/build64_release/thirdparty/glog/glog-0.4.0/src/stacktrace_libunwind-inl.h:65: undefined reference to `_Ux86_64_getcontext'
/usr/bin/ld: /root/git/flare/build64_release/thirdparty/glog/glog-0.4.0/src/stacktrace_libunwind-inl.h:66: undefined reference to `_ULx86_64_init_local'
/usr/bin/ld: /root/git/flare/build64_release/thirdparty/glog/glog-0.4.0/src/stacktrace_libunwind-inl.h:78: undefined reference to `_ULx86_64_step'
/usr/bin/ld: /root/git/flare/build64_release/thirdparty/glog/glog-0.4.0/src/stacktrace_libunwind-inl.h:70: undefined reference to `_ULx86_64_get_reg'
/usr/bin/ld: /root/git/flare/build64_release/thirdparty/glog/glog-0.4.0/src/stacktrace_libunwind-inl.h:78: undefined reference to `_ULx86_64_step'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
Blade(error): Build failure.
Blade(info): Cost time 0.267s
Blade(error): Failure

问一个非常诡异的用例

  1. https://github.com/Tencent/flare/tree/master/flare/base/compression/compression_test.cc
    运行./compression_test gzip
  std::string Compress(
      std::function<bool(Compressor*, CompressionOutputStream* out)> f) {
    if (!with_test_output_stream_) {
      NoncontiguousBufferBuilder builder;
      NoncontiguousBufferCompressionOutputStream out(&builder);
      auto&& c = MakeCompressor(method_);  //  调试这里一直返回nullptr
      EXPECT_TRUE(c);
  1. https://github.com/Tencent/flare/blob/master/flare/base/compression_test.cc
    这个用例正常运行。

gzip都已经在cpp文件FLARE_COMPRESSION_REGISTER_COMPRESSOR("gzip", GzipCompressor);注册过了,这是啥情况?

编译时间太长

bash -c "./blade test ..."

WSL2 ubuntu20.04下编译要半个多小时,有时候我可能只改了一个.h文件都要这么久,请问有什么解决方案吗

关于调度参数的疑问

请问一下,这里针对不同策略的计算是有什么依据吗(根据什么原理确定的参数)?

SchedulingParameters GetSchedulingParameters(SchedulingProfile profile,
                                             std::size_t numa_domains,
                                             std::size_t available_processors,
                                             std::size_t desired_concurrency) {
  if (profile == SchedulingProfile::ComputeHeavy) {
    return GetSchedulingParametersForComputeHeavy(desired_concurrency);
  } else if (profile == SchedulingProfile::Compute) {
    return GetSchedulingParametersForCompute(numa_domains, available_processors,
                                             desired_concurrency);
  } else if (profile == SchedulingProfile::Neutral) {
    // @sa: `SchedulingProfile` for the constants below.
    return GetSchedulingParametersOfGroupSize(numa_domains, desired_concurrency,
                                              16, 32);
  } else if (profile == SchedulingProfile::Io) {
    return GetSchedulingParametersOfGroupSize(numa_domains, desired_concurrency,
                                              12, 24);
  } else if (profile == SchedulingProfile::IoHeavy) {
    return GetSchedulingParametersOfGroupSize(numa_domains, desired_concurrency,
                                              8, 16);
  }
  FLARE_UNREACHABLE("Unexpected scheduling profile [{}].",
                    underlying_value(profile));
}

对GetFreeCount的取值比较困惑

std::size_t GetFreeCount(std::size_t upto) {
  return std::min(upto, std::max(upto / 2, kMinimumFreePerWash));
}
  1. upto > 2 * kMinimumFreePerWash,取 upto / 2 显然大于 kMinimumFreePerWash
  2. 2 * kMinimumFreePerWash >= upto >= kMinimumFreePerWash ,取 kMinimumFreePerWash
  3. kMinimumFreePerWash > upto ,取 upto 显然小于 kMinimumFreePerWash

请问这样取值的意义在哪?L32C1-L32C1

Http header data structure

这个owning_strs_2_为什么不用std::vector<std::string>, 从代码中看, 只用到了push_backclear
还有就是internal::CaseInsensitiveHashMap<std::string_view, std::size_t> , 为什么不直接使用internal::CaseInsensitiveHashMap<std::string_view, std::string_view>, 感觉没有必要映射一个size_t, 再多加一个using NonowningFields = std::vector<std::pair<std::string_view, std::string_view>>;

std::deque<std::string> owning_strs_2_;
// For better lookup (which is likely to be done frequently if we're
// parsing, instead of building, headers) performance, we build this map for
// that purpose. Values are indices into `fields_`.
internal::CaseInsensitiveHashMap<std::string_view, std::size_t> header_idx_;
// Referencing either `buffer_` or `owning_strs_`.
NonowningFields fields_;

compile fail

编译环境 gcc 版本 11.2.0 (GCC)

flare2

这里缺少头文件, 加入头文件 optional 就行

flare

这个我暂时将/usr/include/sys/cdefs.h中的__glibc_has_attribute (const)删了,希望有更好的解决办法

flare3

在thirdparty/curl/BUILD的configure_options里加入'--without-librtmp'和'--without-libpsl'

采用intrusive_ptr的考虑是什么?

base下面实现了一系列intrusive东西,例如intrusive doubly linked list、intrusive smart pointer(class RefPtr),侵入式设计是比较少见的,是出于怎样的考虑呢?是为了减少一次内存分配取得更好性能考虑吗?另外, intrusive smart pointer 似乎没有循环引用问题,有考虑过这个方面吗?

Question about looping for read_events_ and write_events_

Could anyone help to explain why to use a loop for read_events_ here.

if (read_events_.fetch_add(1, std::memory_order_acquire) == 0) {
      RefPtr self_ref(ref_ptr, this);

      do {
        auto rc = OnReadable();
        if (FLARE_LIKELY(rc == EventAction::Ready)) {
          continue;
        } else if (FLARE_UNLIKELY(rc == EventAction::Leaving)) {
          FLARE_CHECK(read_mostly_.seldomly_used->cleanup_reason.load(
                          std::memory_order_relaxed) != CleanupReason::None,
                      "Did you forget to call `Kill()`?");
          GetEventLoop()->AddTask([this] {
            read_events_.store(0, std::memory_order_relaxed);
            QueueCleanupCallbackCheck();
          });
          break;
        } else if (rc == EventAction::Suppress) {
          SuppressReadAndClearReadEventCount();
          break;
        }
      } while (read_events_.fetch_sub(1, std::memory_order_release) !=
               1);
      QueueCleanupCallbackCheck();
    });

We can only get read_events_ == 1 after the if condition, so is there any special case that looping to decrease read_events_ is necessary?

DetachTimer的设计权衡是怎样的?

DetachTimer()是应用于怎样的场景呢? 注释中说 can be helpful in fire-and-forget cases. 我的理解是单次定时任务(不重复定时),但已经存在了对应的CreateTimer()接口了,对于DetachTimer的设计思考以及具体作用不太理解。

无法编译项目, 提示没有 GIT LFS? 但是我我确定已经安装

uddf@tuf_test_server:~/flare
$ git lfs install
Updated git hooks.
Git LFS initialized.

uddf@tuf_test_server:~/flare
$ ./blade build
Blade(error): This repository need GIT LFS, see README.md
Blade(error): Blade will exit...
uddf@tuf_test_server:~/flare
$ ./blade build flare
Blade(error): This repository need GIT LFS, see README.md
Blade(error): Blade will exit...
uddf@tuf_test_server:~/flare
$ ./blade build flare/BUILD
Blade(error): This repository need GIT LFS, see README.md
Blade(error): Blade will exit...
uddf@tuf_test_server:~/flare

Fail to build / run on Ubuntu 20.04

  • An undefined reference to pthread_once (even if we've already link against pthread);
  • link_all_symbols does not work correctly with shared linking.

[yadcc ARM] spinlock 在 arm 构建中找不到符号 __aarch64_swp1_acq

srcs = 'spinlock.cc',

在 arm 上构建 yadcc 的时候报错

gcc-10.3.0/include/c++/10.3.0/bits/atomic_base.h:443: undefined reference to `__aarch64_swp1_acq'

经查此符号只在 libatomic.a 中有,似乎 gcc 与 atomic 关系有点复杂, 临时规避后继续通行

diff --git a/flare/base/thread/BUILD b/flare/base/thread/BUILD
index 60e4805..9a2bed7 100644
--- a/flare/base/thread/BUILD
+++ b/flare/base/thread/BUILD
@@ -163,6 +163,9 @@ cc_library(
   deps = [
     '//flare/base:likely',
   ],
+  linkflags = [
+    '-Wl,-Bstatic -latomic -Wl,-Bdynamic',
+  ],
   visibility = 'PUBLIC',
 )

logging

  • flare中的FLARE_LOG_INFO除了fmt::format优化还有其他的么? 有这么一个问题, 假设将日志等级设成WARN, 那么下面这条Hello World的日志一定不会打, 但是这条LOGMessage却还是会生成, 也就是说EXPECT会失败.
  FLARE_LOG_INFO("{}", []() -> const char* {
    EXPECT_FALSE(true);
    return "Hello World!";
  }());
  • 在source code中改变日志等级, 只能用glog提供的FLAGS_minloglevel么? flare为什么没有提供, 我在logging_test中使用FLAGS_minloglevel还得链接上glog, 这个好像是blade的问题, 然后blade直接编译出来的binary也不能直接运行(我想通过-minloglevel改变日志等级), 这两点和bazel不太一样, 以前我还没注意.
    ./blade-bin/flare/base/internal/logging_test: error while loading shared libraries: libfmt.so.7: cannot open shared object file: No such file or directory

为什么侵入式的链表采用的是组合而不是继承?

struct DoublyLinkedListEntry {
  DoublyLinkedListEntry* prev = this;
  DoublyLinkedListEntry* next = this;
};

template <class T>
class DoubleLyLinkedList {
public:
  static_assert(std::is_base_of_v<DoublyLinkedListEntry, T>);

private:
  static constexpr T* object_cast(DoublyLinkedListEntry* entry) noexcept {
    return reinterpret_cast<T*>(entry);
  }
  static constexpr DoublyLinkedListEntry* node_cast(T* ptr) noexcept {
    return reinterpret_cast<DoublyLinkedListEntry*>(ptr);
  }

  size_t size_{};
  DoublyLinkedListEntry head_;
};
  // 组合
  struct C {
    DoublyLinkedListEntry chain;
    int x;
  };
  DoublyLinkedList<C, &C::chain> list;
 // 继承
  struct C : public DoublyLinkedListEntry {
    int x;
  };
  DoublyLinkedList<C> list;

Question About Future and Async: Will it blocking until the the result has been product

Since that brpc do not has the future type, and std::async will:

If the std::future obtained from std::async is not moved from or bound to a reference, the destructor of the std::future will block at the end of the full expression until the asynchronous operation completes, essentially making code such as the following synchronous

And folly::Future will detach when destructor.

I have through the documents:

  1. https://github.com/Tencent/flare/blob/master/flare/doc/fiber.md
  2. https://github.com/Tencent/flare/blob/master/flare/doc/future.md

I didn't find anywhere that says if it will block or detach when destruct. And I go through the code, and find it will share a std::shared_ptr<Core>, and it will do like detached. Did I catched it?

coredump when http overloaded

return std::make_unique<ProtoMessage>(std::move(meta), nullptr);

这上面是不是要 return std::make_unique(std::move(meta), std::monostate());

在 http overloaded 后调用 NormalConnectionHandler::WriteOverloaded

auto msg = factory->Create(MessageFactory::Type::Overloaded,
corresponding_req.GetCorrelationId(), stream);
if (msg) { // Note that `MessageFactory::Create` may return `nullptr`.
WriteMessage(*msg, protocol, controller, kFastCallReservedContextId);
}

然后一直执行到在下面 msg.msg_or_buffer.index() == 1 后拿到的 pb 是个空的,

} else if (msg.msg_or_buffer.index() == 1) {
auto&& pb = *std::get<1>(msg.msg_or_buffer);
std::string rc;
if (content_type_ == ContentType::kApplicationJson) {
FLARE_CHECK(ProtoMessageToJson(pb, &rc, nullptr));

导致 ProtoMessageToJson coredump
if (!message.IsInitialized()) {

bazel build ... 失败

checking run-time libs availability... failed
configure: error: one or more libs available at link-time are not available run-time. Libs used at link-time: -lnghttp2   -lssl -lcrypto -lz 
_____ END BUILD LOGS _____
rules_foreign_cc: Build wrapper script location: bazel-out/k8-fastbuild/bin/external/com_github_curl_curl/curl_foreign_cc/wrapper_build_script.sh
rules_foreign_cc: Build script location: bazel-out/k8-fastbuild/bin/external/com_github_curl_curl/curl_foreign_cc/build_script.sh
rules_foreign_cc: Build log location: bazel-out/k8-fastbuild/bin/external/com_github_curl_curl/curl_foreign_cc/Configure.log

Does the server example in the flare example support 1 thread start?

./build64_release/flare/example/rpc/server --port=5510 --logtostderr --flare_concurrency_hint=1 --flare_enable_watchdog=true

I1020 15:56:15.628629 27869 flare/init.cc:114] Flare started.
I1020 15:56:15.629772 27869 flare/fiber/runtime.cc:425] Using fiber scheduling profile [io-heavy].
I1020 15:56:15.629823 27869 flare/fiber/runtime.cc:220] Starting 1 worker threads per group, for a total of 1 groups. The system is treated as UMA.
I1020 15:56:15.635567 27869 flare/init.cc:122] Flare runtime initialized.

Unable to start successfully

请问有合适的源码阅读顺序吗

代码质量很高,配合doc阅读也有点吃力,目前卡在flare/base这个目录,关联的东西很多,不知道从哪下手了。希望后续doc能有图~

About scheduler_lock

// This lock is held when the fiber is in state-transition (e.g., from running
// to suspended). This is required since it's inherent racy when we add
// ourselves into some wait-chain (and eventually woken up by someone else)
// and go to sleep. The one who wake us up can be running in a different
// pthread, and therefore might wake us up even before we actually went sleep.
// So we always grab this lock before transiting the fiber's state, to ensure
// that nobody else can change the fiber's state concurrently.
//
// For waking up a fiber, this lock is grabbed by whoever the waker;
// For a fiber to go to sleep, this lock is grabbed by the fiber itself and
// released by *`SchedulingGroup`* (by the time we're sleeping, we cannot
// release the lock ourselves.).
//
// This lock also protects us from being woken up by several pthread
// concurrently (in case we waited on several waitables and have not removed
// us from all of them before more than one of then has fired.).
Spinlock scheduler_lock;

// Argument `context` (i.e., `this`) is only used the first time the context
// is jumped to (in `FiberProc`).
jump_context(&caller->state_save_area, state_save_area, this);

或许可以利用一下这个Argument contextfiber切换过去再将caller fiberState改变,状态改变的时候就不需要这把锁了?

inline void FiberEntity::Resume() noexcept {
  SetCurrentFiberEntity(this);
  state = FiberState::Running;
  // Argument `context`  set caller
  auto caller_ = jump_context(&caller->state_save_area, state_save_area, caller);
  // caller_ set nullptr when fiber return
  if (caller_) {
    static_cast<FiberEntity*>(caller_)->state = FiberState::Waiting;
  }
  ...
}

static void FiberProc(void* context) {
  auto caller = reinterpret_cast<FiberEntity*>(context);
  caller->state = FiberState::Waiting;
  //....
  current_fiber->state = FiberState::Dead;
  GetMasterFiberEntity()->M_return([](){...});
}

void FiberEntity::M_return(Function<void()>&& cb) noexcept {
  // set `resume_proc` ....
  SetCurrentFiberEntity(this);
  state = FiberState::Running;
  // set Argument `context`  nullptr
  jump_context(&caller->state_save_area, state_save_area, nullptr);
}

关于Function类的疑问

相关文件flare/base/function.h,flare没有使用std::function,而是采用自己实现的Function,请问有什么目的或者优势?

Why use std::common_type in TryParseTraits

Could anyone help to explain why std::common_type is need here instead of just using T()?

template <class T, class>
struct TryParseTraits {
  // The default implementation delegates calls to `TryParse` found by ADL.
  template <class... Args>
  static std::optional<T> TryParse(std::string_view s, const Args&... args) {
    return TryParse(std::common_type<T>(), s, args...);
  }
};

How to build flare using blade?

Hi there,
I have installed blade and ninja but don't know how to build flare or examples in flare project. I have no experience of blade and ninja. I think I installed blade and ninja correctly. Below is some log:

`luis@ubuntu:/mnt/hgfs/E/flare/flare/example$ blade -h
usage: blade [-h] [--version] {build,run,test,clean,query,dump} ...

blade [options...] [targets...]

positional arguments:
{build,run,test,clean,query,dump}
Available subcommands
build Build specified targets
run Build and runs a single target
test Build the specified targets and runs tests
clean Remove all blade-created output
query Execute a dependency graph query
dump Dump specified internal information

optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
luis@ubuntu:/mnt/hgfs/E/flare/flare/example$ ninja --version
1.10.0
luis@ubuntu:/mnt/hgfs/E/flare/flare/example$
`

有人能编译成功吗?

./blade build ...编译失败。

image

后来我cd到flare/flare目录,执行 ../blade build

没有上述报错了。但是会卡住:

[2/788] CMAKE BUILD //thirdparty/opentracing-cpp:opentracing-cpp_build


进度不继续走。

已经安装了git lfs

Unable to link, ubuntu 20.04

blade build ... works, no errors, however when running blade test, many .so cannot be correctly linked.
Latest head of flare, latest head of blade-build

Details

blade test flare/base:endian_test
Blade(info): Loading BUILD files...
Blade(info): Loading done.
Blade(info): Analyzing dependency graph...
Blade(info): Analyzing done.
Blade(info): Generating backend build code...
Blade(info): Generating done.
Blade(info): Building...
Blade(info): Adjust build jobs number(-j N) to be 20
ninja: warning: bad deps log signature or version; starting over
[46/46] LINK BINARY build64_release/flare/base/endian_test
Blade(info): Build success.
Blade(info): Adjust test jobs number(-j N) to be 10
Blade(notice): 1 tests to run
Blade(info): Spawn 1 worker thread(s) to run concurrent tests
repo/flare/build64_release/flare/base/endian_test: error while loading shared libraries: libgtest_main.so: cannot open shared object file: No such file or directory
Blade(info): [1/0/1] Test //flare/base:endian_test finished : FAILED:127

ldd result:

~/repo/flare$ ldd blade-bin/flare/base/endian_test
linux-vdso.so.1 (0x00007ffc89d05000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007feb7aab2000)
libgtest_main.so => not found
libgtest.so => not found
libtcmalloc.so.4 => not found
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007feb7a8d1000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007feb7a8b6000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007feb7a6c2000)
/lib64/ld-linux-x86-64.so.2 (0x00007feb7aafe000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007feb7a573000)

I wonder how does link work in blade?

clang-format

噢噢是这样的我看到代码有些地方的格式不太对
@0x804d8000 你们用的clang-format版本是多少
关于google style 头文件的问题, clang-format文件能不能修改做到格式化

FLARE_CHECK 等字符过多,可以简化

首先要说代码质量相当不错,功能也相当充分,完全可以作为基础库了。

提个小建议

FLARE_LOG_INFO
FLARE_LOG_WARNING
FLARE_CHECK
FLARE_LIKELY
FLARE_DCHECK
FLARE_LOG_WARNING_EVERY_SECOND

等宏名字过于冗长,可以大气一些直接改为 FLOG_INFO, FCHECK

namespace flare 可能会用得比较多,如果有更简洁的方案,当采用之

clang-17 asan/tsan fiber coredump

environment

docker ubuntu22.04 clang 17 bazel

.bazelrc add new configs

build:tsan --config=san-common --copt=-fsanitize=thread --linkopt=-fsanitize=thread
build:tsan_llvm --config=llvm --config=tsan
build:asan_llvm --config=llvm --config=asan

future_test tsan coredump

build command : bazel build --config tsan_llvm //flare/fiber:*
log:

# ./bazel-bin/flare/fiber/future_test
Running main() from gmock_main.cc
[==========] Running 3 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 3 tests from Future
[ RUN      ] Future.BlockingGet
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1102 03:38:13.659870 1061357 runtime.cc:425] Using fiber scheduling profile [neutral].
I1102 03:38:13.660069 1061357 runtime.cc:224] Starting 18 worker threads per group, for a total of 2 groups. The system is treated as UMA.
ThreadSanitizer:DEADLYSIGNAL
==1061357==ERROR: ThreadSanitizer: SEGV on unknown address 0x7f82f653bff8 (pc 0x558d1af98b66 bp 0x7f830f331df0 sp 0x7f830f331be8 T1061423)
==1061357==The signal is caused by a WRITE memory access.
ThreadSanitizer:DEADLYSIGNAL
ThreadSanitizer: nested bug in the same thread, aborting.

future_test asan coredump

build command : bazel build --config asan_llvm //flare/fiber:*
log:

./bazel-bin/flare/fiber/future_test
Running main() from gmock_main.cc
[==========] Running 3 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 3 tests from Future
[ RUN      ] Future.BlockingGet
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1102 03:45:45.175843 1084756 runtime.cc:425] Using fiber scheduling profile [neutral].
I1102 03:45:45.176086 1084756 runtime.cc:224] Starting 18 worker threads per group, for a total of 2 groups. The system is treated as UMA.
AddressSanitizer:DEADLYSIGNAL
.......
=================================================================
==1084756==ERROR: AddressSanitizer: SEGV on unknown address 0x7f52e0199a00 (pc 0x55e2ca67c61d bp 0x7f52e05be6d0 sp 0x7f52e05be600 T42)
==1084756==The signal is caused by a WRITE memory access.
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
    #0 0x55e2ca67c61d in flare::internal::asan::StartSwitchFiber(void**, void const*, unsigned long) /proc/self/cwd/./flare/base/internal/annotation.h:161:1
    #1 0x55e2ca67af9c in flare::fiber::detail::FiberEntity::Resume() /proc/self/cwd/./flare/fiber/detail/fiber_entity.h:343:3
    #2 0x55e2ca684aef in flare::fiber::detail::FiberEntity::ResumeOn(flare::Function<void ()>&&) /proc/self/cwd/flare/fiber/detail/fiber_entity.cc:179:3
    #3 0x55e2ca6872eb in flare::fiber::detail::FiberProc(void*) /proc/self/cwd/flare/fiber/detail/fiber_entity.cc:133:29

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /proc/self/cwd/./flare/base/internal/annotation.h:161:1 in flare::internal::asan::StartSwitchFiber(void**, void const*, unsigned long)
Thread T42 created by T0 here:
    #0 0x55e2ca4ade7d in pthread_create (/root/.cache/bazel/_bazel_root/c5f028faa5e19a5e19c31dee93eeaf11/execroot/__main__/bazel-out/k8-dbg/bin/flare/fiber/future_test+0x111e7d) (BuildId: 40d8dc05e8ca6ce1fa4f444d1ddbaa8b42ab93e8)
    #1 0x7f5313b16328 in std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State>>, void (*)()) (/lib/x86_64-linux-gnu/libstdc++.so.6+0xdc328) (BuildId: e37fe1a879783838de78cbc8c80621fa685d58a2)
    #2 0x55e2ca6783a0 in flare::fiber::detail::FiberWorker::Start(bool) /proc/self/cwd/flare/fiber/detail/fiber_worker.cc:40:13
    #3 0x55e2ca575750 in flare::fiber::(anonymous namespace)::FullyFledgedSchedulingGroup::Start(bool) /proc/self/cwd/flare/fiber/runtime.cc:118:10
    #4 0x55e2ca57277a in flare::fiber::StartRuntime() /proc/self/cwd/flare/fiber/runtime.cc:490:11
    #5 0x55e2ca505034 in void flare::fiber::testing::RunAsFiber<flare::Future_BlockingGet_Test::TestBody()::$_0>(flare::Future_BlockingGet_Test::TestBody()::$_0&&) /proc/self/cwd/./flare/fiber/detail/testing.h:41:3
    #6 0x55e2ca504e9e in flare::Future_BlockingGet_Test::TestBody() /proc/self/cwd/flare/fiber/future_test.cc:33:3
    #7 0x55e2cab8f1ff in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2599:10
    #8 0x55e2cab552b6 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2635:14
    #9 0x55e2cab11bcd in testing::Test::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2674:5
    #10 0x55e2cab138f4 in testing::TestInfo::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2853:11
    #11 0x55e2cab14fc0 in testing::TestSuite::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:3012:30
    #12 0x55e2cab3937c in testing::internal::UnitTestImpl::RunAllTests() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:5870:44
    #13 0x55e2cab95daf in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2599:10
    #14 0x55e2cab5b86d in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2635:14
    #15 0x55e2cab38655 in testing::UnitTest::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:5444:10
    #16 0x55e2caaf1ca0 in RUN_ALL_TESTS() /proc/self/cwd/external/com_google_googletest/googletest/include/gtest/gtest.h:2293:73
    #17 0x55e2caaf1bfe in main /proc/self/cwd/external/com_google_googletest/googlemock/src/gmock_main.cc:70:10
    #18 0x7f531371ed8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16

==1084756==ABORTING

Timer

std::uint64_t SetTimer(std::chrono::steady_clock::time_point at,
std::chrono::nanoseconds interval,
Function<void(std::uint64_t)>&& cb) {
// This is ugly. But since we have to start a fiber each time user's `cb` is
// called, we must share it.
//
// We also take measures not to call user's callback before the previous call
// has returned. Otherwise we'll likely crash user's (presumably poor) code.
struct UserCallback {
void Run(std::uint64_t tid) {
if (!running.exchange(true, std::memory_order_acq_rel)) {
cb(tid);
}
running.store(false, std::memory_order_relaxed);
// Otherwise this call is lost. This can happen if user's code runs too
// slowly. For the moment we left the behavior as unspecified.
}
Function<void(std::uint64_t)> cb;
std::atomic<bool> running{};
};
auto ucb = std::make_shared<UserCallback>();
ucb->cb = std::move(cb);
auto sg = detail::NearestSchedulingGroup();
auto timer_id = sg->CreateTimer(at, interval, [ucb](auto tid) mutable {
internal::StartFiberDetached([ucb, tid] { ucb->cb(tid); });
});
sg->EnableTimer(timer_id);
return timer_id;
}

没有看懂UserCallback的作用, Run函数完全没有调用啊, running也就显得乏力了

Fail to build

./blade build ... gave me the following errors

ERROR:root:code` for hash sha512 was not found.
Traceback (most recent call last):
  File "/home/server_dev/.blade_boost/lib/python2.7/hashlib.py", line 139, in <module>
    globals()[__func_name] = __get_hash(__func_name)
  File "/home/server_dev/.blade_boost/lib/python2.7/hashlib.py", line 91, in __get_builtin_constructor
    raise ValueError('unsupported hash type %s' % name)
ValueError: unsupported hash type sha512
...
... (similar error for other hashes)
...
Traceback (most recent call last):
  File "/home/server_dev/.blade_boost/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/home/server_dev/.blade_boost/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/server_dev/repo/flare/thirdparty/blade/blade.zip/__main__.py", line 14, in <module>
  File "/home/server_dev/repo/flare/thirdparty/blade/blade.zip/blade/__init__.py", line 6, in <module>
  File "/home/server_dev/repo/flare/thirdparty/blade/blade.zip/blade/config.py", line 422, in <module>
  File "/home/server_dev/repo/flare/thirdparty/blade/blade.zip/blade/config.py", line 44, in __init__
AttributeError: 'module' object has no attribute 'md5'

I though it was my openssl lib being missing, reinstalled but still no luck.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.