Giter Club home page Giter Club logo

datavane / datasophon Goto Github PK

View Code? Open in Web Editor NEW
921.0 36.0 324.0 49.55 MB

The next generation of cloud-native big data management expert , Aims to help users rapidly build stable, efficient, and scalable cloud-native platforms for big data.

Home Page: https://datasophon.github.io/datasophon-website/

License: Apache License 2.0

Java 65.83% Shell 10.46% FreeMarker 1.92% Vue 21.04% Stylus 0.01% Less 0.71% Dockerfile 0.02%
doris kubernetes spark yarn cloudnative easy-to-use

datasophon's Introduction

DataSophon Logo

DataSophon

Makes it easy to manage and monitor clusters

Office Website | Chinese

If you like,star fork it and join us

Vision

Aiming at quickly deploying, managing, monitoring and automating the operation and maintenance of Big Data service components and nodes,helping you quickly build stable, efficient Big Data cluster services.

What is DataSophon?

The Three-Body Problem, a Hugo Award-winning work of the world's highest science fiction literature, is known for its stunning "hard science fiction" style, and its author Liu Cixin is credited with "single-handedly raising Chinese science fiction to a world-class level".

As a very important role in the Triad, the Sophon is a two-dimensional unfolding of the nine-dimensional proton, which is transformed into a supercomputer through circuit etching and then transferred back to the microscopic eleventh dimension to monitor every human movement and use quantum entanglement to achieve instantaneous communication to report to the Triad civilization four light years away. To put it bluntly, the Sophon is a AI real-time remote monitoring and management platform deployed by the Triad civilization on Earth.

DataSophon is a similar management platform. Unlike the Sophon, which aims to limit human's basic science and hinder human's technology development, DataSophon is dedicated to automatical monitoring, operation and management of Big Data infrastructure components and nodes, helping you to quickly build a stable, efficient Big Data cluster service.

Key Features

  • Easy to deploy, can quickly complete the deployment of about 300 nodes of big data clusters
  • Nationalization compatible, compatible with arm servers and common localized operating systems
  • Comprehensive and rich monitoring indicators, showing that the users care about most, based on production practice.
  • Flexible and convenient alarm service, which can realize user-defined alarm groups and alarm indicators
  • Strong scalability, users can integrate or upgrade any components through configuration.

img

Product Architecture

product-en

Architecture

img

Questions

For questions, bugs and supports please open an issue, we'll reply you in time.

Communication and Contribution

Welcome to join the community for communication and sharing, and also welcome you to contribute to the community. Thank you very much for your support here~ WeChat community group (recommended): Scan the QR code to add WeChat, invite to join the group, and apply for a comment of "Datasophon".

img

datasophon's People

Contributors

88fantasy avatar a19920714liou avatar alldatafounder avatar ceohui avatar chenss-1 avatar datasophon avatar gaozhenfeng1 avatar gmady520 avatar green241 avatar gtk96 avatar haitaodesign avatar hitozhu avatar hzluting avatar javaht avatar lipeng186 avatar lishiyucn avatar liu-hai avatar liugddx avatar liuxin319 avatar lnnlab avatar luoyajun10 avatar luyangyang12138 avatar thomasg19930417 avatar wujieren avatar zhangdw123 avatar zhangkeyu008 avatar zhegemingzimeibanquan avatar zhzhenqin avatar zq0757 avatar zzm0809 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datasophon's Issues

manger 和前端会开源吗

Which chapter do you think needs improvement?

manger 和前端会开源吗

What information do you think needs to be added?

manger 和前端会开源吗

[Bug]: Add Service Error

What happened?

image

Additional Information

when i add hdfs service, but giving up at last step, error happend.
we can find two hdfs service in list and can not delete it

[Elasticsearch]: package is not found

the dashboard can add 'Elasticsearch' ,but the packages is not found package

按照指导手册部署好框架之后,页面上有elasticsearch,但是packages里面没有安装包

[Feature Request]: 重试按钮

Tell us what feature you want?

1、安装是吧,是否可以有个重试按钮。
2、安装经常存卡在哪里变动了

[Feature Request]: 是否可以支持接管已经使用Apache版本部署的大数据集群

Tell us what feature you want?

背景

很多小厂没有预算去购买EMR,且HDP 、CDH集成度太高,可能没有满足业务要求的组件版本套件,比如hdp3 Hadoop是3.x的版本,但spark却是2.x。

所有有些小厂会选择使用Apache版本进行集成。比如我们,当时我集成花了大量时间,而且集成完之后集群依旧不好管理,只能编写一些自动化脚本简化日常运维操作,可视化程度非常低。

Wants

如果后期有计划支持接管已经在线上使用的集群那就更好了。没有的话,我们可能只能自研,使用salt等运维工具实现类似一个Supervisor的功能。

望考虑datasophon实现之可行性。

[Bug]: NullPointerException when dispatch agent

What happened?

image

when processing to this step, the runtime log shows the following exception:

[INFO] 2022-12-04 14:15:45 com.datasophon.api.master.handler.host.StartWorkerHandler:[66] - end dispatcher host agent :ctyun9
[INFO] 2022-12-04 14:15:51 com.datasophon.api.master.alert.HostCheckActor:[27] - start to check host info
[INFO] 2022-12-04 14:15:56 com.datasophon.api.master.WorkerStartActor:[39] - receive message when worker first start :ctyun9
[INFO] 2022-12-04 14:15:56 com.datasophon.api.master.WorkerStartActor:[55] - host install set to 100%
[INFO] 2022-12-04 14:15:56 com.datasophon.api.master.WorkerStartActor:[69] - host install save to database
[ERROR] [12/04/2022 14:15:56.773] [datasophon-akka.actor.default-dispatcher-156] [akka://datasophon/user/master/prometheusActor] null
java.lang.NullPointerException
at com.datasophon.api.master.PrometheusActor.onReceive(PrometheusActor.java:144)
at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)
at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
at akka.actor.ActorCell.invoke(ActorCell.scala:495)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)

Additional Information

No response

[Feature Request]: 自定义配置安装路径及hdfs nameservice 名称

Tell us what feature you want?

1、目前无法自定义安装目录名称,代码中有大量写死安装路径为/opt/datasophon,导致在common.properties文件中修改了install.path属性后安装报错,希望可以真正实现自定义配置安装路径。
2、hdfs 修改nameservice后,dfs.ha.namenodes.nameservice1,dfs.namenode.rpc-address.nameservice1.nn1等多个依赖nameservcie 值的配置项没有更改为配置的nameservice ,希望优化。

[Bug]: Optimize state synchronization

What happened?

A bug happened!

Additional Information

Add 113 Datanodes in batch, only 23 of which are displayed successfully on the page, while the rest are displayed unsuccessfully, and the subsequent status will slowly change to successful;
Hopes to optimize status synchronization and display intermediate status (such as progress bar) to avoid misleading users

[Bug]: flink/spark on yarn: submitted by user root application rejected by placement rules.

/opt/datasophon/flink-1.15.2/bin/flink run -t yarn-per-job /opt/datasophon/flink-1.15.2/examples/batch/WordCount.jar --input /test/input/word.txt --output /test/output/fwordcount/

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/datasophon/flink-1.15.2/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/datasophon/hadoop-3.3.3/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2022-12-09 17:38:31,895 WARN org.apache.flink.yarn.configuration.YarnLogConfigUtil [] - The configuration directory ('/opt/datasophon/flink-1.15.2/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2022-12-09 17:38:32,108 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2022-12-09 17:38:32,120 WARN org.apache.flink.yarn.YarnClusterDescriptor [] - Job Clusters are deprecated since Flink 1.15. Please use an Application Cluster/Application Mode instead.
2022-12-09 17:38:32,243 INFO org.apache.hadoop.conf.Configuration [] - resource-types.xml not found
2022-12-09 17:38:32,243 INFO org.apache.hadoop.yarn.util.resource.ResourceUtils [] - Unable to find 'resource-types.xml'.
2022-12-09 17:38:32,286 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - The configured JobManager memory is 1600 MB. YARN will allocate 2048 MB to make up an integer multiple of its minimum allocation memory (1024 MB, configured via 'yarn.scheduler.minimum-allocation-mb'). The extra 448 MB may not be used by Flink.
2022-12-09 17:38:32,286 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - The configured TaskManager memory is 1728 MB. YARN will allocate 2048 MB to make up an integer multiple of its minimum allocation memory (1024 MB, configured via 'yarn.scheduler.minimum-allocation-mb'). The extra 320 MB may not be used by Flink.
2022-12-09 17:38:32,286 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Cluster specification: ClusterSpecification{masterMemoryMB=1600, taskManagerMemoryMB=1728, slotsPerTaskManager=1}
2022-12-09 17:38:33,769 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Removing 'localhost' Key: 'jobmanager.bind-host' , default: null (fallback keys: []) setting from effective configuration; using '0.0.0.0' instead.
2022-12-09 17:38:33,770 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Removing 'localhost' Key: 'taskmanager.bind-host' , default: null (fallback keys: []) setting from effective configuration; using '0.0.0.0' instead.
2022-12-09 17:38:33,799 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Submitting application master application_1670577058759_0008


The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: Could not deploy Yarn job cluster.
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372)
at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222)
at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:836)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:247)
at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1078)
at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1156)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1156)
Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: Could not deploy Yarn job cluster.
at org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:491)
at org.apache.flink.client.deployment.executors.AbstractJobClusterExecutor.execute(AbstractJobClusterExecutor.java:82)
at org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1053)
at org.apache.flink.client.program.ContextEnvironment.executeAsync(ContextEnvironment.java:132)
at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:70)
at org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:93)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355)
... 11 more
Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1670577058759_0008 to YARN : Reject application application_1670577058759_0008 submitted by user root application rejected by placement rules.
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:336)
at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:1240)
at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:616)
at org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:484)
... 21 more
2022-12-09 17:38:33,842 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Cancelling deployment from Deployment Failure Hook
2022-12-09 17:38:33,843 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Killing YARN application
2022-12-09 17:38:33,848 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl [] - Killed application application_1670577058759_0008
2022-12-09 17:38:33,849 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Deleting files in hdfs://nameservice1/user/root/.flink/application_1670577058759_0008.

Who is using DataSophon?

Who is using DataSophon?

Sincerely thank everyone who constantly keeps on using and supporting DataSophon. We will try our best to make DataSophon better and make the community and ecology more prosperous.

The original intention of this issue
We’d like to listen to the community to make DataSophon better.
Learn more about the practical use scenarios of DataSophon to facilitate the next step of planning.
What we expect from you
Please submit a comment in this issue to include the following information:

logo: your company/school/organization logo.
name: your company/school/organization name
website: your company/school/organization website
contact: contact info, e.g: blog, email, Twitter (at least one).
usage Scenario: for what business scenario do you use DataSophon.

[Feature][server] Add Apache Kyuubi

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on Data Warehouses and Lakehouses.[1]

Kyuubi provides a docker compose based playground[2] and online try me[3](based-on arm64).

Kyuubi was adopted by AliCloud EMR[4] and TencentCloud EMR[5].

The minimal deps of Kyuubi is Zookeeper 3.4+, and Spark 3.1+ (assuming you want to use Kyuubi as the Spark Thrift Gateway)

[1] https://mp.weixin.qq.com/s/5Sj12_qbQTnCOcZVZUSNhg
[2] https://github.com/apache/incubator-kyuubi/tree/master/docker/playground
[3] https://try.kyuubi.cloud/
[4] https://help.aliyun.com/document_detail/439451.html
[5] https://cloud.tencent.com/document/product/589/72001

[Feature Request]: remove jsch

Tell us what feature you want?

Because jsch runs abnormally under different operating systems, we need to remove it and use linux ssh to implement its functions

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.