Light

81981266 / bigdataproject Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 0.0 78.58 MB

it is my big data class project

Shell 2.43% HTML 89.68% Python 3.65% Jupyter Notebook 4.24%

bigdataproject's Introduction

大数据存储与分析

前言

本项目目标是基于阿里云搭建全分布式系统，运行Hadoop3 + Spark2 + MongoDB + something else做一些好玩的事，目前正在构思与探索。这是笔者第一次上手大数据平台，因此秉持着“study with output”的精神，尝试把从下载软件到跑通代码的过程都记录下来，并且尽力“知其所以然”。所以本项目记录比较详细，适合新手阅读。

思路

边学边记。

目录

因配置环境只需要一篇文档，这里把配置环境的说明统一文件放到Documentations文件夹下。

环境安装与Hello-World DEMO

数据集介绍

公开数据集

这一部分数据集能够在公开渠道下载。

爬虫数据集

这一部分数据集由笔者自行爬取获得。

上手

bigdataproject's People

Contributors

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.