babeldb's Introduction

BabelDB

“The library will endure; it is the universe. As for us, everything has not been written; we are not turning into phantoms. We walk the corridors, searching the shelves and rearranging them, looking for lines of meaning amid leagues of cacophony and incoherence, reading the history of the past and our future, collecting our thoughts and collecting the thoughts of others, and every so often glimpsing mirrors, in which we may recognize creatures of the information.” ― Jorge Luis Borges, The Library of Babel

⚠️ BabelDB is an ongoing "Sci-Fi" experimentation project.

BabelDB is an in-memory Website Database. BabelDB combines a programmatic data extraction engine with scheduling and data clustering. It offers a standard and lightweight SQL syntax and a powerful DSL for querying, searching and information retrieval. BabelDB continuously ingests data from any pre-defined seed web source and allows you to query data with standard SQL. Also it provides its own query language: BabelQL, built on top of the engine to provide search capabilities such as full-text search, term and phrase matching, regex and more.

Traditionally the building blocks of Databases relies on storage resource (e.g. disk, memory) and how it is organized and how data is distributed. Well for BabelDB the storage and distribution is already solved by internet itself: interconnected computer networks to storage and distribute data around the globe. BabelDB attempts to make all common DB features accesible for all at any time in any device.

Features

Motivation

From Wikipedia:

...a database is an organized collection of data stored and accessed electronically...

Can Internet as a whole be considered a Database by itself?

The internet is a vast space of information. Most of the information is free (which does not mean true) and accessible through browsers and search engines and dedicated tooling. Crawler & Scrapper bots are popular ways for automated data collection and indexing. Crawling is essentially what search engines do while scraping is an automated way of extracting specific datasets. But when it comes to address a more specific use cases or non-technical users, sometimes this is not enough.

For example:

I want to collect all news articles automatically and compare climate change narrative between site X and Y.
I want to know how site X looked like 24 hours ago and retrieve only the updates.
I want to keep track of companies that are environmentally friendly or have sustainability programs.
I want to discover linked web resources which match with some pattern.
I want to subscribe and be aware when certain semantic shows up in site X.

Ok!, technically speaking this is not too complex with the tooling we have access nowadays. But let's say I want a Marketing analyst with knowledge of SQL can do it.

BabelDB is the experimental attempt to solve that! 😀

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

margostino / babeldb Goto Github PK

babeldb's Introduction

BabelDB

Features

Motivation

Can Internet as a whole be considered a Database by itself?

babeldb's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent