Giter Club home page Giter Club logo

tiamat's Introduction

tiamat

Pymongo tutorial with NOAA metadata records

In this lab, we will use Pymongo to store NOAA metadata records to a database.

Tiamat

By Internet Archive Book Images [No restrictions], via Wikimedia Commons

Please verify you have all installations listed in the Install file before proceeding.

Install MongoDB

First, install MongoDB.

OSX

$ brew install mongodb

Linux

Windows

Install Pymongo

Next, make sure you have pymongo installed.

Type this into your terminal:

$ python -m pip install pymongo

If you're using Anaconda Prompt:

$ conda install -c anaconda pymongo

Clone this repo

$ git clone https://github.com/navarretedaniel/tiamat.git
$ cd tiamat

Install Jupyter if you don't already have it

In terminal:

$ pip install jupyter

In Anaconda Prompt:

$ conda install jupyter

Launch the Jupyter Notebook

$ jupyter notebook MongoDBTutorial.ipynb

Basic Commands

import json
import pymongo
from pprint import pprint

conn=pymongo.MongoClient()
db = conn.earthwindfire
records = db.records

with open("data_sample.json") as data_file:    
    noaa = json.load(data_file)

def insert(metadata):
    for dataset in metadata:
        data ={}
        data["title"] = dataset["title"]
        data["description"] = dataset["description"]
        data["keywords"] = dataset["keyword"]
        data["accessLevel"] = dataset["accessLevel"]
        data["lang"] = dataset["language"]
        # choose your own
        # choose your own
        # choose your own
        # choose your own

        records.insert_one(data)

insert(noaa)
# Check to make sure they're all in there
records.count()

# Find
records.find_one()

for rec in records.find()[:2]:
    pprint(rec)

records.find({"keywords": "NESDIS"}).count()

records.find({"keywords": "NESDIS","keywords": "Russia","accessLevel":"public"}).count()

for r in records.find({"keywords": "NESDIS","keywords": "Russia","accessLevel":"public"}):
    pprint(r)

# Limit
cursor = db.records.find({"$where": "this.keywords.length > 100"}).limit(2);
for rec in cursor:
    pprint(rec)

# Full text search
db.records.create_index([('description', 'text')])

cursor = db.records.find({'$text': {'$search': 'precipitation'}})
for rec in cursor:
    print rec

cursor = db.records.find({'$text': {'$search': 'fire'}})
cursor.count()

# Drop text index to create a new one
db.records.drop_index("description_text")

# Create a wildcard index
db.records.create_index([("$**","text")])

cursor = db.records.find({'$text': {'$search': "Russia"}})
for rec in cursor:
    pprint(rec)

# Projections
cursor = db.records.find({'$text': {'$search': "Russia"}}, {"title": 1,"_id":0 })
for rec in cursor:
    print rec

# Limit
cursor = db.records.find({'$text': {'$search': "Russia"}}, {"title": 1,"_id":0 }).limit(2)
for rec in cursor:
    print rec

# Aggregate
cursor = db.records.aggregate(
    [
        {"$group": {"_id": "$lang", "count": {"$sum": 1}}}
    ]
)
for document in cursor:
    pprint(document)

cursor = db.records.aggregate(
    [
        {"$match": {'$text': {'$search': "Russia"}, "accessLevel": "public"}},
        {"$group": {"_id": "$title"}}
    ]
)

for document in cursor:
    pprint(document)

# Remove data
conn.earthwindfire.collection_names()
conn.earthwindfire.drop_collection("records")
conn.earthwindfire.collection_names()

conn.database_names()
conn.drop_database("earthwindfire")
conn.database_names()

If you already know SQL...

The following table provides an overview of common SQL aggregation terms, functions, and concepts and the corresponding MongoDB aggregation operators:

SQL Terms, Functions, and Concepts MongoDB Aggregation Operators
WHERE $match
GROUP BY $group
HAVING $match
SELECT $project
ORDER BY $sort
LIMIT $limit
SUM() $sum
COUNT() $sum
join $lookup

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.