Giter Club home page Giter Club logo

gender_art's Introduction

Gender Art Project

Add some description here...

Configuration

$cp config.example.py  config.py
$vi config.py

Requirements

$pip install -r requirements.txt
$npm i

Instructions

First download the data as json files:

$python download_all_files.py

Make sure you have a mongo database running on http://localhost:27017 You can use docker for that:

$docker-compose up

Or alternatively MongoDB with these two steps:

$mongod --dbpath "your\path\to\db"
$mongo

Then insert the data into a mongodb:

$node dataToMongo-splitfiles.js

Scripts Descriptions

download_all_files.py

A script to connect to API and download the files with pagination. Caches files already downloaded. Set clear to 'True' to empty cache. Requires config.py with login information. Performs initial cleaning of json documents.

dataToMongo-splitfiles.js

Creates Mongo database with three collections: Author, Artwork and Media. Requires json data files.

gender_art's People

Contributors

arplv avatar jdenes avatar paulgirard avatar rusabin avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gender_art's Issues

Series Artworks in Unique Artworks

unique_artworks.py excludes series of artworks when creating dataset.
code in lines 37-42:
separable_artworks_groups = db.Artwork.aggregate([
{"$match": {"type":'separable'}},
{'$group': dict( [('_id','ensemble_id')] + [(k,{'$first':'$%s'%k}) for k,v in project.items()])},
{"$project":project}
])
should create a list of unique artworks from series, but it takes only the first one individual artworks of first series. It also creates '_id': 'ensemble_id' instead of the numerical id of series.

As a result all other series of artworks are excluded from the dataset.

Adapt download data script to python3

Python3 script which does :

  • retireve personnal account info from config
  • generate a token and store into a file see the documentation
  • craft the url with pagination
  • set up a cache system to avoid restarting from scracth if we interupt the script
  • set up a retry mechanism (see the bash script)
  • store the data into json files as the bash script we doing it

Gender simple countings

  • calculate gender ratios on the total database
  • same ratio though time by the acquisition date : how many artworks from women had been acquired by year / artworks from men had been acquired by year
  • same ratio by institutions (through time)
  • same ratio by artworktypes (through time)

control artists data field completion

  • compute an average date for each artists which is the avergae of birthdate, deathdate, creation dates of all artworks
  • depict a barchart of number of artist by average year (distribution control)
  • for each important data field depict a barchart which represents the number of artists which does not have this data field available by the average year

gender in artists groups ?

  • do we have group compisition
  • if yes, how to count them gender wise ? consider a group as female if at least is present or majority or only groups which are composed of female artists ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.