ipeagit / geobr Goto Github PK

Easy access to official spatial data sets of Brazil in R and Python

Home Page: https://ipeagit.github.io/geobr/

R 87.44% Python 12.52% Shell 0.04%

r rstats python spatial-data geopackage shapefile sf geopandas brazil datasets

geobr's Introduction

geobr: Download Official Spatial Data Sets of Brazil

geobr is a computational package to download official spatial data sets of Brazil. The package includes a wide range of geospatial data in geopackage format (like shapefiles but better), available at various geographic scales and for various years with harmonized attributes, projection and topology (see detailed list of available data sets below).

The package is currently available in R and Python.

R	Python	Repo

Installation R

# From CRAN
install.packages("geobr")
library(geobr)

# or use the development version with latest features
utils::remove.packages('geobr')
devtools::install_github("ipeaGIT/geobr", subdir = "r-package")
library(geobr)

obs. If you use Linux, you need to install a couple dependencies before installing the libraries sf and geobr. More info here.

Installation Python

pip install geobr

Windows users:

conda create -n geo_env
conda activate geo_env  
conda config --env --add channels conda-forge  
conda config --env --set channel_priority strict  
conda install python=3 geopandas  
pip install geobr

Basic Usage

The syntax of all geobr functions operate on the same logic so it becomes intuitive to download any data set using a single line of code. Like this:

R, reading the data as an `sf` object

library(geobr)

# Read specific municipality at a given year
mun <- read_municipality(code_muni=1200179, year=2017)

# Read all municipalities of given state at a given year
mun <- read_municipality(code_muni=33, year=2010) # or
mun <- read_municipality(code_muni="RJ", year=2010)

# Read all municipalities in the country at a given year
mun <- read_municipality(code_muni="all", year=2018)

More examples in the intro Vignette

Python, reading the data as a `geopandas` object

from geobr import read_municipality

# Read specific municipality at a given year
mun = read_municipality(code_muni=1200179, year=2017)

# Read all municipalities of given state at a given year
mun = read_municipality(code_muni=33, year=2010) # or
mun = read_municipality(code_muni="RJ", year=2010)

# Read all municipalities in the country at a given year
mun = read_municipality(code_muni="all", year=2018)

More examples here

Available datasets:

👉 All datasets use geodetic reference system "SIRGAS2000", CRS(4674).

Function	Geographies available	Years available	Source
`read_country`	Country	1872, 1900, 1911, 1920, 1933, 1940, 1950, 1960, 1970, 1980, 1991, 2000, 2001, 2010, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020	IBGE
`read_region`	Region	2000, 2001, 2010, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020	IBGE
`read_state`	States	1872, 1900, 1911, 1920, 1933, 1940, 1950, 1960, 1970, 1980, 1991, 2000, 2001, 2010, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020	IBGE
`read_meso_region`	Meso region	2000, 2001, 2010, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020	IBGE
`read_micro_region`	Micro region	2000, 2001, 2010, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020	IBGE
`read_intermediate_region`	Intermediate region	2017, 2019, 2020	IBGE
`read_immediate_region`	Immediate region	2017, 2019, 2020	IBGE
`read_municipality`	Municipality	1872, 1900, 1911, 1920, 1933, 1940, 1950, 1960, 1970, 1980, 1991, 2000, 2001, 2005, 2007, 2010, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022	IBGE
`read_municipal_seat`	Municipality seats (sedes municipais)	1872, 1900, 1911, 1920, 1933, 1940, 1950, 1960, 1970, 1980, 1991, 2010	IBGE
`read_weighting_area`	Census weighting area (área de ponderação)	2010	IBGE
`read_census_tract`	Census tract (setor censitário)	2000, 2010, 2017, 2019, 2020	IBGE
`read_statistical_grid`	Statistical Grid of 200 x 200 meters	2010	IBGE
`read_metro_area`	Metropolitan areas	1970, 2001, 2002, 2003, 2005, 2010, 2013, 2014, 2015, 2016, 2017, 2018	IBGE
`read_urban_area`	Urban footprints	2005, 2015	IBGE
`read_amazon`	Brazil's Legal Amazon	2012	MMA
`read_biomes`	Biomes	2004, 2019	IBGE
`read_conservation_units`	Environmental Conservation Units	201909	MMA
`read_disaster_risk_area`	Disaster risk areas	2010	CEMADEN and IBGE
`read_indigenous_land`	Indigenous lands	201907, 202103	FUNAI
`read_semiarid`	Semi Arid region	2005, 2017	IBGE
`read_health_facilities`	Health facilities	201505, 202303	CNES, DataSUS
`read_health_region`	Health regions and macro regions	1991, 1994, 1997, 2001, 2005, 2013	DataSUS
`read_neighborhood`	Neighborhood limits	2010	IBGE
`read_schools`	Schools	2020, 2023	INEP
`read_comparable_areas`	Historically comparable municipalities, aka Áreas mínimas comparáveis (AMCs)	1872, 1900, 1911, 1920, 1933, 1940, 1950, 1960, 1970, 1980, 1991, 2000, 2010	IBGE
`read_urban_concentrations`	Urban concentration areas (concentrações urbanas)	2015	IBGE
`read_pop_arrangements`	Population arrangements (arranjos populacionais)	2015	IBGE

Other functions:

Function	Action
`list_geobr`	List all datasets available in the geobr package
`lookup_muni`	Look up municipality codes by their name, or the other way around
`grid_state_correspondence_table`	Loads a correspondence table indicating what quadrants of IBGE's statistical grid intersect with each state
`cep_to_state`	Determine the state of a given CEP postal code
...	...

Note 1. Data sets and Functions marked with "dev" are only available in the development version of geobr.

Note 2. Most data sets are available at scale 1:250,000 (see documentation for details).

Coming soon:

Geography	Years available	Source
`read_census_tract`	2007	IBGE
Longitudinal Database* of micro regions	various years	IBGE
Longitudinal Database* of Census tracts	various years	IBGE
...	...	...

'*' Longitudinal Database refers to áreas mínimas comparáveis (AMCs)

Quadro geográfico de referência para produção, análise e disseminação de estatísticas
Outros arquivos e recortes estão disponiveis em ftp://geoftp.ibge.gov.br/.

Contributing to geobr

If you would like to contribute to geobr and add new functions or data sets, please check this guide to propose your contribution.

Related projects

As of today, there is another R package with similar functionalities: simplefeaturesbr. The geobr package has a few advantages when compared to simplefeaturesbr, including for example:

A same syntax structure across all functions, making the package very easy and intuitive to use
Access to a wider range of official spatial data sets, such as states and municipalities, but also macro-, meso- and micro-regions, weighting areas, census tracts, urbanized areas, etc
Access to shapefiles with updated geometries for various years
Harmonized attributes and geographic projections across geographies and years
Option to download geometries with simplified borders for fast rendering
Stable version published on CRAN for R users, and on PyPI for Python users

Similar packages for other countries/continents

Africa: afrimapr
Argentina: geoAr
Brazil: geobr
Canada: cancensus
Chile: chilemapas
Czech Republic: RCzechia
Finland: geofi
Peru: mapsPERU
Spain: mapSpain
UK: geographr
Uruguay: geouy
USA: tigris
Global (political administrative boundaries): rgeoboundaries

Credits

Original shapefiles are created by official government institutions. The geobr package is developed by a team at the Institute for Applied Economic Research (Ipea), Brazil. If you want to cite this package, you can cite it as:

Pereira, R.H.M.; Gonçalves, C.N.; et. all (2019) geobr: Loads Shapefiles of Official Spatial Data Sets of Brazil. GitHub repository - https://github.com/ipeaGIT/geobr.

geobr's People

Contributors

Stargazers

Watchers

Forkers

marionog felipemartinsgoulart claudioaq hendrixfreire leofrota otaviosbarbosa cavedo95 babisan08 pedro-andrade-inpe erickkill edysouza seoncheolpark nemochina2008 raquelrguima jwellbelove weslleyi alinenascimento jxaviert jeanderson-souza joaocarabetta gortaina felipegermanos bodartv josebrunods claudiacerqn stanleycruvinel thyagorezende feliperoquete bmpacifique fdzul joaobazzo bafurtado igoralves1 limanalytics evoluchico vcuspinera mathiasfls mralbu vlasvlasvlas brmenegs16 cbsevilha jtrecenti fcotelo ma-t marcosebarreto pedrocava ramonssouza alaskievic larissa-ballalai aracele dmarcelinobr pindograma alxavier-dev prdm0 samuel-rosa pedrojorge7 trendingtechnology jaironicolau fdbesanto2 filipemsc marcelamz jamshidsod fernandascovino tbrugz walefmachado aneisse danielsouzabio jimhester annapfmatos vgs549 jhonathan-pedroso glaudemias jocelitocastro pryskas barbosabiologo morrisgpeter aspeddro ericagoto laurentlsantos davidfrancor wandrys-dev carolinaholanda ebenezerandrade rafalopespx schoulten goulartnogueira wellalbuquerque isabellahelter hugoaluque kognitalab prof-rodrigo-silva rfsaldanha matheussruiz21 rafagfe fabioandriolo andredequeirozpatrinicola rafawbraga sueldias grazieleruas ammaciel

geobr's Issues

add function to read municipalities in the semiarid region

link to list of municipalities in the semiarid region

add function to read municipalities by biome

The user could select a given biome (Amazonia, Caatinga, Cerrado, Mata Atlantica, Pampa, Pantanal) and get all municipalities that belong to it, in a way similar to #38 and #39.

Include data of `faces de quadra` do IBGE

data: ftp://geoftp.ibge.gov.br/recortes_para_fins_estatisticos/malha_de_setores_censitarios/censo_2010/base_de_faces_de_logradouros/

info:
ftp://geoftp.ibge.gov.br/recortes_para_fins_estatisticos/malha_de_setores_censitarios/censo_2010/base_de_faces_de_logradouros/1_Leia_me/Base%20de%20Faces%20de%20Logradouros%20do%20CD%202010.pdf

Add data: Human Development Units (UDHs)

Source: Atlas do Desenvolvimento, do Ipea

Suggestion: move this repo to https://github.com/ipea

@rafapereirabr , I imagine that you created ipeaGIT because the ipea (https://github.com/ipea) group name was taken.
But it is actually ours, and I just added you as owner.
Consider migrating the project there if you want.

add Terras Indígenas

for future consideration: add a read_terras_indigenas function that grabs polygons from FUNAI

Create vignette 2

Igor, eu sugiro mover a vignette 2 de Georeferencing-gain para outro branch do repositório, e trazer ela para o main branch quando tivermos artigo em estágio mais avançado. Assim ela nao entra na submissão versão 1.0 do pacote para o CRAN. O que acha?

Suggestion: make code_muni="all" the default (and also for code_uf)

personal opinion here: I don't like to have to tell the function which subset of the data I want. The natural expectation is that you will get all the data. Making code_muni="all" the default would avoid this.

I think having this default at least for the small datasets (mun, uf, micro, macro) makes sense.

Far large dataset such as setor censitario, faces, etc. than you could force user to define *_muni="all"

Urbanized areas

Create script prep_urban_areas do download and clan IBGE data on urbanized areas (years 2005 and 2015)
Create geobr function read_urban_area() to download the data

obs. data available at ftp://geoftp.ibge.gov.br/organizacao_do_territorio/tipologias_do_territorio/areas_urbanizadas_do_brasil

Incluir dados e função de áreas de risco

Oi Guilherme, use aqueles shape files de áreas de risco do IBGE que usamos no outro trabalho do ODS. Ok? abs

Add function to read grade_estatistica 2010

read_grade( cod_uf = xxxxx, cod_muni = xxxxx, year = 2010) { 

# read sf municipality
   temp_muni <- read_muni(cod_muni = xxxxx, year = 2010)

# read bbox das grades do brasil (PRECISA SER CRIADO)

# overlay muni and grade bboxes

# identify grade id

# fazer download do grade ID do muni.ZIP

# Unzipar grade ID do muni

# ler grade ID do muni

# crop do muni sf e grade_id

}

include function `read_setorcens`

Incluir scripts usados para tratamento dos dados brutos

Passar para pasta ./geobr/prep_data os scripts em R que foram usados para tratamento dos dados brutos. Isso serve para fins de documentação

add UF, or name UF to municipality dataset

e ai @rafapereirabr ! Ta ficando legal!!

Minor suggestion: add UF name to municipalities dataset.

Sometimes the person will only have UF names and Municipality names (as in my case now with the Brasil Mais Produtivo data). And municipal names are not unique, only within State.

Add bank branches database

Brazilian Central bank updates on a monthly basis a database with addresses of all bank branches in Brazil.

There is a database for branches, PAE, PAB (these two are smaller branches, usually inside large companies) and consórcio administrator.

Link is below

https://www.bcb.gov.br/estabilidadefinanceira/agenciasconsorcio

Corrigir nome de colunas na base de area de ponderação

Paulo, alterar o nome das colunas para seguir padrão do pacote. Isso precisa ser alterado tanto no script da função quanto na base de dados

nome das cols atualmente: cod_areapond, cod_mum, cod_uf

como deve ficar: code_weighting_area, code_muni, code_state

single progress bar for read functions?

Function calls such as:

states <- read_state(year=2010, code_state = "all")

create one progress bar each time a state will be donloaded, summing up 27 progress bars in this case. Possibly creating one single progress bar that grows as each state is downloaded would be more interesting for the user.

Harmonizing data columns across years: Sampling areas

Colocar todas bases de área de ponderação com mesma estrutura de arquivo.

colunas cod_areapond, cod_mun e geom

misspelling in README.md

There's a typo in the README.md file. Coming soon , not Comming soon. :)

Add shapes of Electoral districts

read_uf por sigla

incluir opção de ler a sigla do estado, exemplo m <- read_uf(cod_uf="SP", ano=2010)

microregion codes are not retained fully by read_micro_region

when I choose a single UF, like cod_micro = 24 , the read_micro_region function returns the right geometries, but it does not retain the full microregion codes. Instead they are all "24".

That will be a serious problem if a user later wants to merge microregion data (employment rates, % rural population, ...) from some other source onto this dataframe.

Add function `read_country`

Adicionar função read_country. Que envolve

# 1 carregar dados dos estados
read_state(cod_state="all")
# Dar merge nos poligonos. Um das funções abaixo:
st_union 

 st_combine

read_uf ou read_state

O manual.pdf diz read_uf, mas no pacote a função que faz tal operação parece ser a read_state.

Municipalities in the border of regions

It would be interesting if the function (or functions) implemented to solve #38, #39, and #45 allows the user to choose what to do with municipalities that are not fully within the region. Some possibilities:

Include all municipalities as long as they have some overlap with the region (maybe the default)
Remove the municipalities in the border
Cut the polygos of municipalities in the border in order to guarantee that the returned area is the same of the region

Package takes a lot of space: 95mb

Do you really need the files at: /geobr/data/* ?

It seems this makes the package much larger than necessary (I assume because brazil_2010 is a geometry right?) in terms of size.

Usar base de dados local em read_municipality quando year=2010

para carregar base, basta rodar data("brazil_2010") e retornar a return(brazil_2010)

add function to read municipalities in the 'Legal Amazon'

Link to the list of municipalities in the Legal Amazon

Use the syntax package::function()

Pessoal, por favor se certifiquem de que estamos a sintaxe package::function() para todas funções no pacote.

Create report with download stats of data set

Sugestão do @pedro-andrade-inpe

Corrigir nome de colunas na base de muni, micro, meso e state

usar nomes:

[1] "cod_muni"     "name_muni"    "cod_micro"    "name_micro"   "cod_meso"     "name_meso"    "cod_state"   
 [8] "name_state"   "abbrev_state" "cod_region"   "name_region"  "geometry"  
```

add dataset Metropolitan Regions

Adicionar documentação da base de dados brazil_2010

Include 2010 data in `./data`

Include table with metadata of all geometries ./data. Something like:

municipality_name	municipality_code	state_name	state_code	region_name	state_initials	region_code	geom
...	...	...	...	...	...	...	...

Na coluna geom informar dados referentes aos municipios

Create vignettes

Create one or two vignettes demonstrating the package functions

holes in read_country() with no arguments

read_country() with no arguments returns the geometry of Brazil in 2010 with several holes. The data for 2014 and 2015 also have some holes. Perhaps removing the holes manually after computing the union solves the problem as shown in the code below.

(I saw in another issue the discussion that st_union is very slow. unionSpatialPolygons from maptools is much faster)

require(geobr)
require(dplyr)
require(sp)
require(sf)
require(maptools)

sp_states <- read_state(year=2010, code_state = "all") %>% as("Spatial")

result <- unionSpatialPolygons(sp_states, rep(TRUE, 27))

outerRings = Filter(function(f){f@ringDir==1},result@polygons[[1]]@Polygons)
outerBounds = SpatialPolygons(list(Polygons(outerRings,ID=1)))
plot(outerBounds)

m <- st_as_sf(outerBounds)

write_sf(m, "brazil.shp")

create function with dictionary of all codes and names?

data available here: https://concla.ibge.gov.br/classificacoes/por-tema/codigo-de-areas/codigo-de-areas

Add historical data - municipios_1872_1991

historical data on municipios_1872_1991 can be found at this address ftp://geoftp.ibge.gov.br/organizacao_do_territorio/malhas_territoriais

Usar base de dados local em quando year=2010 em funções read_micro, meso e state

para carregar base, basta rodar data("brazil_2010").

Depois é necessário fazer um dissolve da base de acordo com a geografia em cada função

read_municipio: include `cod_uf` argument

Include cod_uf argument in the read_municipio function

Example:
read_municipio(cod_uf="SP", ano=2010) returns all municipalities of UF 35

Harmonizing data columns across years: Mesoregion

Colocar todas bases de mesoregiões com mesma estrutura de arquivo. Exemplo:

          nome_meso cod_meso  Geometry
1 Leste Rondoniense     1102  POLYGON ((-62.22055 -8.5908...
2   Madeira-Guaporé     1101  POLYGON ((-63.32721 -7.9767...
...

Python Version

Is there any work/planning to build a python version?

If not, can I start one? There is no licensing on the project. So I am not sure about if you are ok with other users building on your .rds files.

read_weighting_area retornando "Error in parse_url"

test <- read_weighting_area(code_weighting = 35, year=2010)
test <- read_weighting_area(code_weighting = "SP", year=2010)

Os códigos acima retornam "Error in parse_url(url) : length(url) == 1 is not TRUE".

Estou entrando algo errado, ou a função está com problemas?

Create function of AMC for municipalities

Avaliar se é possível aproveitar o código em Stata do Philipp Ehrl (professor da Univ. Catolica de Brasilia e bolsista na DIRUR). Ele publicou uma nota na Estudos Economicos sobre o código e o código está disponível aqui.

Harmonizing data columns across years: Municipalities

Colocar todas bases de município com mesma estrutura de arquivo. Exemplo:

         nome_mun   cod_mun      geometry
1      Acrelândia   1200013      POLYGON ((-67.13424 -9.6762...
2    Assis Brasil   1200054      POLYGON ((-69.5814 -10.3806...
...

names with accents seem to cause problems.

Geographic names that contain accents seem to cause warnings when I use the read_ function in a US English enviroment. See attached pdf for an example.

Adding a line

options(encoding='UTF-8')

did not change anything.
geobr-test.pdf

read_biomes() and default year

A call to read_biomes() without any argument does not work as it requires an year. It could work even without any argument by having 2004 as default value. I don't know whether the other read functions have a default year, but I think they could always return the latest data available when the year argument is not used.

Add 'region' column to UF data sets

Add the followig two columns to UF data sets

"region_name", "region_code"
North, 1
South, 4
..., ...

Criar função para gerar/atualizar tabela de metadados

Criar função que varre todos arquivos na pasta ./geobr/data e atualiza tabela com metadados do pacote. Exemplo abaixo de tabela de metadados

geo	year	code	download_path
municipio	2001	33	http://www.ipea.gov.br/geobr/municipio/2001/33MU.rds
micro_regiao	2016	17	http://www.ipea.gov.br/geobr/micro_regiao/2016/17MI.rds
...	...	...	...

test

dasdasdasd