datapartnership / lebanon-economic-monitor Goto Github PK

View Code? Open in Web Editor NEW

0.0 3.0 2.0 171.78 MB

Understanding Lebanon's Economy through Alternative Data

Home Page: https://datapartnership.org/lebanon-economic-monitor/

License: Mozilla Public License 2.0

Makefile 0.01% Python 0.01% R 0.04% Jupyter Notebook 99.95%

economic-monitor lebanon data-for-good meta-population datapartnership

lebanon-economic-monitor's Introduction

Understanding Lebanon's Economy through Alternative Data

The MENA MTI team is leading two analytical projects for Lebanon -- the Lebanon Country Economic Memorandum (CEM) (P178305) and 'Revisiting the Syrian Conflict's Impact on Lebanon (P178302). Data constraints are a major concern for both projects. This repository includes an exploration of alternative data sources that could support analysis of Lebanon's economic growth, conflict, demographics, and refugee dynamics.

Data Availability Statement

Restrictions may apply to the data that support the findings of this study. Data received from the private sector through the Development Data Partnership are subject to the terms and conditions of the data license agreement and the "Official Use Only" data classification. These data are available upon request through the Development Data Partnership. Licensing and access information for all other datasets are included in the documentation.

License

This project is licensed under the Mozilla Public License - see the LICENSE file for details.

lebanon-economic-monitor's People

Contributors

Watchers

Forkers

bennyistanto jagbakpe

lebanon-economic-monitor's Issues

Issue on page /notebooks/ais-analysis/README.html

Unable to sign in to sharepoint.

error message below:

If that doesn't help, contact your support team and include these technical details:
Correlation ID: 45e316a1-4016-5000-16e5-d26067ee4baf
Date and Time: 3/20/2024 6:22:57 AM
URL: https://worldbankgroup.sharepoint.com/teams/DevelopmentDataPartnershipCommunity-WBGroup/Shared Documents/Forms/AllItems.aspx?cid=fccdf23e-94d5-48bf-b75d-0af291138bde&csf=1&e=Yvwh8r&FolderCTID=0x012000CFAB9FF0F938A64EBB297E7E16BDFCFD&id=%2Fteams%2FDevelopmentDataPartnershipCommunity-WBGroup%2FShared+Documents%2FProjects%2FData+Lab%2FLebanon+Economic+Analytics%2FData%2Ftrade&viewid=80cdadb3-8bb3-47ae-8b18-c1dd89c373c5&web=1
User: [email protected]
Issue Type: User not in directory.

Review dependencies and reproducibility

Currently, the package does not include dependencies adequately.

Issue on page /notebooks/aviation-trends/aviation.html

Are there codes for EDA? I only see codes for visualizations. Is there a separate codebook/notebook?

If possible can written instructions be provided on how to navigate the websites where the datasets are ie. detailed steps to follow to get to the actual data. So far I only have access to the NO2 dataset which I downloaded from the Sharepoint(which is now not working). The other Datasets I have not been able to access. I am yet to try to reach out to the sources of the datasets however. I will try that too and see whether it helps. Thank you.

Are there data/datasets to quantity population migration within the country?

Issue on page /notebooks/air-pollution/air-pollution.html

Your issuDriverError: ../../data/shapefiles/lbn_adm_cdr_20200810/lbn_admbnda_adm2_cdr_20200810.shp: No such file or directory
e content here.

Where can i access the shapefile please?
code reference:
import geopandas as gpd

shapefile_path = '../../data/shapefiles/lbn_adm_cdr_20200810/lbn_admbnda_adm2_cdr_20200810.shp'
lebanon_adm2 = gpd.read_file(shapefile_path)

Issue on page /notebooks/aviation-trends/aviation.html

Are there codes for EDA? I only see codes for visualizations. Is there a separate codebook/notebook?

Release package properly

As of 0c48116, the Python package is not released properly. I'll open a PR to address this shortly adopting YY.MM.MICRO CalVer.

Issue on page /notebooks/ais-analysis/README.html

"The difference in arrival vs. departure draught is a key calculation to estimate trade. However, the difference between outgoing and incoming draft is only identified in a subset of port calls. For the remaining routes, we apply the back-propagation technique, searching for the arrival draft at the next port visited."

Is " "draft" supposed to be "draught" here?

Issue on page /notebooks/air-pollution/air-pollution.html

A few typos;
This notebook 'uses' analyses the NO2 data in Lebanon and presents the insights from the analysis. Two datasets inform the analysis - NO2 'dta' obtained from Google Earth Engine and boundary files obtained from HDX.

Issue on page /notebooks/air-pollution/air-pollution.html

"This notebook uses analyses the NO2 data in Lebanon and presents the insights from the analysis."
"uses" should be deleted.

Issue on page /notebooks/air-pollution/air-pollution.html

Cannot access share point

Conflict Data

Full ACLED database on Lebanon; At a later stage, match the regional conflict data with the regional NTL and population data. In addition to ACLED, are there any other conflict or social tension databases available for Lebanon?

Issue on page /notebooks/ntl-analysis/README.html

I followed the link provided to get the dataset, I opened the data archive hyperlink and saw a series of data sets with no clear description. Is it possible that the actual dataset name from the archive can be provided so I can find the dataset?

Is there a way to estimate the share of refugees in population by region?

Incorrect table of contents

Revert 73da1d0.

Population and Demographics Assessment

To overcome gaps in official demographic statistics, the team has turned to private sector data sources. In their book, “The Narrow Corridor,” Daron Acemoglu and James Robinson suggested that Lebanon has not conducted a population census since 1932. The task is to obtain and plot national and regional (as disaggregated as possible) population data, either monthly and annually.

Proposal

Beta Give feedback

https://github.com/datapartnership/operations/issues/643
Options

Tasks

Beta Give feedback

Are there any alternative data/datasets that can help gauge Lebanon’ population trends? #21

population
Is there a way to estimate the share of refugees in population by region? #23

population
Are there data/datasets to quantity population migration within the country? #24

population wontfix
Add Population Density and Displacement Maps #25

population
Options

Issue on page /notebooks/air-pollution/air-pollution.html

I received the following error

Issue on page /notebooks/crop-growing-status/README.html

Apart from monthly temperature and rainfall data, was there other data used. Because the data section explains that there are a number of datasets that were used.

Issue on page /notebooks/air-pollution/air-pollution.html

Suggested comments

output_notebook() # Sets up Bokeh to display plots in Jupyter Notebook

bokeh.core.validation.silence(EMPTY_LAYOUT, True)# Suppress certain Bokeh warnings for a cleaner output
bokeh.core.validation.silence(MISSING_RENDERERS, True)# Suppress certain Bokeh warnings for a cleaner output

Grouping the data by administrative region and date, then calculate the mean of the NO2 column number density

df = monthly_no2_adm1.groupby(['admin1Name', 'date']).mean('NO2_column_number_density').reset_index()

Creating a line chart using a custom function get_line_chart

Parameters:

df: DataFrame containing the data

category: Column name for the categorical variable (administrative region in this case)

title: Title for the plot

source: Source of the data (e.g., 'Copernicus')

measure: The measure being plotted (NO2 column number density in this case)

line_chart = get_line_chart(df, category='admin1Name', title='Monthly Air Pollution in Lebanon by Admin 1', source='Copernicus', measure='NO2_column_number_density')

Display the line chart

show(line_chart)

Are there any alternative data/datasets that can help gauge Lebanon’ population trends?

Alternative datasets as proxies for population growth and density in Lebanon

Introduction

This notebook is a proof of concept for using alternative datasets as proxies for population growth and density in Lebanon. The datasets used are:

Facebook population density
Kontur population density
World Bank Data
- Infrastructuree
- Population
- Urban Development
- Gender


pacman::p_load(tidyverse, janitor, tmap, tmaptools, sf, mapview, GADMTools,
               hrbrthemes)

Importing and reading the data:

pop <- st_read("kontur_population_LB_20220630.gpkg") %>% st_transform(4326)

gender_lbn <- read_csv("gender_lbn.csv") %>% janitor::clean_names() %>% slice(-1)

urban <- read_csv("urban-development_lbn.csv") %>% janitor::clean_names() %>% slice(-1)

villa <- st_read("Lebanon_Villages.shp") %>% st_transform(4326)

births <- readxl::read_excel("birthsAndPregnanciesLBN.xlsx") %>% janitor::clean_names()

adm <- st_read("gadm41_LBN_2.json/gadm41_LBN_2.json")

Using tmap to visualize the population density in Lebanon:

 pop %>% tm_shape() + tm_polygons(col = "population", palette = "viridis", 
                                 style = "quantile", n = 5, 
                                 title = "Population", 
                                 legend.hist = TRUE, border.col = NULL,
                                 legend.is.portrait = FALSE
                                 )

Using geolocated data, we can visualize the villages in Lebanon:

tm2 <- tm_shape(villa) + tm_dots(
                              title = "Villages", col = "slateblue4",
                              legend.hist = TRUE, border.col = NULL,
                              legend.is.portrait = FALSE
                              ) +
  tm_layout(main.title = "Villages")

urban %>% filter(indicator_name %in% c("Urban population", 
                                       "Population in largest city")) %>% 
  mutate(value = as.numeric(value),
         year = as.numeric(year)) %>%
  ggplot(aes(x = year, y = value, group = indicator_name, color = indicator_name)) +
  geom_line() +
  geom_point() +
  labs(title = "Key urban indicators", x = "Year", y = "Population",
       caption = "Source: World Bank",
       color = NULL) +
  theme_minimal() +
  scale_y_continuous(labels = scales::comma) +
  theme_ipsum_rc() +
  scale_color_ipsum()  +
  theme(legend.position = "bottom")

biryears <- births %>% pivot_longer(!c(adm2_code, adm2_name, urban_or_rural)) %>% 
  filter(grepl("adjusted_births", name)) %>% 
  mutate(name = parse_number(name)) %>% 
  fuzzyjoin::stringdist_inner_join(adm, by = c("adm2_name" = "NAME_2"), max_dist = 2) %>%
  st_as_sf()


biryears %>% 
  group_by(urban_or_rural, name) %>%
  top_n(4, value) %>%
  ggplot(aes(x = name, y = value, color = adm2_name)) +
  geom_line() +
  geom_point() +
  labs(title = "Births", x = "Year", y = "Births",
       caption = "Source: World Bank",
       color = NULL) +
  theme_minimal() +
  scale_y_continuous(labels = scales::comma) +
  theme_ipsum_rc() +
  scale_color_ipsum()  +
  facet_wrap(~urban_or_rural) +
  theme(legend.position = "bottom")

refug <- jsonlite::fromJSON("https://data.unhcr.org/population/get/timeseries?widget_id=411213&geo_id=71&sv_id=4&population_collection=22&frequency=day&fromDate=2012-01-01")

refug$data$timeseries %>% 
  mutate(data_date = as.Date(data_date),
         individuals = as.numeric(individuals)) %>%
  ggplot(aes(x = data_date, y = individuals)) +
  geom_line() +
  geom_point() +
  labs(title = "Refugees", x = "Year", y = "Refugees",
       caption = "Source: UNHCR",
       color = NULL) +
  theme_minimal() +
  scale_y_continuous(labels = scales::comma) +
  theme_ipsum_rc() +
  scale_color_ipsum()  +
  theme(legend.position = "bottom")

Issue on page /notebooks/aviation-trends/aviation.html

### Suggested comments

Creates a figure with 1 row and 2 columns of subplots, setting the figure size

fig, axs = plt.subplots(1, 2, figsize=(15, 5))

Sets font family for the entire plot

plt.rcParams["font.family"] = "Georgia"

Plots the total ask (total number of seats available) on the first subplot

axs[0].bar(x=inbound_flights_mena_monthly_ask['date'], # X values (dates)
height=inbound_flights_mena_monthly_ask['total_ask'], # Heights of bars (total ask)
width=20) # Width of bars

Plots the total payload (total weight of cargo) on the second subplot

axs[1].bar(x=inbound_flights_mena_monthly_ask['date'], # X values (dates)
height=inbound_flights_mena_monthly_ask['total_payload'], # Heights of bars (total payload)
width=20) # Width of bars

Loops through each subplot

for ax in axs:
# Hide the right and top spines (axis lines)
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)

# Only shows ticks on the left and bottom spines
ax.yaxis.set_ticks_position('left')
ax.xaxis.set_ticks_position('bottom')

# Adds gridlines to both major and minor ticks
ax.grid(which='both', linestyle='--', linewidth=0.5, color='gray', alpha=0.7)

# Sets titles for each subplot
axs[0].set_title('Average payload on Inbound Flights to Lebanon', font='Georgia', fontsize=12)
axs[1].set_title('Average Available Seat Kilometres on Inbound Flights to Lebanon', font='Georgia', fontsize=12)

# Sets x-label for each subplot
ax.set_xlabel('Month')

# Sets y-label for each subplot
axs[0].set_ylabel('Average payload (kg)')
axs[1].set_ylabel('Average Available Seat Kilometres')

Adds a subtitle to the first subplot

subtitle = 'Source: Global Aviation Dashboard (World Bank)'
axs[0].text(0, -0.15, subtitle, ha='left', va='center', transform=axs[0].transAxes,
fontsize=10, color='black', weight='normal')

Creates a figure with 2 rows and 1 column of subplots, setting the figure size

fig, axs = plt.subplots(2, 1, figsize=(10, 10))

Sets font family for the entire plot

plt.rcParams["font.family"] = "Georgia"

Plots the total number of seats on inbound flights monthly on the first subplot

axs[0].bar(x=inbound_flights_mena_monthly_all['date'], # X values (dates)
height=inbound_flights_mena_monthly_all['total_seats'], # Heights of bars (total seats)
width=20) # Width of bars

Plots the total number of seats on inbound flights yearly on the second subplot

axs[1].bar(x=inbound_flights_mena_yearly_all['date'], # X values (years)
height=inbound_flights_mena_yearly_all['total_seats'], # Heights of bars (total seats)
width=30) # Width of bars

Loops through each subplot

for ax in axs:
# Hides the right and top spines (axis lines)
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)

# Only shows ticks on the left and bottom spines
ax.yaxis.set_ticks_position('left')
ax.xaxis.set_ticks_position('bottom')

# Adds gridlines to both major and minor ticks
ax.grid(which='both', linestyle='--', linewidth=0.5, color='gray', alpha=0.7)

# Sets title for the first subplot
axs[0].set_title('Number of Seats on Inbound Flights to Lebanon (Monthly)', font='Georgia', fontsize=12)
# Sets title for the second subplot
axs[1].set_title('Number of Seats on Inbound Flights to Lebanon (Yearly)', font='Georgia', fontsize=12)

# Sets x-label for each subplot
ax.set_xlabel('Month' if ax == axs[0] else 'Year')

# Sets y-label for each subplot
ax.set_ylabel('Number of seats (in thousands)')

Adds a subtitle to the second subplot

subtitle = 'Source: Global Aviation Dashboard (World Bank)'
axs[1].text(0, -0.15, subtitle, ha='left', va='center', transform=axs[1].transAxes,
fontsize=10, color='black', weight='normal')

Imports necessary libraries

import numpy as np
import matplotlib.patches as mpatches

df: DataFrame containing data about top categories and seats

sorted_df: DataFrame sorted by date and total seats in descending order

df = top_categories_seats
sorted_df = df.sort_values(by=['date', 'total_seats'], ascending=[True, False])

Gets unique years

years = sorted_df['date'].unique()

Creats a figure and axis object

fig, ax = plt.subplots(figsize=(8, 6))
bar_width = 0.5 # Width of the bars
colors = ['#e6194b', '#3cb44b', '#ffe119', '#4363d8', '#f58231',
'#911eb4', '#42d4f4', '#f032e6', '#bfef45', '#fabed4'] # List of colors for the bars

Plots each category for each year

for i, year in enumerate(years):
bottom = np.zeros(len(years)) # Initializes the bottom values for stacked bars
sorted_df = df[df['date'] == year].sort_values(by='total_seats', ascending=False) # Sorts the data for the current year

# Gets unique categories for the current year
categories = sorted_df['origin_country'].unique()

# Plots each category for the current year
for j, category in enumerate(categories):
    # Gets the total seats for the current category and year
    value = sorted_df[(sorted_df['date'] == year) & (sorted_df['origin_country'] == category)]['total_seats'].sum()
    if value > 0:
        # Plots a bar for the current category and year, with stacking
        ax.bar(year, value, bar_width, bottom=bottom[i], color=country_color_map[category], label=category if i == 0 else "")
        bottom[i] += value  # Updates the bottom value for stacking

Sets axis labels

ax.set_xlabel('Year')
ax.set_ylabel('Number of seats (in thousands)')

Creates legend patches for each country

legend_patches = [mpatches.Patch(color=color, label=country) for country, color in country_color_map.items()]

Adds legend to the plot

ax.legend(handles=legend_patches, loc='upper center', frameon=False, bbox_to_anchor=(1.15, 1))

Customizes plot aesthetics

ax.spines['right'].set_visible(False) # Hides the right spine
ax.spines['top'].set_visible(False) # Hides the top spine
ax.yaxis.set_ticks_position('left') # Shows ticks only on the left spine
ax.xaxis.set_ticks_position('bottom') # Shows ticks only on the bottom spine
ax.grid(which='both', linestyle='--', linewidth=0.5, color='gray', alpha=0.7) # Add gridlines

Sets plot title and rotation for x-axis ticks

ax.set_title('Top 5 countries with number of inbound flight seats')
plt.xticks(rotation=0)

Adds a subtitle to the plot

subtitle = 'Source: Global Aviation Dashboard (World Bank)'
ax.text(0, -0.15, subtitle, ha='left', va='center', transform=ax.transAxes,
fontsize=10, color='black', weight='normal')

code error reference:
----> 5 df = monthly_no2_adm1.groupby(['admin1Name', 'date']).mean('NO2_column_number_density').reset_index()

KeyError: 'admin1Name'

datapartnership / lebanon-economic-monitor Goto Github PK

lebanon-economic-monitor's Introduction

Understanding Lebanon's Economy through Alternative Data

Contents

Data Availability Statement

License

lebanon-economic-monitor's People

Contributors

Watchers

Forkers

lebanon-economic-monitor's Issues

Proposal

Tasks

Grouping the data by administrative region and date, then calculate the mean of the NO2 column number density

Creating a line chart using a custom function get_line_chart

Parameters:

df: DataFrame containing the data

category: Column name for the categorical variable (administrative region in this case)

title: Title for the plot

source: Source of the data (e.g., 'Copernicus')

measure: The measure being plotted (NO2 column number density in this case)

Display the line chart

Alternative datasets as proxies for population growth and density in Lebanon

Introduction

Creates a figure with 1 row and 2 columns of subplots, setting the figure size

Sets font family for the entire plot

Plots the total ask (total number of seats available) on the first subplot

Plots the total payload (total weight of cargo) on the second subplot

Loops through each subplot

Adds a subtitle to the first subplot

Creates a figure with 2 rows and 1 column of subplots, setting the figure size

Sets font family for the entire plot

Plots the total number of seats on inbound flights monthly on the first subplot

Plots the total number of seats on inbound flights yearly on the second subplot

Loops through each subplot

Adds a subtitle to the second subplot

Imports necessary libraries

df: DataFrame containing data about top categories and seats

sorted_df: DataFrame sorted by date and total seats in descending order

Gets unique years

Creats a figure and axis object

Plots each category for each year

Sets axis labels

Creates legend patches for each country

Adds legend to the plot

Customizes plot aesthetics

Sets plot title and rotation for x-axis ticks

Adds a subtitle to the plot

Recommend Projects

Recommend Topics

Recommend Org