Giter Club home page Giter Club logo

saqgetr's People

Contributors

skgrange avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

saqgetr's Issues

Conflict with Java

Hi, your package is great for retrieving EEA data but I have encountered a mysterious bug: There seemes to be some sort of Java conflict when using the package. This took me a while to trace down, but now I have identified the place in the code that creates an error. With a simple call like this:
allsites <- `get_saq_sites()`
and then, later:
lons = seq(-39.5,39.5,1) lats = seq(45.5,64.5,1) bg = read_osm(bb(c(min(lons),min(lats),max(lons),max(lats))), type='osm')
I get a long error message:
**Error in .jcall("java/lang/Class", "Ljava/lang/Class;", "forName", cl, :
Unable to start conversion to UTF-16
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'Class' in selecting a method for function 'new': class not found
Error in .tryJava() :
Java classes could not be loaded. Most likely because Java is not set up with your R installation.
Here are some trouble shooting tips:

  1. Install Java (for mac consider installing java 1.6 from https://support.apple.com/kb/DL1572?locale=en_US )
  2. Run
    R CMD javareconf
    in the terminal. If you are using Mac OS X >= 10.7 you may want to try
    R CMD javareconf JAVA_CPPFLAGS=-I/System/Library/Frameworks/JavaVM.framework/Headers
    instead.**

If I leave out the call to get_saq_sites there I get no error message,

Earlier, I got a similar error message in other programs, and then managed to solve it ny the recipe given on this web page: [ https://www.geeksforgeeks.org/how-to-set-java-path-in-windows-and-linux/]

I have done the same now, but it seems that just a basic call to the saqgetr package leads to the Java error.
So there seem to be some conflicting java stuff in your package.
I really hope you find a fix for this as I'm a bit stuck

Updated data from get_saq_observations()?

Hi Stuart,
first of all - thanks for an excellent package for extracting EEA data! Very efficient and useful. According to the manual there is a lag of about 1 month for the newest data and indeed the data_processes() and filtered_processes() show this, but when extracting data for the sites with get_saq_observations() I actually get the "almost updated" data (a lag of 3 days from now). Is there a kind of mismatch between these routines or are the data being updated just now, so that I happened to hit the period of monthly updating? According to data_processes() all data end at 2021-01-15 or 2021-01-16, but when extracting data by get_saq_observations() they end at 2021-02-15. (Today is 2021-02-18).
Thanks.

Update of validated (E1a) data uncovered a lack of data delivery for the UK

To the users of saqgetr, the observations since 2019 have been updated with validated data (called the E1a data flow) in the AQER nomenclature this month. However, there are some issues with the lack of validated data for some countries, notably the UK. I have reinserted the near-real-time observations (from the E2a data flow) for the UK which I think has resolved the missing data issue, but in-depth testing of other countries has not been done. If users encounter systematic missing data for a year across a number of monitoring sites in a country, please let me know and I will see what I can do. Many thanks!

Data updated only by 12-06-2020 using get_saq_observations()

Hi Stuart,
I am doing an air pollution research, want to collect the UK hourly air pollutant data as recent as possible. However, when using get_saq_observations() as the example shows, it only updates till 12-June,2020 .

I wonder can that data be updated till the end of June, or the early of July?

Yours Sincerely,
Ada

No data after 2023-08-12

Hello,

It seems that there is no data available after the 2023-08-12.

require(saqgetr)
data_sites <- get_saq_sites()
data_sites$date_end %>% max(na.rm=T)

outputs :
[1] "2023-08-12 23:00:00 UTC"

Thanks,
Best wishes

closeAllConnections() causes issue when knitting

Hi Stuart,

I'm trying to put up some RMarkdown document using this package. It works fine when run the code interactively in the global environment but when I try to knit it to html the get_saq_observations() throws some error relating to connections.

I think this is due to the use of closeAllConnections() in the read_saq_observations(). See this related SO question: https://stackoverflow.com/a/11165899/4227151

Can we safely delete closeAllConnections()? Otherwise not sure how hard it is to specify which connection to close.

Cheers,
Hao

4 values returned for one observation

I was downloading some data for 2012 and noticed there were 4 values for every hour. Each value is different.

dat <- get_saq_observations(site = 'gr0027a', start = '2012-07-01', end = '2012-07-15', variable = 'o3')

Also for this site, but for no2 two values returned. Expecting only one for each hour for both species.

dat_2 <- get_saq_observations(site = 'gb0002r', start = '2012-07-01', end = '2012-07-15', variable = 'no2')

Can't download BC data from 2008 to 2012, 2019 and part of 2018 from 31 BC monitoring sites.

Hi Dr. Grange,

By using the get_saq_observations() function, I could manage to download BC data from 31 BC monitoring sites (including those closed ones) between 2013 and 2017. However, as I was trying to download BC data between 2008 and 2012, 2018 to 2019; only part of 2018 could be downloaded. Please see the following code:

BC_Data <- get_saq_observations(
site = c("gb1044a","gb1028a","gb0048r","gb1067a","gb1097a","gb1055r","gb0620a",
"gb0682a","gb0886a","gb0567a","gb1023a","gb0723a","gb0934a","gb0580a",
"gb0960a","gb0851a","gb0105a","gb0146a","gb0991a","gb0839a","gb0931a",
"gb0641a","gb0036r","gb0706a","gb0613a","gb0995a","gb1059a","gb0182a",
"gb0658a","gb0234a","gb0135a"),
start = 2011,
end = 2012,
variable = "bc",
verbose = TRUE
) %>%
saq_clean_observations (summary = " hour " , valid_only = TRUE, spread = TRUE
) %>%
arrange(site)**

Is there anything I have done wrong?

Thank you so much in advance.

Sam

Use of get_saq_observations() for parallel data + use of 'valid_only'

Hi Stuart,
two questions to the get_saq_observations():

  1. I encountered an issue when extracting data for PM10 for a station (gb0036r in 2015) that has both daily and hourly resolved data. When using the get_saq_observations() I get a tibble containing 9101 observations and it turns out that the first 365 (or actually 364, one missing) data refer to the daily based obs and the rest refer the hourly, all packed together. They can be separated though if using the saq_clean_observations() with spread=T, but I thought it was a bit strange that they were packed into one variable at first. I think it could have been useful with a possibility to specify the temporal resolution (i.e. the 'period' as given by EEA) inside the get_saq_observations(), e.g. a parameter call like 'period = hour' inside that function. Just a wish.
  2. Second question regards the 'valid_only'. It's not completely clear to me what this does. If I set this to F, do I then risk getting actual data values that should not be used or do I just get all the 'NA's for the times with invalid data? Important to know. It seems to me that the latter is the correct, but it would be good to get it confirmed (I can live with the NAs but dont want the erroneous observational values of course).
    Thanks, Sverre

saqgetr and lml discrepancy

I noticed some discrepancy between the data from saqgetr and the Dutch observation network (LML operated by RIVM). The LML data is dated 07/2023 so potentially this is simply a case of Airbase not being updated with the adjusted data from RIVM, but the data from Airbase says it has been validated.

How often is Airbase updated?
Are there multiple validation steps?

RIVM info on validation (not much detail) https://www.luchtmeetnet.nl/informatie/overige/validatie-data

Reprex below:

library(saqgetr)
library(dplyr)
library(lubridate)
library(threadr)
library(openair)
library(reshape2)

## import all netherlands sites
saq_sites_nl <- get_saq_sites() %>% 
  filter(grepl("nl0", site))

## get valid observations for 2022
saq_nl <- get_saq_observations(site = saq_sites_nl$site, variable = "pm2.5", valid_only = TRUE, tz = "UTC", start = "2022", end = "2022") %>% 
  select(date, site, saqgetr = value) 

## import csv, doesn't like header so go for row above
lml_dat <- read.table("https://data.rivm.nl/data/luchtmeetnet/Vastgesteld-jaar/2022/2022_PM25.csv", skip = 9, sep = ';')
## use first row
names(lml_dat) <- lml_dat[1,]

## LML is in CET winter time, convert to UTC
lml_nl <- lml_dat[-1,] %>% 
  mutate(date = ymd_hm(` Begindatumtijd`, tz = "UTC")-3600)

lml_nl_down <- lml_nl[,-c(1,2,3,4,5)]  %>% 
  melt('date') %>% 
  mutate(variable = gsub("NL01", "nl00", variable),
         variable = gsub("NL10", "nl00", variable),
         variable = gsub("NL49", "nl00", variable)) %>% 
  transmute(date, site = variable, lml = as.numeric(value))

## left join with saq first as it has fewer dates with data
saq_lml_nl <- left_join(saq_nl, lml_nl_down, by = c('date', 'site'))

Summarising the two datasets for each site


## summary stats
statz <- aqStats(saq_lml_nl, c('saqgetr', 'lml') ,type = "site")

## calculate daily means and number of days above 15
saq_lml_24h_exceed <- saq_lml_nl %>% 
  timeAverage("day", type = "site") %>% 
  group_by(site) %>% 
  summarise(saq_gt_15 = sum(saqgetr >= 15, na.rm = TRUE),
            lml_gt_15 = sum(lml >= 15, na.rm = TRUE)) %>% 
  left_join(saq_sites_nl, by = "site") %>% ## get site info
  select(site, site_type, site_area, saq_gt_15, lml_gt_15) %>%
  arrange(site_type, site_area) ## arrange by site type then site area

An example below for the site Vredepeel-Vredeweg NL00131 which is 1ug/m3 higher than lml from 01/01/2022 to 24/11/2022 16:00 then it is the same.

## Example of one site

## import background site Vredepeel-Vredeweg
saq_nl00131 <- get_saq_observations(site = "nl00131", variable = "pm2.5", valid_only = TRUE, tz = "UTC", start = "2022", end = "2022") %>% 
  select(date, saqgetr = value) 

## import csv, doesn't like header so go for row above
lml_dat <- read.table("https://data.rivm.nl/data/luchtmeetnet/Vastgesteld-jaar/2022/2022_PM25.csv", skip = 9, sep = ';')
## use first row
names(lml_dat) <- lml_dat[1,]

## convert to UTC
lml_nl00131 <- lml_dat[-1,] %>% 
  transmute(date = ymd_hm(` Begindatumtijd`, tz = "UTC")-3600,
            lml = as.numeric(NL10131))

## join them together
nl00131 <- left_join(saq_nl00131, lml_nl00131, by = 'date')

## plot full time series
threadr::time_dygraph(nl00131, c('saqgetr', 'lml'))

## plot summary
openair::timeVariation(nl00131, c('saqgetr', 'lml'))

example period of difference

get_saq_sites(): connection cannot be established

Hello,

I'm running into the error:
Error in open.connection(con, "rb") : HTTP error 403.
when trying to import the sites information with get_saq_sites().

This error occurs already since two weeks.
Maybe you can help me on this issue?

Thank you!
Greetings,
Lea

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.