Giter Club home page Giter Club logo

trackmatcher.jl's Introduction

TrackMatcher

Overview

TrackMatcher is a Julia package to find intersections between two sets of trajectories. The current version can match aircraft- or cloud-track (primary) data with CALIPSO satellite (secondary) ground tracks and store relevant data in the vicinity of the intersection.

DOI

Installation

TrackMatcher is an unregistered Julia package, but can be installed with the package manager. TrackMatcher uses the PCHIP.jl package developed within the TrackMatcher framework. It is a pure Julia implementation of the Peacewise Cubic Hermite Interpolating Polynomial. PCHIP has to be installed prior to TrackMatcher as Julia currently only acknowledges registered dependencies.

julia> ]
pkg> add https://github.com/LIM-AeroCloud/PCHIP.jl.git
pkg> add https://github.com/LIM-AeroCloud/TrackMatcher.jl.git

Usage

In essence, 3 TrackMatcher structs are needed to load essential primary and secondary data, and find intersections in the stored track data. A full overview is given in the WIKI and this README is only meant as a quick reminder of the most important functions. Additionally, Julia's help function can be used by pressing ? and the respective function name or type constructor.

Loading primary data

FlightData or FlightTrack of individual flights are loaded into a FlightSet with vectors of FlightData for different database types, currently holding

  1. VOLPE AEDT inventory (field/keyword volpe)
  2. FlightAware archived data (field/keyword flightaware)
  3. flightaware.com online data (field/keyword webdata)

A convenience constructor for FlightSet exists, where data can be loaded by passing a String or Vector{String} with directories of the input data files for each source type to the individual fields of FlightSet using the keyword arguments listed above. Those folders are searched recursively for the respective data files. More than one folder path can be listed for all the database types in a Vector{String}.

FlightSet{T}(;
  volpe::Union{String,Vector{String}}=String[],
  flightaware::Union{String,Vector{String}}=String[],
  webdata::Union{String,Vector{String}}=String[],
  altmin::Real=5000,
  odelim::Union{Nothing,Char,String}=nothing,
  savedir::Union{String,Bool}="abs",
  remarks=nothing
) where T

A similar convenience constructor exists for CloudTracks. As only one database type exists, only the directories are needed as vararg rather than kwarg:

CloudSet{T}(
  folders::String...;
  savedir::Union{String,Bool}="abs",
  structname::String="cloud",
  remarks=nothing
) where T

For details on the kwargs, see the WIKI or use the help function for the individual constructors.

Loading CALIOP data from the CALIPSO satellite

Initially, only CALIPSO positions (lat/lon) and overpass times are stored as SatData in a SatSet for performance reasons. Each granule is stored as SatData or SatTrack, which are combined in the granules field of SatSet. Only one of the types cloud profile (CPro) or cloud layer (CLay) data can be used to construct a SatSet.

In the vicinity of intersections, additional observations can be stored as CPro or CLay structs. This feature is only available, if the file/folder structure does not change between loading the data and calculating intersections. It can be controlled with the savedir keyword argument.

The SatSet metadata includes information about the granules, the type of the satellite data, the date range of the data, the time the database was created, the loadtime, and any remarks as additional data or comments.

A SatSet can be constructed by giving any number of folder strings and any remarks using the keyword remarks. The folders are scanned recursively for any hdf file and the type of the satellite data is determined by keywords CLay or CPro in the folder/file names. If both types exist in the folders, the data type is determined by the majority in the first 50 file names. Alternatively, the sat data type can be forced with the keyword type set to a Symbol :CPro or :CLay.

SatSet{T}(
  folders::String...;
  type::Symbol=:undef,
  savedir::Union{String,Bool}="abs",
  remarks=nothing
) where T

ℹ️ NOTE

SatSet is designed to use CALIPSO data provided by the AERIS/ICARE Data and Services Centre. For the best performance, you should use the same file/folder format as used by ICARE. In particular, Cloud layer files must include the keyword CLay in the file name and cloud profile data files the keyword CPro.


Finding intersections in the trajectories of the flight and satellite data

Intersections and corresponding accuracies and observation data in the vicinity of the intersection are stored in the XData struct. Alternatively, an Intersection constructor can be used for the construction of XData.

A convenience constructor exists for automatic calculation of the intersections from the FlightSet and SatSet with parameters controlling these calculations. Additionally, it can be specified by the optional argument savesecondsattype whether both satellite data types CLay and CPro should be stored in SatSet or only the specified main type. For this feature to work, folder and file names of Clay/CPro data must be identical except for the keywords CLay/CPro swapped.

Find intersections by instatiating the Intersection struct with:

Intersection{T}(
  tracks::PrimarySet,
  sat::SatSet,
  savesecondsattype::Bool=false;
  maxtimediff::Int=30,
  primspan::Int=0,
  secspan::Int=15,
  lidarrange::Tuple{Real,Real}=(15_000,-Inf),
  stepwidth::Real=0.01,
  Xradius::Real=20_000,
  expdist::Real=Inf,
  atol::Real=0.1,
  savedir::Union{String,Bool}="abs",
  remarks=nothing
) where T

kwargs

  • maxtimediff::Int=30: maximum delay at intersection between aircraft/satellite overpass
  • primspan::Int=0: number of primary (flight or cloud) data points saved before and after the closest measurement to the intersection
  • secspan::Int=15: number of secondary (satellite) data points saved before and after the closest measurement to the intersection
  • lidarrange::Tuple{Real,Real}=(15_000,-Inf): top/bottom bounds of the lidar column data, between which data is stored; use (Inf, -Inf) to store the whole column
  • stepwidth::Float64=1000: stepwidth in degrees (at the equator) used for the interpolation of flight and satellite tracks
  • Xradius::Real=20_000: Radius in meters, in which multiple intersection finds are assumed to correspond to the same intersection and only the intersection with the minimum delay between flight and sat overpass is saved
  • expdist::Real=Inf: maximum threshold for the distance of a found intersection to the nearest measured track point
  • atol::Real=0.1: tolerance to increase the bounding box around primary tracks by atol degrees increasing the search radius in secondary tracks preventing the omission of intersections due to rounding errors
  • savedir::Union{String,Bool}="abs": options to save absolute ("abs") or relative ("rel") folder paths in the metadata of FlightData or SatSet. When savedir is set to an empty string ("") or false, folder strings are save as given in the constructor. When set to an empty string or false in the Intersection constructor, no observations are saved in Intersection.
  • remarks=nothing: any data or comments that can be attached to the metadata of Intersection

trackmatcher.jl's People

Contributors

pb866 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Forkers

pb866

trackmatcher.jl's Issues

Save non-essential CPro data

In addition to the data needed to calculate the intersections and monitor the features at the intersection, save the following cloud profile data in the vicinity of intersections to struct CPro:

  • Particulate_Depolarization_Ratio_Profile_532
  • Day_Night_Flag
  • Tropopause_Height
  • Temperature
  • Pressure
  • Relative_Humidity
  • Ice_Water_Content_Profile
  • CAD_Score

Use open source software for HDF4 file reading

Matlab is currently only needed to read the CALIPSO HDF4 files. This can also be done with software such as python or fortran, which is available for free. Resolving this issue would make TrackMatcher a complete open source tool and remove the dependence on a matlab licence.

Filter duplicate intersection finds at inflection points

At inflection points, where flight tracks are cut in different segments for interpolation, it is possible that the same intersection is found at each side of the flex point, if the intersection is close enough to the flex point to be within the set accuracy.

In those case filter the intersections and use only those with the minimal distance to the sat track.

  • Decide on default Xradius

Improve cloud data

This issue tracks a (continuously updated and incomplete) list of issues and missing features related to cloud track data.

  • add progress bar during cloud track loading
  • make expdist work for cloud tracks
  • monitor distance to closest cloud track point with flightcoord (see #37 for name issue)
  • save CLAAS2 data near intersections

Use Mercator projection for the calculation of intersections

The Geodesy package should be used to transform the lat/lon (LLA) values into the Mercator projection (UTM coordinates). This way, flight tracks are straight lines except for points of course changes and can be interpolated much faster with linear interpolations rather than using MATLAB's pchip algorithm. Moreover, inaccuracies in the data points' position can be overcome with a least square fit of the flight segments between course changes.

Allow setting the SatData type explicitly with a kwarg

Instead of determining the satellite data type from the file names of the first 50 files, allow setting it explicitly with a kwarg type. This way, a main folder can be passed to the SatData constructor and all folders are searched for the sat files of the desired type, even if files of the other type are listed first.

Generalise variable names

Try to avoid variable/field names using flight, cloud or sat or find more suiting variable names:

List of fields/variables to be renamed:

  • Intersection.tracked.flight to Intersection.tracked.primary or Intersection.track.primary
  • Make Intersection.tracked.primary a Vector{<:PrimaryTrack}
  • Intersection.accuracy.flightcoord to Intersection.accuracy.primdist
  • Intersection.tracked to Intersection.observed, Intersection.observation(s), Intersection.measured or Intersection.measurement(s)
  • Intersection.accuracy.satcoord to Intersection.accuracy.secdist
  • #36 Rename feature
  • #35 Unify data struct names
  • Xf to Xp (for primary intersection)
  • Change inventory und archive to VOLPE and FlightAware

Add aerosol data

Use aerosol data besides cloud data from CALIOP.

  • Process APro and ALay similar to CPro and CLay
  • revise savesecondsattype to store any or no data of the above datasets

Introduce metadata to `Intersection` data

Currently, in Intersection in the field accuracy, the current accuracy as the distance between the 2 closest points in the interpolated satellite and flight data within a given precision is saved. The maximum precision value should be saved as well to be able to reproduce the calculations.

A new struct IntersectionDB (or something alike) could be created with the Vector{Intersection} as data and the flag settings from function intersection as metadata. Currently, metadata would need to hold:

  • precision
  • deltat

In addition, a created field with the time of creation similar to FlightDB and SatDB should be created as well as a field remarks, where any additional data/comments can be attached. satspan and flightspan are not needed in the metadata as those are obvious from the Intersection data.

Alternatively, Intersection could be restructured to the proposed IntersectionDB type, where Intersection would have a field data holding a DataFrame with the current fields of Intersection as columns for all Intersection objects in the current intersection vector. Furthermore a metadata field could be introduced with the settings as described above.

Complete feature availability for FlightAware data

Currently, only inventory data by VOLPE gets full features in the test version.
The following features need to be added to FlightAware online and archive data before completion of v0.1:

  • minimum altitude threshold (defaulting to ≤ 15000)

Simplify intersection finding with IntervalArithmetic/IntervalRootFinding

Roots of the PCHIP polynomial can be directly found with the following sample code:

using IntervalArithmetic, IntervalRootFinding
import PyPlot; const plt = PyPlot
import PCHIP

f(x) = x.^2 .- 0.6
g(x) = 0.8x .- 0.2

x, xi = -5:5, collect(-5:0.1:5)

fp = PCHIP.pchip(xi, f(xi))
gp = PCHIP.pchip(xi, g(xi))
d(x) = PCHIP.interpolate(fp, x) - PCHIP.interpolate(gp, x)

rts = IntervalRootFinding.roots(d, -5 .. 5)
x = midpoints = mid.(interval.(rts))

plt.clf()
plt.scatter(x, f(x), label="f", marker = "s", color=:darkred)
plt.scatter(x, g(x), label="h", marker = "s", color=:g)
plt.plot(xi, PCHIP.interpolate(fp, xi), color=:darkred)
plt.plot(xi, PCHIP.interpolate(hp, xi), color=:g)
plt.scatter(x, g.(x), label = "intersections", marker = "s", color=:gold)
plt.grid(ls=":"); plt.legend()
plt.gcf()

This will simplify the algorithm drastically.

Document adding timezone support for online data

Online data is read in and UTC is calculated based on the column name. Currently, only CET and CEST are supported, but new timezones can easily be added in function loadOnlineData. Document in the wiki, how and where to add supoort for different timezones in TrackMatcher.

Additional concrete type constructors within new type tree

Introduce new concrete types/constructors to the newly introduced type tree (#38).

  • new concrete type Data{T} <: DataSet{T} with fields for FlightDB, SatTrack, and Intersection
  • DataSet constructor for Data to store relevant data and compute intersections in one go
  • new concrete type MeasuredData{T} <: MeasuredSet{{T} with fields for FlightDB and SatTrack
  • MeasuredSet constructor for MeasuredData to load all experimental data in one go

Allow more than one intersection find per flight/sat segment

Currently, an intersection is found, by finding the minimum distance between flight and sat track points and calculating an analytic solution by linear interpolation, if the minimum distance is below a threshold. This should yield the correct results for flights with a prevailing eastern or western direction, but is likely to miss intersections for flights with prevailing northern or southern directions. These flights can have more than one intersection per flight segment and it is by pure chance (and depending on the stepwidth parameter), which intersection has the minimum distance.

These intersections might get filtered because of the delay time between the aircraft and satellite passage at the intersection, while correct intersections are not considered. This leads to different results with different stepwidth settings and misses correct intersections.

There, multiple intersection finds must be allowed within a flight or sat segment, however, not more than one intersection should be allowed within Xradius, which is important for flights parallel to a satellite track.

Introduce type tree

Introduce a type tree for self-defined types:

  • abstract type: DataSet{T<:AbstractFloat} with concrete type Data{T} <: DataSet{T}
  • abstract type MeasuredSet{T} <: DataSet{T} with concrete type MeasuredData{T} <: MeasuredSet{T}
  • abstract type PrimarySet{T} <: MeasuredSet{T}
  • concrete type DBMetadata{T} <: PrimarySet{T}
  • abstract type FlightSet{T} <: PrimarySet{T} with concrete type FlightDB{T} <: FlightSet{T}
  • abstract type FlightTrack{T} <: FlightSet{T} with concrete type FlightData{T} <: FlightTrack{T}
  • concrete type `FlightMetadata{T} <: FlightTrack{T}
  • abstract type CloudSet{T} <: PrimarySet{T} with concrete type CloudDB{T} <: CloudSet{T}
  • abstract type: CloudTrack{T} <: CloudSet{T} with concrete type CloudData{T} <: CloudTrack{T}
  • concrete type `CloudMetadata{T} <: CloudTrack{T}
  • abstract type SecondarySet{T} <: MeasuredSet{T}
  • abstract type SatTrack{T} <: SecondarySet{T} with concrete type SatData{T} <: SatTrack{T}
  • concrete type `SatMetadata{T} <: SatTrack{T}
  • abstract type ComputedSet{T} <: DataSet{T}
  • abstract type Intersection{T} <: ComputedDB{T} with concrete type XData{T} <: Intersection{T}
  • concrete type `XMetadata{T} <: Intersection{T}
  • promote Float in intersection to Float of primary and secondary dataset

Closes #35.

Unify units

Unify all Units to SI. Currently these diffenent units exist and should be changed to meter:

  • flight altitude in feet (change minimum to 5000)
  • lider column levels in km
  • distance between coordinates for precision in degrees (see also #7)
  • save data in FlightDB/SatData/Intersection in SI units, e.g. FlightData.data.alt
  • Temperature in °C or K, e.g. for layer top temperature?
  • maxtimediff in minutes or seconds?

Add new option to save no auxiliary satellite data in the vicinity of intersection

Currently, data must be loaded on the same system where the intersections are calculated, because additional satellite data is loaded from the hdf files in the vicinity of intersections. To get the additional data, absolute folder paths are stored in a dictionary, which would need to be identical on a different system.

To allow data loading and the calculation on different settings, allow the usage only of SatData and switch off saving of additional satellite data introducing a new keyword to Intersection. Possibly combine the new function with savesecondsattype.

Additionally, add a flag to use relative or absolute folder paths to make loading and calculating intersections on different systems possible.

Clean up satellite database

Currently, CLay and CPro data are stored in SatDB and one of the other in Intersection. Handling, which data is processed is tedious. Overhaul routines to save both datasets in the output and make it easier to specify, which dataset (CLay or CPro) is used for data processing.

In particular:

  • save SatDB as sat field of Intersection, not Union{CLay,CPro}
  • use a kwarg, which dataset to use for data processing, i.e. finding intersections
  • save ±15 data points from intersection of CLay and CPro data stored in SatDB as field of Intersection (solves #6)
  • document changes in manual

Rename feature

We should move away from the CALIOP nomenclature for feature. The kwarg and DataFrame header feature should be renamed to something more memorable with a name connected to the atmospheric condition. Possible names are (list is open for new suggestions):

  • condition
  • atmos
  • atmoscond
  • atmoscondition
  • atmosstate
  • atmos_state
  • meteo
  • meteorology
  • meteocond
  • meteocondition

Give unique database ID to online data

Continuous integers are assigned todbID, which changes the dbID of flights depending how many data is read in or in which order. Therefore, a more persistent ID should be chosen such as ORIG-DEST-<start-time> or FlightNo-<start-time>.

Add source to MetaData of Intersection

In the MetaData of Intersection add a field source that specifies whether the data is taken from the VOLPE inventory, the FlightAware archive or the FlightAware online data.

  • Append MetaData by field source
  • Append error message, when Intersection fails to load by source

Don't save interpolated satellite data at Intersection

Currently struct Intersection holds the 'exact', i.e. interpolated, lat/lon values of the intersection between satellite and flight tracks as well as the overpass time of the aircraft and the satellite.

For the satellite, no interpolated times should be used as interest focuses on the satellite recordings which happened at specific locations and times. Therefore the times of the recordings in the vicinity of the intersection should be recorded rather than the interpolated time at the exact intersection.

A decision needs to be made whether lat/lon values of the satellite recordings need to be saved in struct Intersection as well or whether time only is sufficient. Furthermore, it should be decided, whether times and/or positions of all satellite recordings in the inner flight path (±3 recordings) and outer flight path (±7 recordings) should be saved as described in the 2016 Nature Communications paper by Tesche et al.

Store only time, lat/lon, and file name in CPro/CLay

To reduce struct sizes and computation time, only store time, lat/lon, and the filename of the CALIOP hdf file in structs CLay and CPro. Read in further relevant data of the respective files in column filename only if an intersection between the flight and sat track was found.
From the additional data, use the Atmospheric Volume Description / Feature Classification Flags to determine the presence of pre-existing clouds.

Unify data struct names

Use a similar naming scheme for all track-related data structs, one of the other:

  • FlightData, CloudData, SatData
  • FlightTrack, CloudTrack, SatTrack

Possibly the first option to introduce no breaking changes on master unless there is a good reason, why Track would be a much better name.

Refine flight data handling

The below flight trajectory demonstrates current limitations of the flight data handling:

  • The flight track should be split in different parts with the outer left and right parts using latitude as x values for interpolation and the middle part using longitude as x value for interpolation.
  • Values outside the latitude interval [-80:80] should be ignored as they don't get monitored by CALIPSO.

f23946_overview.pdf

Check for correct database input

When loading FlightData in FlightSet with keywords inventory, archive or onlineData, check that the correct corresponding data is used to each keyword. Otherwise, throw an error before running into different errors that users cannot relate to.

Save non-essential data to CLay

Save the following additional data in struct CLay:

  • cloud top/base altitude
  • layer top temperature
  • ice water path
  • tropopause height
  • feature optical depth 532nm
  • feature classification flag
  • day/night flag
  • profile number (?)
  • horizontal averaging

If only 5km horizontal averaging is used, does this variable need to be stored? Or should it be made optional, which maximum horizontal averaging is allowed?

Check, whether CPro can use the same fields (names) for the profile data and save data as SatData (#13).

Document errors/warnings in wiki

In the wiki, make a section listing all warnings/errors and possible solutions how to overcome them.

In particular, document the following issues:

  • Solutions to current warnings/errors
  • Note that Intersection should use the same precision for Float as the highest precision in the flight or sat data

Correct time interpolation

Currently, time is interpolated by using the same x data as used for y-data interpolation, i.e. lat for satellite data and lat or lon for flight data depending on the prevailing flight direction. However, time depends on lat/lon` pairs and cannot be interpolated just against one coordinate.
For accurate time interpolation, either determine the time at the intersection from the next earliest measured coordinate pair and the speed of the aircraft or by linear interpolation from the two closest points and the distance to them.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.