hove-io / transit_model Goto Github PK

Managing transit data with Rust

License: GNU Affero General Public License v3.0

Rust 99.81% Makefile 0.19%

navitia ntfs gtfs transit rust gtfs-conversion netex transit-model gtfs2netexfr netexfr

transit_model's Introduction

transit_model

transit_model is a Rust crate to manage, convert and enrich transit data.
This is done by implementing the NTFS model (used in navitia).

This repository regroups crates that offer enabler-libraries and binaries to convert and enrich transit data.

Additionally, transit_model is itself a library providing various functionalities. Please refer to the code, examples and documentation to discover them.

Please check documentation attached to each crate:

binary gtfs2netexfr converts GTFS data format into NeTEx-France data format.
binary gtfs2ntfs converts GTFS data format into NTFS data format.
binary ntfs2gtfs converts NTFS data format into GTFS data format.
binary ntfs2netexfr converts NTFS data format into NeTEx-France data format.
binary ntfs2ntfs checks and cleans a NTFS dataset.
binary restrict-validity-period restricts the validity period of a NTFS dataset and purges out-of-date data.

Usage with Docker

For all the binaries mentioned above, it is also possible to use them with Docker. All the binaries are part of the image navitia/transit_model which is tagged alongside the crate transit_model. Let's use gtfs2ntfs as an example.

mkdir output-ntfs/
docker run \
	--volume "${PWD}/tests/fixtures/gtfs:/gtfs" \
	--volume "${PWD}/output-ntfs:/ntfs" \
	navitia/transit_model \
	gtfs2ntfs \
	--input /gtfs \
	--output /ntfs

Setup Rust environment

transit_model is developed in Rust.

If you want to contribute or install binaries, you need to install a Rust environment: see rustup.rs

PROJ dependency

Based on PROJ, the proj crate allows the transformation of localization coordinates.

Some transit_model's crates (see each documentation) use PROJ.
So it must be installed on the system to compile and use those crates.

PROJ for binaries

The proj crate requires PROJ.

If your system has pkg-config and a sufficiently new version of PROJ installed, it will be used. Otherwise, the crate falls back to building PROJ from source, which requires some build time dependencies.

To install PROJ build time dependencies, you can execute the following command (On Debian systems):

make install_proj_deps

You can also install the required PROJ version system-wide to avoid full rebuild (ex: cargo clean):

make install_proj

PROJ installation instructions may help, too.

Using PROJ and transit_model as a developer

proj crate is a binding to the C library.

PROJ is configured as a feature of the transit_model crate.
So to use it for coding, the proj feature must be activated (cargo build --features=proj).
Then specific code should be conditionally enabled with #[cfg(feature="proj")].

NTFS Level of Support

transit_model is supporting most of NTFS format.
From the standard, some of the functionalities are not fully supported:

No support for Line Groups (files line_groups.txt and line_group_links.txt).
The field trip_short_name_at_stop in stop_times.txt introduced in version v0.10.0 (see NTFS changelog in French) is not supported.

Contributing

Please see CONTRIBUTING to know more about the code or how to test, contribute, report issues.

License

Licensed under GNU Affero General Public License v3.0

transit_model's People

Contributors

Stargazers

Watchers

transit_model's Issues

Include the build of Proj.4 inside the Cargo build

With Cargo, it is possible to execute some tasks before the compilation of the project thanks to the build.rs file (see documentation). It would be possible to compile and install Proj.4 (maybe asking for validation of the user before executing the task?) before the compilation of transit_model. Since Proj.4 is a AutoTools project, this Rust helper for AutoTools might be useful to compile it.

Proj4 v6.1.0 breaks the build

Between version v6.0.0 and v6.1.0 of Proj4, the conversion from EPSG:28992 to EPSG:4326 (aka WGS84) used for KV1 doesn't provide exactly the same results (millionth precision). We should define what version of Proj4 is officially supported in transit_model.

companies.txt: contributor_id attribute in data but not in spec

Some (all?) of companies.txt seems to have a contributor_id attribute not specified in the spec
https://github.com/CanalTP/navitia/blob/dev/documentation/ntfs/ntfs_fr.md#companiestxt-requis

Add the ability to read a zip

Hi here!

It would be quite nice to be able to directly read a .zip since it's usually the packaging of a GTFS.
In the same spirit, it might be helpful to read directly an url.

Would you be interested in such capabilities ?

It might need a bit of a refactor in the reader, I think we'll need to change all the utilities that takes a AsRef<path::Path> by feeding them a std::io::Read (a bit like it's done in gtfs-structure.

As a side effect, would you be interested by a mean to have real time discussion ? It might be easier to exchange. We could set up a gitter channel or use IRC.

See you,
Antoine

FareZone on StopPoints

Actually, FareZone in navitia_models and the NTFS are available on both StopPoints and StopArea.
This issue is a reminder to remove the FareZone on the StopArea :

in navitia_models
in the NTFS specs

Use cargo worskpace

I propose a new file structure based on cargo workspaces

.
├── Cargo.lock
├── Cargo.toml
├── README.md
├── examples
├── LICENSE
├── libs
│   ├── collection
│   ├── model-builder
│   ├── relations
│   └── transit_model_procmacro
├── src
└── tests

Cargo.toml

[package]
....


[workspace]
members = [
    ".",
    "./libs/collection",
    "./libs/relations",
    "./libs/transit_model_procmacro",
    "./libs/model-builder"
]

[dependencies]
...
transit_model_procmacro = { version = "0.1.0", path = "./libs/transit_model_procmacro" }
transit_model_collection = { path = "./libs/collection" }
transit_model_relations = { path = "./libs/relations" }

...

This will allow to:

keep libraries in a separate folder
run all tests with cargo test --all

Refactor adding prefix in GTFS read

That's a bit invasive as is. The idea is to do everything without prefix, and before creating PtObjects, adding the prefix everywhere, with something like fn add_prefix<T: AddPrefix>(&mut CollectionId<T>, &str)

contributors.txt and datasets.txt are required by NTM, but that's not what the spec say

gtfs2ntfs : new way to generate prefixes ?

Hello,

since v0.25.0, trip's prefixes are not handled the same way as before :

v0.24.0

gtfs2ntfs -i input/ -o output -c config.json -p cfd

T2C11821953316814874|13934463:T100|13:56:00 -> cfd:T2C11821953316814874|13934463:T100|13:56:00

v0.25.0

gtfs2ntfs -i input/ -o output -c config.json -p cfd

T2C11821953316814874|13934463:T100|13:56:00 -> cfd:clermo:T2C11821953316814874|13934463:T100|13:56:00

gtfs2ntfs adds 6 characters to prefix for trips. It seems inconsistent since stop ids are prefixed only with the prefix given as before.

Seems to be related to this change

Coordonnées WGS

La solution exporte tous les coordonnées dans un projection Lambert https://epsg.io/2154
Ca ne fonctionne pas au dehors de la France.
Il serait mieux d'utiliser les coordonnées WGS (au moins en addition)

Maintenant:

<gml:pos srsName="EPSG:2154">0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003026405013 0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007725596546446257</gml:pos>

Exigé:

				<Centroid>
					<Location>
					<Longitude>3.026405013</Longitude>
					<Latitude>77.25596546446257</Latitude>
					</Location>
				</Centroid>

Ou:

3.026405013
77.25596546446257
<gml:pos srsName="EPSG:2154">0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003026405013 0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007725596546446257</gml:pos>

[comment.txt] wrong fields interrupt bina abruptly

Concerned file of NTFS:
https://github.com/CanalTP/navitia/blob/dev/documentation/ntfs/ntfs_fr.md#commentstxt-optionnel

2 problems:

comment_type management
comment_value/comment_name conflict

comment_type

The value of this field is standard on at least one of our NTFS (stif dev). And the bina fails abruptly in mimir.
The field is required in nav_model, while optional in the doc, in navitia2 (don't know about fusio)

I don't know what should be the behavior:

enforce it
ignore the field
ignore the whole line
ignore the whole file ( )
update NTFS doc
patch fusio

comment_value / comment_name conflict

Probably an outdated documentation problem.
The field is named comment_name in navitia, and it's probably the same in fusio as we get it in NTFS files.
The field is named comment_value in the doc (and navitia_model sticked to that).

The simple correction would be to update doc and navitia_model.
Although the name is not super clear, so we might think of changing it in navitia and fusio. But as it is required, it would break retrocompat of files...

comments.txt reading

Anyway, on all that, my mind is that rejecting the whole bina for this comment file is too abrupt.

gtfs - write of stop_code

The GTFS stop_code property should be read as an alternative object_code for the stop with the
object_system set to gtfs_stop_code.

When writing the model as a GTFS, the gtfs_stop_code should be written in the stops.txt as a stop_code.
If there are several (due to GTFS merge for instance), juste take the first one for the moment :)

FR: prefixes dans les id

Pour l'utilisation au dehors de la France. Il devrait être possible t'assigner un prefix, qui on choisi.

example:
maintenant:
exigé:

merge-ntfs

if i have multiple NTFS/coverage, what should i do ?
i can't send multiple NTFS to tyr-api, because it clean database, and then insert.
thanks

Do gtfs2netexfr handle the pathways and levels GTFS extensions ?

From what I have seen, you have at least partially added the support for the pathways and levels GTFS extension (#398 and #678)
I am wondering if the gtfs2netexfr crate is able to generate a netex file containing accessibility informations if such an information is present in the GTFS input ?

Thank you for your help !

GTFS read - no warning if a trip refers to an inexisting service_id

If the trips.txt file contains a trip referencing a service_id that doesn't exist in calendar_dates.txt or calendar.txt, there should be a warning returned.

[tech] Improve ignored doc-tests

After #782 (comment)
Some doc-tests are ignored, which is a shame.
The main point is that we want to avoid a heavy (compile-time) dependency for the gain (reqwest was removed for that, and deps on async frameworks are heavy).

We have 4 options on that :

No http-client deps
a. let it as-is (ignored)
b. use file read + comment that it could be reqwest
We can check https://blog.logrocket.com/the-state-of-rust-http-clients/ or https://users.rust-lang.org/t/lightweight-alternative-for-reqwest/33601/2 for lighter http clients (ureq, http_req).
Or add reqwest (with some internal TLS support to avoid system deps?) back to check if it's OK for compile times.
a. add dev-deps and remove ignore in tests
b. add complete deps (and real url-read fn for convenience as it's the most common use case)

I will probably have a try on option 2.a (maybe not short-term), but if anyone wants to try some of the options listed or others, please do!

Also, any feedback from users outside Kisio Digital (ex-CanalTP) is more than welcome on the usefulness of this 🙏

Update Readme

We could have a better Readme

remove references to binaries once #316 is merged
complete the list of features (maybe add some examples in ./examples)

`Collection[WithId]::new()` could take any kind of iterable instead of a `Vec`

Today, Collection::new() and CollectionWithId::new() takes a Vec<T: Id<T>> as an input argument. But technically, nothing should prevent us to construct a Collection or a CollectionWithId from a HashSet or a HashMap::values().

This would mean change the CollectionWithId::new() method from (and similar for Collection)

pub fn new(v: Vec<T>) -> Result<Self>;

to something like

pub fn new(v: I) -> Result<Self>
where
	I: IntoIterator<Item = T>;

Allow reading GTFS csv files with a flexible option for columns number

Hi,

I am Francis Chabouis, working as a developper for transport.data.gouv.fr.
We are using gtfs2netexfr to generate NeTEx files from the GTFS available on our plateform.

For the moment I didn't see an option to read GTFS files with a flexible option, meaning a way to convert GTFS files containing CSV with a non-constant number of columns.

It is unfortunately possible that provided CSV files have missing columns on some rows, and we would like gtfs2netexfr to be able to handle such cases.

We have an issue on our project, with a an example of such a GTFS.

Thank you and let me know if you need further explanations on this request.

Rust edition 2018

Is there something blocking the move to rust 2018 ?

I think it would be great to move to this.

To reduce the chance to have conflict, it might be nice to wait until there are no on going PR.

I'm willing to do it if it's ok for you.

Add more flexibility to read inputs

previously there was a method gtfs::read_from_url that was quite handy to read a gtfs from the network (and is used by https://github.com/etalab/transpo-rt/) but the dependency to reqwest was slowing down the compilation time of transit_model without being directly used by it.

It makes however the upgrade to the newest version of transit_model impossible for transpo-rt (because FileHandler and gtfs::read are not public).

I see several way to make this possible

A: put back the reqwest dependency

👍 easiest solution
👍 can be used by kisio too
👎 build time

B: put back the reqwest dependency behind a feature

👍 easy solution
👍 can be used by kisio too
👎 👎 build time for ci (if ci is run for both feature, else 👍 )
👍 build time for other crates using transit_model without feature (like tartare-tools)

C: expose a way to do the `reqwest` call externally

I think we could expose a gtfs::from_ziped_read (not fan of the name if you have suggestion)

mod gtfs {

pub fn from_ziped_read<P: AsRef<Path>>(
    reader: R,
    source_name: &str,  // it's a bother but `source_name` is useful here to have nicer error msg
)-> Result<Model> 
where
    R: std::io::Seek + std::io::Read {
    let mut file_handle = read_utils::ZipHandler::new(reader, &source_name)?;
    read(&mut file_handle, config_path, prefix)
}
}

and in another crate (like transpo-rt)

fn read_url(url: &str) -> Result<std::io::Cursor<Vec<u8>>> {
    let mut res = reqwest::get(url)?;
    let mut body = Vec::new();
    res.read_to_end(&mut body)?;
    Ok(std::io::Cursor::new(body))
}

pub fn main() -> Result<()> {
    let url = "http://pouet.zip";
    let r = read_url(&url)?;
    let model = gtfs::from_ziped_read(r, &url)?;
}

(it could be added to transit_model doc to make it easier)
👎 more code but not too complex
👍 can be used for other stuff (like a POC for WASM done by @datanel )
👍 build time

what do you think about this? I'd personally vote for B + C 😜

Add type for date

as specified in https://github.com/CanalTP/navitia/blob/dev/documentation/ntfs/ntfs_fr.md#format-des-donn%C3%A9es

we should deserialize dates (format YYYYMMDD) and time (format HH:MM:SS) to an appropriate type

[gtfs2netexfr] Help needed to pinpoint cause of "unused lines" pruning

Hello!

I am a member of transport.data.gouv.fr, reaching out because we are investigating on the result of a GTFS to Netex conversion (etalab/transport-site#1864) for which I'd need a bit of help if possible, and there is a maintenance contract if I understand well (although I do not know the exact terms).

When running the converter against the 2 GTFS resources available here, the resulting Netex contains a single "line", whereas the producer includes many in their GTFS.

As documented at the bottom of etalab/transport-site#1864, I have dived into the Rust source (with logs/breakpoint debugging etc) and I have seen that the converter optimises the output to remove lines which are not referenced via routes, themselves removed if no service/calendar refers to them.

I have not yet been able to fully trace if the GTFS is faulty, or if the converter has excessive pruning for some reason (the former is more likely than the latter I believe).

Before I try to mount the data into a database and cross-check, I wondered if there are tricks in the converter to help pinpoint the issue, if there are known caveats that could explain an excessive pruning maybe, or if you could recommend tools to make the analysis faster/easier?

Thanks in advance!

gtfs - use read_config from read_utils

PR #94 introduce a new read_config function.
signature is different, and implies a little refacto of the gtfs read module to use it.

First remarks on code

@TeXitoi To be discussed, as there's no PR to discuss on.

https://github.com/CanalTP/navitia_model/blob/9327364e29645194cb79866609131e657dff2be8/src/lib.rs#L79
I'd name it Context more than PtObjects
https://github.com/CanalTP/navitia_model/blob/9327364e29645194cb79866609131e657dff2be8/src/lib.rs#L10
:) not able to read or maintain that... We have to take a course on that! Maybe clearer (longer) macro param names (what is ty for?)?
https://github.com/CanalTP/navitia_model/blob/09d7b0d4913b8df7469977c7fa0d4e8baf5b60e5/get-corresponding-derive/src/lib.rs#L12
Again the naming is not super helping, or missing comments (also the case for get-corresponding-derive/src/lib.rs in general)?

Create helpers to ease unit testing

This crate is well tested (good job 🥇 💖 !) but mainly with integration tests, not unit tests.

For our crate https://github.com/etalab/transpo-rt we also started with integration tests, but I plan to add unit tests too.

To ease unit tests creation it would be nice to have a helper to create datasets (a bit like https://github.com/CanalTP/navitia/blob/dev/source/ed/build_helper.h#L216)

@Tristramg implemented one in rust for his CSA crate.

Would you be interested by such a feature ?

If so, since it's not a trivial problem, I think it would be nice to design at first the end use of such a builder in this issue.
Do you have an idea on how you want to write the tests ?

Note: I can't spend too much time on this, we should try to have a minimal design if you can't spend time on implementing it too.

SiteConnection id not unique

The ID of SiteConnection is built from the two relevant StopPlaces. However, it is possible to have multiple Connections between two StopPlaces (as it seems). This may be a problem in the GTFS, but it results in an id conflict in the resulting. Or some information was lost. In any case in the resulting correspondances.xml file, there should only be one SiteConnection element with a given id.

				<SiteConnection id="FR:SiteConnection:8100042_8100042:" version="any">
					<WalkTransferDuration>
						<DefaultDuration>PT1800S</DefaultDuration>
					</WalkTransferDuration>
					<From>
						<StopPlaceRef ref="FR:StopPlace:Navitia_8100042:">
						</StopPlaceRef>
						<QuayRef ref="FR:Quay:8100042:">
						</QuayRef>
					</From>
					<To>
						<StopPlaceRef ref="FR:StopPlace:Navitia_8100042:">
						</StopPlaceRef>
						<QuayRef ref="FR:Quay:8100042:">
						</QuayRef>
					</To>
				</SiteConnection>
				<SiteConnection id="FR:SiteConnection:8100042_8100042:" version="any">
					<WalkTransferDuration>
						<DefaultDuration>PT1800S</DefaultDuration>
					</WalkTransferDuration>
					<From>
						<StopPlaceRef ref="FR:StopPlace:Navitia_8100042:">
						</StopPlaceRef>
						<QuayRef ref="FR:Quay:8100042:">
						</QuayRef>
					</From>
					<To>
						<StopPlaceRef ref="FR:StopPlace:Navitia_8100042:">
						</StopPlaceRef>
						<QuayRef ref="FR:Quay:8100042:">
						</QuayRef>
					</To>
				</SiteConnection>
				<SiteConnection id="FR:SiteConnection:8100042_8100042:" version="any">
					<WalkTransferDuration>
						<DefaultDuration>PT1800S</DefaultDuration>
					</WalkTransferDuration>
					<From>
						<StopPlaceRef ref="FR:StopPlace:Navitia_8100042:">
						</StopPlaceRef>
						<QuayRef ref="FR:Quay:8100042:">
						</QuayRef>
					</From>
					<To>
						<StopPlaceRef ref="FR:StopPlace:Navitia_8100042:">
						</StopPlaceRef>
						<QuayRef ref="FR:Quay:8100042:">
						</QuayRef>
					</To>
				</SiteConnection>
				<SiteConnection id="FR:SiteConnection:8100042_8100042:" version="any">
					<WalkTransferDuration>
						<DefaultDuration>PT1800S</DefaultDuration>
					</WalkTransferDuration>
					<From>
						<StopPlaceRef ref="FR:StopPlace:Navitia_8100042:">
						</StopPlaceRef>
						<QuayRef ref="FR:Quay:8100042:">
						</QuayRef>
					</From>
					<To>
						<StopPlaceRef ref="FR:StopPlace:Navitia_8100042:">
						</StopPlaceRef>
						<QuayRef ref="FR:Quay:8100042:">
						</QuayRef>
					</To>
				</SiteConnection>

We used https://data.oebb.at/oebb?dataset=uddi:cd36722f-1b9a-11e8-8087-b71b4f81793a for the transform

gtfs2netexfr installation issue

Hi,

I'm a great newb with rust and cargo
I followed the install procedure by install PROJ and then clang and libssl-dev => that's OK

I tried to install gtfs2netexfr and get this when I lauched the command

~/home/gtfs2netexfr$ cargo install gtfs2netexfr
    Updating crates.io index
error: could not find `gtfs2netexfr` in registry `https://github.com/rust-lang/crates.io-index`

how can I solve this ?
Thanks

Problem of `StopTime`'s attributes when updating `Collections::vehicle_journey`

Context

In Collections, and because the amount of StopTime is usually important, we do not store stop_time_id, stop_time_headsign or stop_time_comment inside the StopTime object. Instead, we have HashMap in Collections to do the link (Collections::stop_time_ids, Collections::stop_time_headsigns, and Collections::stop_time_comments). The key of these HashMaps (identifying uniquely the StopTime) is a tuple (Idx<VehicleJourney>, u32) and the value is a String (id, headsign or comment).

Problem

The problem I see is the key of the HashMap. Today, the key (to link to a StopTime) is done with a Idx<VehicleJourney> and the sequence number of the StopTime. And this key is indeed unique to identify a StopTime. However, the problem is on Idx<VehicleJourney>: it needs to be updated each time we modify Collections::vehicle_journey (reordering or suppression of a VehicleJourney).

Could we think of a solution where an update of Collections::vehicle_journey would automatically update these HashMap links? Because the problem today is that we are forgetting to do these updates which results in bugs.

Possible solution

One naive option I see would be to use a &StopTime has the key of these HashMaps... but this can probably transform into a nightmare because of lifetimes...

gtfs2ntfs - Can not generate zipped ntfs as mentioned in doc

gtfs2ntfs - Can not generate zipped ntfs (for output) as mentioned in doc
https://github.com/CanalTP/navitia_model/blob/master/src/documentation/gtfs2ntfs.md

better object duplicate error

Hi,
When using gtfs2netexfr we have some errors like "identifier 8593027 already exists" (as the last log of a netex generation).
To help the diagnostic, it would be helpful to know the collection having duplicates and maybe explain at a meta level (in the main ?) that the problem blocks the netex generation.

[NTFS] Handling of stop.location_type > 1

currently stops with a location_type equals to 2 or 3 will be ignored.
Reading of the whole ntfs will be aborted if there is a stop_time that use this stop.

This also open the door to the whole ODT thing...

Error reading "./ntfs/feed_infos.txt"

Hello,

I am using the gtfs2ntfs tool for the first time with the latest GTFS 'fr-nw', but unfortunately I get this error message :

Oct 26 08:25:34.264 INFO Writing feed_infos.txt
Error reading "./ntfs/feed_infos.txt"
No such file or directory (os error 2)

My command line is :

gtfs2ntfs --input ./gtfs/ --output ./ntfs

Do you know how I can resolve this problem ?

[refactor] Splitting big modules into multiple files

This thought came from the file src/objects.rs, it's close from 1500 lines and it will grow in the next few weeks. What is the general sentiment about refactoring into multiple files (the extreme being one file by struct and keep the same API by flattening in the mod.rs). An intermediate solution might also be possible if we think of homogeneous groups of objects.
Looking quickly on the web, I couldn't find a lot of Best Practices about that topic (see this discussion on Reddit for example).

Release binaries

When publishing a new release (don't know what is the process currently), it would be nice to publish also the corresponding binaries, at least for Debian.

This is probably doable using Travis which would publish it in GitHub releases, and this would ease the external use (for example as a hand-made preprocess for navitia).

Missing documentation about some APIs

Some APIs of the transit_model are not documented (at least, not in src/documentation, the code itself might be partially documented). Here a (possibly non-exhaustive) list:

transit_model_collections::Collections::try_merge()
transit_model_collections::Collections::restrict_period()
transit_model::apply_rules::apply_rules()
transit_model::hellogo_fares::enrich_with_hellogo_fares()
transit_model::ntfs::filter::filter()
transit_model::merge_stop_areas::merge_stop_areas()
transit_model::transfers::generates_transfers()

All of the above APIs have been implemented with a lot of business rules that are not documented. It would be nice to know and explain why and how they are implemented and what decisions have been made.

Add License

Have realistic tests on GTFS and NTFS format

Considering that most of these devs are based on NTFS spec from navitia,
Considering that NTFS is not perfectly up-to-date,

We should then test with complete/quite-big files produced by fusio and usually consumed by navitia, to check that there are no conflicts.

This may be done directly in CI.

Manage errors

Remove failure

Should we consider removing failure and having custom errors implementing https://doc.rust-lang.org/std/error/trait.Error.html ?

failure wraps errors implementing https://docs.rs/failure/0.1.6/failure/trait.Fail.html not https://doc.rust-lang.org/std/error/trait.Error.html

what we might do

create erros with struct/enum
implement https://doc.rust-lang.org/std/error/trait.Error.html, we might use https://github.com/dtolnay/thiserror
use https://github.com/dtolnay/anyhow

[tech] Easier hello world

The current API to load a model (from a GTFS or NTFS) is honestly quite straightforward and easy. I think we can still improve it a bit and I think a very neat library example can ease adoption.

For the moment to load a Model from a GTFS (and I think it can be applied to the other format) you need to do:

let model = gtfs::read_from_path(input_dir, gtfs::Configuration::default())?;

I think here the gtfs::Configuration::default() could be hidden.
Moreover in most uses of gtfs (like all transit_model's executables) the model is initialized like:

    let model = if opt.input.is_file() {
        transit_model::gtfs::read_from_zip(opt.input, configuration)?
    } else if opt.input.is_dir() {
        transit_model::gtfs::read_from_path(opt.input, configuration)?
    } else {
        bail!("Invalid input data: must be an existing directory or a ZIP archive");
    };

since it's a common use, it could be done for the users.

To ease the canonical way to create a model I think we can:

add a magic function that can read a zip or path (and urls in the futur ? 😻 )
add a builder pattern to hide the configuration

Pseudo code for this

First the usage:

The easiest

let model = gtfs::read("./pouet")?;
// or
let model = gtfs::read("./pouet.zip")?;
// or later
let model = gtfs::read("http://pouet.zip")?; :heart_eyes_cat:

or directly

let model = gtfs::from_zip("./pouet.zip")?;

and with a custom config

let model = gtfs::Reader::default()
  .prefix("IDFM")
  .on_demand_transport(true)
  .read("./pouet")?;

And the gtfs module

mod gtfs {
  pub fn from_zip(p: impl AsRef<Path>) -> Result<Model> {
     gtfs::Reader::default().from_zip(p)
  }
  pub fn from_path(path: impl AsRef<Path>) -> Result<Model> {…}
  pub fn read(path: impl AsRef<Path>) -> Result<Model> {…} // read from a zip or a path
  pub fn from_reader(path: impl AsRef<Path>) -> Result<Model> {…} // see https://github.com/CanalTP/transit_model/issues/737
  
  pub struct Reader {
    configuration: transit_model::gtfs::Configuration
  }
  impl Reader {  
    // The 'builder' functions that 'eats' the `Reader`
    pub fn from(self, path: impl AsRef<Path>) -> Result<Model> {…} // not convinced by the name, would read from a zip or a path
    pub fn from_zip(self, path: impl AsRef<Path>) -> Result<Model> {…}
    pub fn from_path(self, path: impl AsRef<Path>) -> Result<Model> {…}
    pub fn from_reader(self, path: impl AsRef<Path>) -> Result<Model> {…} // see https://github.com/CanalTP/transit_model/issues/737
 
    // all needed functions to configure the transit_model::gtfs::Configuration
    pub fn prefix(self, p: &str) -> Self {…}
    pub fn contributor(self, c: &str) -> Self {…}
    // …
  }
}

What do you think about this? Do you think it is worth it? Any better ideas on the naming?

Improve error message when input file doesn't exist in 'gtfs2ntfs'

Bonjour,

J'ai vu qu'il y avait eu un PR (PR 187) en février ajoutant ce message d'erreur mais je n'arrive pas à récupérer plus d'informations...

Du coup je lance ma commande :
target/release/gtfs2ntfs -i /gtfs/gtfs-test.zip -o /ntfs/ntfs-test.zip

Et j'ai directement un retour :
calendar_dates.txt or calendar.txt not found

Or mon GTFS contient bien un calendar.txt (j'ai testé sur plusieurs GTFS avec tous une origine différente) et ce sont des GTFS qui s'intègrent bien dans mon navitia...

Auriez-vous une idée du soucis ? Pour information je suis sur Proj4 v6.1.0

tests fail without feature

warning: unused import: `super::*`
  --> src/validity_period.rs:68:9
   |
68 |     use super::*;
   |         ^^^^^^^^
   |
   = note: `#[warn(unused_imports)]` on by default

error[E0432]: unresolved import `transit_model::netex_france`
  --> tests/write_netex_france.rs:17:41fs2ntfs(test), transfers(test), apply_r   |(test), filter_ntfs(test), merge_stop_areas(test), gtfs_reader(example)   17 | use transit_model::{self, model::Model, netex_france, test_utils::*};
   |                                         ^^^^^^^^^^^^ no `netex_france` in the root

error: aborting due to previous error

For more information about this error, try `rustc --explain E0432`.
error: could not compile `transit_model`.
warning: build failed, waiting for other jobs to finish...
error: build failed

I think you should run test with and without features on travis.

Tighter typed Ids

The Collectionand Idx are really great to have lots of statically checked stuff.

The transit model has a lot of links between the objects, and for the moment all those links are Strings.
I think navitia_model could be even better if those String ids were also typed.

For example the stop_area_id in the StopPoint could be a Identifier<StopArea> instead of a String.

We could then also remove the get_idx/get method in CollectionWithId<T> to take an &Identifier<T> instead of a &str.

We would also need to find a nice name for this identifier. I don't think we can use the obvious Id because it's already a Trait. Identifier ? Link ? TypedId ? TId ? any other thing ? another option would be to rename the Id trait (maybe to HasId ?)

What do you think ?

Remove `test_utils` from the main build

This project has a nice utility for testing files called test_utils.rs. However, this utility is test only so it should be compiled only for tests... but it's not. Doesn't seem simple to fix the problem, see below.

Unit tests

For Unit tests, we could fix the problem by adding #[cfg(test)] above the line mod test_utils in lib.rs. However, if we do that, it won't be usable in the integration tests which is using a clean build of the library (see documentation).

Note: It seems that tartare-tools is also using test_utils so this solution would also break tartare-tools.

Integration tests

We could move src/test_utils.rs into tests/test_utils.rs as advised in the documentation and use it in integration tests with mod test_utils;. But in this case, the unit tests wouldn't work anymore.

Possible solution

A possible solution would be to extract the test utilities inside an external crate.

hove-io / transit_model Goto Github PK

transit_model's Introduction

transit_model

Usage with Docker

Setup Rust environment

PROJ dependency

PROJ for binaries

Using PROJ and transit_model as a developer

NTFS Level of Support

Contributing

License

transit_model's People

Contributors

Stargazers

Watchers

Forkers

transit_model's Issues

v0.24.0

v0.25.0

A: put back the reqwest dependency

B: put back the reqwest dependency behind a feature

C: expose a way to do the reqwest call externally

Context

Problem

Possible solution

Pseudo code for this

First the usage:

And the gtfs module

Unit tests

Integration tests

Possible solution

Recommend Projects

Recommend Topics

Recommend Org

C: expose a way to do the `reqwest` call externally