dariusc93 / rust-ipfs Goto Github PK

View Code? Open in Web Editor NEW

This project forked from rs-ipfs/rust-ipfs

36.0 2.0 7.0 69.29 MB

The InterPlanetary File System (IPFS), implemented in Rust.

License: Apache License 2.0

Shell 0.05% Rust 99.95%

rust-ipfs's Introduction

Rust IPFS

The Interplanetary File System (IPFS), implemented in Rust

Description
- Project Status
Getting Started
- Running the tests
- Contributing
Roadmap
Maintainers
Alternatives
Contributors
License
Trademarks

Description

This repository is a fork of rust-ipfs, which contains the crates for the IPFS core implementation which includes a blockstore, a libp2p integration which includes DHT content discovery and pubsub support Our goal is to leverage both the unique properties of Rust to create powerful, performant software that works even in resource-constrained environments, while also maximizing interoperability with the other "flavors" of IPFS, namely JavaScript and Go.

Project Status - `Alpha`

This project is a WIP and everything is subject to change

For more information about IPFS see: https://docs.ipfs.io/introduction/overview/

Getting started

We recommend browsing the examples and tests in order to see how to use Rust-IPFS in different scenarios.

Note: Test are a WIP

Running the tests

For information on running test, please see the archived readme. This may be outdated but this section will be updated in the future

Contributing

See the contributing docs for more info.

If you have any questions on the use of the library or other inquiries, you are welcome to submit an issue.

Roadmap

Completed API Work

Pubsub

For previous completed work, please see the archived readme.

Maintainers

Rust IPFS was originally authored by @dvc94ch and was maintained by @koivunej, and @aphelionz, but now is maintained by @dariusc93.

For maintainers please see the archived readme.

Alternatives and other cool, related projects

It’s been noted that the Rust-IPFS name and popularity may serve its organization from a "first-mover" perspective. However, alternatives with different philosophies do exist, and we believe that supporting a diverse IPFS community is important and will ultimately help produce the best solution possible.

Parity's rust-libp2p, which does a lot the of heavy lifting here
Beetle (previously known as Iroh) - Another rust implementation of IPFS
ipfs-embed - Another rust implementation of IPFS
rust-ipfs-api - A Rust client for an existing IPFS HTTP API. Supports both hyper and actix.
rust-ipld - Basic rust ipld library supporting dag-cbor, dag-json and dag-pb formats.
PolkaX's own rust-ipfs

If you know of another implementation or another cool project adjacent to these efforts, let us know!

Contributors

For previous/original contributors, please see the archived readme.

License

Dual licensed under MIT or Apache License (Version 2.0). See LICENSE-MIT and LICENSE-APACHE for more details.

Trademarks

The Rust logo and wordmark are trademarks owned and protected by the Rust Foundation. The Rust and Cargo logos (bitmap and vector) are owned by Rust Foundation and distributed under the terms of the Creative Commons Attribution license (CC-BY).

rust-ipfs's People

Contributors

Stargazers

Watchers

Forkers

satellite-im ipfs-nexivil iohzrd setheum-labs ali-usama esomore arthur999999

rust-ipfs's Issues

Implement port mapping

The only current method of hole punching is using a relay (with dcutr being used first before falling back to a relay), however in many devices it may be possible to utilize port mapping as the primary choice before falling back to using a relay, if its available. This should make it possible to allow connections directly without using a relay for direct connections, however some things that should be decided on before implementing:

Should IGD or NAT-PMP be used? We could use both and fall back to former if the latter fails.
What libraries should be used? Should we implement those here, use a crate like igd, or use external library (eg libnatpmp)?
Should there be a interface sitting on top of any library we used so we can swap in and out different implementations to reduce breakage?
What impact would it have on libp2p? During past testing with igd, everything worked fine with peers being able to connect, however this was done with a fixed address and not catching swarm events on any address change. This may need testing to be sure we can add or remove the addresses during those events (by also catching loopback addresses so we can ignore those along with any p2p-circuit addresses).
Should we also have relays be used along side with port mapping in the event port mapping fails for whatever reason (eg device moving from a device that supports it to one that doesnt).
(Long term) What impact would this have on quic (when rust-libp2p supports the transport)? (May do separate testing)
Should this possibly be done upstream in libp2p with separate implementation?

Note:
We should check to determine if we are behind a nat prior to attempting port mapping.

Reference:

Build failed with latest nightly Rust

Hello there!

Could you please consider adding support for the latest nightly version of Rust in your crate? This would greatly benefit developers who rely on nightly features for their projects.

Thank you for your attention to this matter.

To Reproduce

#23 839.6    Compiling rust-ipfs v0.10.4 (https://github.com/dariusc93/rust-ipfs.git?branch=libp2p-next#ced0237b)
#23 845.7 error[E0277]: `core::fmt::rt::Opaque` cannot be shared between threads safely
#23 845.7    --> /usr/local/cargo/git/checkouts/rust-ipfs-8623051e16f56fb9/ced0237/src/task.rs:277:13
#23 845.7     |
#23 845.7 277 | /             tokio::task::spawn(async move {
#23 845.7 278 | |                 debug!("stopping session {}", ctx);
#23 845.7 279 | |                 if let Some(workers) = workers {
#23 845.7 280 | |                     debug!("stopping workers {} for session {}", workers.len(), ctx);
#23 845.7 ...   |
#23 845.7 295 | |                 debug!("session {} stopped", ctx);
#23 845.7 296 | |             });
#23 845.7     | |______________^ `core::fmt::rt::Opaque` cannot be shared between threads safely
#23 845.7     |
#23 845.7     = help: the trait `Sync` is not implemented for `core::fmt::rt::Opaque`, which is required by `{async block@/usr/local/cargo/git/checkouts/rust-ipfs-8623051e16f56fb9/ced0237/src/task.rs:277:32: 296:14}: std::marker::Send`
#23 845.7     = note: required for `&core::fmt::rt::Opaque` to implement `std::marker::Send`
#23 845.7 note: required because it appears within the type `core::fmt::rt::Argument<'_>`
#23 845.7    --> /rustc/f4b771bf1fb836392e1c510a625cdc81be09c952/library/core/src/fmt/rt.rs:75:12
#23 845.7     = note: required because it captures the following types: `&beetle_bitswap_next::client::session_manager::SessionManager`, `u64`, `&{closure@beetle_bitswap_next::client::session_manager::SessionManager::remove_session::{closure#0}::{closure#0}}`, `tracing::field::Iter`, `&FieldSet`, `&Field`, `Field`, `&[&str]`, `core::fmt::rt::Argument<'_>`, `impl futures::Future<Output = tokio::sync::RwLockReadGuard<'_, ahash::hash_map::AHashMap<u64, Session>>>`, `tracing::log::Level`, `tracing::log::Metadata<'_>`, `&tracing::Metadata<'_>`, `&dyn tracing::log::Log`, `tracing::log::Metadata<'_>`, `tracing::field::Iter`, `&FieldSet`, `&Field`, `Field`, `&[&str]`, `core::fmt::rt::Argument<'_>`, `impl futures::Future<Output = tokio::sync::RwLockReadGuard<'_, ahash::hash_map::AHashMap<u64, Session>>>`, `Vec<CidGeneric<64>>`, `impl futures::Future<Output = Vec<CidGeneric<64>>>`, `impl futures::Future<Output = ()>`, `impl futures::Future<Output = tokio::sync::RwLockWriteGuard<'_, ahash::hash_map::AHashMap<u64, Session>>>`
#23 845.7 note: required because it's used within this `async` fn body
#23 845.7    --> /usr/local/cargo/git/checkouts/rust-ipfs-8623051e16f56fb9/ced0237/packages/beetle-bitswap-next/src/client/session_manager.rs:175:71
#23 845.7     |
#23 845.7 175 |       pub async fn remove_session(&self, session_id: u64) -> Result<()> {
#23 845.7     |  _______________________________________________________________________^
#23 845.7 176 | |         debug!(
#23 845.7 177 | |             "stopping session {} ({} sessions)",
#23 845.7 178 | |             session_id,
#23 845.7 ...   |
#23 845.7 189 | |         Ok(())
#23 845.7 190 | |     }
#23 845.7     | |_____^
#23 845.7     = note: required because it captures the following types: `Session`, `usize`, `impl futures::Future<Output = Result<(), anyhow::Error>>`, `session::Inner`, `tokio::task::JoinHandle<()>`
#23 845.7 note: required because it's used within this `async` fn body
#23 845.7    --> /usr/local/cargo/git/checkouts/rust-ipfs-8623051e16f56fb9/ced0237/packages/beetle-bitswap-next/src/client/session.rs:182:43
#23 845.7     |
#23 845.7 182 |       pub async fn stop(self) -> Result<()> {
#23 845.7     |  ___________________________________________^
#23 845.7 183 | |         let count = Arc::strong_count(&self.inner);
#23 845.7 184 | |         info!("stopping session {} ({})", self.inner.id, count,);
#23 845.7 185 | |         ensure!(
#23 845.7 ...   |
#23 845.7 207 | |         Ok(())
#23 845.7 208 | |     }
#23 845.7     | |_____^
#23 845.7     = note: required because it captures the following types: `std::option::Option<Session>`, `impl futures::Future<Output = std::option::Option<Session>>`, `impl futures::Future<Output = Result<(), anyhow::Error>>`
#23 845.7 note: required because it's used within this `async` fn body
#23 845.7    --> /usr/local/cargo/git/checkouts/rust-ipfs-8623051e16f56fb9/ced0237/packages/beetle-bitswap-next/src/client.rs:317:69
#23 845.7     |
#23 845.7 317 |       pub async fn stop_session(&self, session_id: u64) -> Result<()> {
#23 845.7     |  _____________________________________________________________________^
#23 845.7 318 | |         if let Some(session) = self.session_manager.get_session(session_id).await {
#23 845.7 319 | |             session.stop().await?;
#23 845.7 320 | |         }
#23 845.7 321 | |
#23 845.7 322 | |         Ok(())
#23 845.7 323 | |     }
#23 845.7     | |_____^
#23 845.7     = note: required because it captures the following types: `Vec<(futures::futures_channel::oneshot::Sender<()>, tokio::task::JoinHandle<()>)>`, `std::vec::IntoIter<(futures::futures_channel::oneshot::Sender<()>, tokio::task::JoinHandle<()>)>`, `std::option::Option<(futures::futures_channel::oneshot::Sender<()>, tokio::task::JoinHandle<()>)>`, `tokio::task::JoinHandle<()>`, `impl futures::Future<Output = Result<(), anyhow::Error>>`
#23 845.7 note: required because it's used within this `async` block
#23 845.7    --> /usr/local/cargo/git/checkouts/rust-ipfs-8623051e16f56fb9/ced0237/src/task.rs:277:32
#23 845.7     |
#23 845.7 277 |               tokio::task::spawn(async move {
#23 845.7     |  ________________________________^
#23 845.7 278 | |                 debug!("stopping session {}", ctx);
#23 845.7 279 | |                 if let Some(workers) = workers {
#23 845.7 280 | |                     debug!("stopping workers {} for session {}", workers.len(), ctx);
#23 845.7 ...   |
#23 845.7 295 | |                 debug!("session {} stopped", ctx);
#23 845.7 296 | |             });
#23 845.7     | |_____________^
#23 845.7 note: required by a bound in `tokio::spawn`
#23 845.7    --> /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/task/spawn.rs:166:21
#23 845.7     |
#23 845.7 164 |     pub fn spawn<F>(future: F) -> JoinHandle<F::Output>
#23 845.7     |            ----- required by a bound in this function
#23 845.7 165 |     where
#23 845.7 166 |         F: Future + Send + 'static,
#23 845.7     |                     ^^^^ required by this bound in `spawn`
#23 845.7 
#23 845.8 For more information about this error, try `rustc --explain E0277`.
#23 845.8 error: could not compile `rust-ipfs` (lib) due to 1 previous error

Environment (please complete the following information)

Operating system, kernel version where applicable: Windows 10 / Debian 12
Rust version: 1.79.0-nightly, nightly-x86_64-pc-windows-msvc

Return stream for Ipfs::get_providers

rust-libp2p merged changes in libp2p/rust-libp2p#2712 that would allow get_providers to return in real time when polling the swarm, so what will likely need to be done is to create a stream when calling Ipfs::get_providers and allow one to poll the stream for the peers found providing the key.

Question:

After testing, it looks like i can only call get_providers in libp2p once to receive everything in real time but only up to a specific point (though i will test again since I was checking for a specific key in the swarm). Should we continue to only call it once once this implemented or should we call it on an interval internally?

The event is only emitted until the query is finished earlier or until GetProvidersOk::FinishedWithNoAdditionalRecord is emitted with the ProgressStep::last being true. In this case we will stop the stream once the swarm provide this event.

pubsub: Discover pubsub peers

Currently, for nodes to communicate through pubsub topic, they would have to be connected to a peer who is subscribed to the same topic, however when it comes to discovering peers who are apart of the topic, this usually requires the node manual input. We could, instead, add a feature to allow one to enable discovery for pubsub peers of a given topic (or even namespace). Through this, we could allow discovery through DHT, or some rendezvous (either the libp2p protocol or maybe a third party protocol), possibly automate it so we can enable/disable discovery if it falls below or go beyond a specific number of peers for the topic, etc.

Question:

Do we wish to make it compatible with kubo?
Should we pass a configuration option to Ipfs::pubsub_subcribe or return a struct that would allow configuring before awaiting for the stream?

Issues with Storing and Accessing Large Data (>12-13 Bytes) via Public Gateway

Description

I'm integrating rust-ipfs into a Substrate blockchain to enable decentralized storage capabilities for our nodes. The integration involves using offchain workers to interact with an IPFS node, managed by rust-ipfs, for storing and retrieving data. While testing this setup, I've encountered an issue where I'm unable to access data larger than approximately 12 to 13 bytes through a public IPFS gateway. Smaller data sizes work as expected and are accessible without issues.

Steps to Reproduce

Initialize the IPFS node using rust-ipfs with this configuration.
Store data on IPFS using the Substrate offchain worker, which interacts with the rust-ipfs instance.
Attempt to access the stored data through a public IPFS gateway (e.g., https://ipfs.io/ipfs/).

Expected Behavior

Data of any size, when stored on IPFS using rust-ipfs through our Substrate blockchain integration, should be retrievable via public IPFS gateways.

Actual Behavior

When attempting to access data larger than 12 to 13 bytes through a public gateway, the request fails (504: Gateway Timeout Error). Smaller data sizes are retrievable without any issues.

Additional Information

Rust-IPFS version: forked rust-ipfs
Substrate version: polkadot-v0.9.43

I suspect this might be related to how rust-ipfs handles data chunking or broadcasting of CID announcements to the IPFS network, particularly for larger data sizes. However, I am not entirely sure if the issue lies within the configuration of the rust-ipfs node, the data storage process, or the retrieval/query mechanism.

Request for Assistance

Could you provide insights or recommendations on how to address this issue? Specifically, I am looking for:

Confirmation if this is a known issue with rust-ipfs or if it might be related to my integration approach.
Any configuration changes or optimizations that could help in successfully storing and accessing larger data sizes via public IPFS gateways.
Best practices for debugging and resolving such issues when integrating rust-ipfs with Substrate blockchains.

Thank you for your support and looking forward to your guidance on resolving this challenge.

Implement Metrics

Request: Open a discussion page

Hello!

Could you open up a discussion page on this repository? Now that the Iroh project has evolved into its own protocol, several people interested in a Rust implementation of IPFS are watching this repository.
It's my understanding that you maintain this repository for your own interests, and I completely respect that; my hope is that, if a discussion page were to be opened, others would be able to assist you with this project.

Kind regards,
Emil

Implement Rendezvous Protocol

feat: Support both noise and tls

Currently, we only support using noise, however rust-libp2p does have support for tls, however there does not seem to be any easy way of using both at the same time. IntoSecurityUpgrade, which is used as apart of SwarmBuilder, is not exposed publicly, which means we would require to either:

Write our own impl that would accept both noise and tls (or maybe have it so it would support one or the other, or both)
Rewrite the logic to make use of SwarmBuilder
Port over components of IntoSecurityUgrade into the transport module.

For 1, we could mimic what IntoSecurityUpgrade does, which would allow us to possibly add a configuration option to support either or (but panic if neither are used unless we decide to support plaintext). If we decide to go for 2, we would have to rethink the whole flow and how we wish to configure TLS and Noise and if we wish to move those options out of TransportConfig and into a separate configuration.

Implement Connection Management

Currently, connection management is mostly connection limits, which is currently implemented into PeerBook using NetworkBehaviour (based on libp2p/rust-libp2p#3386). This behaviour does support whitelisting peers (which is more of an "allow list") to bypass the limits, but this is basic (and would likely be in a separate behaviour)

Down the road, we would need a realistic connection manager that would close least useful connections upon reaching a specific limit set. This would close connections to peers that are not guarded or protected by a protocol or behaviour such gossipsub, kad, or even in cases like PeerBook, though may need to check to determine the best course on checking the peer in other behaviours outside of what we managed (eg if the peer is actively subscribed to a topic in gossip, or apart of routing in kad, used for a relay, exchanging data via bitswap, etc). A scoring system would likely need to be implemented in some way.

TODO:

Protect peers manually added to PeerBook
- This is done already, but the peer should not be counted towards the connection limit, while the current implementation, it does count.
Make sure tagged peers dont count towards the connection limit
Tag peers used in specific protocols
- Peers subscribed to the same topic in gossipsub, untagging when unsubscribed
- Peers exchanging blocks with bitswap, untagging when exchanging of blocks are done or have timed out.
- Peers used as a relay.
Automatically remove tagged peers that disconnected (though protected peers will remain in place)

Increase files limit

There would be alot of file descriptors that would be created when there is alot of connections being created. As a result, this would usually give an error related to too many files opened. The solution on the system would be to increase the soft limit. This should be done internally within rust-ipfs, probably as a configuration open, so it can allow increasing the limit to a specific amount or to the max hard limit.

Note: There is a crate called fdlimit, however it does not have a target gate for android, so this could be used temporarily but would likely need to be increased when it comes to using android. Would also need to investigate into windows calls to increase the file handles there.

rlimit - https://crates.io/crates/rlimit

Remove MultiaddrWithPeerId et al

Its understanding that MultiaddrWithPeerId and similar was used as a wrapper for when the peerid is or is not apart of the multiaddr, however it may be better to stick with just Multiaddr and perform direct checks when needed.

doc: Improve documentations

Currently, documentation within rust-ipfs may be a bit dated or could use some improvement to provide clarity on functionality. This can include the README.md, which does not reflect the true state of this library as it should.

Reject peers with/without specific protocols

There may be cases were one may would want to reject that connect with or without a specific protocol. This may be to whitelist specific protocols from connecting (eg maybe to disconnect peers with additional or abnormal protocols), or maybe to disconnect peers that dont supply Identify behaviour, etc.

Due to rust-libp2p not supporting providing the information directly, we would have to rely on Identify and the Info that is injected into PeerBook to cross check protocols.

Auto relay connection

Should add a option to add listen on a relay automatically while behind a NAT. This would allow us to walk through DHT and find any peers that has the relay v2 protocol in which case we would then listen to it automatically. There are a couple of things that might need to be done

Build up a map to keep track of active peers with the relay protocol
Determine the max number we can and should listen on ~~(eg does it do anything to listen on multiple relays as been done manually?).~~
Only use relays when behind a nat or firewall, otherwise dont use relays unless connection is being blocked in some way.
Track the ListenId after doing Swarm::listen_on and use it to remove the listener when there is an issue related to it

Draft should be first implemented into this crate and once its figured out to split it out into its own crate (or implement directly into libp2p?).

Passthrough gossipsub configuration

Currently, the pubsub (rather internal or external) doesnt allow for configuration to be supplied, which would restrict what we can and cannot do down the road. IpfsOptions should have a field that can be supplied to gossipsub, with the pubsub implementation accepting a argument for the configuration and with external to build gossipsub with the configuration and pass that over using GossipsubStream::from.

Use RemoteProtocolsChange to detect bitswap protocol

In the current implementation, we are injecting the protocol into the bitswap protocol to determine the protocol the peer advertise, if any, however with recent changes to libp2p (0.52 once released), we can use ConnectionEvent::RemoteProtocolsChange in the handler to determine if the peer has the supported protocol.

Implement a connection idle

New connections that are established without any activities would close almost immediately. Eg, if the node isnt subscribed to the same topic as another peer, or if dht is not performing queries, the connection may close right away after a connection is established. This is more observed when connecting to another peer over relay from actual use. While this isnt a problem, it might be better to implement a behaviour with a handler that keeps the initial connection alive for a specific amount of time (that is configurable) before allowing swarm to close the connection handler.

Note:

This is a workaround until libp2p/rust-libp2p#4121 is resolved.
This will deprecate UninitializedIpfs::enable_keepalive
This implementation does not refresh KeepAlive at this time, so once the timer is up, it may close if no other behaviour is keeping the connection open.

Implement IPNS

The base of IPNS was removed (giving some example in commits df8a12e, af7d88b, etc) leaving only dnslink portion in tack. Since libp2p have came a long way since then, we should be able to make use of DHT to put IPNS records.

Import newer IPNS proto file
Create functions to create, sign (both sigv1 and sigv2), encode and decode the IPNS Record (see #88)
Implement functions to publish IPNS to DHT (see #88)
Test for go-ipfs and js-ipfs interop
Implement IPNS over Pubsub(?) - Low priority and research would be needed

Note: This will be done in a separate crate that would be imported into rust-ipfs

Possible refactor of p2p behaviour

Due to changes in libp2p related to NetworkBehaviour from libp2p/rust-libp2p#2842, structs implementing the trait would not be able to have ignored fields, which is being used https://github.com/dariusc93/rust-ipfs/blob/next/src/p2p/behaviour.rs#L50-L53 . As a result, changes would need to be done so they are on the outside of the struct

Make rust-ipfs more platform agnostic

From where it stands, it seems that rust-ipfs is not exactly platform agnostic in a sense where it could run on every platform without an issue (mostly around wasm). rust-libp2p is platform agnostic, although testing may need to be done around wasm, wasi, etc., but have not seen many reports of issues on those platforms while rust-ipfs, being directly tied in with tokio, which does not support wasm completely, would likely have trouble running on such platforms.

Platforms that rust-ipfs can run on:

PC (Works on Windows, Linux, Mac. For FreeBSD, OpenBSD and NetBSD, testing coming soon)
Android [1]
WASI (Not tested)
WASM (tested, ~~bug in relay protocol, see libp2p/rust-libp2p#5305~~)
iOS (Not tested)

What should be done in the future:

Make rust-ipfs less dependent on tokio to give the user option to choose between different libraries (eg smol, async-std, etc)
Gate any IO operations behind features not to conflict with platforms that cannot perform direct IO (eg WASM)
Test official protocols against WASM in rust-ipfs to determine what does and doesnt work
Test interop with js-ipfs and go-ipfs

Notes:

Under default configuration, DNS would not operate on android due to its attempt to locate /etc/resolve.conf. As a result, we have to check for the target os and use the following lines to use the default dns (which I believe would be google dns)
We can use idb, or gluesql for the data and block store when targetting wasm to make it easier on us.
We may want to utilize send_wrapper when handling any wasm32 libraries as many of them that interact with bindings will not support Send or Sync.

bitswap: Bundle multiple requests/blocks into a single message

In the internal implementation of bitswap, wants and haves were split into their own respective sessions on purpose to allow handling each request on its own, putting peers into those sessions based on any requests made for said blocks, however this would result in multiple messages being sent when multiple blocks are requested. We could optimize this by bundling the multiple requests into a single message, assuming we can stay below the max message size (which is 2MB). If a message does begin to reach that limit, we could split it up into multiple messages if needed.

Implement graphsync

We currently use bitswap, but extending out to graphsync may be helpful in better exchanging of blocks between peers, which may eventually replace bitswap altogether (and possibly build bitswap on top of graphsync so we can have backwards compatibility)

Use kad client mode instead of optional protocol

Due to kademlia not having a client mode implemented, we would not be able to stop queries to the node from other peers when connected unless the protocol itself is disabled, which may not always be desirable since the node may want to query the DHT for information but not want to be queried itself. Once libp2p/rust-libp2p#2521 (or any related PR, pending libp2p/rust-libp2p#2680) has been merged in, remove the Toggle around the protocol and allow it to be enabled completely with a option to toggle between client and server mode.

relay_manager.random_select() blocking forever

Describe the bug
Under some conditions the loop in random_select never exits. In my case, there appears to be only one connection for a given peer and that turns out to be on in the blacklist. Since this function is called from my event loop, it essentially kills the app.

To Reproduce
I can't reliably reproduce, but I believe the issue happens when connections to a relay node drop (temporarily). I'm assuming that I am not correctly pushing in the right events and/or calling the method at the wrong time.

Expected behavior
I would expect the function to not block, even when used potentially incorrectly.

Environment (please complete the following information)

Operating system, kernel version where applicable: MacOS
Rust version: 1.77.2
Crate version: 0.2.2

Quic State Mismatch Error

Running a local node for awhile, quic would return an error with the following

2023-02-25T13:45:49.595210Z ERROR libp2p_quic::endpoint: State mismatch: event for closed connection

This error reference back to https://github.com/libp2p/rust-libp2p/blob/master/transports/quic/src/endpoint.rs#L493 and https://github.com/libp2p/rust-libp2p/blob/master/transports/quic/src/endpoint.rs#L546. So far, there is no weird behaviour or disruption of service, but should be investigated into the actual cause by checking the logs.

refactor: Switch to ipld-core

It looks like ipld-core will be the successor to libipld (or libipld-core) and might be beneficial to migrate from libipld to ipld-core in the near future once more information is gathered.

Refactor IpfsOption

Should provide extended options from the protocols, the ability to disable or enable protocols that would be deemed optional (rg dcutr, etc), and provide option to configure swarm, connections, preload values into the protocols, etc.

Make adding data easier

Right now, adding data is rather hard with the API of Ipfs.

We have add_file_unixfs, but that does return a Stream itself which one has to process and even if that is done, finding the cid::Cid that refers to the data that has been added is non-trivial.

I would like to have an API where I can add a stream of bytes (not Result<u8>´, but actual u8) and get a cid::Cid` that refers to the bytes that were just added. For example something like this:

let cid: cid::Cid = ipfs.add_from_stream(my_bytestream).await?;

Implement garbage collection

Current implementation doesnt have a mean of clearing up unpinned blocks. While not exactly a priority, we should have a basic gc implemented to cleaned up blocks in a separate task or thread with a lock being held to prevent it from running or overlapping with each cleanup, especially if there is a manual gc call.

Question:

Should this run automatically? If so, at what interval? go-ipfs runs GC every hour with its flag (not sure if that changed since the last I checked) so should we do the same with a configuration option or a function to enable/disable it?
Should we have a configuration option to define the max size (eg 4GB) before GC automatically runs in attempt to reduce below? Note, this would only apply to unpinned blocks. Pinned blocks would be ignored nor count towards the size.
Should GC run during graceful shutdown or startup or maybe have a option to allow that?

Perform test with quic transport

Quic is not yet implemented in rust-libp2p but there are PRs that could be attempts quic transport implementation.

Test hole punching
Test bitswap
Monitor network usage and compare it between usage with tcp.
Check logs for any errors.

PR in rust-libp2p to test:

Rewrite ipfs-http

bug: mdns spams `rtnetlink socket closed`

Whenever the socket closes, it causes logs to spam ERROR libp2p_mdns::behaviour: if watch returned an error: rtnetlink socket closed.

This seem to be a bug upstream in libp2p that has yet to be resolved. See libp2p/rust-libp2p#4287

Support multiple and custom executors

Currently, rust-ipfs only make use of tokio, however this makes it hard for one to switch to another executor that they might wish to use like async-std, etc. This would include spawning our own task

Thoughts:

Expose Executor from rust-libp2p and pass that to Swarm, however this may limit are ability to spawn task for that executor.
Provide a custom executor that would be used in Executor for Swarm as well as while we are polling swarm.
Support executors behind a feature. This might give us more control over it but may limit allowing custom executors
We could do both 2 and 3 while leaving room for any external runtime without the futures.

Questions:

Do we want to have some form of handle to be returned?
Should we require custom transports if a custom executor is set outside of any preset function?

Notes:

Would resolve part of #6 but not all
Would have to work with rust-libp2p.
May have to support local spawning as well as spawning blocking task (eg for IO operations)
For the library to be completely free from depending on tokio, bitswap library would need to be independent from tokio in itself. This could possibly be done by passing an executor to the behaviour for the task, although tokio-context would need to be removed in this case or fork it to support multiple runtimes. The other option would be to have the runtimes behind a feature.

bitswap: Reset request upon relay circuit closing and restablishing

libp2p-relay is keeping track of the amount of bytes that been sent through the circuit, and once the amount sent reach of exceeds the max the relay would allow for the peer, it would close the circuit. See here and here.

This presents a problem because when performing bitswap over relay where the circuit has a higher amount of allow bytes than the default, it would cap out and close the circuit interrupting the bitswap exchange. The circuit is later reestablished assuming that a direct connection has not been established via dcutr, however, if the connection is established by the same peer, the exchange does not continue (either through relay, dcutr, or directly), resulting in a timeout in the repo. What we should do is upon the circuit or the connection, being reestablished, we should resend the ledger to the peer so that we can continue in the process of the block(s) exchange.

Note:

If the circuit is complete, it would close as well.

Refactor SwarmApi

In a previous discussion with the then maintainer of rust-ipfs mentioned about how SwarmApi was used as a workaround since at the time it wasnt clear on how welcoming libp2p would be for a cli use case. Alot of the functionality right now is not any longer (which does adds unnecessary complexity) with most getting pushed into using libp2p swarm. In this case, it should get refactor, likely renaming SwarmApi to something else with only holding information about connections and peers.

Thoughts:

Refactor to only store peers IdentifyInfo, ping rtt, and other details in the behaviour and possibly cleanup after a duration passed for any peers that disconnects or have a very high ping to where it would be unreasonable to remain connected
Remove SwarmApi completely and move it into IpfsFuture and just store and remove base on events from the swarm
Similar to above but splitting it into different behaviours (eg peerbook, autorelay (which initially was going to be used apart of this behaviour but would be best to move into its own), a basic connection manager, etc)

bitswap: Timeout peers who have not respond

Currently, we would send a message with sendDontHave being true, however not all bitswap implementations would act on this field or may ignore it causing us to wait as a result. What we should do instead is add a timeout to each peer when we send a request and if a message does not arrive before it timeout then we should pop the peer from the session or move them into a separate queue to be removed so when the session is polled again we can send a cancel request as we remove them, but if they respond before this happen, we could move them back into the session.

bitswap: Investigate performance speed

Bitswap (or beetle-bitswap) speed is rather low compared to kubo (and possibly js-ipfs/helia) when exchanging blocks between two rust-ipfs nodes, however the speed is much faster when exchanging blocks with a kubo node. Originally, it could have been due to how we process blocks in IpfsTask, however since a kubo node bitswap performance is much faster between rust-ipfs<->kubo, this may be more related to how blocks are sent out in the current implementation or requested from kubo.

Information (base on 256mb file in local network):
rust-ipfs<->rust-ipfs: ~103s
kubo<->rust-ipfs: ~9s
rust-ipfs<->kubo: ~38s

Merge commits from archived repo and document status

The /rs-ipfs/rust-ipfs repo is archived, but there are still two commits that could be merged here.

And I see in rs-ipfs#512 which informs about maintenance status that you are doing "passive maintenance on [your] own fork". Would not it be better to unarchive the original repo and keep maintaining it while clearly indicating that the maintenance "team" changed because most of them moved to iroh?

If we were to do this, I would be happy to document project status on the README.

Implement Keystore

Current implementation only allow passing in a single keypair, which would be useful for starting the node, however this implementation does not support a keystore that would allow the creation, storage and usage of generated keys. This would be helpful when it comes to implementing #1, or if a node which to select a specific keypair for starting their node, or possibly other usage (eg implementing signing or encryption for dag, though such specs are not finalized or complete)

repo: Move logic into its own task

Currently, repo operations within flatfs and memory relies on different methods synchronization. In MemBlockStore, it relies on a mutex, while FsBlockStore relies on a broadcast in a map that sits behind a mutex. While these do work to an extent, it does create some concerns due to the async nature of the blockstore api, specifically around flatfs.

As a result, I would propose moving the logic into its own single task that can be communicated with channels. This would allow for a FIFO approach, which should reduce some of the complexity.

Note:

Depending on the results, this could eventually extend out to the datastore
This is meant for internal use. External implementations are not impacted, but may be suggested to follow a similar implementation.

misc: Expand test for Unixfs functionality

Custom Behaviour

Current implementation only supports specific protocols without any ability to extend for behaviours to be used outside of rust-ipfs. There should be a way to inject custom behaviours that can be polled and used along side the current behaviours.

Migrate from protoc/prost to quick-protobuf

Currently unixfs uses quick-protobuf but should be simple to migrate others over as well

Reference: libp2p/rust-libp2p#3024

MultiAddr Ext

To replace MultiAddrWithPeerId, etc., it should be a trait that extends MultiAddr. This would allow for us to roll back the use of the wrappers while maintaining some form of compatibility

Related #11

Update bitswap to 1.2.0

This may include switching to use libp2p-bitswap

bitswap: Add libp2p-bitswap as a separate feature

relay: Panic when using RelayConfig::unbounded

Using RelayConfig::unbounded would result in a panic due to a overflow

thread 'tokio-runtime-worker' panicked at library/std/src/time.rs:601:31:
overflow when adding duration to instant

rust-ipfs/src/p2p/behaviour.rs

Lines 150 to 162 in 41bd505

 pub fn unbounded() -> Self { 

 Self { 

 max_circuits: usize::MAX, 

 max_circuit_bytes: u64::MAX, 

 max_circuit_duration: Duration::MAX, 

 max_circuits_per_peer: usize::MAX, 

 max_reservations: usize::MAX, 

 reservation_duration: Duration::MAX, 

 max_reservations_per_peer: usize::MAX, 

 reservation_rate_limiters: vec![], 

 circuit_src_rate_limiters: vec![], 

 } 

 }

This is likely due to using Duration::MAX, which would cause an overflow when adding Instant. This could be solved by performing a precheck on the Duration by adding Instant to it via checked_add and if it returns None, we could either subtract Instant from Duration or split the half in half until Instant could be added and use that value before converting it into a configuration used by libp2p-relay. We could also do this upstream instead as this may be considered a bug for those who use Duration::MAX

feat: Connect to bootstrap peers

When initializing the ipfs node, one is able to add bootstrap peers via UninitializedIpfs::add_bootstrap, which would add them to kademlia routing table, however if the protocol is disabled, then nothing happens. What we should do instead is connect to the peers if the protocol is disabled (or maybe ensure that the connection is made regardless of protocol status), and maybe make the connection persistent since manual entry would indicate some form of trust. The persistent connection would likely need to be done through its own behaviour with the connection handler being kept alive, though the bootstrap peers may or may not have the same conditions to keep the connection alive.

Implement Errors

Currently, errors are handled by anyhow, but it may be better to return specific error types utilizing thiserror and use anyhow for errors that does not yield a specific error (eg gossipsub builder returns &'static str as an error).

	pub fn unbounded() -> Self {
	Self {
	max_circuits: usize::MAX,
	max_circuit_bytes: u64::MAX,
	max_circuit_duration: Duration::MAX,
	max_circuits_per_peer: usize::MAX,
	max_reservations: usize::MAX,
	reservation_duration: Duration::MAX,
	max_reservations_per_peer: usize::MAX,
	reservation_rate_limiters: vec![],
	circuit_src_rate_limiters: vec![],
	}
	}

dariusc93 / rust-ipfs Goto Github PK

rust-ipfs's Introduction

Rust IPFS

Table of Contents

Description

Project Status - Alpha

Getting started

Running the tests

Contributing

Roadmap

Completed API Work

Maintainers

Alternatives and other cool, related projects

Contributors

License

Trademarks

rust-ipfs's People

Contributors

Stargazers

Watchers

Forkers

rust-ipfs's Issues

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Additional Information

Request for Assistance

Recommend Projects

Recommend Topics

Recommend Org

Project Status - `Alpha`