Giter Club home page Giter Club logo

Comments (15)

hlship avatar hlship commented on September 26, 2024 1

One could imaging that the prefix would map to a seq of routes, not a single one, ordered by specificity, and attempt to match each of them.

That being said, I think there's a fundamental problem with having routes that quasi-conflict, like /user/:id and /user/logout because what if the user's id actually is logout? Or something more likely and more subtle.

I'm leaving this with better documentation, and a clear test case that demonstrates the difference between prefix tree and linear routing.

from pedestal.

michalwilk123 avatar michalwilk123 commented on September 26, 2024

I had the same issue. Changing the router parameter in service map to :linear-search fixed this. There is a bug in the :prefix-tree router implementation

from pedestal.

michalwilk123 avatar michalwilk123 commented on September 26, 2024

related #726

from pedestal.

hlship avatar hlship commented on September 26, 2024

This does seem like a bug; at first I was going to point out that wildcards routes always "win" but that's not the case here.

from pedestal.

mtsbarbosa avatar mtsbarbosa commented on September 26, 2024

I was able to reproduce it on repl if anyone interested:

(require '[io.pedestal.http.route.prefix-tree :as p-tree])
(def ptree (-> nil
               (p-tree/insert "/users/:id" 1)
               (p-tree/insert "/users/login" 2)))

(p-tree/lookup ptree "/users/123")
=> {:path-params {:id "123"}, :payload #io.pedestal.http.route.prefix_tree.Payload{:routes [1]}}
(p-tree/lookup ptree "/users/login")
=> {:path-params {:id "login"}, :payload #io.pedestal.http.route.prefix_tree.Payload{:routes [1]}}

reason for it is here:

I guess the algorithm was designed to look for the first child
being a wild card * or :something, then to go for the others as @hlship mentioned

(let [c (:children node)]
    (or (get c ":")
        (get c "*")
        (get c (char-key path segment-size))))

I am not sure if changing that we are sort of slowing down the algorithm or even moving towards making a linear-search-like out of that. So, a bit unclear to me if that's a bug, because of the algorithm's design.

from pedestal.

terjesb avatar terjesb commented on September 26, 2024

@mtsbarbosa I think in your example, both of your routes are GETs? If that's the case, this is documented and by design for the prefix-router, ref http://pedestal.io/reference/routing-quick-reference:

Wildcard routes always win over explicit paths in the same subtree. E.g., /path/:wild will always match, even if /path/user is defined

In the original example in this issue, the problem seems to be that this unexpectedly also seems to apply across different verbs. The expectation is that POST /path/user should win over DELETE or GET /path/:wild, also for the prefix-tree router.

That being said, the doc above does say "always", which could be interpreted as independent of verb. But @hlship seems to confirm it's a bug.

from pedestal.

hlship avatar hlship commented on September 26, 2024

I'm looking this over, but I believe that routing is correct because, yes wildcards win out. I think you might be able to leverage constraints to accomplish both, I'm going to test that.

from pedestal.

hlship avatar hlship commented on September 26, 2024

Here's some findings:

(defn- attempt-route
  ([routes request]
   (attempt-route routes :prefix-tree request))
  ([routes kind request]
   (let [ctor (get route/router-implementations kind)
         router (ctor (expand-routes routes))
         request' (merge {:request-method :get} request)]
     (router/find-route router request'))))

(deftest wildcard-trumps-static-under-prefix-tree
  (let [routes #{["/users/:id" :get [`view-user] :route-name ::view-user :constraints {:id #"\d+"}]
                 ["/users/logout" :get [`logout] :route-name ::logout]}]
    (is (= nil
           (attempt-route routes {:path-info "/users/abc"})))

    ;; This still holds
    (is (match? {:route-name ::view-user
                 :path-params {:id "123"}}
                (attempt-route routes {:path-info "/users/123"})))

    ;; This is the cause of pain, as one would think that a constraint failure on the wildcard match would
    ;; drop down to match the static path.
    (is (= nil
           (attempt-route routes {:path-info "/users/logout"})))
    
    ;; Have to use :linear-search to get the desired behavior:

    (is (match? {:route-name ::view-user
                 :path-params {:id "123"}}
                (attempt-route routes :linear-search {:path-info "/users/123"})))

    (is (match? {:route-name ::logout}
           (attempt-route routes :linear-search {:path-info "/users/logout"})))))

So, yes, a switch to :linear-search fixes things. I don't necessarily agree that prefix-tree is doing the right thing here, but I believe the code is working as documented (I'll try and improve documentation more), and a change to the code would potentially cause incorrect routing on existing applications.

In a later Pedestal release, perhaps we can create an additional router that works as :prefix-tree, but falls back on constraint failures to other static (or even dynamic) matches as seems intuitive.

from pedestal.

ohpauleez avatar ohpauleez commented on September 26, 2024

The tree router is a prefix tree based on the route. Once a route matches, it uses a hash-map to see if there's a verb match. Given this setup, perhaps it's easier to see why "wildcards always win" -- they are greedy in their matching within the prefix tree.

When the prefix-router was created, we identified other matching strategies that better balance expectations here, but their time and space complexity are difficult to reason about. For example, you could keep all static routes in a hash-map or prefix-tree, and all wildcard routes in their own prefix tree -- on request you first try to match a static route and fallback to trying to match a wildcard route (I believe this is what retit does internally, for example). This also matches the process the linear router follows (go through the sequence of static routes before going through a different sequence of wildcard routes)

Using constraints within routes

It has always been a bad idea to put constraints in at the route level -- process constraints within your endpoint handler.

from pedestal.

aredington avatar aredington commented on September 26, 2024

Agree with @ohpauleez on the problems of applying constraints in route checking. The root conflict here is only possible because verbs are not considered at the time that the wildcard prefix stem conflicts with the privileged token 'logout' in the route. We must maintain an NxM matches for N routes and M verbs. I'd like some assistance in reasoning through the consequences of inverting the priority of path and verb: apply the HTTP verb (which is always present, and has a narrow branching factor) first, and then apply a prefix-tree against the path (which is always present, and likely has a higher branching factor) for that verb posterior to verb matching.

I agree with @hlship that the use case related to the bug is treacherous, but the dangers of the example don't invalidate the problem of a route that should match the request not matching due to an implementation detail.

I believe that inverting verb and path should lead to an ability to generate prefix trees which require fewer distinct stems and intermediate nodes because only the prefixes which cluster by verb need to be differentiated in the tree, which seems like a win. But eager to hear of any other less beneficial consequences I'm ignoring.

from pedestal.

ohpauleez avatar ohpauleez commented on September 26, 2024

Thanks everyone for throwing together some ideas to consider and helping illustrate the challenges in the design space (ala "what if there was a user named 'logout'").

A great bit of background here is the article, "40 years of Suffix Trees" from ACM Communications; March 2016

The utility of using a prefix-tree for routing is that redundant path segments get "compressed down" giving us a log-n match. In one sense, the more overlap we have in the routes, the better off we'll do compared to something linear. If you split on verb first, you don't have a prefix tree, you have M trees (one for each verb) and you have duplicate path information across those different (and potentially sparse) trees (which is less than ideal).

@aredington's suggestion of leaning in to common tree-walk solutions (aggressively prune the search space) is a good one, but I think that's a different router altogether potentially and not one necessarily based on a prefix-tree. There is tension here that a solution needs to do two different matches, one on routes and one of verbs -- I don't know of an encoding where verbs+routes get matched together, but I'm sure one exists.

Generally speaking, you want to match by the biggest discriminating characteristic first (to reduce the search space), which is another reason why the prefix-router does path then verb.

from pedestal.

hlship avatar hlship commented on September 26, 2024

Again, I think changing the current prefix tree router now is extremely risky due to the high probability of unforeseen consequences affecting existing applications. Still, Alex's idea is that we have one prefix tree per HTTP verb, and choosing the correct prefix tree is very cheap compared to navigating such a tree (which uses some amount of string comparisons, etc.).

from pedestal.

aredington avatar aredington commented on September 26, 2024

Trying to get at problems by describing qualities of these circumstances:

  • The current prefix tree finds one verb dispatch table per request.
  • The prefix tree prioritizes wildcards ahead of static routes (documented)
  • The information gain of dispatching by path (two and three possible matches for /users and /user/:token respectively) is inferior to the information gain of dispatching by verb (two, two, and one possible matches for :get, :post, and :delete).

Bullet 2 is documented and has provided Pedestal users structure that they are most likely leaning on right now, and altering it is most likely a breaking change.

Bullet 1 is likely to recharacterize performance of the prefix-tree if it is changed.

Bullet 3 is interesting but should probably be a different router to opt into rather than a change to the prefix-tree.

from pedestal.

ohpauleez avatar ohpauleez commented on September 26, 2024

Edit: Alex's response was spot on, many thanks!


Original message:

Two clarifying bits:

  • I don't think anyone is suggesting we change the prefix-tree (I'm certainly not)
  • I hope it was clear that there is a significant trade-off between 1 and 4 prefix trees, and if you were going to pursue using a weak discriminating attribute first, there are likely better solutions that factor that in.

I don't want people to lose sight of how a prefix or suffix tree works and what it is good at -- keeping that in mind makes reasoning about the prefix tree pretty straightforward.

I think a more meaningful conversation might be focusing on how an optimal router would behave, and then doing the encoding and data structure work to figure out how best to make that happen.

I'll now provide two completely unrelated rants :)

  • In all honesty, most applications are better off using a static or hash/map router, and pushing all parameters to query-string args or post bodies (which ever is more appropriate). I haven't used a path parameter in a long, long time. As a bonus, in systems like Pedestal2 there is zero-copy access to query-string args and post bodies.
  • I think the Pedestal project has managed evolution responsibly. Things are mostly stable, breaking changes were made when necessary, and when a breaking change was made a migration path was provided. Time moves forward, the surrounding context/world changes -- it's ok to responsibly and purposefully evolve a system when you have a goal/destination in mind and the evidence to support the decision. I don't think the idea of "stability above all else" is healthy, productive, or realistic.

from pedestal.

hlship avatar hlship commented on September 26, 2024

Looking over the above, I don't see anything actionable beyond, perhaps, ongoing documentation improvements, or a new router. In the meantime, I'm closing this issue.

from pedestal.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.