Giter Club home page Giter Club logo

Comments (7)

martijnwalraven avatar martijnwalraven commented on May 18, 2024 2

@xiekevin: We definitely want to support caching as part of the framework, so it would be great to hear more about what you are currently doing, and how we can make that work in a more structured way. Design input is very welcome!

My idea is roughly that the resolver function passed to GraphQLResultReader when reading from a network response should also be responsible for keeping normalized records, with the right cache key based on the query path or field value (like id). These records are then published to a store, and we would define a Cache protocol to plug in different caching implementations (more or less key-value, so dealing with records and not whole responses).

The GraphQLResultReader abstraction also makes it easy to read from a store (this may be what you're already doing), so that allows you to reload queries from the cache when other responses return new values. We eventually may want to keep track of dependencies, so we only update queries when their associated data changes, but I think it may be ok to reload all queries at first.

from apollo-ios.

xiekevin avatar xiekevin commented on May 18, 2024

Hi martijn, thanks for the quick response. What you've described is conceptually in line with what I'm doing, but since I was working under the assumption that Apollo-iOS would be a thin data-fetching layer, I built most of the caching / data-sync outside of the framework. That's probably why the structure is bit messier, I hadn't thought of adding integration to the framework itself.

I like the idea of having the resolver update the cache after reading the network response and think it probably makes the most sense to do it there. I am currently using the field value id as the cache key as well (since it's unique per node object under a Relay based GraphQL schema).

Since I've done it outside the framework, I'm updating the cache when the network returns .Data, getting the JSON via currentObject, and syncing with the cache at that point. With this approach, I recursively form wrappers around all JSONObject values that have an id field (these are assumed to be object / node values), generating a hierarchy of wrapper objects around the root object, and storing each wrapper individually into the Cache.

All the data sync happens through a pub sub implementation that wraps around NotificationCenter that listens based off the id key. The cache has a mapping of dependencies so that only the objects whose dependencies have changed are refreshed. I hadn't thought about syncing at the query level, so right now all my caching is done at the object / model level.

I use GraphQLResultReader to recreate the auto-generated structs using the init(reader:) method after updating objectStack on the reader to be the JSONObject that is generated via the wrappers obtained through the Cache.

One major issue I've encountered is when arrays of objects are merged together which do not have the full data needed for deserialization. In this case I would like some version of GraphQLResultReader to return nil instead of throwing and continue deserialization of the array. Here's an example of how this could happen:

query AQuery {
    root(id: ID123456789) {
       object(first: 5) {
          id   
          property_a
       } 
   }
}

query BQuery {
    root(id: ID123456789) {
       object(first: 10) {
          id   
          property_b
       } 
   }
}

In this case when the objects are merged, some objects will have both property_a and property_b while some still only have property_b. I think that it makes sense in this case that the de-serialization process skips over those entities while de-serializing the 5 positions that have both. This allows views that rely on AQuery and BQuery to successfully deserialize.

I think what you're proposing sounds stellar. I have a couple questions as someone new to the field of software engineering and especially to design architecture.

  • Does JSON merging make sense as a data synchronization strategy?
  • Would the main benefit of cacheing at the query level be that there is just one top level object that needs to be updated?
  • Do you have an estimate for when work might start or complete?

Thank you for working on apollo-ios, it's definitely a great tool and has been fun to work with. I'm excited to see it grow more in the future :)

from apollo-ios.

martijnwalraven avatar martijnwalraven commented on May 18, 2024

Thanks for the detailed response! This is exactly the kind of design discussion I was hoping for.

What you're describing is pretty much what I had in mind. Your NotificationCenter wrapper is what I've been calling the Store. So GraphQLResultReader keeps a map of records, these get filled as we parse objects, and after parsing an entire response the record set is published to the store.

The store is then responsible for merging the new record set with what's already there, and notifying subscribers. It would be great to keep track of dependencies and make this as fine grained as possible, so we don't notify subscribers if their data hasn't changed.

I'm not sure what you mean by:

I hadn't thought about syncing at the query level, so right now all my caching is done at the object / model level.
We definitely want to cache at the object level, so maybe you're misunderstanding what I said before. But the idea is that subscribers register an interest in a query, and get notified when any data the query depends on changes. At that point, the query reloads based on the data in the store, through a GraphQLResultReader that reads records from the store and follows references when resolving fields.

I'm also not sure about these questions. Could you elaborate on what you mean by 'JSON merging' and 'cacheing at the query level'?

  • Does JSON merging make sense as a data synchronization strategy?
  • Would the main benefit of cacheing at the query level be that there is just one top level object that needs to be updated?

Caching lists is definitely one of the most tricky parts! I don't think skipping objects that miss fields is something you want as a default policy, but a lot of this is application-specific, so we want to make sure we expose the right hooks.

An argument like first has no special meaning in GraphQL, so strictly taken, the results of object(first: 5) and object(first: 10) are unrelated and would be cached separately. So we'd have a list of 5 object references stored under one cache key, and a list of 10 object references stored under another. These are references, so the objects these point to are still shared (if they have an 'id'), but the lists themselves are separate.

So I think the default caching policy wouldn't actually suffer from the problem you're describing. But of course, in many cases we actually want to capture the fact that these lists are part of the same (conceptual) list. If we've already asked for object(first: 10), we would like to avoid hitting the server for object(first: 5) for example. And things become even more tricky when we take mutations into account, where we would like to add something to the end of a list for instance.

Apollo Client (for JavaScript) gives you a lot of freedom to configure application-specific behavior, and exposes functions like fetchMore and updateQueries for this.

I would also highly recommend this talk by Joe Savona, who works on Relay at Facebook.

He presents a really nice summary of the lessons they have learned at Facebook, and that includes a conceptual model for how they do caching and deal with paginated lists. I like the ConnectionController model he describes, and it seems that could be a good fit for Apollo iOS.

I'm planning on working on this over the next few weeks, and would like to have a basic implementation of caching done by the end of the year. So let's definitely continue this conversation, and maybe there is an opportunity to work together on this, or even adapt some of the code you've been working on.

from apollo-ios.

xiekevin avatar xiekevin commented on May 18, 2024

Thanks for the thoughtful follow-up, I feel like I'm learning a lot by discussing this thoroughly with you. I feel the general implementation strategy that we've each come up with is identical in most aspects and I think that's a great sign.

As for the discussion around syncing at the query level, I had previously been syncing objects themselves and reloading individual objects using init(reader: GraphQLResultReader) rather than the top-level Query object. Caching was still done at the object level, but your suggestion to reload the query rather than reload objects was a good one and I hadn't thought of that at the time of writing.

To give some context around JSON merging, I've used JSON as the basis of merging or updating records. I wasn't sure how else to do this using object types in Swift since I couldn't figure out a way to set properties dynamically without bridging to use Obj-C's setValueForKey. I've essentially stored all Records as wrappers around JSONObjects in my Store with the key as an id. The main purpose of the wrappers is to hold object references.

I watched the talk by Joe Savona and it mentions a lot of good points and things I hadn't considered and is definitely a good overview of GraphQL best practices--I'll be sure to forward that to my team. You also brought up many good points about caching lists and I do agree and see now that a lot of it is platform or product specific. As you mentioned earlier about exposing the right hooks, it would definitely be great to be able to fine-tune how the updates occur.

With that said, I still have a couple of uncertainties. I now understand that there's no special meaning on (first: 10), so theoretically you could've called (last: 10) and there would be no way for the client to know that. Since lists don't have an id property, there must be another way to identify them.

  • How would keys be generated to be consistent and distinct between object(first: 5) and object(first:10) when they're part of the same object and same property?
  • In the example above, if we're able to generate distinct keys, when reloading the query, how does it know to use the cache key for objects(first: 10) or the cache key for objects(first: 5) when reloading the list when both are the same property of the same root object (of id ID123456789)? I'm assuming that only one entry for an object with id ID123456789 exists in the Store.

If you have the time it would be great to keep in touch and it's nice to hear that you'll be working on it starting so soon. My implementation is very rough in general so it probably wouldn't be helpful, but would be happy to share if you'd like to take a look!

from apollo-ios.

martijnwalraven avatar martijnwalraven commented on May 18, 2024

@xiekevin: Sorry, I've been preoccupied for a few days, but getting back to this now. I hope to have some code to share soon.

To answer your question, fields are not cached based on the response name (that could be an alias), but on a combination of the field name and arguments. The idea here is that fields are basically functions, and different arguments may give different results, so they should be cached separately. In the Star Wars schema for example, you don't want to confuse height(unit: FOOT) and height(unit: METER).

This means object(first: 5) and objects(first: 10) are considered different results and will be cached separately by default. (Of course, the point here is that there is a relationship between these results, and you want a mechanism that allows you to take advantage of it. But that is not the default behavior.)

In general, using id as the cache key for an object is just one possibility, and it is often useful to have a fallback strategy. In Apollo Client, the default strategy is actually to use the query path, but you can configure your own dataIdFromObject function. See this blog post for a good description of this model.

from apollo-ios.

xiekevin avatar xiekevin commented on May 18, 2024

Ah, I get it now, that was a great explanation. Thanks for taking the time to help me understand this caching / synchronization strategy. I'll wait for when you have some code to share to give any further thoughts. Please keep me updated when you have new developments to share!

from apollo-ios.

martijnwalraven avatar martijnwalraven commented on May 18, 2024

@xiekevin: Hey, I just wanted to mention normalized caching has been merged! Still working on updating the documentation and writing a blog post, but it would be great to get some feedback and see if this covers your use cases.

from apollo-ios.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.