Some comments and recommendations to make the Newscoop API more usable with recent web app frameworks from @oliverwehn
Thoughts on the Newscoop API (v.1.?)
General
I’m currently trying to build an REST adapter for Ember.js to communicate with the Newscoop API. The thing is, that the API endpoints and the returned results aren’t structured in a that coherent way that would allow a single abstracted API adapter to communicate with the CMS based on simple conventions. Therefor I’ve written down my idea of the API’s structure and the responses I’d expect to get from API endpoints.
Issues
There are a few important resources you probably want to be able to get your hands on in Newscoop (through their endpoints). Trying to get to them I stumbled upon some issues and inconsistencies.
Endpoints
The endpoint architecture is very specific, depending on what you want to query from the api. For example, if you want to get articles, you request them from GET /articles
, but if you want to get the articles for a certain topic, you have to request them from GET /topics/{topicId}/{language}/articles
. It seems, you can’t query articles for a topic or any other property of the article data set from the main article endpoint. If you could, there was no need for multiple endpoints for one kind of resource (in this case 'article'). Also it seems to be inconsistent, that you don’t need a language code for the main article endpoint, but for the topic’s articles endpoint (and a few others) you do.
Response format
There are some aspects of the JSON responses that I’m not really happy with. Requesting topics with
gives me
{
"items":[
{
"id":395,
"title":"Caspar Baader"
},
{
"id":394,
"title":"Thomas de Courten"
},
…
],
"pagination":{
"itemsPerPage":10,
"currentPage":1,
"itemsCount":805,
"nextPageLink":"http:\/\/www.tageswoche.ch\/content-api\/topics?page=2&items_per_page=10"
}
}
The returned object is named "items", what seems a bit generic to me, but would be ok, if "item(s)" was used as the wrapper for the requested data sets consistently throughout the API’s JSON responses. That’s not the case.
Now we go for the articles of a certain topic with
GET /topics/{id}/{language}/articles
you’ll get
{
"id":600,
"title":"Wochenendlich",
"items":[
{
"language":"de",
"fields":{
…
},
…
},
{
…
},
…
]
}
The request for articles of topic {id} is answered with a JSON object containing the articles in "items" again. That’s fine. But why is this also the only way to get to the data set of the topic itself (with its properties loosely added to the response like "id", "title"), instead of providing it via an own endpoint? I’d expect to get the topic data via
what just gives you an 404 error. That seems especially strange, as you can easily get to an article through
giving you:
{
"language":"de",
…
"webcode":"k08x7",
"reads":"0"
}
As mentioned before, responses with multiple data sets are returned as a JSON array wrapped in an "items" property { "items": [ … ] }
. Therefor I’d expect a request for a single data set to give me the result as an object wrapped in an "item" property { "item": { … } }
.
Identifiers
Topics, articles-lists, authors, etc. have an "id" property, articles have a "number" property instead. Why having differently named identifiers for resources?
Recommendations
In my opinion it should be the goal to create a consistent and conventional API where the request URIs for all resources and the structure of responses are build based on a reliable architecture. Therefor this isn’t just a (really short) list of recommendations, but much more a wish list of someone who wants to build nice apps without spending to much time on abstracting API communications.
Each resource should have a single, unique endpoint
Like:
GET /articles /* page through articles data sets { "articles": [ … ], "pagination": { … } } */
GET /articles/{id} /* get data set of article with ID id { "article": { … } } */
GET /articles-lists /* page through articles-lists data sets { "articles-lists": [ … ], "pagination": { … } } */
GET /articles-lists/{id} /* get data set of articles list with ID id { "articles-list": { … } } */
GET /topics /* page through topics data sets { "topics": [ … ], "pagination": { … } } */
GET /topics/{id} /* get data set of topic with ID id { "topic": { … } } */
GET /authors /* page through authors { "authors": [ … ], "pagination": { … } } */
GET /authors/{id} /* get data set of author with ID id { "author": { … } } */
…
This allows the application to determine the endpoint by convention for each of its data models. So including resource links into the response objects like
{
"language": "de",
…,
"authors": [
{
"name":"Karen N. Gerig",
"link":"http:\/\/www.tageswoche.ch\/content-api\/author\/179"
}
],
…
}
would become superfluous.
Filtering and requesting data by relation should be done via data sent with the request. So e.g. the request for articles of a topic shouldn’t look like
GET /topics/600/de/articles
but more like
GET /articles?topics=[600]&language=de /* with language as optional parameter */
It would be much more consistent, as you stay true to the specific API endpoint of the requested resource type.
Respond accordingly to what was requested
shoud get you
{ "articles":
[
{
"id": 753433,
"title": "Wochenendlich in Warschau",
"topics": [
{
"id": 600,
"name": "Wochenendlich",
…
},
{
"id": 432,
"name": "Basel",
…
}
],
…
},
{
"id": 734274,
"title": "Die neue App ist da",
,
"topics": [
{
"id": 453,
"name": "TagesWoche",
…
}
],
…
},
…
],
"pagination":
{
"currentPage": 1,
"itemsPerPage": 10,
…
}
}
and
should get you
{ "article":
{
"id": 753433,
"title": "Wochenendlich in Warschau",
"topics": [
{
"id": 600,
"name": "Wochenendlich",
…
},
{
"id": 432,
"name": "Basel",
…
}
],
…
}
}
with the returned resources being named accordingly to their type and separated from for example the pagination object.
Allow loading related resources asynchronously
To optimize the amount of transferred data and the number of http requests accordingly to the necessities of the app, the API shoukd provide the possibility to switch between synchronous and assynchronous loading of related resources (like topics of an article). Providing a GET parameter like async=1
should make the API deliver just the IDs of the related data sets. The fields representing the relations should be named accordingly to the related resource (singular ('topic') for single "belongs to" relations or pluralized ('topics') for multiple "belongs to" or "has many" relations).
Asynchronous
GET /articles/753433?async=1
should get you
{ "article":
{
"id": 753433,
"title": "Wochenendlich in Warschau",
"topics": [ 600, 432 ],
…
}
}
Synchronous
When loading relations synchronously, it could be more efficient to provide related data sets for side loading instead of listing them redundantly within each returned data set. For example, when a list of 'articles' is requested, the chance is high some of them share one or more 'topics' or 'authors'. This could lead to:
{ "articles":
[
{
"id": 753433,
"title": "Wochenendlich in Warschau",
"topics": [ 432, 600 ],
"authors": [ 45 ],
…
},
{
"id": 734274,
"title": "Die neue App ist da",
,
"topics": [ 432, 453 ],
"authors": [ 45, 74 ],
…
},
…
],
"topics":
[
{
"id": 432,
"name": "Basel",
…
},
{
"id": 453,
"name": "TagesWoche"
…
},
{
"id": 600,
"name": "Wochenendlich",
…
}
],
"authors":
[
{
"id": 45,
"name": "David Bauer",
…
},
{
"id": 74,
"name": "Catherine Binser",
…
}
]
}