Giter Club home page Giter Club logo

wordpress-documentcloud's Introduction

This is the repository for the legacy DocumentCloud site, please see the current repository here:

https://github.com/muckrock/documentcloud

______                                      _   _____ _                 _
|  _  \                                    | | /  __ \ |               | |
| | | |___   ___ _   _ _ __ ___   ___ _ __ | |_| /  \/ | ___  _   _  __| |
| | | / _ \ / __| | | | '_ ` _ \ / _ \ '_ \| __| |   | |/ _ \| | | |/ _` |
| |/ / (_) | (__| |_| | | | | | |  __/ | | | |_| \__/\ | (_) | |_| | (_| |
|___/ \___/ \___|\__,_|_| |_| |_|\___|_| |_|\__|\____/_|\___/ \__,_|\__,_|

DocumentCloud is a catalog of primary source documents and a tool for annotating, organizing and publishing them on the web. Documents are contributed by journalists, researchers and archivists.

This codebase contains the entirety of DocumentCloud.org, and pulls together the rest of our open-source projects: Docsplit is used to extract data from incoming documents; that work is parallelized across CloudCrowd; data on the client-side is modeled by Backbone.js, which depends on Underscore.js for all of its abilities; Jammit concatenates and compresses the dozens of CSS and JS files into a single asset package; the NYTimes' Document Viewer displays the documents, while Pixel Ping records the traffic.

If you find a security issue while browsing the source, please email [email protected] to inform us of the problem.

Code contributed to this project is provided under the MIT license (see the LICENSE file). Some components of the project are subject to their own licenses as indicated (see /vendor and /public/javascripts/vendor directories).

wordpress-documentcloud's People

Contributors

aschweigert avatar bcampeau avatar dannydb avatar eyeseast avatar kant avatar knowtheory avatar reefdog avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wordpress-documentcloud's Issues

Inserting shortcode with widget triggers "tinyMCE.getInstanceById is not a function" error

  1. Install and activate the plugin (natch)
  2. Go to a post
  3. Flip to Visual mode
  4. Click the DocumentCloud toolbar icon
  5. Add a URL and whatever
  6. Click "Insert"

What should happen: The shortcode composes and inserts into the content.

What happens: Widget seems to reload, console shows error: "Uncaught TypeError: tinyMCE.getInstanceById is not a function"

Unsure when this behavior started, but a clean installation of WP 3.9.1 and the plugin at 1b84674 still threw the error.

Aspect ratio argument for embeds

I would love to be able to tell a document to maintain an aspect ratio as width changes. For example, keep the embed at 8.5 x 12 (allowing for chrome).

This might be something better implemented in DocumentCloud itself, but it may be possible within the shortcode, using some added JS. (cc: @knowtheory)

Bare URLs don't get default settings

We have a handful of default parameters we send to the oEmbed endpoint, as well as global defaults you can set in the Settings screen.

These work when you use the shortcode like [documentcloud url="…"], but not when you just paste a resource URL on its own line.

Only store metadata when post contains an embed

The meta variable wide_assets is added on all posts (even those without documentcloud embeds). Some of my users find this irritating. Could you store this information elsewhere or hide it? Or is it meant to be edited manually?

Alternatively, could you make sure wide_assets is only added to posts actually containing documentcloud embeds?

Revisiting default embed sizes

While adapting the plugin for oEmbed, I wanted to maintain 100% backwards compatibility with existing defaults and shortcode attributes. Here are the current size defaults:

array(
    'height' => get_option('documentcloud_default_height', 600),
    'width' => get_option('documentcloud_default_width', 620),
    …
);

(Quick aside: oEmbed services, including ours, expect maxwidth/maxheight, but WordPress standardized on height/width for its embed shortcodes, so it does a dance to map height/width to maxwidth/maxheight right before the oEmbed provider is called. I'll support both, with a priority preference for height/width out of deference to WordPress.)

The existing plugin has this priority order (lower number = higher priority):

  1. Shortcode attribute ([documentcloud width="…"])
  2. User-defined default setting (Settings > DocumentCloud > "Default embed width (px)")
  3. Plugin default (Second param in get_option() above)

Normally, in the absence of user input, WP applies the theme's own content width to the embed's width (and applies a standard multiplier to set a height). You can track this back from WP_oEmbed->fetch() to wp_embed_defaults() to the $content_width in a theme's function.php file. But we never fall through to that case because of the plugin defaults.

I'd like to recommend we remove the plugin defaults and, absent an explicit shortcode or user-defined default, let WordPress push through its theme-specific default embed sizes.

Impact: any existing plugin user who hadn't set widths explicitly (either in a shortcode or in their default settings) and then upgraded the plugin might be surprised to have their existing embeds "upgraded" to the theme's idea of a proper embed width/height. (Though presumably they'd have to open and save the post to have the shortcodes reprocessed.)

@eyeseast / @aschweigert, y'all know this landscape far better than me. What do you think?

Rename main class

class Navis_DocumentCloud should become class WP_DocumentCloud or just class DocumentCloud.

Any reason not to do this?

Some shortcode parameters are not working for some documents

It seems that some of the shortcode parameters (for example height, width, zoom, page, and a few others) are not consistently working on some documents.

Steps to reproduce:

  1. Add the following shortcode to a post:
[documentcloud url="https://www.documentcloud.org/documents/2746909-ABCC-List.html" responsive="false" width="400" height="400" page="3" zoom="false" /]
  1. The plugin will make an API call like so:
https://www.documentcloud.org/api/oembed.json?maxwidth=400&maxheight=400&url=https://www.documentcloud.org/documents/2746909-ABCC-List.html?page=3&zoom=false&responsive=false
  1. The API response will be:
{
    "type": "rich",
    "version": "1.0",
    "provider_name": "DocumentCloud",
    "provider_url": "https://www.documentcloud.org",
    "cache_age": 300,
    "height": 400,
    "width": 400,
    "html": "<div class=\"DC-embed DC-embed-document DV-container\"> <div style=\"position:relative;padding-bottom:141.2857142857142%;height:0;overflow:hidden;max-width:100%;\"> <iframe src=\"//www.documentcloud.org/documents/2746909-ABCC-List.html?embed=true&amp;notes=false&amp;page=3&amp;pdf=false&amp;responsive=false&amp;search=false&amp;sidebar=false&amp;text=false&amp;zoom=false\" title=\"ABCC-List (Hosted by DocumentCloud)\" sandbox=\"allow-scripts allow-same-origin allow-popups\" frameborder=\"0\" style=\"position:absolute;top:0;left:0;width:100%;height:100%;border:1px solid #aaa;border-bottom:0;box-sizing:border-box;\"></iframe> </div> </div>"
}
  1. Note a few things about the html value...
  • The inline style attribute is set at 100% width and 100% height (even though the JSON response shows 400 x 400.
  • The iframe src value properly renders zoom=false however the Zoom slider control will still show up.
  • The iframe src value properly renders page=3 however the document will not default to Page 3.

Support embedding a single note

Here's the embed code:

<div id="DC-note-138916" class="DC-note-container"></div>
<script src="//s3.amazonaws.com/s3.documentcloud.org/notes/loader.js"></script>
<script>
  dc.embed.loadNote('http://www.documentcloud.org/documents/1001227/annotations/138916.js');
</script>

Add to WordPress VIP shared plugins repository

One of our clients is interested in using your plugin on WordPress VIP. I see some minor code formatting and minor issues to resolve so it would pass their code review process. Would you be open to a pull request for this before we invest the effort? We can also help get this into the review queue for WordPress VIP since we're a partner.

Your plugin would then also be available to all WordPress VIP sites, which include some of the largest publishers in the world: https://vip.wordpress.com/plugins/

Let me know your thoughts on this. Thanks.

Shortcode renders the latest embed code

Here's an example, direct from DC:

<div id="DV-viewer-1011044-redwood-charging-docs" class="DV-container"></div>
<script src="//s3.amazonaws.com/s3.documentcloud.org/viewer/loader.js"></script>
<script>
  DV.load("//www.documentcloud.org/documents/1011044-redwood-charging-docs.js", {
  width: 500,
    height: 600,
    sidebar: false,
    container: "#DV-viewer-1011044-redwood-charging-docs"
  });
</script>
  <noscript>
  <a href="http://s3.documentcloud.org/documents/1011044/redwood-charging-docs.pdf">Redwood Charging Docs (PDF)</a>
  <br />
  <a href="http://s3.documentcloud.org/documents/1011044/redwood-charging-docs.txt">Redwood Charging Docs (Text)</a>
</noscript>

This needs to set width: '100%' like the current version does, to keep it semi-responsive.

Investigate WP 4.4 oEmbed changes

As discovered when helping @JoeGermuska debug StoryMap, and via this comment in the WP oEmbed class:

Since WordPress 4.4, oEmbed discovery is enabled for all users and allows embedding of sanitized iframes. The providers in this list are whitelisted, meaning they are trusted and allowed to embed any content, such as iframes, videos, JavaScript, and arbitrary HTML.

Here's what seems to happen.

  1. Since WP 4.4, all URLs entered on their own lines are fetched looking for oEmbed endpoint discoverability tags, and those oEmbed endpoints then fetched.
  2. If the endpoint returns an iframe, it is sanitized (security="restricted" sandbox="allow-scripts" added) and used.
  3. If not an iframe, then the response is discarded, unless the resource is whitelisted or you've registered the provider with a plugin.

Need to investigate and confirm the above, and then decide what changes (if any) to make, both here and on the platform. Questions:

  • Is the above description true?
  • Does the response need to be a bare iframe, or will WP pluck out an iframe nested in other HTML?
  • What strictures does security="restricted" sandbox="allow-scripts" put on us?

Allow default setting to show/hide sidebar

The plugin allows defaults for size, but the sidebar is another frequently used setting. (I always turn it off in posts, for example.)

Any other knobs and dials that should be default-able?

Add an embed widget/wizard

Original version of the plugin had a button (in visual mode) that spawned a wizardy form to let people compose an embed (choosing sidebar yes/no, set size, etc.). It was TinyMCE-based but broke at some point because of upstream TinyMCE changes. It's now been removed entirely.

At some point we should add something like that back in. Here's what @eyeseast did for another plugin: https://github.com/sunlightlabs/navis-openstates/blob/master/js/tinymce/legislators-tinymce.js

Research cache age of oEmbed response

WP can cache the oEmbed response; to enable it, we just need to flip WP_DocumentCloud::CACHING_ENABLED to true. But does it obey our cache_age or have its own internal age? Research.

Decommission old Navis-DocumentCloud plugin

The old one is going to rank high in Google search for a while -- we should see about getting it deleted or pointed to the new one. This was mentioned in another issue that's now closed so getting it on the radar.

Contextual page/note URLs only embeddable with shortcodes

Pages/notes can only be embedded with a shortcode or with their canonical URLs:

Check this out:

https://www.documentcloud.org/documents/282753-lefler-thesis/pages/57.html

Since canonical URLs aren't surfaced, we need to also support single-line embedding with contextual URLs:

Check this out:

https://www.documentcloud.org/documents/282753-lefler-thesis.html#document/p57

TinyMCE preview missing for short codes

If you paste http://www.documentcloud.org/documents/282753-lefler-thesis.html into the TinyMCE editor and then flip to visual mode, the document will be loaded via oembed.

If you paste in [documentcloud url="http://www.documentcloud.org/documents/282753-lefler-thesis.html" width=100 height=100 sidebar=false] (to shut off the sidebar) and then flip to the visual mode, the shortcode is not converted to an embed in editor.

Code style cleanup

I see at least two commenting styles (# and //). We should decide on a capitalization convention, too.

Slugs with uppercase letters throw off URL-cleaner

The pattern to recognize a URL as a DocumentCloud oEmbedable URL is very permissive.

Since many of our resources (pages, notes) have multiple URL patterns, including with page anchors, we have a clean_dc_url() function that recomposes them into the single canonical (and oEmbed-safe) versions. E.g., https://www.documentcloud.org/documents/282753-lefler-thesis.html#document/p57/a42282 is recomposed to https://www.documentcloud.org/documents/282753-lefler-thesis/annotations/42282.html.

Our base document slug pattern, however, has a bug. It only recognizes lowercase alphanumeric slugs, not uppercase. Because of the permissive pattern pointed to above, those URLs still get passed to the oEmbed endpoint, but they don't get cleaned and recomposed, so anchored-variant pages/notes get the document viewer returned instead.

Fix defaulting to a page or note

The viewer can default to a specific page or note with the page and note options. These should really be named default_page and default_note to be more clear. We began this transition way back when, but didn't complete it, and the halfway state is causing problems. For instance, the shortcode generator on the platform outputs page instead of default_page, but this plugin expects default_page.

Until we complete the lift and ensure the viewer works with both options (for backwards compatibilty), we should purge the default_ versions.

Recognize full Unicode range document slugs

Currently, the platform can create document slugs with Unicode characters, but our pattern-recognizer still only recognizes Latin alphanumerics.

Need to change 0-9a-zA-Z- to \p{L}\p{N}%-.

However! We're blocked by the platform's oEmbed endpoint, which currently chokes on them.

Use oEmbed

We're adding an oEmbed service to DocumentCloud, so I'll be adapting this plugin to take advantage of it. The syntax of the shortcodes won't change, so it'll be backwards-compatible.

Goals:

  1. Fetch the embed code from our oEmbed service, natch
  2. Support existing [documentcloud] shortcodes for documents, but prefer [dc-document] going forward to distinguish among target resources; option syntax remains the same
  3. Add new shortcodes support for other resources: notes (#4), collections/searches (#3), and (eventually) pages
  4. Add support for recognizing resource URLs on their own line, as described here; if #15 were accomplished, then this would only be enabled for versions of WP prior to that
  5. Use this implementation as a meta guide for future plugin authors

Edited: Abandoned design of separate shortcodes for different resources. Going to instead parse URL and determine resource type from there.

PHP wrapper for Document Cloud API

Hi Everyone,

I work at The Lens in New Orleans. We want to call the Document Cloud API from PHP (we're on Largo), so I'm working on a wrapper for the API, like this one: http://python-documentcloud.readthedocs.org/en/latest/ but for PHP.

My plan is to start with http://phphttpclient.com/ -- then build an adapter on top of it that will form the interface for calls to the Document Cloud API. From there, I'll build some PHP classes that represent objects in DocumentCloud (ex. Document, Project, etc.).

I'd like to start with unit test integration right off the bat. Do you all use a particular testing framework? I'm familiar with PHPUnit so I'd like to use that one -- but if largo/INN uses a different one I will certainly use that.

Changes to `wide_assets` and `documents` post metadata

When a post is saved, we store a couple pieces of metadata:

  • documentcloud documents stores all the shortcode attributes. I don't know what it's used for.
  • wide_assets set to true|false to hint to the template that the post contains, well, wide assets. Several StateImpact sites respond to this hint according to #16 (comment) and a Hangout conversation with @eyeseast

We have to make at least one change: both pieces of metadata are stored as hashes keyed by document slug. But now that we support notes – which are children of documents and thus whose URLs contain the document slug – we can have two embedded resources on the same post with the same hash key.

My recommendation:

  1. Determine if documentcloud documents post meta is necessary (@eyeseast?) and if not, remove it
  2. If resource is a note, include note ID as part of hash key to distinguish from documents.

Support embedding collections/searches

Here's what the embed code looks like:

<div id="DC-search-group-homicide-watch" class="DC-search-container"></div>
<script src="//s3.amazonaws.com/s3.documentcloud.org/embed/loader.js"></script>
<script>
  dc.embed.load('http://www.documentcloud.org/search/embed/', {
    q: "group: homicide-watch",
    container: "#DC-search-group-homicide-watch",
    title: "Homicide Watch DC",
    order: "title",
    per_page: 12,
    search_bar: true,
    organization: 170
  });
</script>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.