Giter Club home page Giter Club logo

docstore's People

Contributors

cdzombak avatar steiza avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

docstore's Issues

Make webserver script executable

I've looked at OSX and a few linux distros. Especially ones that support having both 2.x and 3.x pythons installed. It seems that the executable python2.7 exists on all of them.

On my server I made webserver.py executable and added the following to the top:

!/usr/bin/env python2.7

That looks for the python2.7 in path vs a hard path. Maybe give it a try on your dev box and see if it causes issues. Seems to work wherever I put it.

This allows it to just run as ./webserver.py which helps in writing a init.d script to control it. I need to polish up the init.d script and then I'll share that in a support scripts folder along with nginx config.

500 error when not logged in

Screen Shot 2021-09-21 at 11 35 38 PM

When I hit a2docs.org/review as an un-logged in user, I get this 500 error. I'm trying
to review a submission.

File import: confirm that import includes all files for multi-file imports

Related to #3 -

The current a2docs has a number of docs where a single document id is associated with multiple files, e.g.

http://a2docs.org/doc/292/ "Ann Arbor Fire Department response times"

which is different from

https://a2docs.aadl.org/view/292 "Ann Arbor Golf Proposal for Huron Hills"

I'm not sure where the ID skew is coming from, but the goal is to preserve the old URLs so that Arborwiki doesn't require a bunch of updates.

Comma in filename

If you upload a file that has a comma in the name, it goes boom. System is Chrome, running against localhost.

The localhost page isn’t working

localhost sent an invalid response.
ERR_RESPONSE_HEADERS_MULTIPLE_CONTENT_DISPOSITION

_AAATA Board Packet November 19, 2015_Revised.pdf is the filename.

[Include details here]

The Stack Overflow answer to this is here:

http://stackoverflow.com/questions/8588818/chrome-pdf-display-duplicate-headers-received-from-the-server

and that issue has something to do with the comma (",") character in the filename.

consider setting content-type on attached files

It would be nice to have the Content-type response header set for attached files, which might make reading on e.g. Chrome, iOS webview, etc. more convenient. I'm not sure if setting content-disposition: attachment prevents the webview from displaying the document in the native PDF viewer, but I can experiment with that if needed.

$ curl -v https://a2docs.org/file/570/2760+Stanton+-+FOIA+Final.pdf
> GET /file/570/2760+Stanton+-+FOIA+Final.pdf HTTP/2
> Host: a2docs.org
> User-Agent: curl/7.64.1
> Accept: */*
>
< HTTP/2 200
< date: Fri, 11 Dec 2020 16:44:08 GMT
< content-type: application/octet-stream
< content-length: 167169
< server: TornadoServer/6.0.3
< content-disposition: attachment; filename="2760 Stanton - FOIA Final.pdf"
< etag: "770df252e24b5b9c39539ec2a8a459da19a45e1e"
< strict-transport-security: max-age=15768000

A link that does display inline correctly:

$ curl -v https://cdn.ballotpedia.org/images/c/cf/2020_Hawaii_sample_ballot_%28Hawaii_County%29.pdf
> Host: cdn.ballotpedia.org
> User-Agent: curl/7.64.1
> Accept: */*
>
< HTTP/2 200
< content-type: application/pdf
< content-length: 648057
< date: Fri, 11 Dec 2020 16:45:16 GMT
< last-modified: Tue, 20 Oct 2020 16:35:33 GMT
< etag: "bd9648313b96686eb357f26a728f7914"
< accept-ranges: bytes
< server: AmazonS3
< x-cache: Miss from cloudfront
< via: 1.1 63b9a4cda82206b6b34aab8f3e958cbe.cloudfront.net (CloudFront)
< x-amz-cf-pop: ORD52-C1
< x-amz-cf-id: l2t0ZfreqWrmhldPoyPu70kdH7JORaGyjK_ZIRpP_U6cOaLV2gyJTQ==

HTTPS support for a2docs.org

Some notes on a transition:

We have an old URL (a2docs.org) and a new URL (a2docs.aadl.org). It would be good to have a plan to consolidate the two, and I think that the surviving URL is the .aadl.org domain.

I suspect that the long term answer is to transfer the a2docs.org domain handling to an nginx configuration which does whatever necessary domain mapping.

The main reason for wanting this is to ensure that all of the old links to a2docs.org that are in Arborwiki still work. An alternative plan is to identify all of those pages that have those links one by one and fix them, and then retire the old a2docs.org name entirely.

Several templates' title blocks have "A2" hardcoded in them

Looking at index.html, org.html, search.html and probably others, the title typically includes "A2":

{% block title %}
Search A2 Government Document Repository
{% end %}

I note that base.html uses {{region}}. I'm too tired to fix this now and verify it actually works on my machine (and I am not familiar with Tornado's templating so I'll need to test this locally), so filing this as a note for later.

User Management and FOIA Request Tracking

This is just here for discussion and is likely a long term change. Auth helps things from getting deleted and spam and the admin user is a good fit for that.

It seems from glancing at some of the uploads and the fields that a use case is tracking requests that have been places. So putting in a stub record of what was requested and date requested and coming back later and uploading the doc when it is received. Correct me if I'm wrong @vielmetti

If that is the case then might be worth discussing what a user management might look like along with views for managing requests.

Could probably do something external like basic webserver auth which the app just then associates the file with the login name. That would prevent the need for user admin interfaces.

Narrative / Description field should have more detailed prompts for info and be larger

Feature request from @vielmetti

Original a2docs.org has the text

Add any relevant details about the documents. What are the documents about? Were there any problems or revelations? If your request was denied, what reason was given? What is the larger issue?

Could use this text as the alt or discuss a different form of the text.

The styling of the box should probably be more fluid for browser size.

Feature: multiple "request tracking numbers"

This is a feature not in the current system, and needs a little thought.

Any given request might have multiple tracking numbers; e.g. the tracking number assigned by the reader to their own request, the tracking number assigned by the institution for internal use, and the tracking number kept by a third party like a2civictech or seeclickfix for external review.

Sometimes these tracking numbers have URLs too.

I don't know how to represent this.

Review queued docs due to earlier server error

In #35 it was noted that there was a server error (now fixed) when uploading to a2docs.

There are a couple of documents stuck in the queue as a result. Review them, and when they are reviewed, close this.

Database cleanup tools

As I was doing an upload this a.m. I noticed that there were two semi-identical names for agencies that came up in the popup - "Ann Arbor Area Transportation Authority" and "Ann Arbor Area Transit Authority". Only one of those is correct.

The hope would be for some administrative way to remedy this, not sure the precise best way yet.

Sample Support Scripts

This is here for my tracking. Need to create a directory (support-scripts ??) and provide the following sample docs:

  • Nginx Config File
  • Apache Config File
  • Init.d startup script

Should probably also do a systemd script but will have to throw up a VM to test.

Auth Broken in Python 3

Haven't had time to dig but guessing maybe a python 3+ issue? Could also be nginx needs specific config for that path but looking at some other posts it sounds like behaviour changed in 3.x and things have to be encoded manually.

Traceback (most recent call last):
  File "/usr/lib/python3.8/base64.py", line 510, in _input_type_check
    m = memoryview(s)
TypeError: memoryview: a bytes-like object is required, not 'str'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/tornado/web.py", line 1702, in _execute
    result = method(*self.path_args, **self.path_kwargs)
  File "/var/www/a2docs/docstore", line 490, in get
    auth_decoded = base64.decodestring(auth_header[6:])
  File "/usr/lib/python3.8/base64.py", line 554, in decodestring
    return decodebytes(s)
  File "/usr/lib/python3.8/base64.py", line 545, in decodebytes
    _input_type_check(s)
  File "/usr/lib/python3.8/base64.py", line 513, in _input_type_check
    raise TypeError(msg) from err
TypeError: expected bytes-like object, not str

500: Internal Server Error on upload

I just tried to upload the CARD presentation on the 1,4 Dioxane plume and got an internal server error. The time was approximately 0815 on 2/29.

500: Internal Server Error

screen shot 2016-02-29 at 8 16 17 am

Deploy "autocomplete" version of code to a2docs.aadl.org

The current version of the code has autocomplete, but the aadl version doesn't have that yet.

Zach identified the question that we're not sure that his import script imported properly the files where there are multiple documents in a single entry, so a redeploy will need to track that issue too.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.