statedecoded / statedecoded Goto Github PK
View Code? Open in Web Editor NEWLegal codes, for humans.
Home Page: https://www.statedecoded.com/
License: Other
Legal codes, for humans.
Home Page: https://www.statedecoded.com/
License: Other
The entirety of http://vacode.org/21-427/ is repeated in the history section.
For both documentation and clarity of thought in construction, it's an API to emulate. It's here.
This is something not working correctly with jQuery Color.
Having made the cross reference matching quite a bit looser—to avoid under-matches— now, at the end of the load-in process, we need to sweep through and remove any cross reference that doesn't match an actual, valid section.
Review all HTML link relations and add any that would be advantageous. For instance, rel=up, rel=first, rel=last, rel=prefetch (for the first search result, if stats bear that out), rel=tag, rel=search, among others.
Some catch lines contain text like "Effective October 1, 2011" in them. This is not useful as of October 1, 2011, yet the titles remain there even in subsequent year's codes. Figure out how to remove this without eliminating useful, actionable information.
On the other hand, some catch lines contain text like "Effective until January 1, 2014." This is really very useful.
This is basically a case of storing metadata in titles that properly belongs elsewhere. Perhaps a "notes" field in laws_meta is the correct place to store this material?
A search with thousands of results will have hundreds of page numbers displayed.
Have a screenshot of a marked-up bit of code, with mouseovers to illustrate the niceties available on each page. Of course, this can't be done until the design is finished.
Disqus provides great functionality to allow comments to be stored locally and synchronized periodically. Do so, and include those comments in the indexing process. Some great keywords and phrases are liable to be found in those comments.
Obviously, a lot more than putting this in the settings will be necessary—it'll need to be employed throughout the code.
The reason for this is because it's faster in PHP to provide absolute paths than to have to look it up over and over again.
There might be some sense in setting this as a constant, automatically, which could be stored in APC. That provides no benefit for installations that lack APC (other than that it doesn't require a user to set it), but it would be both speedy and zero-configuration for others.
Using the specified API for the state, list notable laws that are facing modification or that have recently had bills pass to amend them.
The replace_definitions() function has to attempt all reasonable lettercase scenarios in an effort to track down a definition for a word. While this works (mostly), it's a lousy approach. Rethink it.
Let people maintain a sort of a shopping cart of laws (and chapters, titles, etc?).
When somebody goes to a section of the code via a search, on that section, display a link to return to the search results. Yes, people have back buttons, but some people aren't that smart.
Per Vivian P.
From § 15.2-2201, “Zoning” is stored, but not “to zone,” which occurs in the same definition. Both of them should be matched, but they’re not.
Four months after filing this ticket, I can't find any place where "to zone" actually appears. I'm sure it does somewhere, but I haven't turned it up yet.
It's a quick modification. It's found in functions.inc.php. Right now this query is selecting from court_decision_laws based on law_section. That's because, as of its creation, the laws table hadn't been populated. But it's a much less efficient query than selecting based on law_id.
When there is a legislative changelog that can be linked to, do so. (Per Jane C's suggestion.)
Right now the definition parser stores the scope (global, title, chapter, section, etc.) of a definition, requiring repeated determinations of where to apply a definition. Instead, store the ID of the thing (title, chapter, section, etc.) that is the scope of the definition. That will be more efficient, and more flexible, given the various structures of codes.
We'll have duplicate IDs between sections and containers (titles, chapters, articles, parts, etc.), so definitions will need an indicator as to whether they apply to just one section or to something broader, and then the applicability ID can be understood to refer to one of two tables.
Right now it's only updated manually.
That's an Ajax lazy-load function to replace the CSS-based approach.
https://developers.google.com/webfonts/docs/webfont_loader#Example
For example, § 21-112.22:
Whenever the words “circuit court” are used in this chapter, they shall also be construed to mean “circuit or corporation court” of a city; whenever the word “county” appears in this chapter, it shall also be construed to mean “city,” and whenever the words “governing body of a county” shall appear, they shall also be construed to mean “city council.”
The sentences probably can be broken up by semicolon. Look at the example—there are three definitions, first separated by a semicolon and then by a comma), but there's a clear pattern of saying "whenever," then some words in quotation marks, then some other words in quotation marks," and then the next use of a comma, period, or semicolon. We can break up the text on that basis.
There are two promising potential methods of providing EPUB files: ePubExport and "EPub". The former is a MediaWiki extension, the latter a PHP class. Elmer Masters took the "EPub" class and improved upon it for Free Law Reporter, specifically found here.
We’re going to need a custom function for each state (which means an entire custom function file for each state) that will return the URL for that section on the state's official website. For some states it’s easy, but for others (Florida), it’s tricky.
The Supreme Court of Virginia has long provided only PDF-based opinions, and now it looks like the text-based opinions of the Court of Appeals have been abandoned, without any updates for 14 months, while the PDF-based opinions are current.
The trick here is that those decisions are PDFs, and the text needs to be scraped out.
See Juriscraper, CALI's Free Law Reporter, pdfminer, and the especially promising pdfextract.
(Strictly speaking, this is a feature needed on Open Virginia, not The State Decoded, but having a solution in place for this will be helpful to the many other states who will want to implement this.)
Stripe uses HTTP Auth. What are the merits of this approach?
One of the problems is liable to be that it's an additional obstacle to testing. If we're to make it simple for people to try these things out, this could be problematic.
Most people will not know what a "Class 6 felony" is. The punishment should be listed, upon mouseover.
Because they're not citations to individual sections, they're not being linked right now.
Do not define a term if it is already contained within definition tags. This will necessitate modifying $definition_word_list within section.php, specifically the PCRE, to get it to ignore any text within span class="definition"
tags.
We're using a PCRE for this, but it's not working. The reason it's not working is because it's looking for an individual word that's within tags. We're not going to encounter those, because we apply definitions from longest to shortest. So if we first apply the definition for "natural-born person," then "person" will be matched again, because "person" is followed by a tag, but it is not preceded immediately by one.
The solution to this is going to require more thought. The fix is not to apply definitions from the shortest to the longest, because the longer the term, the more specific it tends to be towards the section, chapter, and title.
For instance, “supervisors” in 15.2-2000. In the database, it’s recorded as “supervisor.” I notice the same problem with “funds”/“fund.” Determine what the cause of this problem is, and fix it.
For instance, 2.2-315 says that 37.2-100 shall apply "mutatis mutandis" to the terms used in that article. WTF?
http://www.courts.state.va.us/ has a glossary, and searching the site for the word turns up PDFs of both criminal and civil glossaries, intended for clerks of court.
http://www.uscourts.gov/Common/Glossary.aspx
http://www.nycourts.gov/lawlibraries/glossary.shtml
Consider using the 1910 edition of Black's Law Dictionary, since it's in the public domain.
Nolo has such a guide. Perhaps there are some terms under which they'd be willing to license it?
There's also http://definitions.uslegal.com/, but the terms of (re)use are totally unclear.
Also, Wiktionary.
http://en.wiktionary.org/wiki/Appendix:Legal_terms
http://en.wiktionary.org/wiki/Category:en:Law
In Virginia, for instance, we know (via Richmond Sunlight) whether the bill passed or failed, we know what year it took place, and we know about the legislator. All of that data should be displayed, some in tooltip form.
38.2, chapter 13 title is too long—ends with “N”. I speculate that the title is longer than the SGML can contain. The solution is probably to check whether the title is at that maximum limit and the last character is an “N” and, if so, replace it with an ellipsis.
Calculate the MD5 hash of every definition (use a trigger, on insert or update?). List all of the places each term is defined, and every unique definition, with some sort of an indicator of where in the code that each is used.
For any content that changes (court cases, tags, comments), it'll be necessary to run a delta reindexing in Solr periodically. Figure out a reasonable schedule and then figure out a way to schedule that. (Cron is clearly the simplest solution, though it complicates installation slightly.)
The parser currently handles only sections entitled "Definitions." This prevents the parsing of definitions within the scope of a single section. Broaden the definition parser to examine all sections.
API URLs are currently a mess—exposed .php extensions, a false "1.0" in the URL structure (do we need the version number in there at all?), etc. Get things cleaner.
It's important to preserve the line breaks in tables, but the text unwrapping functionality breaks that. Prevent carriage returns from being stripped out of tables.
Take the history data, currently stored as a single field, and break it down to its smallest units. For Virginia, it would be enough to store the year and the Acts of the General Assembly identifier, but for Florida we can see that it's rather more complicated. So the table storing this will need to be flexible.
Doing this will make it easier to perform bulk analysis—which laws were enacted in a given year, for instance.
Currently there is a title table and a chapter table. This obviously will not work for codes that use different structures. Flatten these into a single table with a parent/child relationship. Also, think through how to label each level in that hierarchy. A second table? One table, and actually insert the correct label ("part," "chapter," etc.) into the table for each entry? Keep it in an array in the config file?
Also, section.php should be renamed law.php, and chapter/title.php should be replaced with a single file to handle both (and more) layers of functionality.
It may be helpful to establish a view in the database that has one row for each structural endpoint, storing its entire pedigree. For instance, in the Virginia code there would be one row for each chapter, listing also its title.
XML, JSON, and PHP all seem like sensible response formats to provide.
Folks who read to the bottom shouldn't have to scroll up to proceed.
If a section was passed and then updated just once, it says "updated in and 1995." This is a problem resulting from section.php. This, of course, should say only "updated in 1995." Fix this pluralization problem, which is a result of using foreach to iterate through an object.
Some sections, under circumstances not yet understood, have no titles. The use of DT/DD tags on title/chapter/etc. listings results in DDs flowing up, resulting in all subsequent items in the list being mismatched. While it's necessary to fix the underlying problem of missing data, it would also be wise to have the HTML accommodate missing section names.
Provide example requests and example responses. For the responses, straight up display the JSON (or whatever)—don't sugarcoat it. API documentation is for big boys & girls. (See issue #12 for guidance.)
404s need to be handled properly, returning a header and the contents of a dedicated 404 page.
Some defined terms, in the pop-up, are not being uppercased. See “circuit court” in http://vacode.org/21-116/.
PHP, Python, and Ruby all stand out as obvious choices. I think we'd want functions to retrieve a given law, a definition, or a given structural unit, and convert JSON into the language's native object. It should be pretty easy.
Embedded definitions are sometimes simply too long. They need to be abbreviated. Although a good stopgap solution would be to truncate them after a particular character or word count, it would be best to come up with something more intelligent. For instance, to pick some random numbers, it might be determined that 50 words is a good limit, but 70 could be acceptable. That would allow long definitions to be truncated intelligently to break cleanly at the end of sentences.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.