lukevanin / ocrai Goto Github PK
View Code? Open in Web Editor NEWOptical Character Recognition Artificial Intelligence iOS app for Udacity nanodegree
License: MIT License
Optical Character Recognition Artificial Intelligence iOS app for Udacity nanodegree
License: MIT License
Support Microsoft computer vision API for extracting text from images.
https://www.microsoft.com/cognitive-services/en-us/computer-vision-api
Hello I'm getting an issue in this code .
The GoogleVisionApi key is expired, So what can i do .
Show document as slide out detail view.
Current: Raw data from data detector is stored as fragments with annotations demarcating the detected data. The user views and edits fragments directly.
Problem: Fragment data does not correspond directly to the user's needs. E.g. changing a field to a different type, inserting a new field, or removing a field, causes the data to no longer correspond to the scanned data.
Goal: Decouple scanned data from user data. Data should be modelled to better fit the intent of user modification. Original data should be retained if needed separately from user modification.
Support Microsoft entity linking API for entity extraction.
https://www.microsoft.com/cognitive-services/en-us/entity-linking-intelligence-service
Role describes the position a person fills at an organisation.
Identify and parse machine readable codes using CoreImage.
Support face identification and cropping (Google, Microsoft, etc)
Expected:
Possible solutions:
Reproduce:
Expected:
Empty field should not be saved.
Coloured dots are shown next to each field. The dots are intended to indicate the field type. The colour is ambiguous without context.
Goal: Add a legend to indicate the field type, or add icons instead of dots, or remove indicators entirely and rely on section headers.
Add capability using existing data detector.
After scanning data is stored as key value pairs. It would be beneficial to store certain kinds of data, such as addresses, in specialised data structures.
Structured data
Addresses consist of multiple components, and can be used to derive additional data, such as geographical coordinates. The current key-value storage schema prevents this.
Unstructured data
Unstructured data, such as names and untagged text, should may be stored as key value pairs. The data may be tagged to indicate its intent. E.g. name (first and last if possible), organisation, department, salutation.
Semi-sructured data
Semi-structured data, such as phone numbers, URLs, email addresses, and social media names, may also be stored as plain text. These values may be labelled (e.g. home, work, fax, etc) to indicate their role. It would be beneficial to provide UI functions specific to the type of data. e.g. Call a phone number, send a message to a phone number or email address, or open a web page. All of these can be shared. This kind of data should be validated for conformance to accepted protocols. When the user edits information it should be checked for conformance. If the data does not conform, it should be saved and a warning shown.
Actions which can be performed on any field:
Define abstract interface to be implemented by model objects. Interface should define the actions which the object can perform.
Define abstract interface for actions. Actions do not have state. An action is simply an interface to a task which can be executed. Actions may need to be aware of the view hierarchy (i.e. view controller) to present UI. Do actions need to notify the application on completion? An action may be shown as a table view action (delete), or as an activity. Actions may need to define a presentation intent.
Image orientation metadata is not used when rendering annotation overlays. The image should be rendered to remove the orientation, or the annotations should be rendered using orientation.
Current: Scanner process works atomically. Document is scanned in full, then imported into the data store.
Problem: User must wait for the entire scanning process to complete before seeing results.
Goal: Scanner should update data store incrementally as soon as data becomes available.
Implementation: Create a builder interface for composing document. Scanners send detected data to the builder. The builder updates the data store. View controller observes the data store and updates the view when the data store changes.
Reproduce:
Occurs when the scanned text data contains recognisable address data interleaved with other data. The app does not recognise that the two parts of data are related.
The addresses should be merged into a single entity. Separate addresses should stay disjointed.
Possible solutions:
This may be resolved using Microsoft Vision API which groups information differently.
Alternatively, allow user to select addresses to merge. Use case:
Currently this uses a table view which spans the width of the screen.
Current: Fields are grouped by type. Fields are edited inline. Field type is changed by dragging to a different section.
Problem: Editing controls (edit, add, move) makes the view feel busy and crowded, which impedes usability. Dragging fields is problematic (sections may be off screen requiring scrolling while dragging which is hard to do reliably, user may not know which direction to drag a field to).
Goal: Tap on a field to show an edit screen for that field. Show a picker with field types. Customise the view to accommodate the data being edited (allow multilines for addresses, disallow multiline for phone numbers and email).
Reproduce:
Types of data:
Search documents from listing screen
Expected:
Message or activity indicator should appear to show that scanning is in progress.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.