omeka-s-modules / csvimport Goto Github PK
View Code? Open in Web Editor NEWLicense: GNU General Public License v3.0
License: GNU General Public License v3.0
For the second run through CSVImporter, first uninstall the module, then update the master branch, then install it again.
There's now the feature to import users to look at. CSV file takes columns for name, email address, and role. Those new users should get an email when they are created.
The files that didn't work first time around should now either work, or give more useful error messages.
There's also a variety of changes to the interface
Right now you just end up on a blank version of the mapping page
Opened the Importer module, selected a users file, selected Users from the dropdown (of Users/Items). When I click "Next" I get the following error:
Notice: Undefined variable: serviceLocator in /websites/omekadev/home/www/omeka-s/modules/CSVImport/src/Form/MappingForm.php on line 80 Catchable fatal error: Argument 1 passed to Omeka\Form\Element\ResourceSelect::__construct() must implement interface Zend\ServiceManager\ServiceLocatorInterface, null given, called in /websites/omekadev/home/www/omeka-s/modules/CSVImport/src/Form/MappingForm.php on line 80 and defined in /websites/omekadev/home/www/omeka-s/application/src/Form/Element/ResourceSelect.php on line 19
On shared install (dev/omeka-s) importing file that looks like this
email,role,display name [email one],Editor,Megan1 [email two],Editor,Megan2
Either check the user's role, or wait to see if a specific permission to check gets built into core.
On the shared install (dev.omeka.org/omeka-s) imports of files which are borked for some reason don't return a log, or even end with "error" - they get stuck in "in_progress"
I think one of the issues might have been importing a file with item set ids from a second install (my own test install).
See job 289 http://dev.omeka.org/omeka-s/admin/job/289
File attached
must be pointing to the wrong partial?
Either for entire CSV import, or mapped in a column by id or, more usefully, email address
Fallback to the user doing the import is the owner. That raises permissions question of who can import
From #2
ownership??? (restricted by role, specified by email?)
If headings are qnames / terms in Omeka, add ability to automatically map them.
So, if a CSV has heading like dcterms:title | dcterms:description
etc, automatically figure out the mappings that make sense.
Will call for an intermediary step when the CSV is first read. Either module just tries to automatically guess, or user makes an explicit assertion that it should try to map. Explicit assertion could happen either at first read or on the mapping page.
Automatic guessing would just parse the headings around a :
and see if first part matches any prefixes known to S.
The basic prototype for a module that would integrate with this would be Mapping.
currently some just build the HTML line by line, which is hard to follow and edit, and it misses variables in the view
On the develop
branch, after selecting a csv file and clicking the 'Next' button, I received the following runtime error:
There was an error.
โ
exception 'RuntimeException' with message 'Cannot read from file /private/var/tmp/omekamtNK3T' in /Users/kim/Sites/omeka-s/modules/CSVImport/src/CsvFile.php:22
Stack trace:
#0 /Users/kim/Sites/omeka-s/modules/CSVImport/src/CsvFile.php(22): SplFileObject->fgets()
#1 /Users/kim/Sites/omeka-s/modules/CSVImport/src/Controller/IndexController.php(50): CSVImport\CsvFile->isUtf8()
#2 /Users/kim/Sites/omeka-s/vendor/zendframework/zend-mvc/src/Controller/AbstractActionController.php(82): CSVImport\Controller\IndexController->mapAction()
#3 [internal function]: Zend\Mvc\Controller\AbstractActionController->onDispatch(Object(Zend\Mvc\MvcEvent))
#4 /Users/kim/Sites/omeka-s/vendor/zendframework/zend-eventmanager/src/EventManager.php(444): call_user_func(Array, Object(Zend\Mvc\MvcEvent))
#5 /Users/kim/Sites/omeka-s/vendor/zendframework/zend-eventmanager/src/EventManager.php(205): Zend\EventManager\EventManager->triggerListeners('dispatch', Object(Zend\Mvc\MvcEvent), Object(Closure))
#6 /Users/kim/Sites/omeka-s/vendor/zendframework/zend-mvc/src/Controller/AbstractController.php(118): Zend\EventManager\EventManager->trigger('dispatch', Object(Zend\Mvc\MvcEvent), Object(Closure))
#7 /Users/kim/Sites/omeka-s/vendor/zendframework/zend-mvc/src/DispatchListener.php(93): Zend\Mvc\Controller\AbstractController->dispatch(Object(Zend\Http\PhpEnvironment\Request), Object(Zend\Http\PhpEnvironment\Response))
#8 [internal function]: Zend\Mvc\DispatchListener->onDispatch(Object(Zend\Mvc\MvcEvent))
#9 /Users/kim/Sites/omeka-s/vendor/zendframework/zend-eventmanager/src/EventManager.php(444): call_user_func(Array, Object(Zend\Mvc\MvcEvent))
#10 /Users/kim/Sites/omeka-s/vendor/zendframework/zend-eventmanager/src/EventManager.php(205): Zend\EventManager\EventManager->triggerListeners('dispatch', Object(Zend\Mvc\MvcEvent), Object(Closure))
#11 /Users/kim/Sites/omeka-s/vendor/zendframework/zend-mvc/src/Application.php(314): Zend\EventManager\EventManager->trigger('dispatch', Object(Zend\Mvc\MvcEvent), Object(Closure))
#12 /Users/kim/Sites/omeka-s/index.php(17): Zend\Mvc\Application->run()
#13 {main}
In moving to extending ItemForm for use in MappingForm, I lost the multivalue separator
CSV Importer is ready to have a look. Master branch of CSVImport, and latest develop branch for Omeka S.
CSV file(s) should have a lot of different columns, including:
On the import page, choose the csv file, hit next. Try all the different mapping options and combinations you can think of. The 'Basis Import Settings' should apply to all the items imported, unless they are overridden / added to in the mapping options.
In the mapping options, the 'URL' checkbox means that the data type for the values will be a URL, and the 'multiple values' checkbox means that there are multiple values, separated by whatever you put in the CSV file and set in the basic settings.
Click the buttons for the different types of mappings available -- metadata, media, or item data (like the owner email or item set id). Sidebars for each type will let you pick what kind of mapping to use for each column.
Lots of variations to look at, so a couple different CSV files might be good.
I'll also be curious about how it handles CSV files that are broken in different ways. Separate new issues for each thing that comes up will probably be much more manageable than comments here.
Let me know what questions come up.
Thanks
Can no longer extend ItemForm
Dropdown to select owner defaults to user 1, not the current user.
To clarify that a job runs to complete
but has issues with some rows, add a message to the comment pointing people to the job's log when there's something interesting logged.
Either by RC id and/or by a QName
Difference is setting it in the basic import options vs. pulling from the CSV itself. Pulling from CSV adds a mapping interface
From scoping issue #2
resource class (by ID?, QName?, some weird combo of vocab label and class label)
1st step, just run through a simple file for metadata, ignoring multiple values and media
Instead of the getSidebar()
function returning the <div><legend>...
html at the top, build that behind the scenes with getName()
and getLabel()
.
Also add an options array to add classes in addition to 'sidebar'
Job is reported as Completed, despite exception below. Happens both doing create
and batchCreate
Entirely possible that this issue is also present in all the importers.
2016-03-21T21:33:28+00:00 ERR (3): exception 'Omeka\Api\Exception\ValidationException' in /var/www/omekas/application/src/Api/Adapter/AbstractEntityAdapter.php:428
Stack trace:
#0 /var/www/omekas/application/src/Api/Adapter/AbstractEntityAdapter.php(288): Omeka\Api\Adapter\AbstractEntityAdapter->hydrateEntity(Object(Omeka\Api\Request), Object(Omeka\Entity\Item), Object(Omeka\Stdlib\ErrorStore))
#1 /var/www/omekas/application/src/Api/Manager.php(325): Omeka\Api\Adapter\AbstractEntityAdapter->batchCreate(Object(Omeka\Api\Request))
#2 /var/www/omekas/application/src/Api/Manager.php(197): Omeka\Api\Manager->executeBatchCreate(Object(Omeka\Api\Request), Object(Omeka\Api\Adapter\ItemAdapter))
#3 /var/www/omekas/application/src/Api/Manager.php(69): Omeka\Api\Manager->execute(Object(Omeka\Api\Request))
#4 /var/www/omekas/modules/CSVImport/src/Job/Import.php(109): Omeka\Api\Manager->batchCreate('items', Array, Array, true)
#5 /var/www/omekas/modules/CSVImport/src/Job/Import.php(63): CSVImport\Job\Import->createItems(Array)
#6 /var/www/omekas/application/src/Job/Strategy/SynchronousStrategy.php(26): CSVImport\Job\Import->perform()
#7 /var/www/omekas/application/src/Job/Dispatcher.php(85): Omeka\Job\Strategy\SynchronousStrategy->send(Object(Omeka\Entity\Job))
#8 /var/www/omekas/data/scripts/perform-job.php(43): Omeka\Job\Dispatcher->send(Object(Omeka\Entity\Job), Object(Omeka\Job\Strategy\SynchronousStrategy))
#9 {main}
Example offending JSON for batchCreate
. First two import fine.
2016-03-21T21:33:23+00:00 DEBUG (7): Array
(
[0] => Array
(
[o:item_set] => Array
(
)
[0] => Array
(
[0] => Array
(
[@value] => Walden
[property_id] => 1
[type] => literal
)
)
[o:media] => Array
(
[0] => Array
(
[o:ingester] => url
[o:source] => http://upload.wikimedia.org/wikipedia/commons/2/25/Walden_Thoreau.jpg
[ingest_url] => http://upload.wikimedia.org/wikipedia/commons/2/25/Walden_Thoreau.jpg
)
)
)
[1] => Array
(
[o:item_set] => Array
(
)
[0] => Array
(
[0] => Array
(
[@value] => The Count of Monte Cristo
[property_id] => 1
[type] => literal
)
)
[o:media] => Array
(
[0] => Array
(
[o:ingester] => url
[o:source] => http://upload.wikimedia.org/wikipedia/commons/c/c3/Edmond_Dant%C3%A8s.JPG
[ingest_url] => http://upload.wikimedia.org/wikipedia/commons/c/c3/Edmond_Dant%C3%A8s.JPG
)
)
)
[2] => Array
(
[o:item_set] => Array
(
)
[0] => Array
(
[0] => Array
(
[@value] => Narrative of the Life of Frederick Douglass
[property_id] => 1
[type] => literal
)
)
[o:media] => Array
(
[0] => Array
(
[o:ingester] => url
[o:source] => http://upload.wikimedia.org/wikipedia/commons/f/f5/Sketchofdouglass.jpg
[ingest_url] => http://upload.wikimedia.org/wikipedia/commons/f/f5/Sketchofdouglass.jpg
)
[1] => Array
(
[o:ingester] => url
[o:source] => http://upload.wikimedia.org/wikipedia/commons/b/ba/Henry_David_Thoreau.jpg
[ingest_url] => http://upload.wikimedia.org/wikipedia/commons/b/ba/Henry_David_Thoreau.jpg
)
)
)
)
When doing mappings, the type of data (e.g., properties, item data, etc) isn't highlighted to show where you are in the steps. As more and more mapping options become available, that guide matching the mapping view with the sidebar will become more important.
The existing structure is (mostly) column-by-column. Sense I get from #2 is to have an additional mapping for row-by-row. That is, in addition to mapping title
to dcterms:title
, map e.g. item_set
to the Item set for the entire item represented by the row.
Based on the current 'new-mapping' branch, that would likely become a new row/item-based sidebar?
Error message in job log is Fatal error: Call to a member function label() on a non-object in /websites/omekadev/home/www/omeka-s/modules/CSVImport/src/Mapping/ItemMapping.php on line 123
(hooray error logging working!)
File is attached as a zip. Process is as follows:
The job should hit an error and stop before importing a single item.
For previous attempts of this CSV see jobs: 270, 273, 275 on the shared install.
CSV maker - Dickensia.csv.zip
At the very least, $args
always needs to get passed in, and probably $serviceLocator
, too
for the csrf check, if nothing else
For the items mapping, sometimes the URI and Multivalue checkboxes make sense (e.g., when mapping properties). Othertimes, they make no sense (e.g., most item and map data).
Not sure where/how to get the info. Seems like it'd need a pile of annoying javascript from each mapping class.
Seems like this should be named 'CsvImporter' to follow pattern of other module. The cases on letters might help follow conventions used throughout S
Add a basic import setting to make everything in the import get the same template.
Alternatively, read it / map it from the CSV row data.
From #2
resource template (by label, or ID)?
I've tried with two different DC class terms (Text and Moving Image) in separate CSVs (one class per CSV). In both instances, all items were successfully created but all were assigned the class "Agent".
Jobs 248 and 251 on the shared install.
MapClass.zip
Currently, it only handles files from url. Needs to handle all the media types.
Calls for reworking the data posted to be generalized, so that I don't have to hard code all the possible types in advance.
Data needs: column index, ingester, source. JS just needs to expand the branch around how it deals with media.
Pass that along to the job. in buildMediaJson
, dig up that data and do a switch accordingly.
Not sure why, but flickr succeeds and vimeo fails.
Both csvs are very simple, the flickr one adds all three items and the vimeo one did not create a single item, even when it was only the header row and a row with title and url for media.
It's being "camel" capitalized in the INI, the nav entry, and the heading "Csv Column" on the mapping screen at least, but fully capitalized in some other headings and places.
I can select a csv, go through and map the properties, and hit "Import," the module goes to http://dev.omeka.org/omeka-s/admin/csvimport/map and it's just a blank screen. No job is started, nothing gets imported.
When omeka/omeka-s#633 is done, implement it in the mapping form
Include an identifier for the file, to be used for managing updating records
Probably another checkbox to indicate that a column has multiple values.
And, in the 'global' settings for the import, a place to specify the value separator
Sidebar columns with class flag
should not allow multiple flags on same columns
It'd be nice to catch the common character encoding problems and send a message when the file is initially uploaded.
Might be just do
mb_detect_encoding($string, 'UTF-8', true);
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.