Comments (14)
This means the date string you're supplying in the source document ES is not auto-recognized. Please see http://www.elasticsearch.org/guide/reference/mapping/date-format.html and supply the correct format string in the mapping for your date field.
FYI, issues are not meant to be opened for general ES usage questions. Please post to the mailing list if you're still having trouble.
from elasticsearch-mapper-attachments.
So people who work on the elasticsearch-mapper-attachment plugin expect me to take an email pulled directly out of Outlook and CHANGE the dates inside of it? I assure you I didn't send any malformed dates into ElasticSearch. This was an email that Outlook created that I stored as an attachment.
from elasticsearch-mapper-attachments.
Gosh, I totally overlooked that this was for the mapper plugin and not core ES. These github notifications come from all over the place!
This is either an issue with Tika or our integration with it. I'll reopen so we can take a look.
from elasticsearch-mapper-attachments.
Thanks!
from elasticsearch-mapper-attachments.
Same problem to store webpages.
For example, I have this problem when trying to index http://www.unm.edu/
Any workaround to have something working?
from elasticsearch-mapper-attachments.
@scstarkey @tpatris
can you provide us some sample data your are indexing, which fails. I have an assumption, that the Tika is extracting the date from your document, but stores it wrong.
You might be able to change the date formattings inside of the attachment plugin like this (just a wild guess, but worth a try):
{
"person" : {
"properties" : {
"file" : {
"type" : "attachment",
"fields" : {
"date" : {"store" : "yyyy/MM/dd||date_optional_time||date_time"},
}
}
}
}
}
Note: The format above needs to be changed, according to http://www.elasticsearch.org/guide/reference/mapping/date-format/
I hope this helps, but anyway, just post your samples here, in order to be make sure it is not a different bug we are chasing.
from elasticsearch-mapper-attachments.
I can not paste all the content of the HTML that I want to index here but you can get it by using ctrl + u in your browser on the page http://www.unm.edu/.
My error is:
{"error"=>"MapperParsingException[Failed to parse [content.date]]; nested: MapperParsingException[failed to parse date field [Tue, 14 May 2013 08:000:11 -0440], tried both date format [dateOptionalTime], and timestamp number]; nested: IllegalArgumentException[Invalid format: \"Tue, 14 May 2013 08:000:11 -0440\"]; ", "status"=>400}
My mapping is:
mappings: {
weblink: {
properties: {
tags: {
store: yes
analyzer: keyword
boost: 2
type: string
}
id: {
type: integer
}
content: {
path: full
type: attachment
fields: {
content: {
store: yes
term_vector: with_positions_offsets
type: string
}
author: {
store: yes
type: string
}
title: {
store: yes
type: string
}
keywords: {
store: yes
type: string
}
name: {
store: yes
type: string
}
date: {
format: dateOptionalTime
type: date
}
content_type: {
store: yes
type: string
}
}
}
library_id: {
type: long
}
created_at: {
store: yes
format: dateOptionalTime
type: date
}
user_id: {
type: integer
}
type: {
type: string
}
url: {
index: not_analyzed
omit_norms: true
index_options: docs
type: string
}
}
}
}
from elasticsearch-mapper-attachments.
Hey,
looking at the HTML source, specifically at this line
<meta content="Thu, 16 May 2013 01:000:12 -0440" name="date" />
shows a custom date format, which needs to be configured explicitly, as mentioned in my last post. See http://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html for possible options.
However I am a bit unsure about that format, the 000
makes me pretty unsure about it. Reloading it gives me a different date, so there is some caching involved and you should be able to get it working.
from elasticsearch-mapper-attachments.
Hello Alexander,
First thanks for your answer and sorry for the time since my last message.
So... your answer actually doesn't answer to our problem. I will try to explain exactly what it is:
We are building a bookmarking tool, when a user bookmark an URL, we index the full page. It means we don't know what kind of date formats will be used in the webpage content.
So my question is: How to get rid of this error without have to do something specific about date formats?
Thanks
from elasticsearch-mapper-attachments.
Yes is there a simple way to ignore or override the date if one doesnt care about the formats. I mean cant we just store it as string ??
Tried the following mapping
{
"files-type": {
"properties": {
"content": {
"type": "attachment",
"fields": {
"content": {
"store": "yes",
"include_in_all": true,
"term_vector": "with_positions_offsets"
},
"date" : { "type": "string" }
}
}
}
}
}
But attachment type seem to override the one I give explicitly and changes it back to
{
"properties": {
"content": {
"fields": {
"author": {
"type": "string"
},
"content": {
"include_in_all": true,
"store": "yes",
"term_vector": "with_positions_offsets",
"type": "string"
},
"content_type": {
"type": "string"
},
"date": {
"format": "dateOptionalTime",
"type": "date"
},
"keywords": {
"type": "string"
},
"name": {
"type": "string"
},
"title": {
"type": "string"
}
},
"path": "full",
"type": "attachment"
}
}
}
from elasticsearch-mapper-attachments.
Heya,
Jumping in this thread. In next 1.9.0 version, mapper attachment plugin will now ignore metadata fields in case of error, unless you ask it to fail explicitly. See #38.
About mapping, I will look at it. I just fixed something similar in #37 about using multifield
.
from elasticsearch-mapper-attachments.
Did someone tested mapper 1.9? Closing this issue but feel free to reopen if the error still occurs.
from elasticsearch-mapper-attachments.
same question
my mapping is
"starttime": {
"type": "date",
"format":"yyyy/MM"
}
and my data is
"starttime":"2015/01"
and exceptionis
MapperParsingException[failed to parse date field [1997/09], tried both date format[dateOptionalTime], and timestamp number with locale []]; nested: IllegalArgumentException[Invalid format: "1997/09" is malformed at "/09"];
what should i do?
from elasticsearch-mapper-attachments.
Sorry but how is this related to mapper attachment plugin? I mean that starttime is not generated by the mapper plugin, right?
That said, I'm pretty sure your mapping has not been applied as Date parser is stil using the default format.
I'd open a thread on the mailing list and provide a full script which shows exactly what you are doing. So we can help you there.
If you think it's absolutely related to the mapper plugin, you can open a new issue and provide all the same details I just mentioned.
from elasticsearch-mapper-attachments.
Related Issues (20)
- Update to elasticsearch 2.1.2
- This plugin is incompatible with elasticsearch 2.2.0 HOT 5
- Loosing file.content on _update HOT 5
- Searching iWork files HOT 1
- Failed to index docx file with mapper-attachments plugin that comes with Elasticsearch 2.3.1 HOT 2
- Support for Elasticsearch 2.2.1 HOT 1
- Extracting "subject" and "page count" out of a pdf document HOT 2
- Filtered Query Match on Document Contents Not Working
- Unable to build HOT 1
- copy_to not working as expected HOT 1
- build it with gradle error HOT 1
- Plugin doesn't work with embedded elasticsearch in spring boot app's uber JAR HOT 1
- the txt file can't be search HOT 1
- mapper-attachments (documentation) link is broken HOT 1
- NodeJS example HOT 2
- ERROR: Plugin [mapper-attachments] is incompatible with Elasticsearch [2.4.4]. Was designed for version [2.1.2] HOT 3
- pdfbox error HOT 1
- Make the plugin available for ES6 HOT 1
- Please update Bouncy Castle HOT 1
- Support Elasticsearch 7.x HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from elasticsearch-mapper-attachments.