wmorgan / heliotrope Goto Github PK
View Code? Open in Web Editor NEWA personal, threaded, search-centric email server.
A personal, threaded, search-centric email server.
Having such a predicate would make easier the search/labelling of messages from lists.
A bonus would be a word based matching. Example, results from list:sup-devel.rubyforge.org would also be displayed by list:sup and list:''.
GMail labels are UTF-7 encoded so they are unreadable when they are in languages other than english. Before storing the labels in the heliotrope index these should be converted to UTF-8 that would make them readable and also searchable. The ruby Net::IMAP.decode_utf7 method can be used to accomplish this.
I would like to label/unlabel a whole set of thread all at once by specifing query and a set of signed (+/-) labels.
Currently the loop is done on the client which is really inefficient, moreover using turnsole its not possible to really select all the threads of a search result.
If a message has a date with month first and day second, then if the day is greater than 12, heliotrope crashes. The message in question was generated by staples.co.uk after I ordered some goods from their website. So sup should not barf on it, but forging a date header would be fine by me. Though "now" might be better than date zero, which I think is what sup normally does otherwise ...
A possible patch:
lib/heliotrope/maildir-walker.rb | 14 +++++++++++++-
1 files changed, 13 insertions(+), 1 deletions(-)
diff --git a/lib/heliotrope/maildir-walker.rb b/lib/heliotrope/maildir-walker.rb
index db9b9d4..47cb08b 100644
--- a/lib/heliotrope/maildir-walker.rb
+++ b/lib/heliotrope/maildir-walker.rb
@@ -43,7 +43,19 @@ private
while(l = f.gets)
if l =~ /^Date:\s+(.+\S)\s*$/
date = $1
- pdate = Time.parse($1)
+ begin
+ pdate = Time.parse($1)
+ rescue ArgumentError
+ # flip the day and month around and try again
+ if date =~ %r|(\d\d?)([-./])(\d\d?)([-./])(\d{2,4})|
+ date = $3 + $2 + $1 + $4 + $5
+ pdate = Time.parse(date)
+ else
+ puts "Error while parsing time in file #{fn}"
+ puts "Matched date text was #{date}"
+ pdate = Time.at 0
+ end
+ end
return pdate
end
end
I got about half way through importing my ~25000 messages from gmail and then get:
/home/mish/dev/sup/heliotrope/lib/heliotrope/index.rb:484:in `build_thread_structure_from': undefined method `-@' for nil:NilClass (NoMethodError)
from /home/mish/dev/sup/heliotrope/lib/heliotrope/index.rb:470:in `block in build_thread_structure_from'
from /usr/lib/ruby/1.9.1/set.rb:221:in `block in each'
from /usr/lib/ruby/1.9.1/set.rb:221:in `each_key'
from /usr/lib/ruby/1.9.1/set.rb:221:in `each'
from /home/mish/dev/sup/heliotrope/lib/heliotrope/index.rb:470:in `map'
from /home/mish/dev/sup/heliotrope/lib/heliotrope/index.rb:470:in `build_thread_structure_from'
from /home/mish/dev/sup/heliotrope/lib/heliotrope/index.rb:427:in `thread_message!'
from /home/mish/dev/sup/heliotrope/lib/heliotrope/index.rb:89:in `add_message'
from bin/heliotrope-add:147:in `<main>'
Could there be an option added to spit out the message to a file? Or the first message read in after restart. Otherwise it is a bit hard to work out which message it is ...
I just had another play with my test heliotrope setup following a git pull:
$ ruby1.9.1 -I lib bin/heliotrope-server -d ~/.heliotrope/ -H 1.2.3.4 Version mismatch error: index is version nil but I am expecting "0.1". Try running bin/heliotrope-upgrade-index. $ ruby1.9.1 -I lib bin/heliotrope-upgrade-index -d ~/.heliotrope "upgrade__to_0_1" Trying to upgrade from nil to "0.1". Sorry! To upgrade to index version 0.1, you must reindex everything. $ ruby1.9.1 -I lib bin/heliotrope-reindex -d ~/.heliotrope Reindexing... bin/heliotrope-reindex:128:in `add_entry': entry must be a Whistlepig::Entry object (TypeError) from bin/heliotrope-reindex:128:in `block in ' from bin/heliotrope-reindex:63:in `each_message_regular' from bin/heliotrope-reindex:48:in `each_message' from bin/heliotrope-reindex:126:in `'
I am caught in a catch 22? Should I checkout an earlier version of heliotrope and do the reindexing at that point? Or is upgrading the index not really supported at this point?
It seems uid_fetch
will return nil
in some cases leading to errors of the following form:
/tmp/heliotrope/lib/heliotrope/imap-dumper.rb:191:in `next_message': undefined method `size' for nil:NilClass (NoMethodError)
from bin/heliotrope-add:127:in `<main>'
I changed
imapdata = begin
@imap.uid_fetch query, imap_query_columns
to
imapdata = begin
(@imap.uid_fetch query, imap_query_columns) or []
But I don't know if this is the right fix to apply (I did not read the documentation of net/imap yet).
...both in web-interface and turnsole.
The search term is present in utf-8 and windows-1251 encodings in different mails.
My system encoding is utf-8.
Ruby 1.9
Please inform if this is not an issue with ruby 1.8 or any other environment
My mail archive contained a Spam message with a pathological ill-formed Message-ID line, like this one:
Message-ID: <somethinglikeanid
heliotrope-add crashes when trying to grok this mail:
# ruby -Ilib ./bin/heliotrope-add -d test0 -m /tmp/bad.mbox
; loading mail...
end offset is 262
/home/gregor/GIT/heliotrope/lib/heliotrope/message.rb:155:in `digest': can't convert nil into String (TypeError)
from /home/gregor/GIT/heliotrope/lib/heliotrope/message.rb:155:in `hexdigest'
from /home/gregor/GIT/heliotrope/lib/heliotrope/message.rb:155:in `munge_msgid'
from /home/gregor/GIT/heliotrope/lib/heliotrope/message.rb:20:in `parse!'
from ./bin/heliotrope-add:138:in `<main>'
find_msgids returns nil, which is indeed a problem for Digest::MD5.hexdigest ;-)
Including default tokenization rules, case-folding, field, labels, etc.
For me, all labels seem to be set correctly after the import - except not a single thread has the "sent" label.
It's not super important - but useful from time to time...
IMAP username: lusername
IMAP password:
/Projects/heliotrope/lib/heliotrope/imap-dumper.rb:117:in`block in initialize': need ssl (ArgumentError)
from /Projects/heliotrope/lib/heliotrope/imap-dumper.rb:115:in `each'
from /Projects/heliotrope/lib/heliotrope/imap-dumper.rb:115:in`initialize'
from bin/heliotrope-add:94:in `new'
from bin/heliotrope-add:94:in`<main>'
I get an error for self-signed SSL when I connect using SSL and couldn't figure out how to tell heliotrope to trust my untrustworthy self-signed certificate.
Upon attempting to use non-SSL'ed IMAP (which I have enabled server-side, using dovecot and enabling plain-text authentication) I get the above error.
I'm not sure that I'm not doing something dumb.
Hello again
Just been re-importing my gmail and hit the limits again with errors thrown - though this time from well inside the imap libraries. But the errors could still be caught and nice error messages given to users - definitely in the second case below. Might want to retry in the first case. Might have a go at this myself sometime.
First while indexing I got
/usr/lib/ruby/1.9.1/openssl/buffering.rb:235:in `syswrite': closed stream (IOError) from /usr/lib/ruby/1.9.1/openssl/buffering.rb:235:in `do_write' from /usr/lib/ruby/1.9.1/openssl/buffering.rb:318:in `print' from /usr/lib/ruby/1.9.1/net/imap.rb:1168:in `put_string' from /usr/lib/ruby/1.9.1/net/imap.rb:1140:in `block in send_command' from /usr/lib/ruby/1.9.1/monitor.rb:201:in `mon_synchronize' from /usr/lib/ruby/1.9.1/net/imap.rb:1135:in `send_command' from /usr/lib/ruby/1.9.1/net/imap.rb:667:in `close' from /home/hamish/dev/heliotrope/lib/heliotrope/imap-dumper.rb:246:in `finish!' from /home/hamish/dev/heliotrope/bin/heliotrope-import:178:in `ensure in ' from /home/hamish/dev/heliotrope/bin/heliotrope-import:178:in `'
Then when trying to restart I got
/usr/lib/ruby/1.9.1/net/imap.rb:1099:in `get_tagged_response': Account exceeded command or bandwidth limits. (Failure) (Net::IMAP::NoResponseError) from /usr/lib/ruby/1.9.1/net/imap.rb:1153:in `block in send_command' from /usr/lib/ruby/1.9.1/monitor.rb:201:in `mon_synchronize' from /usr/lib/ruby/1.9.1/net/imap.rb:1135:in `send_command' from /usr/lib/ruby/1.9.1/net/imap.rb:419:in `login' from /home/hamish/dev/heliotrope/lib/heliotrope/imap-dumper.rb:148:in `load!' from /home/hamish/dev/heliotrope/bin/heliotrope-import:127:in `'
Just tried to import my gmail mailstore, and got this error:
; scanned 561, indexed 561, skipped 0 bad and 0 seen messages in 44.1s = 12.7 m/s
; requesting messages 779..798 from imap server
; got 20 messages
* wrote broken message to bad-message.txt
/Users/asf/Hacks/heliotrope/lib/heliotrope/meta-index.rb:447:in `block in add_labels_to_labellist!': -:< is an invalid label (Heliotrope::MetaIndex::InvalidLabelError)
from /Users/asf/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/set.rb:222:in `block in each'
from /Users/asf/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/set.rb:222:in `each_key'
from /Users/asf/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/set.rb:222:in `each'
from /Users/asf/Hacks/heliotrope/lib/heliotrope/meta-index.rb:447:in `add_labels_to_labellist!'
from /Users/asf/Hacks/heliotrope/lib/heliotrope/meta-index.rb:147:in `add_message'
from bin/heliotrope-import:133:in `block in <main>'
from /Users/asf/Hacks/heliotrope/lib/heliotrope/message-adder.rb:57:in `block in each_message'
from /Users/asf/Hacks/heliotrope/lib/heliotrope/imap-dumper.rb:149:in `each_message'
from /Users/asf/Hacks/heliotrope/lib/heliotrope/message-adder.rb:42:in `each_message'
from bin/heliotrope-import:109:in `<main>'
I'm kiiind of attached to this label, any way to support it? Thanks!
If I try running heliotrope-add while heliotrope-server is running I get an error:
IMAP password (displayed!): /var/lib/gems/1.9.1/gems/leveldb-ruby-0.7/lib/leveldb.rb:11:in `make': IO error: lock /home/hamish/.heliotrope/store/LOCK: Resource temporarily unavailable (LevelDB::Error)
from /var/lib/gems/1.9.1/gems/leveldb-ruby-0.7/lib/leveldb.rb:11:in `new'
from /home/hamish/dev/heliotrope/lib/heliotrope/index.rb:45:in `initialize'
from bin/heliotrope-add:113:in `new'
from bin/heliotrope-add:113:in `<main>'
Is this expected to work? If not how do I add new messages to the message store while heliotrope-server is running?
apparently gmail rate limits. we may want to print a message when we get nil responses from imap too, since this seems to be a sign we've hit the rate limit.
I have an email from "Olivier", but I can’t find it using \Olivier. I need to use \olivier (lower-case o). This seems logical, given that the terms are downcased when indexed:
entry.add_string "from", indexable_text_for(message.from).downcase
(lib/heliotrope/meta-index.rb:617)
I’m not sure whether this bug is in heliotrope or in whistlepig itself.
Hi,
Just stumbled on heliotrope, love the idea (had similiar thoughts recently) and thought I'd give it a go. Discovered that one email in particular from my local Maildir caused html2text to hang during import. A timeout for how long it would take to run html2text would probably be useful, causing the email to be skipped as bad.
If that makes sense?
heliotrope-add chokes on invalid utf-8. The following patch stops the error so the import will continue but I don't know it it's the right thing to do though.
diff --git a/lib/heliotrope/maildir-walker.rb b/lib/heliotrope/maildir-walker.rb
index 319b8cd..b03674d 100644
--- a/lib/heliotrope/maildir-walker.rb
+++ b/lib/heliotrope/maildir-walker.rb
@@ -47,11 +47,22 @@ private
def get_date_in_file fn
File.open(fn) do |f|
while(l = f.gets)
+ error_count = 0
+ begin
if l =~ /^Date:\s+(.+\S)\s*$/
date = $1
pdate = Time.parse($1)
return pdate
end
+ rescue => e
+ unless error_count > 1
+ l.encode!('utf-8', 'utf-8', :invalid => :replace)
+ error_count += 1
+ retry
+ else
+ puts "; cannot fix: #{e}: #{l}"
+ end
+ end
end
end
puts "; warning: no date in #{fn}"
From: William Morgan [email protected]
Newsgroups: gmane.mail.sup.devel
Subject: [Heliotrope/Turnsole] How to use IMAP?
Date: Mon, 09 Jan 2012 16:40:56 -0800
Reply-To: Sup developer discussion [email protected]
Excerpts from Michael Stapelberg's message of 2012-01-08 14:48:18 -0800:
What is the correct way to do this?
I need to write some code to do this. For IMAP and GMail, heliotrope-add will
keep a pointer to thelast message imported, by default. For mbox there is a
trick you can use. But there's nothing for maildir right now. Please file
an issue so that I don't forget this!
Sometimes these just hang forever.
Navigating to a message in the web interface the message is shown correctly but the From: and To: fields do not show the corresponding email addresses. Instead they show the ruby objects like shown below:
From: #Heliotrope::Person:0x000000030426f0
To: #Heliotrope::Person:0x00000003032278
Maybe a call to to_s would fix this.
IMAP password (displayed!): ; loading mail...
; connecting to imap.gmail.com:993 (ssl: true)...
; login as [email protected] ...
; found 43628 new messages...
; found 43628 messages to import
; requesting messages 167251..167300 from imap server
; got 23 messages
decode_mime_parts': undefined method
multipart?' for nil:NilClass (NoMethodError)decode_mime_parts' from ./lib/heliotrope/message.rb:178:in
decode_mime_parts'map' from ./lib/heliotrope/message.rb:177:in
decode_mime_parts'mime_parts' from ./lib/heliotrope/message.rb:133:in
has_attachment?'Patched it for now with (let me know if you want a copy of the bad-message.txt) https://gist.github.com/1074432.
Perhaps I just couldn't follow your thinking, but I found the last part of the README a bit confusing, where it talks about heliotrope-add: "To add messages from existing GMail, IMAP, mbox or maildir sources..."
It mentions using heliotrope-add, with --state-file to add messages at a regular interval, but then in the while loop it is using heliotrope-import and no --stat-file.
I noticed that heliotrope-add bangs out on message [email protected] from the debian-vote list: http://lists.debian.org/debian-vote/2008/03/msg00130.html
# ruby1.9.1 -Ilib ./bin/heliotrope-add -m debian-vote-2008 -d test
/home/test/GIT/heliotrope/lib/heliotrope/index.rb:482:in `block in index!':
undefined method `indexable_text' for nil:NilClass (NoMethodError)
from /home/test/GIT/heliotrope/lib/heliotrope/index.rb:482:in `map'
from /home/test/GIT/heliotrope/lib/heliotrope/index.rb:482:in `index!'
from /home/test/GIT/heliotrope/lib/heliotrope/index.rb:80:in `add_message'
from ./bin/heliotrope-add:113:in `<main>'
As you can see even in the HTML representation of the message linked above:
To: , [email protected]
This code in line 482 of lib/heliotrope/index.rb will fail work if any recipient is empty:
message.recipients.map { |x| x.indexable_text }.join(" ").downcase
I'm lacking the Ruby skills to make heliotrope cope with such pathological messages. In Python, I would fix it like this:
[x.indexable_text for x in message.recipients if x]
Just tried importing my old mbox with this command:
ruby -Ilib bin/heliotrope-add -m ~/tmp/mailbox -d data
and got this crash at some point:
* wrote broken message to bad-message.txt
end offset is 29492938
/home/ash/src/heliotrope/lib/heliotrope/decoder.rb:50:in `force_encoding': can't convert nil into String (TypeError)
from /home/ash/src/heliotrope/lib/heliotrope/decoder.rb:50:in `transcode'
from /home/ash/src/heliotrope/lib/heliotrope/message.rb:265:in `mime_content_for'
from /home/ash/src/heliotrope/lib/heliotrope/message.rb:187:in `decode_mime_parts'
from /home/ash/src/heliotrope/lib/heliotrope/message.rb:181:in `block in decode_mime_parts'
from /home/ash/src/heliotrope/lib/heliotrope/message.rb:181:in `map'
from /home/ash/src/heliotrope/lib/heliotrope/message.rb:181:in `decode_mime_parts'
from /home/ash/src/heliotrope/lib/heliotrope/message.rb:147:in `mime_parts'
from /home/ash/src/heliotrope/lib/heliotrope/message.rb:133:in `has_attachment?'
from /home/ash/src/heliotrope/lib/heliotrope/meta-index.rb:72:in `add_message'
from bin/heliotrope-add:149:in `<main>'
Ruby version:
$ type ruby
ruby is hashed (/home/ash/.rvm/rubies/ruby-1.9.2-p290/bin/ruby)
$ ruby -v
ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-linux]
Do you have any plans to support syncing with GMail/IMAP?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.