Giter Club home page Giter Club logo

heliotrope's People

Contributors

heylu avatar hyperbolist avatar jboyens avatar np avatar rburchell avatar stapelberg avatar twilliam avatar wmorgan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

heliotrope's Issues

Adding a list:foo predicate to search queries

Having such a predicate would make easier the search/labelling of messages from lists.
A bonus would be a word based matching. Example, results from list:sup-devel.rubyforge.org would also be displayed by list:sup and list:''.

Gmail dumper does not support non english labels.

GMail labels are UTF-7 encoded so they are unreadable when they are in languages other than english. Before storing the labels in the heliotrope index these should be converted to UTF-8 that would make them readable and also searchable. The ruby Net::IMAP.decode_utf7 method can be used to accomplish this.

Batch labelling based on search result

I would like to label/unlabel a whole set of thread all at once by specifing query and a set of signed (+/-) labels.
Currently the loop is done on the client which is really inefficient, moreover using turnsole its not possible to really select all the threads of a search result.

Heliotrope crashes if the date is badly formatted

If a message has a date with month first and day second, then if the day is greater than 12, heliotrope crashes. The message in question was generated by staples.co.uk after I ordered some goods from their website. So sup should not barf on it, but forging a date header would be fine by me. Though "now" might be better than date zero, which I think is what sup normally does otherwise ...

A possible patch:

lib/heliotrope/maildir-walker.rb |   14 +++++++++++++-
 1 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/lib/heliotrope/maildir-walker.rb b/lib/heliotrope/maildir-walker.rb
index db9b9d4..47cb08b 100644
--- a/lib/heliotrope/maildir-walker.rb
+++ b/lib/heliotrope/maildir-walker.rb
@@ -43,7 +43,19 @@ private
      while(l = f.gets)
        if l =~ /^Date:\s+(.+\S)\s*$/
          date = $1
-          pdate = Time.parse($1)
+          begin
+            pdate = Time.parse($1)
+          rescue ArgumentError
+            # flip the day and month around and try again
+            if date =~ %r|(\d\d?)([-./])(\d\d?)([-./])(\d{2,4})|
+              date = $3 + $2 + $1 + $4 + $5
+              pdate = Time.parse(date)
+            else
+              puts "Error while parsing time in file #{fn}"
+              puts "Matched date text was #{date}"
+              pdate = Time.at 0
+            end
+          end
          return pdate
        end
      end

Uncaught exception while importing

I got about half way through importing my ~25000 messages from gmail and then get:

/home/mish/dev/sup/heliotrope/lib/heliotrope/index.rb:484:in `build_thread_structure_from': undefined method `-@' for nil:NilClass (NoMethodError)
    from /home/mish/dev/sup/heliotrope/lib/heliotrope/index.rb:470:in `block in build_thread_structure_from'
    from /usr/lib/ruby/1.9.1/set.rb:221:in `block in each'
    from /usr/lib/ruby/1.9.1/set.rb:221:in `each_key'
    from /usr/lib/ruby/1.9.1/set.rb:221:in `each'
    from /home/mish/dev/sup/heliotrope/lib/heliotrope/index.rb:470:in `map'
    from /home/mish/dev/sup/heliotrope/lib/heliotrope/index.rb:470:in `build_thread_structure_from'
    from /home/mish/dev/sup/heliotrope/lib/heliotrope/index.rb:427:in `thread_message!'
    from /home/mish/dev/sup/heliotrope/lib/heliotrope/index.rb:89:in `add_message'
    from bin/heliotrope-add:147:in `<main>'

Could there be an option added to spit out the message to a file? Or the first message read in after restart. Otherwise it is a bit hard to work out which message it is ...

Failed to upgrade index

I just had another play with my test heliotrope setup following a git pull:

$ ruby1.9.1 -I lib bin/heliotrope-server  -d ~/.heliotrope/ -H 1.2.3.4
Version mismatch error: index is version nil but I am expecting "0.1".
Try running bin/heliotrope-upgrade-index.

$ ruby1.9.1 -I lib bin/heliotrope-upgrade-index -d ~/.heliotrope
"upgrade__to_0_1"
Trying to upgrade from nil to "0.1".
Sorry! To upgrade to index version 0.1, you must reindex everything.

$ ruby1.9.1 -I lib bin/heliotrope-reindex -d ~/.heliotrope
Reindexing...
bin/heliotrope-reindex:128:in `add_entry': entry must be a Whistlepig::Entry object (TypeError)
        from bin/heliotrope-reindex:128:in `block in '
        from bin/heliotrope-reindex:63:in `each_message_regular'
        from bin/heliotrope-reindex:48:in `each_message'
        from bin/heliotrope-reindex:126:in `'

I am caught in a catch 22? Should I checkout an earlier version of heliotrope and do the reindexing at that point? Or is upgrading the index not really supported at this point?

Check return value of @imap.uid_fetch

It seems uid_fetch will return nil in some cases leading to errors of the following form:

/tmp/heliotrope/lib/heliotrope/imap-dumper.rb:191:in `next_message': undefined method `size' for nil:NilClass (NoMethodError)
from bin/heliotrope-add:127:in `<main>'

I changed

imapdata = begin
  @imap.uid_fetch query, imap_query_columns

to

imapdata = begin
    (@imap.uid_fetch query, imap_query_columns) or []

But I don't know if this is the right fix to apply (I did not read the documentation of net/imap yet).

Search for cyrillic terms does not work

...both in web-interface and turnsole.

The search term is present in utf-8 and windows-1251 encodings in different mails.
My system encoding is utf-8.

Ruby 1.9

Please inform if this is not an issue with ruby 1.8 or any other environment

Crash with ill-formed Message-ID lines

My mail archive contained a Spam message with a pathological ill-formed Message-ID line, like this one:

Message-ID: <somethinglikeanid

heliotrope-add crashes when trying to grok this mail:

# ruby -Ilib ./bin/heliotrope-add -d test0 -m /tmp/bad.mbox 
; loading mail...
end offset is 262
/home/gregor/GIT/heliotrope/lib/heliotrope/message.rb:155:in `digest': can't convert nil into String (TypeError)
    from /home/gregor/GIT/heliotrope/lib/heliotrope/message.rb:155:in `hexdigest'
    from /home/gregor/GIT/heliotrope/lib/heliotrope/message.rb:155:in `munge_msgid'
    from /home/gregor/GIT/heliotrope/lib/heliotrope/message.rb:20:in `parse!'
    from ./bin/heliotrope-add:138:in `<main>'

find_msgids returns nil, which is indeed a problem for Digest::MD5.hexdigest ;-)

imap-dumper.rb needs ssl (can't --dont-use-ssl)

IMAP username: lusername
IMAP password: 
/Projects/heliotrope/lib/heliotrope/imap-dumper.rb:117:in`block in initialize': need ssl (ArgumentError)
        from /Projects/heliotrope/lib/heliotrope/imap-dumper.rb:115:in `each'
        from /Projects/heliotrope/lib/heliotrope/imap-dumper.rb:115:in`initialize'
        from bin/heliotrope-add:94:in `new'
        from bin/heliotrope-add:94:in`<main>'

I get an error for self-signed SSL when I connect using SSL and couldn't figure out how to tell heliotrope to trust my untrustworthy self-signed certificate.

Upon attempting to use non-SSL'ed IMAP (which I have enabled server-side, using dovecot and enabling plain-text authentication) I get the above error.

I'm not sure that I'm not doing something dumb.

New exception on hitting gmail limits

Hello again

Just been re-importing my gmail and hit the limits again with errors thrown - though this time from well inside the imap libraries. But the errors could still be caught and nice error messages given to users - definitely in the second case below. Might want to retry in the first case. Might have a go at this myself sometime.

First while indexing I got

/usr/lib/ruby/1.9.1/openssl/buffering.rb:235:in `syswrite': closed stream (IOError)
        from /usr/lib/ruby/1.9.1/openssl/buffering.rb:235:in `do_write'
        from /usr/lib/ruby/1.9.1/openssl/buffering.rb:318:in `print'
        from /usr/lib/ruby/1.9.1/net/imap.rb:1168:in `put_string'
        from /usr/lib/ruby/1.9.1/net/imap.rb:1140:in `block in send_command'
        from /usr/lib/ruby/1.9.1/monitor.rb:201:in `mon_synchronize'
        from /usr/lib/ruby/1.9.1/net/imap.rb:1135:in `send_command'
        from /usr/lib/ruby/1.9.1/net/imap.rb:667:in `close'
        from /home/hamish/dev/heliotrope/lib/heliotrope/imap-dumper.rb:246:in `finish!'
        from /home/hamish/dev/heliotrope/bin/heliotrope-import:178:in `ensure in '
        from /home/hamish/dev/heliotrope/bin/heliotrope-import:178:in `'

Then when trying to restart I got

/usr/lib/ruby/1.9.1/net/imap.rb:1099:in `get_tagged_response':  Account exceeded command or bandwidth limits. (Failure) (Net::IMAP::NoResponseError)
        from /usr/lib/ruby/1.9.1/net/imap.rb:1153:in `block in send_command'
        from /usr/lib/ruby/1.9.1/monitor.rb:201:in `mon_synchronize'
        from /usr/lib/ruby/1.9.1/net/imap.rb:1135:in `send_command'
        from /usr/lib/ruby/1.9.1/net/imap.rb:419:in `login'
        from /home/hamish/dev/heliotrope/lib/heliotrope/imap-dumper.rb:148:in `load!'
        from /home/hamish/dev/heliotrope/bin/heliotrope-import:127:in `'

Raises for a gmail message labeled with "=(-:<"

Just tried to import my gmail mailstore, and got this error:

; scanned 561, indexed 561, skipped 0 bad and 0 seen messages in 44.1s = 12.7 m/s
; requesting messages 779..798 from imap server
; got 20 messages
* wrote broken message to bad-message.txt
/Users/asf/Hacks/heliotrope/lib/heliotrope/meta-index.rb:447:in `block in add_labels_to_labellist!': -:< is an invalid label (Heliotrope::MetaIndex::InvalidLabelError)
        from /Users/asf/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/set.rb:222:in `block in each'
        from /Users/asf/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/set.rb:222:in `each_key'
        from /Users/asf/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/set.rb:222:in `each'
        from /Users/asf/Hacks/heliotrope/lib/heliotrope/meta-index.rb:447:in `add_labels_to_labellist!'
        from /Users/asf/Hacks/heliotrope/lib/heliotrope/meta-index.rb:147:in `add_message'
        from bin/heliotrope-import:133:in `block in <main>'
        from /Users/asf/Hacks/heliotrope/lib/heliotrope/message-adder.rb:57:in `block in each_message'
        from /Users/asf/Hacks/heliotrope/lib/heliotrope/imap-dumper.rb:149:in `each_message'
        from /Users/asf/Hacks/heliotrope/lib/heliotrope/message-adder.rb:42:in `each_message'
        from bin/heliotrope-import:109:in `<main>'

I'm kiiind of attached to this label, any way to support it? Thanks!

heliotrope-add doesn't work while heliotrope-server is running

If I try running heliotrope-add while heliotrope-server is running I get an error:

IMAP password (displayed!): /var/lib/gems/1.9.1/gems/leveldb-ruby-0.7/lib/leveldb.rb:11:in `make': IO error: lock      /home/hamish/.heliotrope/store/LOCK: Resource temporarily unavailable (LevelDB::Error)
        from /var/lib/gems/1.9.1/gems/leveldb-ruby-0.7/lib/leveldb.rb:11:in `new'
        from /home/hamish/dev/heliotrope/lib/heliotrope/index.rb:45:in `initialize'
        from bin/heliotrope-add:113:in `new'
        from bin/heliotrope-add:113:in `<main>'

Is this expected to work? If not how do I add new messages to the message store while heliotrope-server is running?

Searching for sender with upper-case search terms doesn’t work

I have an email from "Olivier", but I can’t find it using \Olivier. I need to use \olivier (lower-case o). This seems logical, given that the terms are downcased when indexed:

entry.add_string "from", indexable_text_for(message.from).downcase

(lib/heliotrope/meta-index.rb:617)

I’m not sure whether this bug is in heliotrope or in whistlepig itself.

html2text timeout

Hi,

Just stumbled on heliotrope, love the idea (had similiar thoughts recently) and thought I'd give it a go. Discovered that one email in particular from my local Maildir caused html2text to hang during import. A timeout for how long it would take to run html2text would probably be useful, causing the email to be skipped as bad.

If that makes sense?

invalid byte sequence in UTF-8

heliotrope-add chokes on invalid utf-8. The following patch stops the error so the import will continue but I don't know it it's the right thing to do though.

diff --git a/lib/heliotrope/maildir-walker.rb b/lib/heliotrope/maildir-walker.rb
index 319b8cd..b03674d 100644
--- a/lib/heliotrope/maildir-walker.rb
+++ b/lib/heliotrope/maildir-walker.rb
@@ -47,11 +47,22 @@ private
   def get_date_in_file fn
     File.open(fn) do |f|
       while(l = f.gets)
+        error_count = 0
+        begin
         if l =~ /^Date:\s+(.+\S)\s*$/
           date = $1
           pdate = Time.parse($1)
           return pdate
         end
+        rescue => e
+          unless error_count > 1
+            l.encode!('utf-8', 'utf-8', :invalid => :replace)
+            error_count += 1
+            retry
+          else
+            puts "; cannot fix: #{e}: #{l}"
+          end
+        end
       end
     end
     puts "; warning: no date in #{fn}"

heliotrope-add and maildir

From: William Morgan [email protected]
Newsgroups: gmane.mail.sup.devel
Subject: [Heliotrope/Turnsole] How to use IMAP?
Date: Mon, 09 Jan 2012 16:40:56 -0800
Reply-To: Sup developer discussion [email protected]

Excerpts from Michael Stapelberg's message of 2012-01-08 14:48:18 -0800:

What is the correct way to do this?

I need to write some code to do this. For IMAP and GMail, heliotrope-add will
keep a pointer to thelast message imported, by default. For mbox there is a
trick you can use. But there's nothing for maildir right now. Please file
an issue so that I don't forget this!

Error while importing from Gmail.

IMAP password (displayed!): ; loading mail...
; connecting to imap.gmail.com:993 (ssl: true)...
; login as [email protected] ...
; found 43628 new messages...
; found 43628 messages to import
; requesting messages 167251..167300 from imap server
; got 23 messages

  • wrote broken message to bad-message.txt
    ./lib/heliotrope/message.rb:172:in decode_mime_parts': undefined methodmultipart?' for nil:NilClass (NoMethodError)
    from ./lib/heliotrope/message.rb:175:in decode_mime_parts' from ./lib/heliotrope/message.rb:178:indecode_mime_parts'
    from ./lib/heliotrope/message.rb:177:in map' from ./lib/heliotrope/message.rb:177:indecode_mime_parts'
    from ./lib/heliotrope/message.rb:147:in mime_parts' from ./lib/heliotrope/message.rb:133:inhas_attachment?'
    from ./lib/heliotrope/index.rb:77:in `add_message'
    from bin/heliotrope-add:147

Patched it for now with (let me know if you want a copy of the bad-message.txt) https://gist.github.com/1074432.

readme text

Perhaps I just couldn't follow your thinking, but I found the last part of the README a bit confusing, where it talks about heliotrope-add: "To add messages from existing GMail, IMAP, mbox or maildir sources..."

It mentions using heliotrope-add, with --state-file to add messages at a regular interval, but then in the while loop it is using heliotrope-import and no --stat-file.

Crash with empty message.recipients

I noticed that heliotrope-add bangs out on message [email protected] from the debian-vote list: http://lists.debian.org/debian-vote/2008/03/msg00130.html

# ruby1.9.1 -Ilib ./bin/heliotrope-add -m debian-vote-2008 -d test
/home/test/GIT/heliotrope/lib/heliotrope/index.rb:482:in `block in index!':
undefined method `indexable_text' for nil:NilClass (NoMethodError)
    from /home/test/GIT/heliotrope/lib/heliotrope/index.rb:482:in `map'
    from /home/test/GIT/heliotrope/lib/heliotrope/index.rb:482:in `index!'
    from /home/test/GIT/heliotrope/lib/heliotrope/index.rb:80:in `add_message'
    from ./bin/heliotrope-add:113:in `<main>'

As you can see even in the HTML representation of the message linked above:

This code in line 482 of lib/heliotrope/index.rb will fail work if any recipient is empty:

message.recipients.map { |x| x.indexable_text }.join(" ").downcase

I'm lacking the Ruby skills to make heliotrope cope with such pathological messages. In Python, I would fix it like this:

[x.indexable_text for x in message.recipients if x]

heliotrope-add -m mbox crashes in transcode

Just tried importing my old mbox with this command:

ruby -Ilib bin/heliotrope-add -m ~/tmp/mailbox -d data

and got this crash at some point:

* wrote broken message to bad-message.txt
end offset is 29492938
/home/ash/src/heliotrope/lib/heliotrope/decoder.rb:50:in `force_encoding': can't convert nil into String (TypeError)
from /home/ash/src/heliotrope/lib/heliotrope/decoder.rb:50:in `transcode'
from /home/ash/src/heliotrope/lib/heliotrope/message.rb:265:in `mime_content_for'
from /home/ash/src/heliotrope/lib/heliotrope/message.rb:187:in `decode_mime_parts'
from /home/ash/src/heliotrope/lib/heliotrope/message.rb:181:in `block in decode_mime_parts'
from /home/ash/src/heliotrope/lib/heliotrope/message.rb:181:in `map'
from /home/ash/src/heliotrope/lib/heliotrope/message.rb:181:in `decode_mime_parts'
from /home/ash/src/heliotrope/lib/heliotrope/message.rb:147:in `mime_parts'
from /home/ash/src/heliotrope/lib/heliotrope/message.rb:133:in `has_attachment?'
from /home/ash/src/heliotrope/lib/heliotrope/meta-index.rb:72:in `add_message'
from bin/heliotrope-add:149:in `<main>'

Ruby version:

$ type ruby
ruby is hashed (/home/ash/.rvm/rubies/ruby-1.9.2-p290/bin/ruby)
$ ruby -v
ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-linux]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.