Giter Club home page Giter Club logo

Comments (7)

JocelynDelalande avatar JocelynDelalande commented on June 2, 2024

Long time I did not used Mailpile, but I did have the exact same symptoms. Can remember for sure if the messages got deleted from imap, but I don't think so.

I cannot remember if I was able to recover them with a full rescan, but I think that no, and anyway, the bug was too frequent and the rescan too long for that use.

from mailpile.

JackDca avatar JackDca commented on June 2, 2024

Thanks for the comments @JocelynDelalande.

I can't figure out what could cause this problem. But, I've added some code to search.py which will do a traceback.print_stack() when an entry is written to INDEX that has a snippet but no file pointer. I'm redirecting all the stderr output to a file. If the problem happens again I'll have a lot more to work with than I do now. Could take months though!

from mailpile.

JackDca avatar JackDca commented on June 2, 2024

Bingo! The problem occurred again (actually on 2021-03-23 but I've only now had time to look at the results). I had added code to do a traceback when a metadata entry was written with a blank location but a non-trivial "snippet'. This produced about 20000 tracebacks which all look like this:

  14040   File "/usr/lib/python2.7/threading.py", line 774, in __bootstrap
  14041     self.__bootstrap_inner()
  14042   File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
  14043     self.run()
  14044   File "/home/jack/Projects/MPJack/Code/Mailpile/mailpile/mail_source/__init__.py", line 1094, in run
  14045     self.sync_mail()
  14046   File "/home/jack/Projects/MPJack/Code/Mailpile/mailpile/mail_source/__init__.py", line 319, in sync_mail
  14047     stop_after=this_batch)
  14048   File "/home/jack/Projects/MPJack/Code/Mailpile/mailpile/mail_source/__init__.py", line 1010, in rescan_mailbox
  14049     **scan_mailbox_args)
  14050   File "/home/jack/Projects/MPJack/Code/Mailpile/mailpile/search.py", line 565, in scan_mailbox
  14051     self._remove_location(session, msg_ptr)
  14052   File "/home/jack/Projects/MPJack/Code/Mailpile/mailpile/search.py", line 386, in _remove_location
  14053     self.set_msg_at_idx_pos(msg_idx_pos, msg_info)
  14054   File "/home/jack/Projects/MPJack/Code/Mailpile/mailpile/search.py", line 1717, in set_msg_at_idx_pos
  14055     traceback.print_stack()

The decision to delete the filepointer (location) is made in mailpile.search.scan_mailbox() in this code:

553            # Figure out which messages exist at all, and remove stale pointers.
554            # This happens last and in a locked section, because other threads may
555            # have added messages while we were busy with other things.
556            with self._lock:
557                messages = sorted(mbox.keys())  # Update this, in case it changed
558                existing_ptrs = set()
559                for ui in range(0, len(messages)):
560                    msg_ptr = mbox.get_msg_ptr(mailbox_idx, messages[ui])
561                    existing_ptrs.add(msg_ptr)
562                for msg_ptr in self.PTRS.keys():
563                    if (msg_ptr[:MBX_ID_LEN] == mailbox_idx and
564                            msg_ptr not in existing_ptrs):
565                        self._remove_location(session, msg_ptr)
566                        updated += 1

I have never worked on this part of the code so I don't know why it is causing the problem.

Best option is to repeat the process by adding some code to print some variables at this point and wait for the problem to happen again.

from mailpile.

JackDca avatar JackDca commented on June 2, 2024

Update: This problem occurred again on 2021-04-16. There were about 16554 metadata index lines having a snippet but no message location. Unfortunately, due to an error in my traceback code, I got no useful information from this occurrence.

A rescan all appeared to recover the data, but I have not verified that with certainty. I fixed my error (I hope) and am awaiting the next occurrence. As of the date of this post, there has been none. That's 25 days so far.

Of course, the Heisenberg Uncertainty Principle may come into play here. Is my traceback code affecting the program in a way which prevents the problem from being manifested?

Patience!!

from mailpile.

JackDca avatar JackDca commented on June 2, 2024

Update: The problem occurred again on 2021-06-03. Once again I shut down, saved the homedir, restarted and ran rescan all which appeared to recover all the emails (though have not proved that).

This time debug code produced some useful information. Debug information was printed from search.py function scan_mailbox whenever Mailpile deleted a message location pointer for a message that had MSG_PTRS (pointer to the file containing the message) with length < 20 (i.e. 1 or 0 message location pointers ) and MSG_BODY (the "snippet") with length > 3. Clearly the lone message pointer to a non-empty message should not be deleted. But this occurred 21293 times. In all cases the message was from the same mailbox, which was the INBOX of one of my three email accounts. That mailbox now has about 21500 emails so it looks like most or all of the emails in that INBOX were affected.. The code block below shows where the debug output was placed. The debug output showed that messages had zero length. The list messages is set twice at line 476 and 556 in this function with the statement messages = sorted(mbox.keys()). Debug code also captured the length of messages before the line 556 update. In all cases the length was 0 before and after the line 556 update.

The next step is to put debug code in mailbox_opener() and mbox.update_toc() to try to determine why one or the other produces the zero-length mbox.

These details are provided in the hope that they may be useful to someone with a better understanding of the code than mine.

Location of the debug code:

            # Figure out which messages exist at all, and remove stale pointers.
            # This happens last and in a locked section, because other threads may
            # have added messages while we were busy with other things.
            with self._lock:
                existing_ptrs = set()
                for ui in range(0, len(messages)):
                    msg_ptr = mbox.get_msg_ptr(mailbox_idx, messages[ui])
                    existing_ptrs.add(msg_ptr)
                for msg_ptr in self.PTRS.keys():
                    if (msg_ptr[:MBX_ID_LEN] == mailbox_idx and
                            msg_ptr not in existing_ptrs):
                        # ********* Debug output placed here **********
                        self._remove_location(session, msg_ptr)
                        updated += 1

The traceback (occurred 21293 times)

File "/usr/lib/python2.7/threading.py", line 774, in __bootstrap
  self.__bootstrap_inner()
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
  self.run()
File "/home/jack/Projects/MPJack/Code/Mailpile/mailpile/mail_source/__init__.py", line 1094, in run
  self.sync_mail()
File "/home/jack/Projects/MPJack/Code/Mailpile/mailpile/mail_source/__init__.py", line 319, in sync_mail
  stop_after=this_batch)
File "/home/jack/Projects/MPJack/Code/Mailpile/mailpile/mail_source/__init__.py", line 1010, in rescan_mailbox
  **scan_mailbox_args)
File "/home/jack/Projects/MPJack/Code/Mailpile/mailpile/search.py", line 590, in scan_mailbox
  traceback.print_stack()

from mailpile.

JackDca avatar JackDca commented on June 2, 2024

Update:

I inserted a statement at line 557 of search.py so that the code that deletes message location pointers is skipped if the mbox is zero length (which can't be correct). I've run Mailpile for one month and the "Message not found" problem has not occurred in that time so the kludgy fix is likely working (though I've gone for more than a month between "Message not found" occurrences before). I've also logged a lot (60 MB) of debug printout in an effort to find out why the mbox is zero length. Next step is to find some time to analyze the debug output! Not sure when I will get to it.

        if not lazy:
            # Figure out which messages exist at all, and remove stale pointers.
            # This happens last and in a locked section, because other threads may
            # have added messages while we were busy with other things.
            with self._lock:
                messages = sorted(mbox.keys())  # Update this, in case it changed
                if(len(messages) > 0):                    # *** DEBUG kludge!
                    existing_ptrs = set()


from mailpile.

JackDca avatar JackDca commented on June 2, 2024

(Almost a year later.) The above mentioned patch appears to prevent the wholesale deletion of message pointers so that emails are not lost. But it also appears to prevent messages from being deleted.

It appears that the problem may be caused by concurrent access to messages by multiple threads. In an attempt to find a workaround and further narrow down the problem I added code to prevent some threads from being launched, specifically those which do tag automation. After logging several months of debug data, it appears that the len(messages) == 0 condition occurs just as frequently even with those threads disabled. So I am no wiser and I am out of ideas.

from mailpile.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.