Comments (9)
Got you @emanuel030, thanks for reporting. I will try to have support for this within this month. In the meantime, if you need it asap, you should be able to define your own regex for your headers using arguments regex
and regex_alert
in WhatsAppChat
class.
We are currently discussing how to provide extensive support to all formats, since it seems there are endless header formats.
from whatstk.
another type would be (german version of the app) :
[DD.MM.YY, hh:mm:ss] username:
from whatstk.
@emanuel030 is this the message header or a copy and past? For example, this is an export from my phone:
6/28/16, 15:53 - +55 00 0000-0000: [10/6 11:31] +55 11 1111-1111: Vixe, consegui com nosso colega Luiz, uma cartilha do MEC q não autoriza diárias para estudante. [10/6 11:32] +55 22 2222-2222: Aqui viaja.. internacional também... nas só passagem, sem diárias. Recebem uma ajuda de custo pelo órgão [10/6 11:51] +55 33 3333-3333: ALUNO = AJUDA DE CUSTO, independente da origem do recurso [10/6 11:56] +55 44 4444-4444: Aluno vai pelo auxílio viagem
Where the user +55 00 0000-0000
copied and pasted messages from +55 11 1111-1111
, +55 22 2222-2222
, +55 33 3333-3333
and +55 44 4444-4444
. Tha's why considering [] brackets as part of the header is not a good choice in my case, because in the example above, its all part of the same message, which is a copy and past.
I hope I have made myself understood.
from whatstk.
This would be a copy from the chat itself:
[20.09.17, 16:54:32] Veronica MBS: Hey guys!
[20.09.17, 16:54:43] Veronica MBS: I just created
from whatstk.
@emanuel030 Can you do a copy and past in that chat? For example, select these two messages and past it on that chat, and export it? =)
from whatstk.
@kafran The header does not change in my case if you copy a text, also does not change if you reply to one of the previous chats. It is still:
[17.04.18, 21:06:52] Jack: Sorry mlad but im home and hosed
[20.04.18, 16:17:52] emanuel: Sorry mlad but im home and hosed
from whatstk.
I would like to propose you guys to join this group about whatstk so we can generate the same chat and export from multiples devices and languages and timezones and understand how Whatsapp construct these files, as it seems to be a mystery: https://chat.whatsapp.com/LBR8ZMZWc57JZ1RKTNRKVa
What do you think? @lucasrodes @albertaparicio @emanuel030
from whatstk.
I think the expressions are already too general and are detecting fragments other than the header itself. The problem is with past and copy messages which also contains a "header". This is an example exported from my phone:
6/28/16, 15:49 - +55 11 1111-1111: Karol, alguém postou algo sobre auxílio financeiro p alunos. Vou dar uma procurada. 6/28/16, 15:53 - +55 11 1111-1111: [10/6 11:31] +55 22 2222-2222: Vixe, consegui com nosso colega Luiz ,uma cartilha q não autoriza diárias para estudante. [10/6 11:32] +55 33 3333-3333: Aqui viaja.. internacional também... nas só passagem, sem diárias. Recebem uma ajuda de custo pelo órgão [10/6 11:51] +55 44 4444-4444: ALUNO = AJUDA DE CUSTO, independente da origem do recurso [10/6 11:56] +55 55 5555-5555: Aluno vai pelo auxílio viagem 6/28/16, 15:53 - +55 11 1111-1111: Essa tal dessa cartilha. Vou procurar.
Here is the correct regex for it: https://goo.gl/U8B94g
from whatstk.
Issue closed because now the header format can be manually set (see WhatsAppChat
arguments)
from whatstk.
Related Issues (20)
- Reduce library dependencies so that whatstk becomes lighter
- Installation error with Python 3.8 and Visual Studio HOT 11
- Fix links
- Automate chat text files to CSV conversion HOT 6
- ENH: Add support for Google Drive
- Error when using `pip install` HOT 8
- Parse messages which contain \n characters HOT 1
- Migrate CI/CD: Travis to GitHub Actions HOT 1
- Insights on message length per user HOT 2
- Add script to parse chat from txt to csv HOT 1
- Command-line tool to visualize chat
- Generate documentation of project using sphinx.
- Create list mapping usernames to colours, so same color is used for a user in all plots
- Chat df schema (strings)
- Re-adapt library so it might incorporate other sources in the future (e.g. facebook, instagram...).
- Read chats from URLs
- Option to count the number of interventions jointly (all users combined).
- argument 'cummulative' is mispelled, should be 'cumulative'.
- Python 3.9 compatibility HOT 2
- change index in chat dataframe: Use ID instead of timestamp (since timestamp might be repeated)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from whatstk.