Giter Club home page Giter Club logo

Comments (25)

vrajat avatar vrajat commented on July 19, 2024

I've released a new version v0.6.2. One major change is that I had to split sqlite to keep the command line clean. Refer below. I'll work on revamping docs. Please follow #38

usage: piicatcher sqlite [-h] -s PATH [-f {ascii_table,json,db}]
                         [-c {deep,shallow}] [-o OUTPUT] [--list-all]

optional arguments:
  -h, --help            show this help message and exit
  -s PATH, --path PATH  File path to SQLite database
  -f {ascii_table,json,db}, --output-format {ascii_table,json,db}
                        Choose output format type
  -c {deep,shallow}, --scan-type {deep,shallow}
                        Choose deep(scan data) or shallow(scan column names
                        only)
  -o OUTPUT, --output OUTPUT
                        File path for report. If not specified, then report is
                        printed to sys.stdout
  --list-all            List all columns. By default only columns with PII
                        information is listed

from piicatcher.

jayeshagwan1 avatar jayeshagwan1 commented on July 19, 2024

@vrajat Getting this error after v0.6.2. files also not working now

image

from piicatcher.

vrajat avatar vrajat commented on July 19, 2024

Can you check after running pip install pymssql ? The package is part of requirements and it should have been involved. I am not sure why it got missed. I'll see if I can reproduce it in the background.

from piicatcher.

jayeshagwan1 avatar jayeshagwan1 commented on July 19, 2024

Tried installing pymssql

image

from piicatcher.

vrajat avatar vrajat commented on July 19, 2024

More info: #29 (comment)
Moore's law. I'll release a new version with the right version.

from piicatcher.

jayeshagwan1 avatar jayeshagwan1 commented on July 19, 2024

For temporary fix I ran pip install "pymssql<3.0". Now Files are running fine. For db not able to get data:

image

from piicatcher.

vrajat avatar vrajat commented on July 19, 2024

OK. By default it only lists columns that has PII. You can list all columns using --list-all.

Shallow checks names of columns only. It looks for column names like name, email etc using regular expressions from https://github.com/madisonmay/CommonRegex. Do you have columns with these names ?

Another option is to also check data using -c deep

from piicatcher.

jayeshagwan1 avatar jayeshagwan1 commented on July 19, 2024

Yes I have such columns with names:

image

--list-all with -c deep

image

from piicatcher.

vrajat avatar vrajat commented on July 19, 2024

I recreated the table and it works for me:
Screenshot from 2019-12-19 15-45-10

Can you check if piicatcher can read all the tables in the sqlite database ? list all should not have returned an empty list. That points that it cannot find any tables and columns.

What are the data types of these columns ? piicatcher only checks text, varchar and char columns.

sqlite does not have user/roles/grant. So it cant be that.

Can you check if the following query returns any rows in sqlite3 ?

sqlite3 db.data.sqlite

sqlite>SELECT 
            "" as schema_name,
            m.name as table_name, 
            p.name as column_name,
            p.type as data_type
        FROM 
            sqlite_master AS m
        JOIN 
            pragma_table_info(m.name) AS p
        WHERE
            p.type like 'text' or p.type like 'varchar%' or p.type like 'char%'
        ORDER BY 
            m.name, 
            p.name

from piicatcher.

jayeshagwan1 avatar jayeshagwan1 commented on July 19, 2024

May be I am missing something
This is my sqlite

image

Anything you can see is not correct ?

from piicatcher.

vrajat avatar vrajat commented on July 19, 2024

Piicatcher should definitely detect these columns. I dont see anything wrong. Can you run this SQL query:

SELECT 
        "" as schema_name,
        m.name as table_name, 
        p.name as column_name,
        p.type as data_type
    FROM 
        sqlite_master AS m
    JOIN 
        pragma_table_info(m.name) AS p
    ORDER BY 
        m.name, 
        p.name

from piicatcher.

jayeshagwan1 avatar jayeshagwan1 commented on July 19, 2024

image

Any issues ?

from piicatcher.

vrajat avatar vrajat commented on July 19, 2024

Yes. data_type is null or empty. That is unexpected. I get

|test|address|varchar
|test|city|varchar
|test|company_name|varchar
|test|county|varchar
|test|email|varchar web
|test|first_name|varchar
|test|last_name|varchar
|test|phone1|varchar
|test|phone2|varchar
|test|state|varchar
|test|zip|varchar

What was your CREATE TABLE statement ? My guess is that you did not specify a type. I can reproduce with:

create table no_type(a,b);
# run select query
|no_type|a|
|no_type|b|

Can you recreate the table with data types ? For example:

create table test(first_name varchar, last_name varchar, company_name varchar, address varchar, city varchar, county varchar, state varchar, zip varchar, phone1 varchar, phone2 varchar, email varchar web);

from piicatcher.

jayeshagwan1 avatar jayeshagwan1 commented on July 19, 2024

I have script like this which has datatypes:

CREATE TABLE IF NOT EXISTS [userdata] (
[first_name] VARCHAR NULL,
[last_name] VARCHAR NULL,
[company_name] VARCHAR NULL,
[address] VARCHAR NULL,
[city] VARCHAR NULL,
[county] VARCHAR NULL,
[state] VARCHAR NULL,
[zip] INT NULL,
[phone1] VARCHAR NULL,
[phone2] VARCHAR NULL,
[email] VARCHAR NULL,
[web] VARCHAR NULL
);

from piicatcher.

vrajat avatar vrajat commented on July 19, 2024

I get the data type for that table.

|userdata|address|VARCHAR
|userdata|city|VARCHAR
|userdata|company_name|VARCHAR
|userdata|county|VARCHAR
|userdata|email|VARCHAR
|userdata|first_name|VARCHAR
|userdata|last_name|VARCHAR
|userdata|phone1|VARCHAR
|userdata|phone2|VARCHAR
|userdata|state|VARCHAR
|userdata|web|VARCHAR
|userdata|zip|INT

Which version of sqlite do you have ?

I have:

SQLite 3.29.0 2019-07-10 17:32:03 fc82b73eaac8b36950e527f12c4b5dc1e147e6f4ad2217ae43ad82882a88alt1
zlib version 1.2.11
gcc-9.2.1 20191008

from piicatcher.

jayeshagwan1 avatar jayeshagwan1 commented on July 19, 2024

V 3.30.1.0

from piicatcher.

vrajat avatar vrajat commented on July 19, 2024

OK. Then I am at a loss why data type is not returned for you.

from piicatcher.

jayeshagwan1 avatar jayeshagwan1 commented on July 19, 2024

I altered script and removed null and now I am getting the desired result:

image

But piicatcher sqlite still not returning data

image

from piicatcher.

jayeshagwan1 avatar jayeshagwan1 commented on July 19, 2024

My bad.... Now its working.. Thanks a lot...
image

from piicatcher.

vrajat avatar vrajat commented on July 19, 2024

from piicatcher.

jayeshagwan1 avatar jayeshagwan1 commented on July 19, 2024

I have tested piicatcher on different file formats like json, plaintext, xml, csv but does not work with xlsx, xlx

from piicatcher.

vrajat avatar vrajat commented on July 19, 2024

It does not support binary formats like xlsx, xls and PDF yet. Can you file a new feature request ?

from piicatcher.

jayeshagwan1 avatar jayeshagwan1 commented on July 19, 2024

Sure ...

from piicatcher.

jayeshagwan1 avatar jayeshagwan1 commented on July 19, 2024

Getting error while having deep scan mysql db

image

from piicatcher.

vrajat avatar vrajat commented on July 19, 2024

I've released a new version. Thanks for all the bug reports! Please open a new issue for future problems.

from piicatcher.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.