gangly / datafaker Goto Github PK
View Code? Open in Web Editor NEWDatafaker is a large-scale test data and flow test data generation tool. Datafaker fakes data and inserts to varied data sources. 测试数据生成工具
Datafaker is a large-scale test data and flow test data generation tool. Datafaker fakes data and inserts to varied data sources. 测试数据生成工具
D:\datafaker>datafaker rdb mysqlclient://root:root@localhost:3600/test?charset=utf8 stu 10 --outprint --meta meta.txt --outspliter ',,'
Traceback (most recent call last):
File "D:\python\lib\site-packages\datafaker\cli.py", line 77, in main
db = load_db_class(args.dbtype)(args)
File "D:\python\lib\site-packages\datafaker\dbs\basedb.py", line 18, in init
self.schema = self.parse_schema()
File "D:\python\lib\site-packages\datafaker\dbs\basedb.py", line 127, in parse_schema
schema = self.parse_meta_schema()
File "D:\python\lib\site-packages\datafaker\dbs\basedb.py", line 137, in parse_meta_schema
rows = self.construct_meta_rows()
File "D:\python\lib\site-packages\datafaker\dbs\basedb.py", line 201, in construct_meta_rows
lines = read_file_lines(filepath)
File "D:\python\lib\site-packages\datafaker\utils.py", line 84, in read_file_lines
lines = fp.read().splitlines()
File "D:\python\lib\codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd7 in position 9: invalid continuation byte
'utf-8' codec can't decode byte 0xd7 in position 9: invalid continuation byte
Exception ignored in: <bound method RdbDB.del of <datafaker.dbs.rdbdb.RdbDB object at 0x000002171485B6D8>>
Traceback (most recent call last):
File "D:\python\lib\site-packages\datafaker\dbs\rdbdb.py", line 14, in del
self.session.close()
AttributeError: 'RdbDB' object has no attribute 'session'
@gangly 麻烦您看一下是什么问题?
行政区错乱 , 比如 , 海南省银川市梁平广州路z座 ( 银川市不是在海南的 )
往开启kerberos和权限的集群中写入数据,报错
sh-4.2$ datafaker es 101.12.67.77:9200 mytest01/_doc 10 --meta meta.txt
Process Process-4:
Traceback (most recent call last):
File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/site-packages/datafaker/dbs/basedb.py", line 122, in save
self.save_data(lines)
File "/usr/lib/python2.7/site-packages/datafaker/dbs/esdb.py", line 38, in save_data
success, _ = bulk(self.es, actions, index=self.args.table, raise_on_error=True)
File "/usr/lib/python2.7/site-packages/elasticsearch/helpers/actions.py", line 310, in bulk
for ok, item in streaming_bulk(client, actions, *args, **kwargs):
File "/usr/lib/python2.7/site-packages/elasticsearch/helpers/actions.py", line 240, in streaming_bulk
**kwargs
File "/usr/lib/python2.7/site-packages/elasticsearch/helpers/actions.py", line 126, in _process_bulk_chunk
raise e
AuthenticationException: AuthenticationException(401, u'security_exception', u'missi
ng authentication credentials for REST request [/mytest01%2F_doc/_bulk]')
time used: 0.301 s
数据规则:
sh-4.2$ cat meta.txt
id||int||自增id[:inc(id,1)]
name||varchar(20)||学生名字
sh-4.2$
索引:
curl -XPUT --negotiate -u : 'http://101.12.67.77:9200/mytest01/?pretty=true' -H 'Content-Type:application/json' -d '{"mappings":{"properties":{"id":{"type":"long"},"name":{"type":"text"}}}}'
比如索引com1已有100条数据,向再追加部分的数据。相关的元数据,请问命令是什么尼。
% datafaker file . out.txt 10 --meta mg_mock_meta.txt
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/datafaker-0.7.4-py3.8.egg/datafaker/cli.py", line 89, in main
db = load_db_class(args.dbtype)(args)
File "/usr/local/lib/python3.8/site-packages/datafaker-0.7.4-py3.8.egg/datafaker/cli.py", line 79, in load_db_class
module = import(pkgname, fromlist=(classname))
File "", line 991, in _find_and_load
File "", line 975, in _find_and_load_unlocked
File "", line 655, in _load_unlocked
File "", line 618, in _load_backward_compatible
File "", line 259, in load_module
File "/usr/local/lib/python3.8/site-packages/datafaker-0.7.4-py3.8.egg/datafaker/dbs/filedb.py", line 5, in
from datafaker.compat import safe_encode
File "", line 991, in _find_and_load
File "", line 975, in _find_and_load_unlocked
File "", line 655, in _load_unlocked
File "", line 618, in _load_backward_compatible
File "", line 259, in load_module
File "/usr/local/lib/python3.8/site-packages/datafaker-0.7.4-py3.8.egg/datafaker/compat.py", line 46, in
List = multiprocessing.Manager().list
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 57, in Manager
m.start()
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/managers.py", line 579, in start
self._process.start()
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in init
super().init(process_obj)
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init
self._launch(process_obj)
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 42, in _launch
prep_data = spawn.get_preparation_data(process_obj._name)
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/spawn.py", line 183, in get_preparation_data
main_mod_name = getattr(main_module.spec, "name", None)
AttributeError: module 'main' has no attribute 'spec'
module 'main' has no attribute 'spec'
如题,数据库密码带有@的要怎么设置。比如:
datafaker rdb mysql+mysqldb://user:pass@2020@localhost:3306/mydb?charset=utf8 stu 10 --meta meta.txt --batch 10
如题,偶尔就出现下面的错误:
F:\Python\test_data>datafaker rdb mysql+mysqldb://root:@localhost:3306/datafaker?charset=utf8 stu 10 --meta meta.txt --batch 1 --workers 2
insert 1 records
insert 2 records
insert 3 records
insert 4 records
insert 5 records
insert 6 records
insert 7 records
insert 8 records
insert 9 records
insert 10 records
Exception in thread Thread-2:
Traceback (most recent call last):
File "F:\Python\Python-2.7.16\lib\threading.py", line 801, in __bootstrap_inner
self.run()
File "F:\Python\Python-2.7.16\lib\threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "build\bdist.win-amd64\egg\datafaker\dbs\basedb.py", line 122, in save
self.save_data(lines)
File "build\bdist.win-amd64\egg\datafaker\dbs\rdbdb.py", line 26, in save_data
self.save_other_rdb(lines, names_format, column_names)
File "build\bdist.win-amd64\egg\datafaker\dbs\rdbdb.py", line 42, in save_other_rdb
self.session.execute(sql)
File "build\bdist.win-amd64\egg\sqlalchemy\orm\session.py", line 1269, in execute
clause, params or {}
File "build\bdist.win-amd64\egg\sqlalchemy\engine\base.py", line 988, in execute
return meth(self, multiparams, params)
File "build\bdist.win-amd64\egg\sqlalchemy\sql\elements.py", line 287, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "build\bdist.win-amd64\egg\sqlalchemy\engine\base.py", line 1107, in _execute_clauseelement
distilled_params,
File "build\bdist.win-amd64\egg\sqlalchemy\engine\base.py", line 1253, in _execute_context
e, statement, parameters, cursor, context
File "build\bdist.win-amd64\egg\sqlalchemy\engine\base.py", line 1473, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "build\bdist.win-amd64\egg\sqlalchemy\util\compat.py", line 398, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "build\bdist.win-amd64\egg\sqlalchemy\engine\base.py", line 1249, in _execute_context
cursor, statement, parameters, context
File "build\bdist.win-amd64\egg\sqlalchemy\engine\default.py", line 552, in do_execute
cursor.execute(statement, parameters)
File "F:\Python\Python-2.7.16\lib\site-packages\MySQLdb\cursors.py", line 174, in execute
self.errorhandler(self, exc, value)
File "F:\Python\Python-2.7.16\lib\site-packages\MySQLdb\connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue
ProgrammingError: (_mysql_exceptions.ProgrammingError) (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL s
erver version for the right syntax to use near '' at line 1")
[SQL: insert into stu (name,school,nickname,age,class_num,score,phone,email,ip,address) values ]
(Background on this error at: http://sqlalche.me/e/f405)
time used: 4.324 s
使用给的demo 其中meta.txt编码为UTF-8,windows的cmd窗口编码是gbk ,在运行 datafaker file out.txt hello 10 --meta meta.txt
时会出现这个错误
UUID||VARCHAR(32)||自增id[:inc(id,1)]
CREATE_USER||VARCHAR(32)||[:enum(file://account_uuid.txt)]
CREATE_TIME||VARCHAR(19)||[:enum(2020-04-05)]
CREATE_ORG||VARCHAR(32)||[:enum(组织1, 组织2)]
CREATE_DEP||VARCHAR(32)||[:enum(部门1, 部门2)]
CHANGE_USER||VARCHAR(32)||[:enum(file://account_uuid.txt)]
ENABLED||CHAR(1)||[:enum(Y)]
REMOVED||CHAR(1)||[:enum(N)]
PRIORITY||DECIMAL(10,0)||顺序号[:decimal(4,2,1)]
REMARK||VARCHAR(32)||
BUSINESS_STATUS||VARCHAR(8)||业务状态[:enum(状态1, 状态2, 状态3, 状态4)]
ID_BUSINESS||VARCHAR(32)||所属业务活动[:enum(春查, 秋查, 安评)]
ID_ITEM||VARCHAR(32)||所属指标(针对安评)
ID_WORK_TYPE||VARCHAR(32)||问题类型(即业务类型)
FIND_DATE||VARCHAR(19)||发现日期
ID_FINDER||VARCHAR(32)||发现人的ID[:enum(file://account_uuid.txt)]
ID_DEPT_FINDER||VARCHAR(32)||发现人所属部门
PHE_CONTENT||VARCHAR(500)||现象描述
ID_BUSINESS_OBJ||VARCHAR(32)||具体业务对象id
ID_BUSINESS_OBJTYPE||VARCHAR(32)||业务对象类型
ID_DEPT_RES||VARCHAR(32)||处理负责部门
ID_PERSON_RES||VARCHAR(32)||处理负责人(审核人)
ID_DEAL||VARCHAR(32)||处理单id
HAZARD_ANALYSIS||VARCHAR(500)||危害分析
PROBLEM_CODE||VARCHAR(32)||编码
ID_WORK_ORDER||VARCHAR(32)||工单ID
ASK_DATE||VARCHAR(19)||要求处理完毕日期[:enum(2020-04-05, 2020-08-08)]
没有为扩展名 .py 找到文件关联
Exception in thread Thread-2:
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages\sqlalchemy\engine\base.py", line 1248, in _execute_context
cursor, statement, parameters, context
File "C:\ProgramData\Anaconda3\lib\site-packages\sqlalchemy\engine\default.py", line 590, in do_execute
cursor.execute(statement, parameters)
File "C:\ProgramData\Anaconda3\lib\site-packages\MySQLdb\cursors.py", line 206, in execute
res = self._query(query)
File "C:\ProgramData\Anaconda3\lib\site-packages\MySQLdb\cursors.py", line 319, in _query
db.query(q)
File "C:\ProgramData\Anaconda3\lib\site-packages\MySQLdb\connections.py", line 259, in query
_mysql.connection.query(self, query)
MySQLdb._exceptions.OperationalError: (1054, "Unknown column 'cf12c42e3c844c84b9d4900628893509' in 'field list'")
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\threading.py", line 926, in _bootstrap_inner
self.run()
File "C:\ProgramData\Anaconda3\lib\threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\datafaker-0.7.4-py3.7.egg\datafaker\dbs\basedb.py", line 122, in save
self.save_data(lines)
File "C:\ProgramData\Anaconda3\lib\site-packages\datafaker-0.7.4-py3.7.egg\datafaker\dbs\rdbdb.py", line 26, in save_data
self.save_other_rdb(lines, names_format, column_names)
File "C:\ProgramData\Anaconda3\lib\site-packages\datafaker-0.7.4-py3.7.egg\datafaker\dbs\rdbdb.py", line 42, in save_other_rdb
self.session.execute(sql)
File "C:\ProgramData\Anaconda3\lib\site-packages\sqlalchemy\orm\session.py", line 1278, in execute
clause, params or {}
File "C:\ProgramData\Anaconda3\lib\site-packages\sqlalchemy\engine\base.py", line 984, in execute
return meth(self, multiparams, params)
File "C:\ProgramData\Anaconda3\lib\site-packages\sqlalchemy\sql\elements.py", line 293, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "C:\ProgramData\Anaconda3\lib\site-packages\sqlalchemy\engine\base.py", line 1103, in _execute_clauseelement
distilled_params,
File "C:\ProgramData\Anaconda3\lib\site-packages\sqlalchemy\engine\base.py", line 1288, in execute_context
e, statement, parameters, cursor, context
File "C:\ProgramData\Anaconda3\lib\site-packages\sqlalchemy\engine\base.py", line 1482, in handle_dbapi_exception
sqlalchemy_exception, with_traceback=exc_info[2], from=e
File "C:\ProgramData\Anaconda3\lib\site-packages\sqlalchemy\util\compat.py", line 178, in raise
raise exception
File "C:\ProgramData\Anaconda3\lib\site-packages\sqlalchemy\engine\base.py", line 1248, in _execute_context
cursor, statement, parameters, context
File "C:\ProgramData\Anaconda3\lib\site-packages\sqlalchemy\engine\default.py", line 590, in do_execute
cursor.execute(statement, parameters)
File "C:\ProgramData\Anaconda3\lib\site-packages\MySQLdb\cursors.py", line 206, in execute
res = self._query(query)
File "C:\ProgramData\Anaconda3\lib\site-packages\MySQLdb\cursors.py", line 319, in _query
db.query(q)
File "C:\ProgramData\Anaconda3\lib\site-packages\MySQLdb\connections.py", line 259, in query
_mysql.connection.query(self, query)
sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (1054, "Unknown column 'cf12c42e3c844c84b9d4900628893509' in 'field list'")
[SQL: insert into wo_comm_problem (UUID,CREATE_USER,CREATE_TIME,CREATE_ORG,CREATE_DEP,CHANGE_USER,ENABLED,REMOVED,PRIORITY,REMARK,BUSINESS_STATUS,ID_BUSINESS,ID_ITEM,ID_WORK_TYPE,FIND_DATE,ID_FINDER,ID_DEPT_FINDER,PHE_CONTENT,ID_BUSINESS_OBJ,ID_BUSINESS_OBJTYPE,ID_DEPT_RES,ID_PERSON_RES,ID_DEAL,HAZARD_ANALYSIS,PROBLEM_CODE,ID_WORK_ORDER,ASK_DATE) values (1,cf12c42e3c844c84b9d4900628893509,2020-04-05,组织2,部门1,3ddf1219a6c44802bcb073ebd941d8ee,Y,N,49.28,None,状态4,安评,None,None,None,cf12c42e3c844c84b9d4900628893509,None,None,None,None,None,None,None,None,None,None,2020-08-08),(2,0ccfebc5962d46dd916257c0a33201ce,2020-04-05,组织1,部门2,39d9aa650f864e7d9c7e49abc1f521be,Y,N,41.81,None,状态1,安评,None,None,None,0ccfebc5962d46dd916257c0a33201ce,None,None,None,None,None,None,None,None,None,None,2020-08-08),(3,cf4f0c886c4e4fbc955a652487b0ae0e,2020-04-05,组织1,部门2,0ccfebc5962d46dd916257c0a33201ce,Y,N,17.12,None,状态4,秋查,None,None,None,cf12c42e3c844c84b9d4900628893509,None,None,None,None,None,None,None,None,None,None,2020-08-08),(4,0ccfebc5962d46dd916257c0a33201ce,2020-04-05,组织1,部门1,0a5f6438cd9f47ed8d8fa26cfe931672,Y,N,27.3,None,状态1,春查,None,None,None,0ccfebc5962d46dd916257c0a33201ce,None,None,None,None,None,None,None,None,None,None,2020-08-08),(5,cf4f0c886c4e4fbc955a652487b0ae0e,2020-04-05,组织2,部门2,354dac55ffc342c8a21665569754a928,Y,N,55.91,None,状态2,春查,None,None,None,cf4f0c886c4e4fbc955a652487b0ae0e,None,None,None,None,None,None,None,None,None,None,2020-04-05),(6,6cb7d9cfced04873b0827af619e9510b,2020-04-05,组织2,部门1,cf12c42e3c844c84b9d4900628893509,Y,N,75.18,None,状态1,秋查,None,None,None,39d9aa650f864e7d9c7e49abc1f521be,None,None,None,None,None,None,None,None,None,None,2020-04-05),(7,be047b8c3d6048c7820f8ee69ca1002e,2020-04-05,组织1,部门2,be047b8c3d6048c7820f8ee69ca1002e,Y,N,35.61,None,状态4,秋查,None,None,None,ae1eb1fb17e54334b2cb71b0d74ea702,None,None,None,None,None,None,None,None,None,None,2020-08-08),(8,be047b8c3d6048c7820f8ee69ca1002e,2020-04-05,组织1,部门2,6cb7d9cfced04873b0827af619e9510b,Y,N,50.52,None,状态1,安评,None,None,None,cf4f0c886c4e4fbc955a652487b0ae0e,None,None,None,None,None,None,None,None,None,None,2020-04-05),(9,354dac55ffc342c8a21665569754a928,2020-04-05,组织2,部门1,39d9aa650f864e7d9c7e49abc1f521be,Y,N,22.86,None,状态1,秋查,None,None,None,39d9aa650f864e7d9c7e49abc1f521be,None,None,None,None,None,None,None,None,None,None,2020-08-08),(10,cf4f0c886c4e4fbc955a652487b0ae0e,2020-04-05,组织2,部门2,39d9aa650f864e7d9c7e49abc1f521be,Y,N,78.3,None,状态3,秋查,None,None,None,ae1eb1fb17e54334b2cb71b0d74ea702,None,None,None,None,None,None,None,None,None,None,2020-04-05)]
@gangly 如标题:模拟多表关联数据 通过制定某些字段为枚举类型(从指定的数据列表里面随机选择),这样在数据量多的情况下能保证多表Join能关联上,查询到数据
这句话,如何理解???
其实复杂逻辑确实可以通过sql 输出到 文件,但是如果程序原生支持就更棒了
简单的数据库数据希望可以直接由程序实现
"uid" : "2",
"kdmc" : "考点名称",
"kdjc" : "考点简称",
"kdbsm" : "考点标识码",
"sfbzhkd" : true,
"kdjcsj" : "2018-02-01",
"csxx" : {
"kwbgsdh" : "考务办公室电话",
"sjbgsdh" : "试卷保管(保密)室电话",
"spjksdh" : "视频监考室电话",
"sjbgsdhsxjsl" : 1,
"sjffssxjsl" : 1,
"kwbgssxjsl" : 1,
"spjkssxjsl" : 1,
"yybfssxjsl" : 1,
"sjlzhtdsxjsl" : 1
},
csxx这个字段可以写吗?感谢您的开源
打印出来的效果是none,就按按照例子的fake_datetime_between('2019-04-14 00:00:00', '2019-04-15 00:00:00')复制输入的
例如:[:date_between(1966-01-01, 2000-12-12)]
同date、date_time_between等范围都不能使用
按照demo例子 所有编码都是utf8,
datafaker rdb oracle://:@ip:port/sid stu 10 --meta meta2.txt
导入报错
UnicodeEncodeError: 'ascii' codec can't encode characters in position 115-118: ordinal not in range(128)
datafaker rdb oracle+cx_Oracle://:@ip:port/sid stu 10 --meta meta2.txt
导入报错
NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:oracle.cx_Oracle
Can't load plugin: sqlalchemy.dialects:oracle.cx_Oracle
Exception AttributeError: "'RdbDB' object has no attribute 'session'" in <bound method RdbDB.del of <datafaker.dbs.rdbdb.RdbDB object at 0x7f03636a9310>> ignored
[root@localhost ~]# datafaker mysql mysql+mysqldb://root:123456@localhost:3306/ant_3.0_qm t_account_detail 10 --meta tad.txt
Process Process-4:
Traceback (most recent call last):
File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/site-packages/datafaker/dbs/basedb.py", line 122, in save
self.save_data(lines)
File "/usr/lib/python2.7/site-packages/datafaker/dbs/rdbdb.py", line 26, in save_data
self.save_other_rdb(lines, names_format, column_names)
File "/usr/lib/python2.7/site-packages/datafaker/dbs/rdbdb.py", line 42, in save_other_rdb
self.session.execute(sql)
File "/usr/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1269, in execute
clause, params or {}
File "/usr/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 988, in execute
return meth(self, multiparams, params)
File "/usr/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
distilled_params,
File "/usr/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1253, in _execute_context
e, statement, parameters, cursor, context
File "/usr/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1473, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "/usr/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/usr/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1249, in _execute_context
cursor, statement, parameters, context
File "/usr/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
cursor.execute(statement, parameters)
File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 163, in execute
result = self._query(query)
File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 321, in _query
conn.query(q)
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 505, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 724, in _read_query_result
result.read()
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1069, in read
first_packet = self.connection._read_packet()
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 676, in _read_packet
packet.raise_for_error()
File "/usr/lib/python2.7/site-packages/pymysql/protocol.py", line 223, in raise_for_error
err.raise_mysql_exception(self._data)
File "/usr/lib/python2.7/site-packages/pymysql/err.py", line 107, in raise_mysql_exception
raise errorclass(errno, errval)
ProgrammingError: (pymysql.err.ProgrammingError) (1064, u"You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '19:53:46,1384292845,0,569193),(15,18,2,'\u5546\u54c1\u8d2d\u4e70','65628319757403516',655645' at line 1")
[SQL: insert into t_account_detail (account_detail_id,shop_id,detail_type,description,biz_no,detail_amount,created_at,deleted_at,flag,avialable_amount) values (14,17,1,'\u5546\u54c1\u51fa\u552e','34250527876847164',691106,2020-08-27 19:53:46,1384292845,0,569193),(15,18,2,'\u5546\u54c1\u8d2d\u4e70','65628319757403516',655645,2020-08-27 19:53:46,350582394,0,110706),(16,19,1,'\u5546\u54c1\u51fa\u552e','19636796119968741',808004,2020-08-27 19:53:46,1256861491,0,466446),(17,20,2,'\u5546\u54c1\u8d2d\u4e70','46234490332455651',659574,2020-08-27 19:53:46,486638023,0,178019),(18,21,1,'\u5546\u54c1\u51fa\u552e','49266513315780232',743029,2020-08-27 19:53:46,717553851,0,246508),(19,22,2,'\u5546\u54c1\u8d2d\u4e70','90561178200881606',197382,2020-08-27 19:53:46,809470599,0,821661),(20,23,1,'\u5546\u54c1\u51fa\u552e','47997281259358174',232968,2020-08-27 19:53:46,1356754672,0,883010),(21,24,2,'\u5546\u54c1\u8d2d\u4e70','77079434011691921',162385,2020-08-27 19:53:46,1260205987,0,787554),(22,25,1,'\u5546\u54c1\u51fa\u552e','32080318843531665',429457,2020-08-27 19:53:46,42826607,0,557156),(23,26,2,'\u5546\u54c1\u8d2d\u4e70','15619744594723433',865789,2020-08-27 19:53:46,334998537,0,119933)]
(Background on this error at: http://sqlalche.me/e/f405)
Python版本:3.7.6
meta文件内容:
person_no||int||auto increament person_no[:inc(person_no,20201000)]
school_code||varchar(20)||school_code[:enum(12440104455354382L)]
name||varchar(20)||name
sex||varchar(5)||sex[:enum(男,女)]
person_role||int||person_role[:enum(1,2)]
grade_no||int||grade_no size[:enum(1,2,3,4,5,6)]
class_no||int||class_no[:enum(1,2,3,4,5,6,7,8,9)]
face_id||varchar(20)||face_id [:inc(700044010402000027202005080000000002,1)]
脚本命令: datafaker mysql mysql+mysqldb://root:root@localhost:3306/medical_door tb_person 10 --outprint --meta person_meta.txt --outspliter _
sqlalchemy.exc.ProgrammingError: (MySQLdb._exceptions.ProgrammingError) (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '11:28:15,0),(2,'dingjie',56.58,'玉华市',40383614455,'姜雷',2019-12-11 11:28' at line 1")
日期没有添加引号好像是这个问题
win10
执行后插入数据后提示没有为扩展名 .py 找到文件关联,但是数据已经插入库中
您好,请问一下
datafaker hive hive://yarn@localhost:10000/test stu 1000 --meta data/hive_meta.txt
这里yarn是指什么呢?我是通过docker拉了一个hive(https://www.huangyunkun.com/2018/06/05/docker-compose-hive/)
我的界面上显示的是
hive2://localhost:10000
是否你的语句中yarn在我这里可以替换成hive2。
另外,是不是需要更改datafaker目录里 init.py ? 该文件我修改成
from datafaker.cli import main
import pymysql
pymysql.install_as_MySQLdb()
import pyhive
目前跑mysql是正常了 。在跑hive的过程中一直遇到如下错误
No module named sasl
谢谢:)
安装成功,命令行生成随机数据时报错,提示hcmd.xxxxxxx模块找不到,看了compact.py的多线程代码,使用了hcmd。
C:\Users\87293\Desktop\datafaker
λ datafaker file ./ t_date 5 --meta t_date.txt --outprint
sequence item 0: expected str instance, datetime.date found
sequence item 0: expected str instance, datetime.date found
sequence item 0: expected str instance, datetime.date found
sequence item 0: expected str instance, datetime.date found
sequence item 0: expected str instance, datetime.date found
time used: 0.130 s
C:\Users\87293\Desktop\datafaker
λ cat t_date.txt
c_date || date || 当前月份[:date_this_month]
C:\Users\87293\Desktop\datafaker
λ cat t_date
2020-06-03
2020-06-08
2020-06-13
2020-06-06
2020-06-05
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
time used: 0.110 s
首先非常感谢大神的工具,非常非常好用,下面提个小小的建议
比如从文件a.txt中进行enum,里面有:
aaa
bbb
ccc
在生成3条 以内的数据时,aaa bbb ccc 分别只出现一次
超过3条重复
或者如果指定了文件中enum且不重复的时候,生成的条数不能超过文件总行数
譬如客户标识,customer001这样的,我给编号为C00001,C00002,C0003,这样的编号如何形成?
C:\Users\yzft1>datafaker mysql mysql+mysqldb://root:[email protected]:3307/population prop_knowledge_patent 20 --meta prop_knowledge_patent2.txt
Traceback (most recent call last):
File "C:\Users\yzft1\AppData\Local\Programs\Python\Python36\lib\site-packages\datafaker\cli.py", line 89, in main
db = load_db_class(args.dbtype)(args)
File "C:\Users\yzft1\AppData\Local\Programs\Python\Python36\lib\site-packages\datafaker\dbs\basedb.py", line 28, in init
self.queue = compat.Queue(maxsize=MAX_QUEUE_SIZE)
AttributeError: module 'datafaker.compat' has no attribute 'Queue'
module 'datafaker.compat' has no attribute 'Queue'
运行系统:WIN10,python3.6.6,mysqlclient
meta.txt
id||int||[:inc(id,1)]
name||varchar(20)||[:name]
age||int||[:age]
cmd命令:
datafaker mysql mysql+mysqldb://root:root@localhost:3306/sqoop student 10 --meta D:\meta.txt
错误:
Exception in thread Thread-2:
Traceback (most recent call last):
File "D:\python\lib\threading.py", line 916, in _bootstrap_inner
self.run()
File "D:\python\lib\threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "D:\python\lib\site-packages\datafaker\dbs\basedb.py", line 122, in save
self.save_data(lines)
File "D:\python\lib\site-packages\datafaker\dbs\rdbdb.py", line 26, in save_data
self.save_other_rdb(lines, names_format, column_names)
File "D:\python\lib\site-packages\datafaker\dbs\rdbdb.py", line 42, in save_other_rdb
self.session.execute(sql)
File "D:\python\lib\site-packages\sqlalchemy\orm\session.py", line 1269, in execute
clause, params or {}
File "D:\python\lib\site-packages\sqlalchemy\engine\base.py", line 988, in execute
return meth(self, multiparams, params)
File "D:\python\lib\site-packages\sqlalchemy\sql\elements.py", line 287, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "D:\python\lib\site-packages\sqlalchemy\engine\base.py", line 1107, in _execute_clauseelement
distilled_params,
File "D:\python\lib\site-packages\sqlalchemy\engine\base.py", line 1253, in _execute_context
e, statement, parameters, cursor, context
File "D:\python\lib\site-packages\sqlalchemy\engine\base.py", line 1475, in _handle_dbapi_exception
util.reraise(*exc_info)
File "D:\python\lib\site-packages\sqlalchemy\util\compat.py", line 153, in reraise
raise value
File "D:\python\lib\site-packages\sqlalchemy\engine\base.py", line 1249, in _execute_context
cursor, statement, parameters, context
File "D:\python\lib\site-packages\sqlalchemy\engine\default.py", line 552, in do_execute
cursor.execute(statement, parameters)
File "D:\python\lib\site-packages\MySQLdb\cursors.py", line 191, in execute
query = query.encode(db.encoding)
File "D:\python\lib\encodings\cp1252.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 45-47: character maps to
1. 系统环境
操作系统: Mac
Python 3.7.3版本
faker 4.0.2版本
2. 问题描述
class FackData(object):
def __init__(self, locale):
self.faker = Faker(locale)
self.faker_funcs = dir(self.faker)
FackData初始化的时候, dir(Faker(locale))无法找到模拟数据方法,测试打印返回如下:
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattr__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_factories', '_factory_map', '_locales', '_map_provider_method', '_select_factory', '_weights', 'cache_pattern', 'factories', 'generator_attrs', 'items', 'locales', 'random', 'seed', 'seed_instance', 'seed_locale', 'weights']
if keyword in self.faker_funcs:
无法找到meta定义的方法,所以模拟数据返回None,不符合预期
3. 解决方法
修改如下代码:
self.faker = Factory().create(locale)
dir(Factory().create(locale))方法,测试打印返回如下:
['_Generator__config', '_Generator__format_token', '_Generator__random', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'add_provider', 'address', 'am_pm', 'android_platform_token', 'ascii_company_email', 'ascii_email', 'ascii_free_email', 'ascii_safe_email', 'bank_country', 'bban', 'binary', 'boolean', 'bothify', 'bs', 'building_number', 'catch_phrase', 'century', 'chrome', 'city', 'city_name', 'city_suffix', 'color', 'color_name', 'company', 'company_email', 'company_prefix', 'company_suffix', 'coordinate', 'country', 'country_calling_code', 'country_code', 'credit_card_expire', 'credit_card_full', 'credit_card_number', 'credit_card_provider', 'credit_card_security_code', 'cryptocurrency', 'cryptocurrency_code', 'cryptocurrency_name', 'csv', 'currency', 'currency_code', 'currency_name', 'currency_symbol', 'date', 'date_between', 'date_between_dates', 'date_object', 'date_of_birth', 'date_this_century', 'date_this_decade', 'date_this_month', 'date_this_year', 'date_time', 'date_time_ad', 'date_time_between', 'date_time_between_dates', 'date_time_this_century', 'date_time_this_decade', 'date_time_this_month', 'date_time_this_year', 'day_of_month', 'day_of_week', 'district', 'domain_name', 'domain_word', 'dsv', 'ean', 'ean13', 'ean8', 'email', 'file_extension', 'file_name', 'file_path', 'firefox', 'first_name', 'first_name_female', 'first_name_male', 'first_romanized_name', 'format', 'free_email', 'free_email_domain', 'future_date', 'future_datetime', 'get_formatter', 'get_providers', 'hex_color', 'hexify', 'hostname', 'http_method', 'iban', 'image_url', 'internet_explorer', 'ios_platform_token', 'ipv4', 'ipv4_network_class', 'ipv4_private', 'ipv4_public', 'ipv6', 'isbn10', 'isbn13', 'iso8601', 'job', 'language_code', 'last_name', 'last_name_female', 'last_name_male', 'last_romanized_name', 'latitude', 'latlng', 'lexify', 'license_plate', 'linux_platform_token', 'linux_processor', 'local_latlng', 'locale', 'location_on_land', 'longitude', 'mac_address', 'mac_platform_token', 'mac_processor', 'md5', 'mime_type', 'month', 'month_name', 'msisdn', 'name', 'name_female', 'name_male', 'null_boolean', 'numerify', 'opera', 'paragraph', 'paragraphs', 'parse', 'password', 'past_date', 'past_datetime', 'phone_number', 'phonenumber_prefix', 'port_number', 'postcode', 'prefix', 'prefix_female', 'prefix_male', 'profile', 'provider', 'providers', 'province', 'psv', 'pybool', 'pydecimal', 'pydict', 'pyfloat', 'pyint', 'pyiterable', 'pylist', 'pyset', 'pystr', 'pystr_format', 'pystruct', 'pytuple', 'random', 'random_choices', 'random_digit', 'random_digit_not_null', 'random_digit_not_null_or_empty', 'random_digit_or_empty', 'random_element', 'random_elements', 'random_int', 'random_letter', 'random_letters', 'random_lowercase_letter', 'random_number', 'random_sample', 'random_uppercase_letter', 'randomize_nb_elements', 'rgb_color', 'rgb_css_color', 'romanized_name', 'safari', 'safe_color_name', 'safe_email', 'safe_hex_color', 'seed', 'seed_instance', 'sentence', 'sentences', 'set_formatter', 'sha1', 'sha256', 'simple_profile', 'slug', 'ssn', 'street_address', 'street_name', 'street_suffix', 'suffix', 'suffix_female', 'suffix_male', 'tar', 'text', 'texts', 'time', 'time_delta', 'time_object', 'time_series', 'timezone', 'tld', 'tsv', 'unix_device', 'unix_partition', 'unix_time', 'upc_a', 'upc_e', 'uri', 'uri_extension', 'uri_page', 'uri_path', 'url', 'user_agent', 'user_name', 'uuid4', 'windows_platform_token', 'word', 'words', 'year', 'zip']
File "D:\Python\Python36\lib\codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd7 in position 9: invalid continuation byte
'utf-8' codec can't decode byte 0xd7 in position 9: invalid continuation byte
Exception ignored in: <bound method RdbDB.del of <datafaker.dbs.rdbdb.RdbDB object at 0x0000018E8E5F0518>>
Traceback (most recent call last):
File "D:\Python\Python36\lib\site-packages\datafaker\dbs\rdbdb.py", line 14, in del
self.session.close()
AttributeError: 'RdbDB' object has no attribute 'session'
作者你好,枚举只能随机取值,能否增加顺序取值,类似于lr、jmeter的参数化
`
C:\Users\leile\Desktop>datafaker mysql mysql+mysqldb://root:Hd@123456@localhost:3306/test stu 10 --meta meta.txt
Traceback (most recent call last):
File "build\bdist.win-amd64\egg\datafaker\cli.py", line 89, in main
db = load_db_class(args.dbtype)(args)
File "build\bdist.win-amd64\egg\datafaker\cli.py", line 79, in load_db_class
module = import(pkgname, fromlist=(classname))
File "build\bdist.win-amd64\egg\datafaker\dbs\mysqldb.py", line 3, in
File "build\bdist.win-amd64\egg\datafaker\dbs\basedb.py", line 7, in
File "build\bdist.win-amd64\egg\datafaker\compat.py", line 41, in
File "build\bdist.win-amd64\egg\datafaker\multithreading.py", line 6, in
ImportError: No module named queue
No module named queue`
%d,3
1=001,10=010, 233=233
类似这种,往前自动补0,但不会超过3位。
或者是
ABC-%d
ABC-1,ABC-2,ABC-3
ABC-%d,3
ABC-001,ABC-099,ABC-233
类似于 .format("ABC-{0}",[:inc(id)]),这种操作。
随机完的数据,再有一个格式化的功能,往前补0或者固定前缀,后缀,截取。
(venv366-64bit-mysql) D:\012_python3\datafaker-master>datafaker mysql mysql+mysqldb://api_test:[email protected]:3306/mailserver virtual_users 2
Exception in thread Thread-1:
Traceback (most recent call last):
File "c:\python366-64bit\Lib\threading.py", line 916, in _bootstrap_inner
self.run()
File "c:\python366-64bit\Lib\threading.py", line 864, in run
self._target(*self._args, **self.kwargs)
File "D:\evn\venv366-64bit-mysql\lib\site-packages\datafaker\dbs\basedb.py", line 42, in fake_data
columns = self.fake_column()
File "D:\evn\venv366-64bit-mysql\lib\site-packages\datafaker\dbs\basedb.py", line 53, in fake_column
columns.append(self.fakedata.do_fake(item['cmd'], item['args']))
File "D:\evn\venv366-64bit-mysql\lib\site-packages\datafaker\fakedata.py", line 215, in do_fake
return method(*args)
File "D:\evn\venv366-64bit-mysql\lib\site-packages\datafaker\fakedata.py", line 40, in fake_int
return self.faker.random_int(min, max)
File "D:\evn\venv366-64bit-mysql\lib\site-packages\faker\providers_init.py", line 106, in random_int
return self.generator.random.randrange(min, max + 1, step)
File "D:\evn\venv366-64bit-mysql\lib\random.py", line 199, in randrange
raise ValueError("empty range for randrange() (%d,%d, %d)" % (istart, istop, width))
ValueError: empty range for randrange() (11,1, -10)
[root@node03 bin]# datafaker kafka node01:6667 test4 1 --meta meta.txt --outprint
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/datafaker/cli.py", line 78, in main
db.do_fake()
File "/usr/local/lib/python2.7/site-packages/datafaker/utils.py", line 72, in wrapper
ret = func(*args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/datafaker/dbs/kafkadb.py", line 26, in do_fake
lines = self.fake_column()
TypeError: fake_column() takes exactly 2 arguments (1 given)
fake_column() takes exactly 2 arguments (1 given)
windows7系统,py3.5下载后无法使用命令,cx_oracle插入语法不支持insert into table_name(column) values (),(),(),希望作者能帮忙解决一下。谢谢~
版本 0.7.1
help信息
optional arguments:
-h, --help show this help message and exit
--auth [AUTH] user and password
--meta [META] meta file path
--interval INTERVAL the interval to make stream data
--batch BATCH the interval to make stream data
--workers WORKERS the interval to make stream data
--version print the version number and exit
--outprint print fake date to screen
--outspliter OUTSPLITER
print data, to split columns
--locale LOCALE locale language
--format FORMAT outprint and outfile format: json, text (default:
text)
--withheader print data or write data to file with column header
作者你好,在生成文本类型时,每行的末尾会出现一个空格和一个整形数据
你的范例中
1,,鲍红,,人和中心,,高小王子,,3,,81,,55.6,,13197453222,,[email protected],,192.100.224.255,,江苏省西宁市梁平朱路I座 944204
最末尾的 944204是在mete文件中没有体现
现在我的问题是:
1.怎么取消掉这部分数据?
def fake_date_between(self, start_date=None, end_date=None, format='%Y-%m-%d'):
# 去掉时分秒,不然后续计算天差值会出错
today = datetime.datetime.strftime(datetime.datetime.today(), "%Y-%m-%d")
today = datetime.datetime.strptime(today, '%Y-%m-%d')
if start_date is None:
start_diff = 'today'
else:
start_date = datetime.datetime.strptime(start_date, '%Y-%m-%d')
diff = (start_date - today).days
start_diff = '%dd' % diff if diff != 0 else 'today'
if end_date is None:
end_diff = today
else:
end_date = datetime.datetime.strptime(end_date, '%Y-%m-%d')
diff = (end_date - today).days
end_diff = '%sd' % diff if diff != 0 else 'today'
如题
@gangly
能不能支持一下SQL Sever呢?感觉就差这个了
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.