Comments (2)
I have been experiencing the same thing. For me, it seems to clone the json files contents at the end of the file (basically if you copied everything, and then pasted it again at the end of the json file)
from tinydb.
The problem is that you don't use Locks while reading and writing. Because of this it's possible that one thread or process does a 'seek' operation just before another thread or process wants to read.
For example, below is the code for a read using the JSONStorage.
def read(self) -> Optional[Dict[str, Dict[str, Any]]]:
# Get the file size by moving the cursor to the file end and reading
# its location
self._handle.seek(0, os.SEEK_END)
size = self._handle.tell()
if not size:
# File is empty, so we return ``None`` so TinyDB can properly
# initialize the database
return None
else:
# Return the cursor to the beginning of the file
self._handle.seek(0)
# Load the JSON contents of the file
return json.load(self._handle)
- proces one does a 'read'.
1.1 proces one doesself._handle.seek(0, os.SEEK_END)
, the cursor is now at the end of the file.
1.2 proces one does self._handle.seek(0), the cursor is now at the beginning of the file. - proces two start a
read
.
2.1 proces two doesself._handle.seek(0, os.SEEK_END)
, the cursor is now at the end of the file. - proces one does a
json.load(self._handle)
-> you read from the end of a file -> file seems empty
process one set the cursor to the beginning, but proces 2 changed it to the end just before process one wants to read, resulting in an empty str. This is one way things can go wrong but you can imagine that there are a lot of ways that this can fail. Like @SpiralAPI saw there might be a proces that does a self._handle.seek(0, os.SEEK_END)
just before a write resulting in appending all the data instead of overwriting.
It's not very usefull to do a search using multiple processes or thread over a single file. Since you would need to use locks every time you read or write, you basically turned it into a synchronous operation.
If you would need to do a CPU-intensive task, it's beter to read everything ahead of time (or at least in chunks) and then pass the data to different processes or threads.
from tinydb.
Related Issues (20)
- Does tinyDB support multi-thread/process access ? HOT 2
- TinyDB 4.8.0 issue HOT 2
- -1073741571 (0xC00000FD) something interesting occured. HOT 2
- Are there any option to persist query caches? HOT 1
- Broken link in documentation.
- is there any way to speed up deletion HOT 1
- Document ID and existing unique value in the document HOT 1
- Search documents that do not have a specific key
- A simple implementation for update_multiple_by_id
- bug: MemoryStorage incorrectly keeps references to nested dicts
- json context manager
- How can we ensure uniqeness when inserting HOT 1
- Questions about Copilot + Open Source Software Hierarchy
- What specific scenarios can this database be used in? HOT 2
- LRUCache.set not update cache value when the key exists
- is it possible to encrypt DB file and add authentication? HOT 1
- suggest renaming param for table.update()
- db.clear_cache() does not re-read new db.json file
- 3 Insert calls to table, only 2 go through HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tinydb.