Comments (7)
Hey @zym1010 ,
DataJoint is not necessarily coupled to Matlab. We also provide a Python version.
We agree that storing large files in the database usually doesn't make sense. Normally, the data after the first preprocessing step is stored in the database. For example, you could spike sort and extract spike times and store those in the database. The datajoint schemas of type Imported
is what you would use in that case.
DataJoint does not help you copying the data via scp.
I have no idea how mym
deals with more fancy Matlab objects (I am mainly working with the python version). Generally we don't recommend storing those kind of objects as blobs in the database. The reason is that you loose the power of relational queries on those attributes. What I mean is the following: Say you have a table with experimental data. Then you can quickly retrieve values for a particular animal and session like this
mytable & 'animal_id=8623' & 'session=2'
With blob fields you cannot do this because the comparisons are actually done on the mysql side.
I never worked with tables, but it seems that you could easily make them into a proper database table (via DataJoint). Writing a conversion function is probably not hard. That way you would have more flexible access to your data.
Hope that helps.
from datajoint-matlab.
@fabiansinz Thanks for your quick reply!
DataJoint is not necessarily coupled to Matlab. We also provide a Python version.
By tightly coupled I mean many blobs are saved in a matlab-specific way (or, more correctly, I guess mym
way).
We agree that storing large files in the database usually doesn't make sense. Normally, the data after the first preprocessing step is stored in the database. For example, you could spike sort and extract spike times and store those in the database. The datajoint schemas of type Imported is what you would use in that case.
From what I see in the codebase of datajoint, in Import, you can read (probably also write) some large files from some local folders, which are probably mapped from some remote servers. Is that right?
from datajoint-matlab.
You are correct. We store data how mym
stores it. You can load the same arrays with python though.
I am not sure which code base you are referring to (DataJoint or the example schemas). Import
is just a particular way of designing a table. It requires you to have a makeTuples
function that is called by populate
. It is "just" semantics to indicate that this table imports data. The table name in the database takes that into account to make backup schedules a bit easier.
We personally use network drives with the experimental data on the machines that import/process the data and store it into the database.
from datajoint-matlab.
I am not sure which code base you are referring to (DataJoint or the example schemas). Import is just a particular way of designing a table. It requires you to have a makeTuples function that is called by populate. It is "just" semantics to indicate that this table imports data. The table name in the database takes that into account to make backup schedules a bit easier.
Thanks for clarification. Actually I don't know the difference between computed
and import
, since they all support populate. So that's just a semantic difference?
from datajoint-matlab.
Yes, the difference is semantic as well and mainly translates into different database tables naming conventions. The idea is that Computed tables are less important than Imported which are less important than Manual tables. The former two can basically always be recomputed while Manual data is lost. Therefore, backup should focus on manual tables (i.e. daily) and less on computed (e.g. weekly). The naming schemes make it easy to automatically detect important tables with a script in the database.
from datajoint-matlab.
Imported tables require access to data in the database and also in external files while populating.
Computed tables only require access to data already in the database.
FYI: a few labs are adopting DataJoint, so we will make a series of video tutorials and put them on YouTube. Watch this space.
from datajoint-matlab.
BTW: You can store any fancy MATLAB object in the database including class objects. However, more complex objects may not be loaded in Python. We generally stick with simpler things like n-dimensional arrays.
from datajoint-matlab.
Related Issues (20)
- Markdown not properly rendered in FEX Toolbox description
- fetchn(tableA * tableB, 'field') does not work in 3.4.2
- del deletes some incorrect entries from downstream tables with renamed foreign keys HOT 1
- table classes on path was not recognized when deleting or populating
- error with fetchn DataJoint version 3.4.2 HOT 1
- `{}` is not supported in attribute comments HOT 1
- bool and boolean is not supported in 3.4.1 HOT 1
- Parallelize file transfer for external blobs HOT 1
- bug with del() HOT 6
- Occasional fwrite errors using file-based stores
- fetching 'KEY' error HOT 2
- discovering existing tables across python/matlab HOT 5
- Empty matrices are converted to NULL HOT 2
- syncDef missing [nullable] & comment properties HOT 2
- Officially use the `+` operator for unions
- SSL Connection error HOT 3
- Schema name must start with lower case HOT 2
- Allow strings as well as char HOT 2
- MATLAB hangs and ramps up CPU usage if user closes the password Diagloue Box
- Allow pasting when logging into a database
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from datajoint-matlab.