Comments (16)
Yes, this puzzles me too. It seems to depend on the size of information_schema tables. In other words, the more tables and schemas the server hosts, the longer the queries take. It may have to do with the fact that these are not real tables but schema information generated into a table each time. I will try to construct a faster query.
from datajoint-matlab.
Apparently setting innodb_stats_on_metadata=0 will speed up queries to the information schema a lot. It reduced the time to about half on my local database (from 2s to 1s). There don't seem to be any drawbacks to disabling the stats: http://www.mysqlperformanceblog.com/2011/12/23/solving-information_schema-slowness/
from datajoint-matlab.
I am also wondering if the underscore in the table names have any effect. Underscores are a mysql wildcard and might trigger some kind of matching to be performed.
I have to research some weird deleting behavior anyway, I might as well create that test setup with a different tierPrefixes and different CamelCase parser.
from datajoint-matlab.
The hypothesis that underscores are to blame may have some merit to it. Let's test it. DataJoint always encloses table names in backquotes to escape wildcard matching and other anomalous behavior, but I agree that underscores could trigger something internally when querying the information schema.
from datajoint-matlab.
Setting innodb_stats_oin_metadata requires super privileges, so it should probably be done in the configuration script.
from datajoint-matlab.
The sluggishness does appear to be related to the use of the underscore. Regular expressions seem to be affected even though they should not interpret underscores as wildcards. I am trying to find a workaround but it may come to renaming all the computed and imported tables.
from datajoint-matlab.
So far, I have not found a way to escape underscores in REGEXP. It's definitely a bug on MySQL's side. The only solution I see is to change the naming convention. I will provide a function that converts a schema from the old naming convention to the new naming convention seamlessly (it's easy). But first I will look around a bit more if there is a way to escape underscores.
Are we willing to change the naming convention for tables at this point?
from datajoint-matlab.
I was able to fix this without requiring a change in the naming convention. This was a MySQL bug wherein underscores were interpreted as wildcards within REGEXP even when escaped. I replaced REGEXP with IN (table1,table2,...)
from datajoint-matlab.
the current solution does not cross schema boundaries, still slow sometimes...
from datajoint-matlab.
This business with underscores in REGEXP must have been an epiphenomenon. The fix from last night made things faster only under some circumstances.
The real delay seems to be caused by querying the referenced_table_schema
field.
There does not seem to be a way around it. I am fixing this by delaying the loading of dependencies until they are actually needed (ERD or cascading deletes or drops).
from datajoint-matlab.
although datajoint now postpones the loading of dependencies until they are actually needed, loading them still takes some time. It is some MySQL issue and I have not been able to figure out why exactly this happens. Does MariaDB have the same issues?
from datajoint-matlab.
Above you say the current solution does not cross schema boundaries. That would be an issue for me since I use several schemas together. Is this fixed in the current implementation?
from datajoint-matlab.
It works correctly in v2.6. Please test various use cases.
Sorry for the confusion. I tried several solutions and some seemed to work for sometime. This is difficult to diagnose because sometimes queries execute faster and it is easy to misattribute the speedup to something you did. I ended up reverting them. The comment about the lack of cross-schema cascading behaviors referred to one such solution.
The solution I ended up with did not accelerate the loading of dependencies but only made it less frequent. Dependencies are loaded right before they are needed: to plot ERDs, to cascade deletes or table drops, or to reverse engineer a table definition.
from datajoint-matlab.
I really doubt that MariaDB is immune. Furthermore, I recently encountered problems with the 5.5 branch of MariaDB. The release contained bugs in the optimizer that caused duplicate tuples to be returned in certain queries. I was able to trace this back to the incorrect use of "LooseScan". This forced me to switch back to MySQL 5.6.
On Sep 14, 2013, at 10:18 PM, Dimitri [email protected] wrote:
although datajoint now postpones the loading of dependencies until they are actually needed, loading them still takes some time. It is some MySQL issue and I have not been able to figure out why exactly this happens. Does MariaDB have the same issues?
—
Reply to this email directly or view it on GitHub.
from datajoint-matlab.
indeed setting innodb_stats_on_meta = 0 globally solves this problem and there does not appear to be another way around this otherwise.
Sorry for the confusion. I am closing this again.
from datajoint-matlab.
Oh I see! I had assumed from your previous descriptions that innodb_stats_on_meta
had been set to 0 the whole time. I am going to add this to our server configuration files in the wiki.
from datajoint-matlab.
Related Issues (20)
- Markdown not properly rendered in FEX Toolbox description
- fetchn(tableA * tableB, 'field') does not work in 3.4.2
- del deletes some incorrect entries from downstream tables with renamed foreign keys HOT 1
- table classes on path was not recognized when deleting or populating
- error with fetchn DataJoint version 3.4.2 HOT 1
- `{}` is not supported in attribute comments HOT 1
- bool and boolean is not supported in 3.4.1 HOT 1
- Parallelize file transfer for external blobs HOT 1
- bug with del() HOT 6
- Occasional fwrite errors using file-based stores
- fetching 'KEY' error HOT 2
- discovering existing tables across python/matlab HOT 5
- Empty matrices are converted to NULL HOT 2
- syncDef missing [nullable] & comment properties HOT 2
- Officially use the `+` operator for unions
- SSL Connection error HOT 3
- Schema name must start with lower case HOT 2
- Allow strings as well as char HOT 2
- MATLAB hangs and ramps up CPU usage if user closes the password Diagloue Box
- Allow pasting when logging into a database
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from datajoint-matlab.