Giter Club home page Giter Club logo

Comments (16)

dimitri-yatsenko avatar dimitri-yatsenko commented on September 26, 2024

Yes, this puzzles me too. It seems to depend on the size of information_schema tables. In other words, the more tables and schemas the server hosts, the longer the queries take. It may have to do with the fact that these are not real tables but schema information generated into a table each time. I will try to construct a faster query.

from datajoint-matlab.

aecker avatar aecker commented on September 26, 2024

Apparently setting innodb_stats_on_metadata=0 will speed up queries to the information schema a lot. It reduced the time to about half on my local database (from 2s to 1s). There don't seem to be any drawbacks to disabling the stats: http://www.mysqlperformanceblog.com/2011/12/23/solving-information_schema-slowness/

from datajoint-matlab.

FlorianFranzen avatar FlorianFranzen commented on September 26, 2024

I am also wondering if the underscore in the table names have any effect. Underscores are a mysql wildcard and might trigger some kind of matching to be performed.

I have to research some weird deleting behavior anyway, I might as well create that test setup with a different tierPrefixes and different CamelCase parser.

from datajoint-matlab.

dimitri-yatsenko avatar dimitri-yatsenko commented on September 26, 2024

The hypothesis that underscores are to blame may have some merit to it. Let's test it. DataJoint always encloses table names in backquotes to escape wildcard matching and other anomalous behavior, but I agree that underscores could trigger something internally when querying the information schema.

from datajoint-matlab.

dimitri-yatsenko avatar dimitri-yatsenko commented on September 26, 2024

Setting innodb_stats_oin_metadata requires super privileges, so it should probably be done in the configuration script.

from datajoint-matlab.

dimitri-yatsenko avatar dimitri-yatsenko commented on September 26, 2024

The sluggishness does appear to be related to the use of the underscore. Regular expressions seem to be affected even though they should not interpret underscores as wildcards. I am trying to find a workaround but it may come to renaming all the computed and imported tables.

from datajoint-matlab.

dimitri-yatsenko avatar dimitri-yatsenko commented on September 26, 2024

So far, I have not found a way to escape underscores in REGEXP. It's definitely a bug on MySQL's side. The only solution I see is to change the naming convention. I will provide a function that converts a schema from the old naming convention to the new naming convention seamlessly (it's easy). But first I will look around a bit more if there is a way to escape underscores.

Are we willing to change the naming convention for tables at this point?

from datajoint-matlab.

dimitri-yatsenko avatar dimitri-yatsenko commented on September 26, 2024

I was able to fix this without requiring a change in the naming convention. This was a MySQL bug wherein underscores were interpreted as wildcards within REGEXP even when escaped. I replaced REGEXP with IN (table1,table2,...)

from datajoint-matlab.

dimitri-yatsenko avatar dimitri-yatsenko commented on September 26, 2024

the current solution does not cross schema boundaries, still slow sometimes...

from datajoint-matlab.

dimitri-yatsenko avatar dimitri-yatsenko commented on September 26, 2024

This business with underscores in REGEXP must have been an epiphenomenon. The fix from last night made things faster only under some circumstances.

The real delay seems to be caused by querying the referenced_table_schema field.

There does not seem to be a way around it. I am fixing this by delaying the loading of dependencies until they are actually needed (ERD or cascading deletes or drops).

from datajoint-matlab.

dimitri-yatsenko avatar dimitri-yatsenko commented on September 26, 2024

although datajoint now postpones the loading of dependencies until they are actually needed, loading them still takes some time. It is some MySQL issue and I have not been able to figure out why exactly this happens. Does MariaDB have the same issues?

from datajoint-matlab.

aecker avatar aecker commented on September 26, 2024

Above you say the current solution does not cross schema boundaries. That would be an issue for me since I use several schemas together. Is this fixed in the current implementation?

from datajoint-matlab.

dimitri-yatsenko avatar dimitri-yatsenko commented on September 26, 2024

It works correctly in v2.6. Please test various use cases.

Sorry for the confusion. I tried several solutions and some seemed to work for sometime. This is difficult to diagnose because sometimes queries execute faster and it is easy to misattribute the speedup to something you did. I ended up reverting them. The comment about the lack of cross-schema cascading behaviors referred to one such solution.

The solution I ended up with did not accelerate the loading of dependencies but only made it less frequent. Dependencies are loaded right before they are needed: to plot ERDs, to cascade deletes or table drops, or to reverse engineer a table definition.

from datajoint-matlab.

ahoenselaar avatar ahoenselaar commented on September 26, 2024

I really doubt that MariaDB is immune. Furthermore, I recently encountered problems with the 5.5 branch of MariaDB. The release contained bugs in the optimizer that caused duplicate tuples to be returned in certain queries. I was able to trace this back to the incorrect use of "LooseScan". This forced me to switch back to MySQL 5.6.

On Sep 14, 2013, at 10:18 PM, Dimitri [email protected] wrote:

although datajoint now postpones the loading of dependencies until they are actually needed, loading them still takes some time. It is some MySQL issue and I have not been able to figure out why exactly this happens. Does MariaDB have the same issues?


Reply to this email directly or view it on GitHub.

from datajoint-matlab.

dimitri-yatsenko avatar dimitri-yatsenko commented on September 26, 2024

indeed setting innodb_stats_on_meta = 0 globally solves this problem and there does not appear to be another way around this otherwise.

Sorry for the confusion. I am closing this again.

from datajoint-matlab.

ahoenselaar avatar ahoenselaar commented on September 26, 2024

Oh I see! I had assumed from your previous descriptions that innodb_stats_on_meta had been set to 0 the whole time. I am going to add this to our server configuration files in the wiki.

from datajoint-matlab.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.