Giter Club home page Giter Club logo

Comments (9)

iluminae avatar iluminae commented on August 16, 2024 1

I have filed a bug at cockroachdb - Thank you for your help setting up a debug environment @brandond - I actually prefer running kine separately from k3s anyway now that I have that set up.

from kine.

brandond avatar brandond commented on August 16, 2024

Does CockroachDB have some weird semantics around auto-increment primary keys or something? The resource version field just comes from the id column and should not under any circumstances be 0.

Note that CockroachDB isn't on our supported engine list; I'm not sure anyone other than you has even tried it yet.

from kine.

iluminae avatar iluminae commented on August 16, 2024

I assume other people use it, as we have another open issue concerning it: #44.

I understand that the id column should not be 0, and it is not:

select count(id) from kine where id=0;
  count
---------
      0

select count(id) from kine where id>0;
   count
------------
  11599894

 select * from kine_auto_inc;
  last_value | log_cnt | is_called
-------------+---------+------------
    48305999 |       0 |   true

and yet this error exists, which makes me think the problem is with a query. Is there a query I can run that would exhibit the issue and cause that error? Or, can I turn on query logging in kine (within k3s) some way?

from kine.

brandond avatar brandond commented on August 16, 2024

You're right, other people have tried it - but I don't think they managed to get or keep it working for as long as you have.

There is no way to turn the logging for kine embedded in K3s up to the trace level required to get the raw SQL statements, but you can run a standalone kine instance either from the release artifacts, or the docker hub image (docker.io/rancher/kine). See https://github.com/k3s-io/kine/blob/v0.6.0/examples/minimal.md for an example, and be sure to add --debug for extra verbosity. For testing, you can skip the CA certs and just serve plaintext HTTP.

Once that's up, you can point K3s at kine instead of directly at the database with the --datastore-endpoint=http://IP:PORT CLI flag (replace these with the IP and Port of your standalone kine instance), and kine should begin spewing copious amounts of data.

from kine.

iluminae avatar iluminae commented on August 16, 2024

wow you were not kidding about the copious information... I cant even open vim with a log file this big after 60s... my kine table has 12million rows in it, seems like its trying to run 12 million queries right now on startup.

Ill try and wrangle this into something intelligible... seems like just millions of runs of this, even before having an etcd client:

listSQL = fmt.Sprintf(`
SELECT (%s), (%s), %s
FROM kine AS kv
JOIN (
SELECT MAX(mkv.id) AS id
FROM kine AS mkv
WHERE
mkv.name LIKE ?
%%s
GROUP BY mkv.name) maxkv
ON maxkv.id = kv.id
WHERE
(kv.deleted = 0 OR ?)
ORDER BY kv.id ASC
`, revSQL, compactRevSQL, columns)

from kine.

iluminae avatar iluminae commented on August 16, 2024

ok, that was a serious amount of logs, but the query I above mentioned seems to have a very odd response that is probably the issue: (columns redacted or removed for brevity)

SELECT ( SELECT MAX(rkv.id) AS id FROM kine AS rkv), ( SELECT MAX(crkv.prev_revision) AS prev_revision FROM kine AS crkv WHERE crkv.name = 'compact_rev_key'), kv.id AS theid, kv.name, kv.created, kv.deleted, kv.create_revision, kv.prev_revision, kv.lease FROM kine AS kv JOIN ( SELECT MAX(mkv.id) AS id FROM kine AS mkv WHERE mkv.name LIKE '/registry/deployments/%' GROUP BY mkv.name) maxkv ON maxkv.id = kv.id WHERE (kv.deleted = 0 OR false) ORDER BY kv.id ASC LIMIT 10001
;
     id    | prev_revision |  theid   |                             name                             | create_revision | prev_revision
-----------+---------------+----------+--------------------------------------------------------------+-----------------+----------------
  48314093 |      36706371 | 32090805 | /registry/deployments/rook-ceph/rook-ceph-operator           |           20219 |      32090511
         0 |             0 | 36707040 | /registry/deployments/kube-system/metrics-server             |             263 |      36707033
  48314093 |      36706371 | 36707269 | /registry/deployments/kube-system/traefik                    |            1478 |      36707252
         0 |             0 | 36707595 | /registry/deployments/git/git-memcached                      |        21215058 |      36707546
  48314093 |      36706371 | 36707654 | /registry/deployments/kube-system/coredns                    |             228 |      36707583
         0 |             0 | 38998290 | /registry/deployments/automation/auto-home-assistant         |        21205977 |      38998270
  48314093 |      36706371 | 39079914 | /registry/deployments/monitoring/prom-prometheus-server      |        39079590 |      39079649
         0 |             0 | 39099441 | /registry/deployments/monitoring/grafana                     |        39081983 |      39099434
  48314093 |      36706371 | 48300495 | /registry/deployments/kube-system/local-path-provisioner     |             243 |      48300488
         0 |             0 | 48312788 | /registry/deployments/rook-ceph/csi-rbdplugin-provisioner    |           21296 |      48300496
  48314093 |      36706371 | 48312789 | /registry/deployments/rook-ceph/csi-cephfsplugin-provisioner |           21340 |      48300493

but if you look at the individual results you do not see any 0 ids (as I pointed out before, nothing is 0 in the id column)

The oddness is in the MAX(id) function within the join - I am trying to get a small example that reproduces this.

from kine.

iluminae avatar iluminae commented on August 16, 2024

OK here is a small example of the issue:

SELECT
    ( SELECT MAX(rkv.id) AS id FROM kine AS rkv ),
    kv.name,
    MAX(kv.id) AS theid
FROM kine AS kv
WHERE kv.deleted = 0
AND kv.name like '/registry/deployments/kube-system/%'
GROUP BY kv.name;
     id    |                           name                           |  theid
-----------+----------------------------------------------------------+-----------
  48319912 | /registry/deployments/kube-system/coredns                | 36707654
         0 | /registry/deployments/kube-system/local-path-provisioner | 48300495
  48319912 | /registry/deployments/kube-system/metrics-server         | 36707040
         0 | /registry/deployments/kube-system/traefik                | 36707269

However if you take the MAX(id) out, then it fixes the first column

SELECT
    ( SELECT MAX(rkv.id) AS id FROM kine AS rkv ),
    kv.name
FROM kine AS kv
WHERE kv.deleted = 0
AND kv.name like '/registry/deployments/kube-system/%'
GROUP BY kv.name;
     id    |                           name
-----------+-----------------------------------------------------------
  48320069 | /registry/deployments/kube-system/coredns
  48320069 | /registry/deployments/kube-system/local-path-provisioner
  48320069 | /registry/deployments/kube-system/metrics-server
  48320069 | /registry/deployments/kube-system/traefik

I will make a ticket with crdb since that looks to me like it just should not be happening.

from kine.

brandond avatar brandond commented on August 16, 2024

Huh yeah it seems to be getting really confused by that subquery. The intent there is to return the 'current revision' (maximum id from the whole table) at the time the query is executed, alongside the name and id of the most recent revision of each individual resource.

from kine.

iluminae avatar iluminae commented on August 16, 2024

I just created a fresh and clean cockroachdb and restored from a backup made 2 days ago and it completely fixed the issue. They just came out with a new rocksdb implmentation called pebble that may have a bug in it, I will follow up with them.

But that tells me that this issue is not required.

For the record (storage bug aside) Cockroachdb is usable and excellent in terms of HA of k3s.

from kine.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.