Giter Club home page Giter Club logo

Comments (29)

chrisfw avatar chrisfw commented on July 17, 2024 1

from odbc2parquet.

pacman82 avatar pacman82 commented on July 17, 2024

Hello @chrisfw thanks for opening this issue, and providing the detailed log output. The datatype (-25) in the error message likely relates to BigInt, which is a hint, that the driver does not like the fact, that the tool is trying to fetch the columns with scale 0 as 64Bit integers. It seems the driver does at least not support this conversion for the SQL DataType, maybe the buffer can stay i64. Don't know yet.

I won't have time to investigate this deeper until the weekend, however in the meanwhile it would be helpful to verify the hypothesis, by trying to do this query without the columns those buffer description have an i64 in the kind attribute (see log output).

Ps. Am I right to assume you run this under windows?

Cheers, Markus

from odbc2parquet.

chrisfw avatar chrisfw commented on July 17, 2024

from odbc2parquet.

chrisfw avatar chrisfw commented on July 17, 2024

from odbc2parquet.

pacman82 avatar pacman82 commented on July 17, 2024

Is there a free version of the database you are using available, so we can include it in the test suite? I'd very much like to have a reproducing example?

from odbc2parquet.

chrisfw avatar chrisfw commented on July 17, 2024

from odbc2parquet.

pacman82 avatar pacman82 commented on July 17, 2024

For reference:

from odbc2parquet.

pacman82 avatar pacman82 commented on July 17, 2024

Setting up an Oracle is quite involved, was hoping for a ready made container easy to integrate into CI 😅 .

Anyhow, I think I now get at least the error messeg. Contrary to what it says, it is not about the SQL Data Type, but about the C Buffer Type we are binding. As seen in the linked reference above. Note that the Constant is declared in a header file name sqlext.h, meaning that it is an extension to the standard.

So far however 64Bit integers were supported by every database I encounterd. However a search revealed this stack overflow question.

https://stackoverflow.com/questions/338609/binding-int64-sql-bigint-as-query-parameter-causes-error-during-execution-in-o

Seemingly indicating that Oracle does not support bindig variables to 64 Bit integer buffers. Please note that the question is kinda old. Are you running the neweset version? Maybe Oracle added support in the meanwhile. So far however it seems the biggest improvement is that they added the constant -25 to the error message, giving users who browse C headers and add constantst an idea of what might be going on.

Regards what can be done in terms of this tool. I think it is a reasonable default behaviour to query Numeric(1..=18,0) as i64. It's also very sensible to let the driver do the conversion. Yet I could think about adding a flag, allowing to query those as text, and convert them from text to i64 within the tool itself. However setting a magic flag after interpreting an obscure error message is not the kind of UX im aiming for.

As a workaroud you could think a about casting every Numeric column with scale zero and precision greater than 9 (so starting at 10) into a Varchar. That way these would be fetched as text. You would have to post process the resulting parquet file though, to turn these back into integer colums, as they would be treated as text.

The sudden stoppage of the tool is worrysome. It can mean one of two things. The Oracle ODBC driver does causes an invalid memory access, double free or other kind of segmentation fault. Or one of the invariants I thought always to be true during writing the code, turns out to be violated.

Since I still have trouble with my test setup, I would kindly ask you to urn the debug build of the tool. Instruction of how to achieve that will follow shortly, but I now have to leave my PC so my GF can use it for gaming.

Cheers, Markus

from odbc2parquet.

pacman82 avatar pacman82 commented on July 17, 2024

Hello @chrisfw ,

In case you find the time to help out diagnosing this thing.

To run obtain a debug build of the tool, you would need a rust toolchain.

https://rustup.rs/ contains instruction of how to install one. It's just executing the script, and at one point you need to download either Visual Studio, or the Visual Studio Build tools to get the msvc compiler backend.

Checking out the repository and typing cargo build will result in the debug executable present in the target\debug subdirectory.

I could also just mail, or upload the executable to you, if you give me some coordinates.

Cheers, Markus

from odbc2parquet.

chrisfw avatar chrisfw commented on July 17, 2024

from odbc2parquet.

pacman82 avatar pacman82 commented on July 17, 2024

Hello Chris,

Thanks for your thoughts on possible workarounds, but I’m really hoping that a better solution will present itself after some further debugging. Oracle is one of the most heavily utilized databases in the world, so it would be a shame if a workaround is required in order to utilize the tool with it.

Very much thinking the same thing.

With regards to the conversion:

The error state reported by the driver should be HY003 IMHO and not HY004. (Meaning it's the buffer type which can't be bound. If it were about the SQL Type it should have said -5.). Also the error should be returned on SQLBindParameter instead of SQLFetch.

Maybe that's worth an upstream bug ticket.

Anyhow, with regards to the conversion, the solution might be to query the NUMERIC columns as text, and perform the conversion within the tool, instead of relying on the driver. This switch of behaviour could be activated with a flag on the command line. However, it would be a better UX if odbc2parquet would detect the error and tried again automatically with the adapted behaviour. Preconditions for the better UX are however:

  1. We can test the error handling code paths for Oracle on CI
  2. First call to SQLFetch did not change the state of the cursor, so we can just try again with a different set of bound buffers, without losing data.

Early termination of the tool so far is a mystery. Curious on what kind of stack trace I'll see.

FYI: $env:RUST_BACKTRACE="1" sets the environment in power shell, so we see more, if it fails with the debug version of the tool.

Cheers, Markus

from odbc2parquet.

chrisfw avatar chrisfw commented on July 17, 2024

from odbc2parquet.

pacman82 avatar pacman82 commented on July 17, 2024

Hello Chris,

One thought I had that may be a possibility for you is to simply create a free oracle cloud infrastructure account and spin up a free tier oracle autonomous database.

Nice idea. I'll look into that. Especially for CI. I did manage to build an oracle XE locally by the way. Yet even my humple attempts to create a table failed with a syntax error. At least I managed to figure out the connection string and did successfully connect.

Same here, it will be quite a few days before I can look deeper into this again.

In terms of experiments (only of course if you find the time):

There are four types of columns in your query.

  1. Character columns which will be queried as text.
  2. Numeric columns with scale zero which fit into a 32Bit integer.
  3. Numeric columns with scale zero which fit into a 64Bit integer.
  4. Numeric columns with scale != 0, which are queried as text.

I wonder if there is a subset of columns, for which the tool would work?

Cheers, Markus

from odbc2parquet.

pacman82 avatar pacman82 commented on July 17, 2024

Yet even my humple attempts to create a table failed with a syntax error.

Writing this I just realized that round brackets are magic in bash. 🤦

from odbc2parquet.

chrisfw avatar chrisfw commented on July 17, 2024

from odbc2parquet.

chrisfw avatar chrisfw commented on July 17, 2024

from odbc2parquet.

pacman82 avatar pacman82 commented on July 17, 2024

Thanks Chris,

Boy, that's a relief. Well the mapping issue is well understood. You can even read in the official documentation that 64Bit Integers are not supported. The public Oracle Forums are mourning this since decades it seems. Workaround is also understood, query these as string and convert. I'll add a flag to the tool. I am a bit undecided if I should just check for Oracle in the database name, or give the user a better error message (with a hint to the flag) if he/she encounters an error.

Cheers, Markus

from odbc2parquet.

chrisfw avatar chrisfw commented on July 17, 2024

from odbc2parquet.

pacman82 avatar pacman82 commented on July 17, 2024

I should just check for Oracle in the database name

I ment to write driver name.

However I think I use the flag, so if any other ODBC drivers out there, don't like 64 Bit integers it can also be used for them.

Cheers, Markus

from odbc2parquet.

chrisfw avatar chrisfw commented on July 17, 2024

from odbc2parquet.

pacman82 avatar pacman82 commented on July 17, 2024

Released version 0.6.22 with the --driver-does-not-support-64bit-integers flag. Not closing the issue, because I haven't done the helpful error message yet. However the flag itself I can introduce and test without having Oracle part of the test setup, so I figured I unblock your use case right now.

Tested the insert path with a 64Bit. Expectedly the Oracle driver also failed, but without emitting any error at all, which I find a bit disheartening, on how to best tackle that.

from odbc2parquet.

chrisfw avatar chrisfw commented on July 17, 2024

from odbc2parquet.

pacman82 avatar pacman82 commented on July 17, 2024

It seems history (unsuprisingly) repeats itself here: blue-yonder/turbodbc#142

I'll go add the better error message upstream in odbc-api, this way useres of arrow-odbc and arrow-odbc-py, won't be left to interpret that all by themeselfes.

from odbc2parquet.

chrisfw avatar chrisfw commented on July 17, 2024

from odbc2parquet.

pacman82 avatar pacman82 commented on July 17, 2024

Hello @chrisfw,

I have created a free tier account in the oracle cloud; I created a database; I also have an ADMIN user with a password. Now I am trying to figure out, how the ODBC connection string would look like. Managed to setup Oracle locally and test, but for the CI I would prefer the cloud solution. Mostly because Oracle does not seem to have ready made containers on Docker Hub.

Pretty sure I can get this done on my own, but I thought I might save some time asking you, since you seem to have experience with that.

Thanks for nothing Oracle.

To be fair, it is very unlikely that it is ill will on their side. It is a hard problem to structure a big organization, so a group of people both understands the buisness value of something technical like a good ODBC driver and has the autonomy, the resources and skills to act on it.
Stuff like good error messages and 64Bit support seldomly make it into a sales call, but in a world there developers have increasing autonomy to choose the tech they use, it starts to matter. However much we suffer, my guess is, there is a person in Oracle who has the skills, and will to improve that driver, but just can't convince his/her manager to spend time on that. I am also guessing that person feels the pain more than we do.

from odbc2parquet.

chrisfw avatar chrisfw commented on July 17, 2024

from odbc2parquet.

pacman82 avatar pacman82 commented on July 17, 2024

I may be able to get you a docker image of 19c.

FYI: I've been able to build a docker image using this repository https://github.com/oracle/docker-images .

Yet I find checking this repository out. Calling the script, to build the docker container, to involved for CI. I just want a ready image with an empty database and a premade user, which I can just docker pull.

However publishing the one I have to docker hub, I'm to scared to do, because I do not want to get in hot water with regards to licensing.

I'll be okay with either just calling docker build on some nice docker file for the CI, or using the ATP instance I spun up on Oracle Cloud, whichever seems less effort.

Preferably still the dockerfile, because there is less friction with tests, if somebody wants to fork and run tests.

from odbc2parquet.

chrisfw avatar chrisfw commented on July 17, 2024

from odbc2parquet.

pacman82 avatar pacman82 commented on July 17, 2024

Closing this issue, because I'll probably won't find the time to include Oracle in the test setup in a sensible way. However, the tool will emit a hint for Oracle users regardless.

"Looks like you are using an Oracle database. Try the \

from odbc2parquet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.