Giter Club home page Giter Club logo

Comments (12)

GeoffMontee avatar GeoffMontee commented on May 26, 2024

Hi @tskenb,

Thanks for your interest in contributing to tds_fdw again! It's much appreciated. Sorry for not responding sooner. I've been quite busy the last couple weeks.

I love the idea of working with raw values, rather than converting to C-strings and using BuildTupleFromCStrings. Looking into that is on my to-do list. I'm not entirely sure how hard or easy it would be yet.

On a similar topic, looking into whether using ExecStoreVirtualTuple would provide better performance than ExecStoreTuple is also on my to-do list.

from tds_fdw.

tskenb avatar tskenb commented on May 26, 2024

Hi @GeoffMontee,

I went ahead and started working on this. I'm getting good speedup for native types like ints and floats, and also for datetimes - more than 50%, although I hate to quote numbers since they depend on the environment you happen to be running in. I'm also getting modest speedup for character types. I haven't bothered with the other types - by making the same function call that BuildTupleFromCStrings uses to produce the Datum we can fall back to using strings on a column by column basis.

It took me a while to figure it all out, but in the end the changes are not that hard. For the native types I just use dbbind, like you are doing to get the row count. For datetimes it is just a matter of refactoring a little and then bypassing a couple of statements. For character types I'm converting the character array to a text Datum, which might not be exactly the same as what it was doing before, but so far I haven't noticed any difference in behavior. I'll test it a little more. Hopefully I'll have a patch ready within the next few days.

Interestingly, even after my changes, and even transferring a table that has only improved column types, I still see 100% CPU usage for the transfer in my environment. That surprises me. Maybe it has something to do with the ExecStoreTuple business that you mentioned.

from tds_fdw.

GeoffMontee avatar GeoffMontee commented on May 26, 2024

Wow, 50% speedup is great! Thanks for working on this!

Regarding the CPU usage, I wonder if that's related to how FreeTDS uses character sets. If the TDS protocol version is 7.0 or higher, all data is transferred in the UCS-2 character set, and then FreeTDS converts it to the client's character set when it's received. There's more information about that here. I bet this behavior could cause high CPU usage.

What value of tds_version are you using in your tests?

from tds_fdw.

tskenb avatar tskenb commented on May 26, 2024

In my create server statement I use tds_version 7.1. I guess that's what you mean?

But the high CPU usage is even with tables that are only ints, floats and datetimes.

from tds_fdw.

GeoffMontee avatar GeoffMontee commented on May 26, 2024

In my create server statement I use tds_version 7.1. I guess that's what you mean?

Yes, that's right.

But the high CPU usage is even with tables that are only ints, floats and datetimes.

Interesting. I would think that FreeTDS wouldn't have to perform any character set conversions with those data types. The high CPU usage might have another cause. Thanks for letting me know.

from tds_fdw.

tskenb avatar tskenb commented on May 26, 2024

I'm considering how to handle the situation that a column's type in the foreign table doesn't match the type in the input table - for example float -> int or int -> smallint. Currently, the input value is written to a string, and if that string can be parsed as the output type then great, otherwise an error is reported. For example, in the int -> smallint case, everything will be fine as long as all the input values are within the range of smallints.

My inclination is to fall back to doing the string method when the types don't match. It might be possible to get higher performance for some mismatches, but it would involve extra coding and uncertain portability (at least uncertain to me).

from tds_fdw.

GeoffMontee avatar GeoffMontee commented on May 26, 2024

Your proposed solution to fall back to string conversions sounds OK to me. I don't see a great way around it, and I especially wouldn't want to try to implement a conversion from every FreeTDS type to every PostgreSQL type. Continuing to use dbconvert() to handle conversions in unusual cases seems like a good idea, at least for now.

from tds_fdw.

tskenb avatar tskenb commented on May 26, 2024

I have finally posted a patch for this.

from tds_fdw.

GeoffMontee avatar GeoffMontee commented on May 26, 2024

This patch looks great. Thank you very much for this contribution!

I'm going to test it a bit, but it should be committed soon.

from tds_fdw.

GeoffMontee avatar GeoffMontee commented on May 26, 2024

@tskenb,

The patch has been pushed. Thanks!

I'd like to acknowledge your contributions in the release notes for the next release. Is that OK with you? Would you like me to mention your full name? If so, what is it? Or I could just say "GitHub user @tskenb," if you'd like?

from tds_fdw.

tskenb avatar tskenb commented on May 26, 2024

Thanks for the quick turnaround. You can mention me. My name is Ken Been. (I tried to use that as my github name, but it already existed, possibly because I created it a long time ago with a defunct email account, I don't know. Or maybe someone else with my name uses github.)

from tds_fdw.

GeoffMontee avatar GeoffMontee commented on May 26, 2024

Fixed in 1.0.6.

from tds_fdw.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.