Giter Club home page Giter Club logo

Comments (6)

jondegenhardt avatar jondegenhardt commented on May 19, 2024 1

@Imperatorn Great to know there's someone trying the tools on Windows! Report any problems you have, Windows related or otherwise!

from tsv-utils.

porteusconf avatar porteusconf commented on May 19, 2024

Does the merge into v2.1.2 for #320 imply that, for newline handling going forward, it might be easiest to adopt option 3?

Option 3, reading both newline forms, but writing Unix newlines

In any case, while waiting for a full windows release/build, could just csv2tsv.exe be made availalble, assuming it passes any needed tests? This would enable Windows users to generate valid tsv from csv without excel, perhaps by:

  • validate foo.csv with some tool(s), for example https://csvlint.io/ But be warned, if you download the "standardized" csv they offer, I think it silently adds double-quotes around every field, including numbers. For example if foo.csv has a row foo,22 it becomes "foo","22" in the "standardized" csv file (not sure why).
  • csv2tsv.exe foo.csv > foo.tsv Note that csv2tsv by default removes double-quotes where not needed, so foo.tsv would be fooTAB22

May need some documentation on how to pass escaped command-line arguments to csv2tsv.exe in windows if using cmd or powershell...
For example, on linux/macos, we can create a file with scsv (semi-colon separated values) using something like these:

csv2tsv --tsv-delim   $";"    foo.csv  > foo.scsv
csv2tsv --tsv-delim   $';'    foo.csv  > foo.scsv
csv2tsv --tsv-delim    \;     foo.csv  > foo.scsv

And I'm thinking none of the above command-lines would work on windows. Perhaps ^; would work per which-symbol-is-escape-character-in-cmd

But if you need to specify tab as a command line argument, then instead of cmd windows folks may need to use powershell, which can escape tab as backtick-t

`t

per About special characters in PowerShell docs

Finally, another work-around avoiding both cmd and powershell completely, just install git-for-windows (choco install git or some other bash shell for windows). Then run csv2tsv.exe in that shell, if csv2tsv.exe can handle arguments passed to it from bash.exe.

from tsv-utils.

jondegenhardt avatar jondegenhardt commented on May 19, 2024

Hi @porteusconf. Thanks for the feedback and suggestions. Some comments in-line below.

Does the merge into v2.1.2 for #320 imply that, for newline handling going forward, it might be easiest to adopt option 3?

Option 3, reading both newline forms, but writing Unix newlines

Option 1, Unix newlines only on both input and output is by far the easiest (lowest investment cost). Option 3 is a fair bit more expensive. Much of this comes from increased test suite cost. Some because there are a several tools that have their own reader functionality (for example, tsv-sample).

A relevant question is how much additional benefit would be seen investing in option 3? It's a question I don't know the answer to. How many users, how prevalent are the data files, and how onerous are the alternatives, such as invoking dos2unix on the data first.

In any case, while waiting for a full windows release/build, could just csv2tsv.exe be made availalble, assuming it passes any needed tests? This would enable Windows users to generate valid tsv from csv without excel, ...

Well, I'm reluctant to create pre-built binary packages for only a single tool. However, I see the merit behind this idea, perhaps there are ways to get the desired effect.

First, note that nothing prevents cloning the git repo and building the tools on Windows. The test suite is not complete for Windows, but that doesn't mean the tools won't work properly. And to your point, csv2tsv would likely passes a more complete test suite simply because the csv2tsv test suite already includes examples of files with Windows newlines.

What could be done in this regard is to: (a) Publish test suite status info for csv2tsv by itself; (b) Add any missing csv2tsv tests; (c) Add specific instructions describing how to build on Windows.

perhaps by:

  • validate foo.csv with some tool(s), for example https://csvlint.io/ But be warned, if you download the "standardized" csv they offer, I think it silently adds double-quotes around every field, including numbers. For example if foo.csv has a row foo,22 it becomes "foo","22" in the "standardized" csv file (not sure why).
  • csv2tsv.exe foo.csv > foo.tsv Note that csv2tsv by default removes double-quotes where not needed, so foo.tsv would be fooTAB22

csv2tsv doesn't have any trouble reading any of these formats, but as you point out, it always generates escape-free TSV.

May need some documentation on how to pass escaped command-line arguments to csv2tsv.exe in windows if using cmd or powershell...

Good thoughts, thank you.

Finally, another work-around avoiding both cmd and powershell completely, just install git-for-windows (choco install git or some other bash shell for windows). Then run csv2tsv.exe in that shell, if csv2tsv.exe can handle arguments passed to it from bash.exe.

Agreed, it might make sense to include this option in the documentation.

from tsv-utils.

Imperatorn avatar Imperatorn commented on May 19, 2024

Status?

from tsv-utils.

jondegenhardt avatar jondegenhardt commented on May 19, 2024

Status?

Status as described in the main description is up-to-date. It is updated as things change. At present, there are no known failure cases on Windows. But, since the test suite doesn't run fully, it leaves unknowns. Also, there's a lack of real-world use on Windows, or at least use that gets reported. So it is more about unknowns at this point.

Do you have specific questions?

from tsv-utils.

Imperatorn avatar Imperatorn commented on May 19, 2024

No, I was just wondering why there weren't any Windows binaries. I've put them here for anyone interested:
https://github.com/Imperatorn/tsv-utils/releases

from tsv-utils.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.