Giter Club home page Giter Club logo

kbcsv's Introduction

Logo

KBCsv

Build status

What?

KBCsv is an efficient, easy to use .NET parsing and writing library for the CSV (comma-separated values) format.

Why?

CSV is a common data format that developers need to work with, and .NET does not include intrinsic support for it. Implementing an efficient, standards-compliant CSV parser is not a trivial task, so using KBCsv avoids the need for developers to do so.

Where?

The easiest way to get KBCsv is to install via NuGet:

Install-Package KBCsv

Or, if you want the extensions:

Install-Package KBCsv.Extensions

Data-specific extensions are available as a separate package for .NET 4.5 (the other packages above are portable):

Install-Package KBCsv.Extensions.Data

How?

using (var streamReader = new StreamReader("data.csv"))
using (var csvReader = new CsvReader(streamReader))
{
    csvReader.ReadHeaderRecord();

    while (csvReader.HasMoreRecords)
    {
        var record = csvReader.ReadDataRecord();
        var name = record["Name"];
        var age = record["Age"];
    }
}

Please see the documentation for more details.

Who?

KBCsv is created and maintained by Kent Boogaart. Issues and pull requests are welcome.

Primary Features

  • Very easy to use
  • Very efficient
  • Separate extension libraries to provide additional (but optional) features such as working with System.Data types
  • Portable Class Library targetting netstandard 1.0
  • Full async support
  • Includes extensive documentation and examples in both C# and VB.NET
  • Conforms to the official CSV standard, RFC4180
  • Also conforms to pseudo-standards, such as this
  • Highly customizable, such as specifying non-standard value separators and delimiters
  • Very high test coverage

kbcsv's People

Contributors

kentcb avatar sujitn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kbcsv's Issues

Create .NET 4.0 Nuget package

I nuget this lib with extensions into .NET 4.0 desktop project in F#. And got total hell with dependencies. Could not build or run.
May you create specific package for old development model when all dlls are references against full .NET not downloads.

My case is :

  1. F# dll project for .NET 4.0 target nugets KBCsv.Extensions
  2. C# project for .NET 4.0 refrences F# project.

I cannot build them together.

Header record cannot be preceded by new lines

This fails:

[Fact]
public void REPRO()
{
    var test = @"


Col1,Col2";

    var reader = CsvReader.FromCsvString(test);
    var header = reader.ReadHeaderRecord();
    Assert.Equal("Col1", header[0]);
}

Add DataTable.WriteCsv to return header

I do next

let csv = new CsvWriter(target)
    dataTable.WriteCsv(csv,true) |> ignore

Could I return headers in some form? Or have public method to calculate these?

CsvReader(string???)

Kent, I am maintaining a project and found it used Kent.Boogaart.KBCsv. Of course the reference was broken as the assembly was not included in source control. I found your project here, downloaded, and compiled. I have a problem with an instantiation of CsvReader(). The code I'm maintaining passes a string as a filename. You're project I've found only has Stream or TextReader. Did you at one time have a version with a string parameter as an overload?

https://www.screencast.com/t/cKRdp8Oc

Unexpected behavior when setting CsvWriter.NewLine to something other than new line

There are some unexpected behaviors when setting CsvWriter.NewLine to something "crazy".

While this value is (obviously) meant to be either \r\n or \n to support different OS new line conventions, since it is a string, it can be set to any value, for example, || (two pipes). Many tools also support both reading and writing such a file (like SQL Server bcp.exe or SSIS; Excel however seems not to).

When I set CsvWriter.NewLine to be double pipes, two unexpected things happened:

  • String values that contain double pipes are NOT automatically surrounded by ValueDelimiter, even though this seems necessary in this situation,
  • String values that contain new lines ARE still automatically surrounded by ValueDelimiter, even though this does NOT seem necessary in this situation.

It seems like the grammar rules from rfc4180 have been generalised from "comma" and "double quote" to "any character" for the ValueSeparator and ValueDelimiter, but not for NewLine (which could be thought of as "Row delimiter").

Flush not called when leaveOpen: true

I don't know if this is by design, (the documentation didn't call it out at least) but it did surprise me a bit as it didn't show in my first tests.

When creating CsvWriter with leaveOpen true, Dispose() doesn't call flush on the below stream. So the following code doesn't work correctly:

var stream = new MemoryStream();

using (var writer = new CsvWriter(stream, Encoding.UTF8, leaveOpen: true))
{
    await writer.WriteRecordsAsync(records, 0, records.Length);
}

Maybe Dispose() could be something like:

if (!disposing)
    return;

if (this.leaveOpen)
    this.textWriter.Flush();
else
    this.textWriter.Dispose();

if (disposing && !this.leaveOpen)

Change names to shorter

Right now have

Kent.Boogaart.KBCsv
Kent.Boogaart.KBCsv.Extensions
  1. Try out .NET BCL approach when Extension extend root namespace and do not have their own. Drop Extensions namespace, only in dll name. Put DataTable extensions into its namespace.
  2. Kent.Boogaart.KBCsv has KB 2 times. Drop KB. Get like Kent.Boogaart.Csv
  3. You have long name hard to remember. With 2 o and 2 a. Rename to kentcb and get
Kentcb.Csv

Value: ease of discovery and typing. You may try renaming when creating backward incompatible changes.

Nuget package for PCL Profile 259 missing

Trying to get the nuget package for PCL Profile 259 and getting the error below. How hard would it be to get a Nuget Package targetting PCL Profile 259?

Thanks!

PM> Install-Package KBCsv
Attempting to gather dependency information for package 'KBCsv.4.0.0' with respect to project 'MYCOMPANY.MYPROJECT.Core.ViewModels', targeting '.NETPortable,Version=v4.5,Profile=Profile259'
Attempting to resolve dependencies for package 'KBCsv.4.0.0' with DependencyBehavior 'Lowest'
Resolving actions to install package 'KBCsv.4.0.0'
Resolved actions to install package 'KBCsv.4.0.0'
GET https://api.nuget.org/packages/kbcsv.4.0.0.nupkg
OK https://api.nuget.org/packages/kbcsv.4.0.0.nupkg 25ms
Installing KBCsv 4.0.0.
Install failed. Rolling back...
Package 'KBCsv.4.0.0' does not exist in project 'MYCOMPANY.MYPROJECT.Core.ViewModels'
Package 'KBCsv.4.0.0' does not exist in folder 'C:\code\MYPATH\MYCOMPANY\Dev\Src\packages'
Install-Package : Could not install package 'KBCsv 4.0.0'. You are trying to install this package into a project that targets
'.NETPortable,Version=v4.5,Profile=Profile259', but the package does not contain any assembly references or content files that are compatible with that
framework. For more information, contact the package author.
At line:1 char:1

  • Install-Package KBCsv
  • - CategoryInfo          : NotSpecified: (:) [Install-Package], Exception
    - FullyQualifiedErrorId : NuGetCmdletUnhandledException,NuGet.PackageManagement.PowerShellCmdlets.InstallPackageCommand
    
    

PM>

Buffered read fills buffer if final value is missing

The following test fails (read is 100, not 2):

[Fact]
public async void kbcsv_repro()
{
    var csv = @"Col1,Col2,Col3
val1,val2,val3
val1,val2,";

    using (var reader = CsvReader.FromCsvString(csv))
    {
        await reader.ReadHeaderRecordAsync();
        var buffer = new DataRecord[100];
        var read = await reader.ReadDataRecordsAsync(buffer, 0, buffer.Length);

        Assert.Equal(2, read);
    }
}

Including a final value or a new line is enough to make the test pass, but neither should be required.

Convert perf tests to BenchmarkDotNet?

It probably makes sense to convert the perf test project over to use BenchmarkDotNet. Ideally, it would run as .NET core, but I'm not sure BenchmarkDotNet supports that...

Non RFC based escaping support

Escaping the delimiter character is supported by doubling, e.g. "some ""quoted"" value", however many sources of CSV files take a different approach and would instead have that as "some "quoted" value".

It would be useful to be able to specify an escape character to override RFC behaviour.

RFC 4180 Conformance

See if KBCsv can conform to RFC4180. This RFC came out after KBCsv was initially released, so I instead ensured it conformed to the pseudo-standards available at the time.

Inconsistent number of columns

Hello,

I am bit surprised that a csv file containing several lines but one line with an extra column is read succesfully without raising any exception?
E.g.
column1;column2
data11;data12
data21;data22
data31;data32;data33
data41;data42

Is it the expected behavior? Is there an option to control this?
Any help will be appreciated.

Thanks
Vincent

Help to read select rows from a csv file with different row & column structures

Hi,

Thanks for the library. I am looking at parsing information from a csv file with variable column and rows like the attached file.

I would like to parse the stock group in a separate table and the advance/declines in another table.

Is this possible and if so, can you please provide me a code as I am a newbie.

The csv file has been renamed to txt because the csv attachment is not allowed.

thanks for your help in advance.
StockMoves.txt

LINQ integration

Would be cool if something like this was possible:

var foos =
    from fooRecord in CsvParser.ReadAllRecords("somefile.csv", includeHeader: false)
    select new Foo(fooRecord[0], fooRecord[1])
    select foo;

Requires careful thinking, especially around resource management and async.

new CsvReader("data.csv")) ?

Hi. This simple example from your README.md doesn't appear to compile due to there being no constructor that takes a filename.

Am I missing something really obvious?

For now I'm just doing:

        using (var streamReader = new StreamReader("foo.csv"))
        using(CsvReader csvRdr = new CsvReader(streamReader)
        {
        }

Which is fine, but not sure if the README just needs updating, or if there was an intention to support thw overload, or if I'm just missing a namespace include/using for some extension class.

Thanks!

Cannot build

I have VS 2013 and it is complaining about (load failed) projects and NuGet Package restore errors.

Not able to import the text containing quotes in the some of the use cases

I was trying to import the .csv file with a string containing quotes as shown in below examples.
To read the .csv I was using KBCsv.dll (Version: 4.0.0.0)
The CsvReader consider the "" as ValueDelimeter and hence the imported strings discard the "" as shown in Example1
Example1 : My"Example"String
Result of Example1: MyExampleString

So, we decorated the part of string with """" to import "".
Example2 : My""Example""String
Result of Example2: MyExampleString

But when the string which contains dot(.) as shown in below example, then the "" are not imported.
Example3: S1.""opctags"".test8
Result of Example3: S1.opctags.test8

Below is the entire .csv file text which we tried to import:
Action Type,Point Name,Tag Name,Item Path,Min Scale,Max Scale,Unit,Data Type
Add,,,S1.""opctags"".test8,0,100,mm/s/lbf,float

One way to import the string mentioned in Example3 is to overide the ValueDelimeter.
Is there any other way to import this kind of text?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.