thephpleague / csv Goto Github PK

View Code? Open in Web Editor NEW

3.3K 60.0 331.0 9.95 MB

CSV data manipulation made easy in PHP

Home Page: https://csv.thephpleague.com

License: MIT License

PHP 99.91% Makefile 0.09%

php csv-document csv-converter csv-filter csv-query csv

csv's People

Contributors

Stargazers

Watchers

Forkers

profburial devdrops 2bj andriariii stanleyinc andersonamuller testvidya11 bm-bookmarks loduis lyhiving alesf jbinfo topscore dev-ext itmyprofession bdunn313 isantoso mfrost503 tstuttard 3l3n01 davidepastore samruddhi-technologies-private-limited azizur77 robindev abriframework sparkinzy royalwang alawnchen alexander-feil surgeforward marclemerdy codeliner thominj florianeckerstorfer degatt2 moleculezz compleatguru marquisknox mortenhauberg ashwinikr1 cjwfuller imanifaiz plastic75cz h4cc stof jeremygres jblotus bcrowe work2015 bluewebtech markdpama kilrizzy postalservice14 grim-reapper vlakarados hmcclungiii sauravpratihar babarinde heyratfans pborreli replay1020 oscjack teino1978-corp 0x6b386f hashrock adamhut weldonvtm ssfinney jpadilla abdel-ben hannesvdvreken connorjburton vlakoff kislaszlo777 medbenhenda webcol haampie gbecchio julioamorim arkaitzgarro digideskio h-bragg piwi91 carnage elfeffe bodyanuk tarikuli hswuhao jim-sheng hhamon php-tool consle hxzlhby seydu isnusun nekulin forked-repo rbull blmeena1991 harikt

csv's Issues

Build suddenly failing

Hi gang,

I installed last night with "league/csv": "~7.0" in my composer.json, and today my builds are failing:

Installing league/csv (dev-master 0d3c28b)
Cloning 0d3c28b
0d3c28b is gone (history was rewritten?)
Failed to download league/csv from source: Failed to execute git checkout '0d3c28bf20ad26e1e23599e1746aaf7c680c0477' && git reset --hard '0d3c28bf20ad26e1e23599e1746aaf7c680c0477'
fatal: reference is not a tree: 0d3c28b
Now trying to download from dist
- Installing league/csv (dev-master 0d3c28b)
  Downloading: connection...
[Composer\Downloader\TransportException]
Could not authenticate against github.com

Did I screw something up?

Allow empty enclosure and escape.

Would it be possible to allow a empty string when setting the enclosure and the escape char? A client has a special format that, for some reason, should not differentiate any chars between the delimiter.

    public function setEnclosure($enclosure = '"')
    {
        if (1 != mb_strlen($enclosure)) {
            throw new InvalidArgumentException('The enclosure must be a single character');
        }
        $this->enclosure = $enclosure;

        return $this;
    }

    public function setEscape($escape = "\\")
    {
        if (1 != mb_strlen($escape)) {
            throw new InvalidArgumentException('The escape character must be a single character');
        }
        $this->escape = $escape;

        return $this;
    }

detectDelimiterList() don't find the correct delimiter

I have the following csv file (without a csv header), line 1:

Surname;Name;;;;;3316157;12360000;"Bank name";DE49123600000003316157;GENODIF1DIR;12,456

With this line detectDelimiterList() can't find the correct delimiter:

$delimiters_list = $inputCsv->detectDelimiterList();

if (!$delimiters_list) {
    //no delimiter found
} elseif (1 == count($delimiters_list)) {
    $delimiter = $delimiters_list[0]; // the found delimiter
} else {
    //inconsistent CSV 
    var_dump($delimiters_list); // all the delimiters found
}

shows:

array(2) {
  [0]=>
  string(1) ","
  [1]=>
  string(1) ";"
}

Extra row with "null" in reader fetchAll result

Using fetchAll I have extra row with NULL in result.
See following testcase:

<?php
require_once 'vendor/autoload.php';
use League\Csv\Reader;

$reader = Reader::createFromString("a,b".PHP_EOL."3,11");
$result = $reader->fetchAll();
var_export($result);// array(0 =>array (0 => 'a',1 => 'b',),1 =>array (0 => '3',1 => '11',),2 =>array (0 => NULL,),)

My environment:
Ubuntu 14.04.1 LTS
PHP 5.5.21-1+deb.sury.org~trusty+2

iterate over fetchAssoc

I didn't see it but maybe i missed it in the docs, but it would be nice if it was possible to iterate over an associated array from the first line.

something like:

$reader = \League\Csv\Reader::createFromPath('/file.csv');
$reader->setMode(Reader::MODE_ASSOC);
foreach ($reader as $row) {
 // here row will be an array with names fro the header
}

Iterate using foreach adds an empty element

I just discovered this library. So, after seeing I can iterate with a simple foreach, I tried with a very simple code:

<?php
require '../vendor/autoload.php';

use League\Csv\Reader;

$data = Reader::createFromPath('file.csv');

foreach ($data as $row) {
    print_r($row);
}

It works, but adds an empty element at the end of data, in the last iteration:

Array
(
    [0] => 15581
    [1] => 1
    [2] => 1
    [3] => 2
    [4] => 339
    [5] => 1400
)
Array
(
    [0] => 
)

I definitely have to read the documentation, but I think this code should just work...

Let me clarify: I have ensured that there is no cause for an empty line. The file in this example has a single line and no more.

How to create a plain Writer and save content to ".csv" file

A very common scenario for me is:

importing data from various sources
manipulating data (usually in some sort of array)
exporting data as CSV file

My first naive approach to accomplish this with this package would look something like this:

$file = "result.csv";
$writer = new Writer();
$data = [/*...*/];
$writer->insertAll($data );
$writer->saveToCsv();

Unfortunately this does not seem to be possible at the moment, because

a Writer can only be instantiated via from... method (right?)
there is no method to simply write the data to a file

My workaround looks like this:

$file = "result.csv";
Utility::writeEmptyFile($file);
$writer = Writer::createFromPath($file);
$data = [/*...*/];
$dataAsString = $writer->__toString();
Utility::writeContentToFile($file,$dataAsString);

So I'm forced to handle the file processing myself, which I'd really like to avoid. This becomes especially cumbersome when having to deal with different encodings (a problem you've already solved when reading csv files with your stream/filter-implementation). Do you plan on integrating this functionality? If not, why not :)?

Cheers
Pascal
A very common scenario for me is:

importing data from various sources (CSV, XML, etc.)
manipulating data (usually in some sort of array)
exporting data as CSV file

The last part doesn't seem to be possible right now. My first naive approach to accomplish this with this package would look something like this:

$file = "result.csv";
$writer = new Writer();
$data = [/*...*/];
$writer->insertAll($data);
$writer->saveToCsv();

Unfortunately this does not work, because

a Writer can only be instantiated via from... method (right?)
there is no method to simply write the data to a file

My workaround looks like this:

$file = "result.csv";
Utility::createEmptyFile($file);
$writer = Writer::createFromPath($file);
$data = [/*...*/];
$dataAsString = $writer->__toString();
Utility::writeContentToFile($file,$dataAsString);

Cheers
Pascal

Does not handle quotes as it should

I'm using this nice class since some time in order to read CSV file but I recently detected that on double quotes some issues could happen.

Here is an example of CSV file.

"Label","Login","Password","Web Site","Comments"
"Generic Bank #1","jdoe","superS3cret","https://www.genericbank.com","Checking accounts, etc"
"Generic Bank #2","jdoe","S3cret with spaces","https://www.genericbank.com","Checking accounts, etc"
"Generic Bank #2","jdoe","SpecialChars\\!"'@#$$%^&&*()@\\///\","https://www.genericbank.com","Checking accounts, etc"
"Retailer Chain #1","",""Secretstartingwithquotes","https://www.bigboxstore.com",""
"Retailer Chain #2","","'Secretstartingwithsinglequote","https://www.bigboxstore.com",""
"Retailer Chain #3","","'Twosinglequotes'","https://www.bigboxstore.com",""
"Health Care #1","jdoe","S3cretwithsinglequote'init","https://www.myhealthcare.com","Health care stuff"
"Health Care #2","jdoe","S3cretwithcomma,init","https://www.myhealthcare.com","Health care stuff"
"Health Care #3","jdoe","S3cretwithdoublequote"init","https://www.myhealthcare.com","Health care stuff"

Here is the output:

Take a look to the password field and the different values, you will see that the passwords that contain double quotes are not correctly managed.
Here is how I'm launching the class.

            $csv = new Reader($file);
            $csv->setDelimiter(',');
            $csv->setEnclosure('"');
            $csv->setEscape('\\');
            $csv->setFlags(SplFileObject::READ_AHEAD|SplFileObject::SKIP_EMPTY);
            $res = $csv->fetchAssoc(['Label', 'Login', 'Password', 'Web site', 'Comments']);
            print_r($res);

Can help in anyway?

Get Rid Of Develop Branch

Can we not have a develop branch for working on patch releases please? It's a bit odd, and makes the branch alias for getting dev changes pointless.

AbstractCSV $open_mode validation bug

Taken from a comment on Reddit on the $open_mode validation.

The library check should be abandonned and we should let PHP internals deal with it

[bug?] First cell always contains "<?php"

Good morning gentlemen,

I've fighting with this weird bug (?) right now:

UPDATE: This seems to be a problem in Yii framework itself, not in thephpleague/csv (as I could reproduce the but by just using PHP's native CSV functions). I'm unsure how to handle this issue now.

Reproduceable problem

When using Yii framework 2.0.3 and putting out a .csv file created with thephpleague/csv 7.0 in the mosst simple, most cleanest way, exactly like in the examples described, I'm always getting a perfect file, but it has <?php in the very first cell and puts the cell's content into double quotes.

Should be
test

But is
<?php"test"

My code (a standard controller in Yii2):

    public function test()
    {
        $writer = Writer::createFromFileObject(new \SplTempFileObject());
        $writer->insertOne(array("test", "", "test2")); // header line
        $writer->insertOne(array("123", "444")); // demo content line
        $writer->output('123.csv');
    }

What I've tried:

all the different BOM settings
all the different Delimiters,, Enclosures, NewLine-settings, Encodings
rendering the file via header()

Note:

showing the data in a HTML table or as raw data works perfectly
I'm suspecting something in the Yii framework working "into" the CSV creation

I'm kindly asking for a short look if this could be a real bug or just something on my side.
This is a bug-report, not a help request :)

Cheers

example gives error

<?php
use League\Csv\Writer;

//we fetch the info from a DB using a PDO object
$sth = $dbh->prepare(
    "SELECT firstname, lastname, email FROM users LIMIT 200"
);
//because we don't want to duplicate the data for each row
// PDO::FETCH_NUM could also have been used
$sth->setFetchMode(PDO::FETCH_ASSOC);
$sth->execute();

//we create the CSV into memory
$csv = Writer::createFromFileObject(new SplTempFileObject());

//we insert the CSV header
$csv->insertOne(['firstname', 'lastname', 'email']);

// The PDOStatement Object implements the Traversable Interface
// that's why Writer::insertAll can directly insert
// the data into the CSV
$csv->insertAll($sth);

// Because you are providing the filename you don't have to
// set the HTTP headers Writer::output can
// directly set them for you
// The file is downloadable
$csv->output('users.csv');

Gives this error in lumen:

[Mon Apr 20 21:19:39 2015] PHP Fatal error:  Class 'App\Http\Controllers\SplTempFileObject' not found in /Users/tim/Documents/lumen-api/API - source code/app/Http/Controllers/apiSoldiers.php on line 25

Null Handling in Csv\Writer class

When passing an array containing null to $writer->insertOne() an InvalidArgumentException is thrown with message:

the provided data can not be transform into a single CSV data row

Example:

$writer->insertOne(["one", "two", null, "four"]);

This creates the requirement for the client code to "sanitize" nulls before inserting

Adding Stream Filtering capabilities to the library

As of now the library does not handle well CSV encoded in different charsets. To resolve this issue we can use PHP stream filter functions. But those functions are not restricted to encoding problems. Which means that we can enhance the library by providing a generic solution to apply PHP stream filtering capabilities to the CSV data.

A work in progress has already landed on the stream branch. Those works needs reviewing before being merge to the master. This features is stated for 5.4 inclusion

You can see a pratical use of the stream filtering capabilities by viewing the stream example source code .

Feedback are welcomed

Add count() wrapper method

When using the offset/limit for pagination, it would be useful if the library had a count() method to quickly determine the total rows without having to fetchAll() including the data memory overhead it comes with.

Reader ignores first newline in value

I came across a weird behaviour while evaluation this package. When reading a CSV file or string via Reader, the first newline in a value of each line will be ignored.

Code to reproduce:

$input = <<<EOS
"line 1 field 1 with crlf: > \r\n < second crlf: > \r\n <, line 1 field 2 with crlf: > \r\n < second crlf: > \r\n <"
"line 2 field 1 with crlf: > \r\n < second crlf: > \r\n <, line 2 field 2 with crlf: > \r\n < second crlf: > \r\n <"
EOS;

$reader = Reader::createFromString($input);
$reader->setNewline("\r\n");
foreach($reader as $r){
    var_dump($r);
}

Output

array(1) {
  [0] =>
  string(104) "line 1 field 1 with crlf: >  < second crlf: > 
 <, line 1 field 2 with crlf: > 
 < second crlf: > 
 <"
}
array(1) {
  [0] =>
  string(104) "line 2 field 1 with crlf: >  < second crlf: > 
 <, line 2 field 2 with crlf: > 
 < second crlf: > 
 <"
}

Please note the first "> <" (without \r\n) in the beginning of each csv line in the output.

The problem does only occur when SplFileObject::DROP_NEW_LINE ist defined as flag -- unfortunately I could not find away to "unset" this flag from outside of the class since it is automatically added in the Controls class when setting the flags:

    /**
     * Set the Flags associated to the CSV SplFileObject
     *
     * @param int $flags
     *
     * @throws \InvalidArgumentException If the argument is not a valid integer
     *
     * @return $this
     */
    public function setFlags($flags)
    {
        if (false === filter_var($flags, FILTER_VALIDATE_INT, ['options' => ['min_range' => 0]])) {
            throw new InvalidArgumentException('you should use a `SplFileObject` Constant');
        }

        $this->flags = $flags|SplFileObject::READ_CSV|SplFileObject::DROP_NEW_LINE;

        return $this;
    }

Could you confirm this behaviour? To me, this looks like a bug - although I'm not sure if I'm missing something.

Cheers
Pascal

Import large csv files

I'm trying to import a large csv files with +600,000 records, do you have an example for this?

A function to shop de file into smaller parts and replace them in the memory like this would be nice:

function file_get_contents_chunked($file,$chunk_size,$callback)
{
    try
    {
        $handle = fopen($file, "r");
        $i = 0;
        while (!feof($handle))
        {
            call_user_func_array($callback,array(fread($handle,$chunk_size),&$handle,$i));
            $i++;
        }

        fclose($handle);

    }
    catch(Exception $e)
    {
         trigger_error("file_get_contents_chunked::" . $e->getMessage(),E_USER_NOTICE);
         return false;
    }

    return true;
}

 $success = file_get_contents_chunked("my/large/file",4096,function($chunk,&$handle,$iteration){
    /*
        * Do what you will with the {&chunk} here
        * {$handle} is passed in case you want to seek
        ** to different parts of the file
        * {$iteration} is the section fo the file that has been read so
        * ($i * 4096) is your current offset within the file.
    */

});

if(!$success)
{
    //It Failed
}

Got it from this question:

http://stackoverflow.com/questions/5249279/file-get-contents-php-fatal-error-allowed-memory-exhausted/5249971#5249971

Can't parse tab-delimited file with no enclosure

I deal with tab delimited files with no enclosures on a regular basis. There seems to be no way to configure the Reader to parse these files. If I try to setEnclosure(null) or setEnclosure(""), it throws an exception.

Add BOM to outputed file

Hello.
Please add a config option to write BOM (\xEF\xBB\xBF) in the begining of outputed file.
It solves the problem of opening UTF-8 encoded csv-file in MS Excel. Refer to http://stackoverflow.com/questions/4348802/how-can-i-output-a-utf-8-csv-in-php-that-excel-will-read-properly

docs typo

Hi!

In docs page http://csv.thephpleague.com/overview/ we have an example:

$reader->setDelimeter(',');

But the correct method name is setDelimiter();

Typo in Docs

Typo of "tree" instead of "three" in Reading section of docs. PR coming soon. Not particularly exciting but every little helps!

Should Exceptions be more detailed to make errors more catchable?

For example, someone uploaded a CSV with a couple of empty header fields at the end of the CSV, and the following exception was thrown:

        if (! $this->isValidAssocKeys($res)) {
            throw new InvalidArgumentException(
                'Use a flat non empty array with unique string values'
            );
        }

There's not much here that I can differentiate between another InvalidArgumentException other than the message, which I assume is not going to maintain BC?

Adding Multiple sorting conditions

The IteratorQuery trait enable sorting the CSV. But as of now you can only set one sorting condition by query.

The current IteratorQuery::setSortBy behaviour is somehow difficult to understand so I'm thinking this method needs an important rewrite to enable

Setting multiple sorting conditions
Allow for a simpler method call

If we register multiple sorting condition :

should the registration order matter ?
should we be able to register/unregistered the settings ?

UTF-8 support?

Hello,

As I see from docs, utf8 should be supported without problems, but I am hitting some.

I also tried adding at the top of the file: mb_internal_encoding("UTF-8"); (sinse this sometimes helps)

basicly the code is really simple:

        $writer = \League\Csv\Writer::createFromFileObject(new SplTempFileObject());
        $writer->setNewline("\r\n");
        $writer->setEncodingFrom("UTF-8"); // this probably is not needed?
        $headers = ["Title", "Start date", "End date", "Time", "Address", "Place", "Organizer", "Text", "Contact person", "Email", "Phone"];
        $writer->insertOne($headers);

        foreach($items as $item)
        {
            $data = [
                strip_tags($item['title']),
                format_date($item['date_at']),
                format_date($item['expires_at']),
                $item['start_time'],
                $item['address'],
                $item['place'],
                $item['organizer'],
                strip_tags($item['body']),
                $item['user']['full_name'],
                $item['user']['email'],
                $item['user']['phone'],
            ];

            $writer->insertOne($data);
        }

        header('Content-Type: text/csv; charset=UTF-8');
        $filename = 'events-'.format_date(time()).'.csv';

        $writer->output($filename);

but what strange I saw, that when I open the file using sublime - I see all those chars, but i.e. opening this file in excel 2014 gives me invalida characters. Any ideas?

Thank you!

also any chance ) of getting a tag for latest master?

thanks

library doesn't support iconv stream filter

Hi, I just pulled the latest version of the library and try to integrate it in my application but it fails at the beginning when I do

$stream = Reader::createFromPath('/home/nsitbon/export_members_20150217.csv');

if ($stream->isActiveStreamFilter()) { 
    $stream->appendStreamFilter('convert.iconv.UTF-16/UTF-8//TRANSLIT');
}

it throws the following exception :

[RuntimeException]
SplFileObject::__construct(): unable to create or locate filter "convert.iconv.UTF-16"

whereas this code works perfectly

$stream = fopen('/home/nsitbon/export_members_20150217.csv', 'r');
$stream_filter_append($stream, 'convert.iconv.UTF-16/UTF-8//TRANSLIT', STREAM_FILTER_READ);

any ideas?

Importing data using the each method

Exporting data to csv is made easy with the Writer::insertAll method, the same is not true if we want to import data from csv into another storage medium. Right now you need to take the result of one of the Reader::fetch* method and iterate over to import your data. This can be memory expensive and you end up doing 2 iterations !!

I'm introducing the Reader::each method that will help ease CSV data import by applying a callback to each CSV row:

As with all the other Reader::fetch* methods, this new method can be modify using the filters methods.

Filesize equal to 0

I have a simple function which write a csv file:

public function writeCsv($csvFilename, $rows)
{
        $writer = CSV\Writer::createFromPath(new \SplFileObject($csvFilename, 'w+'), 'w');

        // header
        $writer->insertOne(array_keys($rows[0]));

        // rows
        $writer->insertAll($rows);

        $fileSize = filesize($csvFilename);

        $this->logger->info("csv file $csvFilename written size=$fileSize");
}

My problem is that $fileSize is equal to 0 but at the end of execution
csvfile "is well flushed".

Is there a way to get the size of the file?

Importing CSV from Windows UTF-8 return NULL

When CSV file is an US standard, has only letters (Dec) code from c.a. 33 to 127 imports works fine.
Other Localisation, non english languages, letters like "óęłńśżźŻŹĆÓŁĘŚŃ" return all fields in one line empty (null), even if only ''strange'' letter exist. After removing that letter, work.

Code is simple:

if (move_uploaded_file($this->data['Payment']['file']['tmp_name'], $filename)) {
$csv = Reader::createFromPath($filename);
$headers = $csv->fetchOne(); -----> works on first line
echo debug($headers);
$data = $csv->setOffset(0)->setLimit(2)->fetchAll(); ----> not working
echo debug($data);
}

Source file is:
--------------------cut here-------------
2014-09-09;2014-09-09;PRZELEW ZEWNETRZNY PRZYCHODZACY;"OPLATA ZA KURS, MICHAL ";"ANDRZEJ JERZY UL. 11 LISTOPADA 08-110 SIEDLCE";'6575676646444666444545454545';200,00;250,67;
2014-09-09;2014-09-09;PRZELEW ZEWN TRZNY PRZYCHODZ•CY;"OP£ATA ZA KURS, JAKUB ";"ANDRZEJ JERZY UL. 11 LISTOPADA 08-110 SIEDLCE";'4435446435354657656575634534534534535';200,00;450,67;
-----------cut-here-------------

Allowed Memory Size Error and Memory Limit

Using a big file of 75MB we are getting the following error:

Fatal error in module Reader:
Allowed memory size of 134217728 bytes exhausted (tried to allocate 81 bytes)

Now I am going to play with @ini_set('memory_limit', '-1'); and see what happens. PLease let me know if there is something I am missing to improve my query.

Thanks for your time and happy hacking!

This is the same piece of code that works but only using small files:

    public function convertItemsFile($files)
    {
        $itemsFileName = 'items_file.csv';

        $items = $this->fileExists($files, 'shortname', $itemsFileName);

        if ($items) {

            $input = $items[0]['name'];

            $csvItems = Reader::createFromPath($input);

            $headers = $csvItems->fetchOne();

            $dataItems = $csvItems
                ->setEncodingFrom('UTF-8')
                ->setDelimiter(',')
                ->setOffset(1)
                ->addFilter(array($this,'filterItemsByStyleCodes'))
                ->fetchAssoc();

            $output = $this->_csvFilesPath . 'vp-' . $items[0]['shortname'];

            @unlink($output);
            touch($output);

            $csvItems = Writer::createFromPath($output);

            $csvItems->insertOne($headers);

            foreach ($dataItems as $row) {

                $csvItems->insertOne([
                    $row['company'],
                    $row['item-number'],
                    $row['description'],
                    $row['style-code']
                ]);

            }

        }
    }

setOffset() is not respected with toHTML()

Hello,

I have a csv file from which I don't want to use the first 3 lines then I want to convert it to an html table.
So I used something like this:

$inputCsv = Reader::createFromString(.....);
$inputCsv->setDelimiter("\t");
$inputCsv->setEncodingFrom("UTF-8");
$inputCsv->setOffset(3);
echo $inputCsv->toHTML('table table-striped');

But the generated table contains the first 3 lines of my CSV file.

Bye,
Hervé

Weird encoding issue when reading .csv generated by Google AdWords

I have a .csv from Google AdWords that's actually tab-seperated. This is done by default in Google AdWords, not sure why. But when I var_dump() the array, I get a bunch of characters like this:

Here is the .csv I'm trying to use:

https://dl.dropboxusercontent.com/u/30874695/Keyword%20Planner%202015-03-27%20at%2011-13-26.csv

Again, this is straight from Google AdWords, and hasn't been opened/saved in Excel. I've confirmed the strings are 'ASCII'. Here's the code I'm using to read the csv:

$reader = Reader::createFromPath($csv_file);
$reader->setFlags(\SplFileObject::READ_AHEAD|\SplFileObject::SKIP_EMPTY);

$detect_delimiter = $reader->detectDelimiterList();
$delimiter = array_shift($detect_delimiter);
$reader->setDelimiter($delimiter);

$data = $reader
    ->setOffset(1)
    ->fetchAssoc($this->csv_headers);

// Remove duplicate entries
$data = array_map('unserialize', array_unique(array_map('serialize', $data)));

remove wiki?

since it could mislead to wrong docu

Adding Multiple filter conditions

The IteratorQuery trait enable filtering the CSV. But as of now you can only set one filter by query. So if you want to make a complex filtering condition you have to register a very complex function.

It would be interesting to be able to set as many filtering conditions as we want. So that the user can register small yet more readable filters ?

If we register multiple filter:

should the registration order matter ?
should we be able to register/unregistered the filters ?

Working with large files

I'm trying to read a CSV file with 150k rows. My server isn't massive so I am trying to do this is a way that handles memory properly, I can't just return all rows as an array.

What is the best way to loop over this? I tried using a loop like this: foreach ($data as $lineIndex => $row) but ran into memory issues.

I can create a PHP loop and then use ->fetchOne() row at a time, but I believe this 'rewinds' the file after every read, so if I am grabbing row 140,000 it takes 10 seconds to return the data. Any ideas for processing large files?

I am okay with storing the last line read in my cache, and then using ->fetchOne but is there a way to prevent the long seek delay?

SplFileObject Flags have no effect / empty lines cannot be ignored

I just updated from 7.0.1 to 7.1.0 and noticed that my internal tests failed due to a null value appended at the end of each CSV file that was imported. After some trial and error I figured that the SplFileObject falgs don't seem to have an effect any longer.

I'm missing the SplFileObject::SKIP_EMPTY flag in particular since it seemed to make sure that there's no null value at the end if the file terminates on a new line.

See the following test script:

<?php
use League\Csv\Reader;

include __DIR__."/vendor/autoload.php";

$path = __DIR__."/tmp.txt";
$str = "1st\n2nd\n";
$obj = new SplFileObject($path,"w+");
$obj->fwrite($str);
$obj = new SplFileObject($path,"r");
$reader = Reader::createFromFileObject($obj);

$flags = [
    "NONE" => 0,
    "READ_AHEAD" => SplFileObject::READ_AHEAD,
    "READ_AHEAD | DROP_NEW_LINE" => SplFileObject::READ_AHEAD | SplFileObject::DROP_NEW_LINE,
    "READ_AHEAD | SKIP_EMPTY" => SplFileObject::READ_AHEAD | SplFileObject::SKIP_EMPTY,
    "READ_AHEAD | DROP_NEW_LINE | SKIP_EMPTY" => SplFileObject::READ_AHEAD | SplFileObject::DROP_NEW_LINE | SplFileObject::SKIP_EMPTY,
    "DROP_NEW_LINE" => SplFileObject::DROP_NEW_LINE ,
    "DROP_NEW_LINE | SKIP_EMPTY" => SplFileObject::DROP_NEW_LINE | SplFileObject::SKIP_EMPTY,
    "SKIP_EMPTY" => SplFileObject::SKIP_EMPTY ,
];

foreach($flags as $flagName => $flag) {
    $reader->setFlags($flag);
    $lines = $reader->fetchAll();
    $vals = [];
    foreach($lines as $line){
        if($line == null){
            $vals[] = "<null>";
        }elseif(count($line)){
            $val = array_shift($line);
            if($val == null){
                $val = "<null>";
            }
            $vals[] = $val;
        }
    }
    echo count($lines). " lines [".implode(', ',$vals)."]\tfor ". $flagName."\n";
}

Output in 7.0.1

3 lines [1st, 2nd, <null>]  for NONE
3 lines [1st, 2nd, <null>]  for READ_AHEAD
3 lines [1st, 2nd, <null>]  for READ_AHEAD | DROP_NEW_LINE
2 lines [1st, 2nd]  for READ_AHEAD | SKIP_EMPTY
2 lines [1st, 2nd]  for READ_AHEAD | DROP_NEW_LINE | SKIP_EMPTY
3 lines [1st, 2nd, <null>]  for DROP_NEW_LINE
2 lines [1st, 2nd]  for DROP_NEW_LINE | SKIP_EMPTY
2 lines [1st, 2nd]  for SKIP_EMPTY

Please note the lines with SplFileObject::SKIP_EMPTY set: Those contain only 2 values (as expected).

Output in 7.1.0

3 lines [1st, 2nd, <null>]  for NONE
3 lines [1st, 2nd, <null>]  for READ_AHEAD
3 lines [1st, 2nd, <null>]  for READ_AHEAD | DROP_NEW_LINE
3 lines [1st, 2nd, <null>]  for READ_AHEAD | SKIP_EMPTY
3 lines [1st, 2nd, <null>]  for READ_AHEAD | DROP_NEW_LINE | SKIP_EMPTY
3 lines [1st, 2nd, <null>]  for DROP_NEW_LINE
3 lines [1st, 2nd, <null>]  for DROP_NEW_LINE | SKIP_EMPTY
3 lines [1st, 2nd, <null>]  for SKIP_EMPTY

Regardless of the flags, the output is always the same (3 lines, last one is null. I tried to figure out what was changed between those version but was unable to identify the cause - so this is my last desperate attempt to understand what's going on :)

Option for checking row consistency

It would be a really nice feature if both the Reader and the Writer could check whether the row lengths in the file are consistent. The Writer should probably throw an Exception if strict check is enabled, and someone tries to write a row, which is not consistent with the file.

I've created my own CSV package, as there were no decent CSV packages at that time. You can check it here. The row consistency check is implemented.

Default keys for fetchAssoc

Hello,

First, thanks for sharing your code.

By default, why don't choose the csv file's first line to set the keys when we are doing a fetchAssoc without keys ?

Actually, we need to do :

$csv = Reader::createFromPath(dirname(__FILE__).'/my.csv');

$header = $csv->fetchOne();
$res = $csv->addFilter(function ($row, $index) { return $index > 0; })
                  ->fetchAssoc($header);

I will propose a pull-request in a few minuts.

Proposal: Use grunt for registering precommit hooks

We could use grunt to register precommit hooks that

Check files for PSR-2 violations
Run phpunit tests

It will allow collaborators to commit only if all the conditions pass.
Precommit hooks, can however be overridden with --no-verify to skip checks.

Pros

Enforce coding conventions. No way of saying, oops i forgot to check.
Ensure no test cases are broken

Cons

Addition of node.js / io.js to the development stack, which will make setting up the dev environment a bit lengthy.

I will submit a PR if this is approved.

What do you make out of this?

I tried installing your library and it didn't work... I "self-update"d composer, just in case and I still got the error which is kinda weird and it has never happened to me before... :)

C:\Zampps\www\building-blocks\databases-flat\LeagueCSV>composer require league/csv
Using version ~6.0 for league/csv
./composer.json has been created
Loading composer repositories with package information
Updating dependencies (including require-dev)

Installing league/csv (6.0.0)
Loading from cache
Failed to download league/csv from dist: RecursiveDirectoryIterator::__construct(C:\Zampps\www\building-blocks\databases-flat\LeagueCSV\ve
-blocks\databases-flat\LeagueCSV\vendor/league/csv): The system cannot find the path specified. (code: 3)
Now trying to download from source
Installing league/csv (6.0.0)
Cloning 7c881c2
[UnexpectedValueException]
RecursiveDirectoryIterator::__construct(C:\Zampps\www\building-blocks\databases-flat\LeagueCSV\vendor\league\csv,C:\Zampps\www\buildng-blocks\databases-flat\LeagueCSV\vendor\league\csv): The system cannot find the path specified. (code: 3)

Memory consumption when writing large files

Hey,

I came across another problem when attempting to write a large CSV file which is somewhat related to #81

Code to reproduce

function writeFile($file, $content){
    $spl = new SplFileObject($file,"w");
    $spl->fwrite($content);
    $spl = null;
}

$writer = Writer::createFromString("");
$cols = 100;
$rows = 1000;
$chars = 100;
$text = implode(" ", array_fill(0, $chars, "a"));
$row = array_fill(0, $cols, $text);
$lines = array_fill(0, $rows, $row);
echo "Using ".memory_get_usage()/(1024*1024)." MB after creating data\n";
$writer->insertAll($lines);
echo "Using ".memory_get_usage()/(1024*1024)." MB after inserting data\n";
$str = $writer->__toString();
echo "Using ".memory_get_usage()/(1024*1024)." MB after converting data to string\n";
writeFile(__DIR__."/test2.csv",$str);
echo "Using ".memory_get_usage()/(1024*1024)." MB after writing data to file\n";

Output

Using 0.60456848144531 MB after creating data
Using 0.60501861572266 MB after inserting data
Using 19.885452270508 MB after converting data to string
Using 19.885543823242 MB after writing data to file

As you can see, I need to somehow "get" the CSV content (using $str = $writer->__toString();) in order to write it to a .csv file.
The resulting string can be very large which leads to a massive increase in used memory (0.6MB => 19.8MB).

Imho it should be possible to perform chunked (as in line by line) writing, like this:

function writeCsv($file, $lines){
    $spl = new SplFileObject($file,"w");

    foreach ($lines as $line)
    {
        $spl->fputcsv($line);
    }
    $spl = null;
}

$cols = 100;
$rows = 1000;
$chars = 100;
$text = implode(" ", array_fill(0, $chars, "a"));
$row = array_fill(0, $cols, $text);
$lines = array_fill(0, $rows, $row);
echo "Using ".memory_get_usage()/(1024*1024)." MB after creating data\n";
writeCsv(__DIR__."/test3.csv",$lines);
echo "Using ".memory_get_usage()/(1024*1024)." MB after writing data to file\n";

Output

Using 0.37704467773438 MB after creating data
Using 0.37732696533203 MB after writing data to file

Since it uses SplFileObject internally, one could also use the filter api in order to use a streaming filter (e.g. for encoding conversion). I feel the Writer would greatly benefit from an Writer::writeToFile method.

Any thoughts on that?

Cheers
Pascal

Add delimiter detection to Reader

Hello,

I've tried out your and many other CSV libraries in the hope of finding one that automatically detects the delimiter and sadly none of the libraries I've tried have had a go at this.

I find this odd since it seems like a really useful abstraction that I as the consumer of the library don't want to worry about. The delimiter is a ;? Bakame should have my back without me having to set this manually.

Just grabbing the first line and counting occurences of valid delimiters for a csv file should cover 95% of the cases. Open office lists tab, comma, semicolon and space as valid delimiters for csv.

I'm guessing that if space is the delimiter then you have to use an enclosure for it to work.

Character replacement

Hello,
i am trying to create some csv fields as the example

$data = [
            ['"1","name", "surname"'],
            ['"2","name", "surname"'],
            ['"3","name", "surname"']
        ];

I want to add each row to a single line but the outlut is the following:

"""1"",""name"", ""surname"""
"""2"",""name"", ""surname"""
"""3"",""name"", ""surname"""

I want the output to be the following:

"1","name","surname"
"2","name","surname"
"3","name","surname"
etc.

I get more " than expected.
Am i doing something wrong??

Thanks

HHVM support

I would seems that HHVM supports CallbackFilterIterator since version 3.2. Yet the Library test suite still does not pass on HHVM. We should investigate the failed test and fix once for all this issue

can csv make some elements with a tag?

the href link to some elements.
when i click the a tag, it will open my broswer and go to the target web site page ??

Unwanted additional line at end of CSV file

I'm trying to create a new CSV file and write it to disk so that it can be emailed. This may not be an issue with the package but may be an issue with the way I am doing things.

Code:

    //create empty file to store CSV data (is there a better way of doing this?)
$handle = fopen($filepath, "w");
fwrite($handle, "");

    //create new writer instance
$csv = League\Csv\Writer::createFromPath(new SplFileObject($filepath));

    //insert data from array
$csv->insertAll($csv_data);

The resulting CSV has all the data in it as expected but there is an additional line with no entry on it.

This is a problem because the resulting CSV has to be uploaded into an Oracle-based system and it does not like empty lines in CSV files.

change development workflow to PRs

currently commits are being pushed to master directly
would help a lot to follow your work on this package to create PRs even if you are the top lead or maintainer, this is how it is done in FOSS. I respect your decision if you ignore this request though it would be a great deal of help to get PR notifications and help review contributing.

Thank you

Filtering out the BOM

The docs say that we should rely on the extracting methods to remove the BOM character while the CSV is read, but that doesn't seem to be the case.

Even though the BOM could be detected, it is not automatically removed, but rather included in the first column read.

I've put together an example here to illustrate what I'm doing:
http://runnable.com/VRYJNJ-c0wNVqTc6/parsing-bom-character-for-php

Am I missing something, or is it really up to the developer to manually remove it?

Double spaces are created when writing file with empty enclosure.

When writing a file with an empty enclosure, the class replaces every space in the data with two spaces. For example, a pipe delimited file:

$fileWriter->setDelimiter("|");
$fileWriter->setEnclosure(" ");

exports this to file (noticed the multiple spaces):

Rapala - Pro Guide Electric Fillet Knife|more data

instead of the actual data:

Rapala - Pro Guide Electric Fillet Knife|more data

(Github seems to be mangling the double spaces in my example)

thephpleague / csv Goto Github PK

csv's People

Contributors

Stargazers

Watchers

Forkers

csv's Issues

Reproduceable problem

Pros

Cons

Recommend Projects

Recommend Topics

Recommend Org