CSV for Comma-Separated Values is a well-established file format to import and export data.
In this tutorial, we will learn how to read and write CSV files in PHP with the help of examples.
We'll also explore how to perform the most common manipulation such as filtering, sorting, and transforming the data.
Last but not least, we'll learn how to parse a CSV file of 1 million lines in PHP in a fast and memory-efficient manner.
I used PHP8 while coding and testing the examples:
- The examples with native functions are compatible with very old PHP versions.
- The examples using
league/csv
require a PHP7 version as a minimum. - The whole code and examples are available in a GitHub repository; it comes packaged as a simple Symfony project and provides a Docker image.
Our Example of CSV File
We'll use a CSV file containing data about movies in this tutorial.
Here is a sample of this data set in plain text:
id,title,poster,overview,release_date,genres
181808,"Star Wars: The Last Jedi",https://.../mWII.jpg,"Rey develops her newly discovered abilities with the guidance of Luke Skywalker, [...]",1513123200,"Documentary"
383498,"Deadpool 2",https://.../3VAd.jpg,"Wisecracking mercenary Deadpool battles the evil and powerful Cable and other bad[...]",1526346000,"Action, Comedy, Adventure"
157336,"Interstellar",https://.../BvIx.jpg,"Interstellar chronicles the adventures of a group of explorers who make use of a [...]",1415145600,"Adventure, Drama, Science Fiction"
Usually, in the CSV format:
- The first line contains the headers
- Each following line is a data row
- A comma separates each cell in a row
- A double-quote encapsulate cells containing text
Sometimes, we can find different characters to separate the data, encapsulate strings, or escape special characters.
Let's look at our sample file in a tabular way:
id | title | poster | overview | release_date | genres |
---|---|---|---|---|---|
181808 | Star Wars: The Last Jedi | https://.../mWII.jpg | Rey develops her newly discovered abilities with the guidance of Luke Skywalker, [...] | 1513123200 | Documentary |
383498 | Deadpool 2 | https://.../3VAd.jpg | Wisecracking mercenary Deadpool battles the evil and powerful Cable and other bad[...] | 1526346000 | Action, Comedy, Adventure |
157336 | Interstellar | https://.../BvIx.jpg | Interstellar chronicles the adventures of a group of explorers who make use of a [...] | 1415145600 | Adventure, Drama, Science Fiction |
Read and Write CSV Files (Old-Fashioned Way)
Read a CSV File Using fgetcsv
The steps to read a CSV file in PHP with native functions:
- Open the file with
fopen
using the read mode - Parse the CSV rows from the file with
fgetcsv
- Close the file with
fclose
$path = 'data/movies-100.csv';
$handle = fopen($path, "r"); // open in readonly mode
while (($row = fgetcsv($handle)) !== false) {
var_dump($row);
}
fclose($handle);
It loads each row as an array; the first row contains the headers.
// the headers
array(6) {
[0]=>
string(2) "id"
[1]=>
string(5) "title"
[2]=>
string(6) "poster"
[3]=>
string(8) "overview"
[4]=>
string(12) "release_date"
[5]=>
string(6) "genres"
}
// another row
array(6) {
[0]=>
string(3) "807"
[1]=>
string(5) "Se7en"
[2]=>
string(63) "https://image.tmdb.org/t/p/w500/6yoghtyTpznpBik8EngEmJskVUO.jpg"
[3]=>
string(390) "Two homicide detectives are on a desperate hunt for a serial killer whose crimes are based on the 'seven deadly sins' in this dark and haunting film that takes viewers from the tortured remains of one victim to the next. The seasoned Det. Sommerset researches each sin in an effort to get inside the killer's mind, while his novice partner, Mills, scoffs at his efforts to unravel the case."
[4]=>
string(9) "811731600"
[5]=>
string(24) "Crime, Mystery, Thriller"
}
In some cases, we want to manipulate rows independently, let's transform each CSV row into an associative array:
// Parse the rows
$rows = [];
$handle = fopen($path, "r");
while (($row = fgetcsv($handle)) !== false) {
$rows[] = $row;
}
fclose($handle);
// Remove the first one that contains headers
$headers = array_shift($rows);
// Combine the headers with each following row
$array = [];
foreach ($rows as $row) {
$array[] = array_combine($headers, $row);
}
var_dump($array);
We now have each row as an associative array, with headers as indexes of each array. We can manipulate the rows independently as simple JSON objects.
array(100) {
[0]=>
array(6) {
["id"]=>
string(6) "287947"
["title"]=>
string(7) "Shazam!"
["poster"]=>
string(63) "https://image.tmdb.org/t/p/w500/xnopI5Xtky18MPhK40cZAGAOVeV.jpg"
["overview"]=>
string(98) "A boy is given the ability to become an adult superhero in times of need with a single magic word."
["release_date"]=>
string(10) "1553299200"
["genres"]=>
string(23) "Action, Comedy, Fantasy"
}
// more rows
Write a CSV File Using fputcsv
The steps to create a CSV file in PHP with native functions:
- Open the file with
fopen
using the write mode - Write each array in a CSV row with
fputcsv
- Close the file with
fclose
$rows = [
['id', 'title', 'poster', 'overview', 'release_date', 'genres'],
[181808, "Star Wars: The Last Jedi", "https://image.tmdb.org/t/p/w500/kOVEVeg59E0wsnXmF9nrh6OmWII.jpg", "Rey develops her newly discovered abilities with the guidance of Luke Skywalker, who is unsettled by the strength of her powers. Meanwhile, the Resistance prepares to do battle with the First Order.", 1513123200, "Documentary"],
[383498, "Deadpool 2", "https://image.tmdb.org/t/p/w500/to0spRl1CMDvyUbOnbb4fTk3VAd.jpg", "Wisecracking mercenary Deadpool battles the evil and powerful Cable and other bad guys to save a boy's life.", 1526346000, "Action, Comedy, Adventure"],
[157336, "Interstellar", "https://image.tmdb.org/t/p/w500/gEU2QniE6E77NI6lCU6MxlNBvIx.jpg", "Interstellar chronicles the adventures of a group of explorers who make use of a newly discovered wormhole to surpass the limitations on human space travel and conquer the vast distances involved in an interstellar voyage.",1415145600,"Adventure, Drama, Science Fiction"]
];
$path = 'data/new-file.csv';
$fp = fopen($path, 'w'); // open in write only mode (write at the start of the file)
foreach ($rows as $row) {
fputcsv($fp, $row);
}
fclose($fp);
The default separator is the comma, we can specify different parameters such as separator
, enclosure
, escape
and eol
when using fgetcsv
and fputcsv
functions.
Append New Rows in a CSV File Using fputcsv
Let's append rows at the end of an existing file:
$rows = [
['id', 'title', 'poster', 'overview', 'release_date', 'genres'],
[181808, "Star Wars: The Last Jedi", "https://image.tmdb.org/t/p/w500/kOVEVeg59E0wsnXmF9nrh6OmWII.jpg", "Rey develops her newly discovered abilities with the guidance of Luke Skywalker, who is unsettled by the strength of her powers. Meanwhile, the Resistance prepares to do battle with the First Order.", 1513123200, "Documentary"],
[383498, "Deadpool 2", "https://image.tmdb.org/t/p/w500/to0spRl1CMDvyUbOnbb4fTk3VAd.jpg", "Wisecracking mercenary Deadpool battles the evil and powerful Cable and other bad guys to save a boy's life.", 1526346000, "Action, Comedy, Adventure"],
[157336, "Interstellar", "https://image.tmdb.org/t/p/w500/gEU2QniE6E77NI6lCU6MxlNBvIx.jpg", "Interstellar chronicles the adventures of a group of explorers who make use of a newly discovered wormhole to surpass the limitations on human space travel and conquer the vast distances involved in an interstellar voyage.",1415145600,"Adventure, Drama, Science Fiction"]
];
$path = 'data/new-file.csv';
$fp = fopen($path, 'a'); // open in write only mode (with pointer at the end of the file)
foreach ($rows as $row) {
fputcsv($fp, $row);
}
fclose($fp);
The essential piece is to understand the mode
parameter used in the fopen
function:
r
is a read-only modew
is a write-only mode, starting at the beginning of the file (erase an existing file)w+
is a read/write mode, starting at the beginning of the file (erase an existing file)a
is a write-only mode, starting at the end of the file (append content at the end of the file)a+
is a read/write mode, starting at the end of the file (append content at the end of the file)
Using the modes w
, w+
, a
, a+
will create the CSV file if it does not exist yet.
Read and Write CSV Files (Modern Way)
This previous reading and writing approach is totally fine, but some modern libraries can make our lives easier and our code a bit more readable.
This tutorial uses the excellent league/csv
library, which provides a clean and straightforward CSV manipulation API.
We can also note the great box/spout
library, which is very strong too, offers support for other formats and can be used to read and write Excel files in PHP.
Install league/csv
Let's install the league/csv
library by using the composer package manager:
composer require league/csv
Read a CSV File Using League\Csv\Reader
Read CSV rows from a file using theReader
:
use League\Csv\Reader;
$path = 'data/movies-100.csv';
$csv = Reader::createFromPath($path, 'r');
$csv->setHeaderOffset(0); // use the first line as headers for rows
$header = $csv->getHeader();
var_dump($header);
$rows = $csv->getRecords();
foreach ($rows as $row) {
var_dump($row);
}
When using the setHeaderOffset
configuration, we can see a slight but interesting difference compared with the use of native PHP function. The league/csv
reader loads the rows as associative arrays (JSON object). It makes data manipulation easier as we can deal independently with each row.
array(6) {
["id"]=>
string(3) "807"
["title"]=>
string(5) "Se7en"
["poster"]=>
string(63) "https://image.tmdb.org/t/p/w500/6yoghtyTpznpBik8EngEmJskVUO.jpg"
["overview"]=>
string(390) "Two homicide detectives are on a desperate hunt for a serial killer whose crimes are based on the 'seven deadly sins' in this dark and haunting film that takes viewers from the tortured remains of one victim to the next. The seasoned Det. Sommerset researches each sin in an effort to get inside the killer's mind, while his novice partner, Mills, scoffs at his efforts to unravel the case."
["release_date"]=>
string(9) "811731600"
["genres"]=>
string(24) "Crime, Mystery, Thriller"
}
Write a CSV File Using League\Csv\Writer
Write CSV rows into a file using theWriter
:
use League\Csv\Writer;
$rows = [
['id', 'title', 'poster', 'overview', 'release_date', 'genres'],
[181808, "Star Wars: The Last Jedi", "https://image.tmdb.org/t/p/w500/kOVEVeg59E0wsnXmF9nrh6OmWII.jpg", "Rey develops her newly discovered abilities with the guidance of Luke Skywalker, who is unsettled by the strength of her powers. Meanwhile, the Resistance prepares to do battle with the First Order.", 1513123200, "Documentary"],
[383498, "Deadpool 2", "https://image.tmdb.org/t/p/w500/to0spRl1CMDvyUbOnbb4fTk3VAd.jpg", "Wisecracking mercenary Deadpool battles the evil and powerful Cable and other bad guys to save a boy's life.", 1526346000, "Action, Comedy, Adventure"],
[157336, "Interstellar", "https://image.tmdb.org/t/p/w500/gEU2QniE6E77NI6lCU6MxlNBvIx.jpg", "Interstellar chronicles the adventures of a group of explorers who make use of a newly discovered wormhole to surpass the limitations on human space travel and conquer the vast distances involved in an interstellar voyage.",1415145600,"Adventure, Drama, Science Fiction"]
];
$path = 'data/new-file2.csv';
$csv = Writer::createFromPath($path, 'w');
$csv->insertAll($rows);
This code snippet is an interesting one to export your database in a CSV file; in this case, you need to feed the file by reading rows from your database using some queries.
Let's keep in mind that when we use createFromPath
with the mode w
, we erase the entire file content if it already exists.
Append New Rows in a CSV File Using League\Csv\Writer
Sometimes, we need to append lines at the end of an existing file without erasing its current content.
In this case, we open the file by using the value a+
for the open_mode
parameter:
use League\Csv\Writer;
$appendRows = [
[324857, "Spider-Man: Into the Spider-Verse", "https://image.tmdb.org/t/p/w500/iiZZdoQBEYBv6id8su7ImL0oCbD.jpg", "Miles Morales is juggling his life between being a high school student and being a spider-man. When Wilson 'Kingpin' Fisk uses a super collider, others from across the Spider-Verse are transported to this dimension.",1544140800,"Action, Adventure, Animation, Science Fiction, Comedy"],
[456740, "Hellboy", "https://image.tmdb.org/t/p/w500/bk8LyaMqUtaQ9hUShuvFznQYQKR.jpg", "Hellboy comes to England, where he must defeat Nimue, Merlin's consort and the Blood Queen. But their battle will bring about the end of the world, a fate he desperately tries to turn away.",1554944400,"Fantasy, Action"]
];
$path = 'data/new-file2.csv';
$csv = Writer::createFromPath($path, 'a+');
$csv->insertAll($appendRows);
Advanced Tips & Tricks Using League\Csv
Now, we'll explore advanced yet commonly used operations on CSV.
Count the CSV Rows
use League\Csv\Reader;
$csv = Reader::createFromPath($path, 'r');
$csv->setHeaderOffset(0);
var_dump($csv->count());
Return the CSV Headers
use League\Csv\Reader;
$csv = Reader::createFromPath($path, 'r');
$csv->setHeaderOffset(0);
var_dump($csv->getHeader());
array(6) {
[0]=>
string(2) "id"
[1]=>
string(5) "title"
[2]=>
string(6) "poster"
[3]=>
string(8) "overview"
[4]=>
string(12) "release_date"
[5]=>
string(6) "genres"
}
Fetch One CSV Row by Using its Index
We fetch the 10th row of the file:
use League\Csv\Reader;
$csv = Reader::createFromPath($path, 'r');
$csv->setHeaderOffset(0);
var_dump($csv->fetchOne(10));
array(6) {
["id"]=>
string(3) "920"
["title"]=>
string(4) "Cars"
["poster"]=>
string(63) "https://image.tmdb.org/t/p/w500/qa6HCwP4Z15l3hpsASz3auugEW6.jpg"
["overview"]=>
string(386) "Lightning McQueen, a hotshot rookie race car driven to succeed, discovers that life is about the journey, not the finish line, when he finds himself unexpectedly detoured in the sleepy Route 66 town of Radiator Springs. On route across the country to the big Piston Cup Championship in California to compete against two seasoned pros, McQueen gets to know the town's offbeat characters."
["release_date"]=>
string(10) "1149728400"
["genres"]=>
string(36) "Animation, Adventure, Comedy, Family"
}
Fetch Many CSV Rows by Using Offset and Limit
We fetch 3 rows starting from the 10th row:
use League\Csv\Reader;
use League\Csv\Statement;
$csv = Reader::createFromPath($path, 'r');
$csv->setHeaderOffset(0);
$stmt = Statement::create()
->offset(10)
->limit(3);
$records = $stmt->process($csv);
foreach ($records as $row) {
var_dump($row);
}
Filter CSV Rows Based on a Condition
We fetch only the movies where the title contains "Deadpool":
use League\Csv\Reader;
use League\Csv\Statement;
$csv = Reader::createFromPath($path, 'r');
$csv->setHeaderOffset(0);
$stmt = Statement::create()
->where(function($row) { return strpos($row['title'], 'Deadpool') !== false; });
$records = $stmt->process($csv);
foreach ($records as $row) {
var_dump($row);
}
array(6) {
["id"]=>
string(6) "383498"
["title"]=>
string(10) "Deadpool 2"
["poster"]=>
string(63) "https://image.tmdb.org/t/p/w500/to0spRl1CMDvyUbOnbb4fTk3VAd.jpg"
["overview"]=>
string(108) "Wisecracking mercenary Deadpool battles the evil and powerful Cable and other bad guys to save a boy's life."
["release_date"]=>
string(10) "1526346000"
["genres"]=>
string(25) "Action, Comedy, Adventure"
}
array(6) {
["id"]=>
string(6) "293660"
["title"]=>
string(8) "Deadpool"
["poster"]=>
string(63) "https://image.tmdb.org/t/p/w500/yGSxMiF0cYuAiyuve5DA6bnWEOI.jpg"
["overview"]=>
string(348) "Deadpool tells the origin story of former Special Forces operative turned mercenary Wade Wilson, who after being subjected to a rogue experiment that leaves him with accelerated healing powers, adopts the alter ego Deadpool. Armed with his new abilities and a dark, twisted sense of humor, Deadpool hunts down the man who nearly destroyed his life."
["release_date"]=>
string(10) "1454976000"
["genres"]=>
string(25) "Action, Adventure, Comedy"
}
Sort the CSV Rows Based on a Condition
We fetch the five first movies ordered by ascendant alphabetical title:
use League\Csv\Reader;
use League\Csv\Statement;
$csv = Reader::createFromPath($path, 'r');
$csv->setHeaderOffset(0);
$stmt = Statement::create()
->orderBy(function($row1, $row2) { return strcmp($row1['title'], $row2['title']); })
->limit(5);
$records = $stmt->process($csv);
foreach ($records as $row) {
var_dump($row);
}
Convert CSV to JSON
Parsing a CSV file and converting its rows to a JSON string is a pretty straightforward operation:
use League\Csv\Reader;
$csv = Reader::createFromPath($path, 'r');
// use the first line as header for rows
$csv->setHeaderOffset(0);
// get all the rows as an array of associative arrays
$data = $csv->jsonSerialize();
// encode the array to a json string
$jsonString = json_encode($data, JSON_PRETTY_PRINT);
Convert JSON to CSV
Converting a JSON format to a CSV file requires a bit more work; we first need to extract the keys from one of the JSON objects to use them as the headers of our CSV file:
use League\Csv\Writer;
$path = 'data/movies-flat-10.json';
// read the file's content as a string
$jsonString = file_get_contents($path);
// convert the json string to an array of associative arrays (one per object)
$jsonObjects = json_decode($jsonString, true);
// fetch the keys of the first json object
$headers = array_keys(current($jsonObjects));
// flatten the json objects to keep only the values as arrays
$rows = [];
foreach ($jsonObjects as $jsonObject) {
$rows[] = array_values($jsonObject);
}
// insert the headers and the rows in the CSV file
$path = 'data/new-file2.csv';
$csv = Writer::createFromPath($path, 'w');
$csv->insertOne($headers);
$csv->insertAll($rows);
If you want to go further on JSON files manipulation, you can look at this dedicated post on reading and writing JSON files using PHP.
Read a Big CSV File While Using a Minimum of Memory
The trick to parse big CSV files by keeping a low memory usage is to never load all data in memory.
Luckily, both native and league/csv
reading methods allow us to iterate and stream the content.
Let's take an example of a file containing 1M of lines; this file weights almost 400MB.
-rw-r--r-- 1 nico nico 393M may 13 12:38 data/movies-1000000.csv
Let's load all its content:
// I'm using here the Symfony Console & StopWatch components
use Symfony\Component\Stopwatch\Stopwatch;
use League\Csv\Reader;
$section = 'read_csv_file';
$this->stopwatch->start($section);
$path = 'data/movies-1000000.csv';
$csv = Reader::createFromPath($path, 'r');
$csv->setHeaderOffset(0);
$rows = $csv->getRecords();
foreach ($rows as $row) {
// we do nothing, but we want to ensure we browse each row
}
$this->stopwatch->stop($section);
$output->writeln("I read ".$csv->count()." rows from the CSV File ".$path);
$output->writeln((string) $this->stopwatch->getEvent($section));
And here is the result:
I read 1000000 rows from the CSV File data/movies-1000000.csv
default/read_csv_file: 6.00 MiB - 26699 ms
We load our 1 million CSV lines in 26 seconds by using only 6MB of memory! 🤯
By streaming properly the data reading and the processing you want to apply, you can keep the memory usage very low.
Using this method, we can write a very efficient PHP CSV parser which is especially beneficial when importing a large CSV file into a database.
Download the Code and Examples
You can find all the code and examples in this GitHub repository.
It's packaged as a simple Symfony project, a set of commands, it also comes with a Docker image. 🐋