This package provides load and save support for CSV Files under the FileIO.jl package.
Use Pkg.add("CSVFiles")
in Julia to install CSVFiles and its dependencies.
To read a CSV file into a DataFrame
, use the following julia code:
using FileIO, CSVFiles, DataFrames
df = DataFrame(load("data.csv"))
The call to load
returns a struct
that is an IterableTable.jl, so it can be passed to any function that can handle iterable tables, i.e. all the sinks in IterableTable.jl. Here are some examples of materializing a CSV file into data structures that are not a DataFrame
:
using FileIO, CSVFiles, DataTables, IndexedTables, TimeSeries, Temporal, Gadfly
# Load into a DataTable
dt = DataTable(load("data.csv"))
# Load into an IndexedTable
it = IndexedTable(load("data.csv"))
# Load into a TimeArray
ta = TimeArray(load("data.csv"))
# Load into a TS
ts = TS(load("data.csv"))
# Plot directly with Gadfly
plot(load("data.csv"), x=:a, y=:b, Geom.line)
One can load both local files and files that can be downloaded via either http or https. To download
from a remote URL, simply pass a URL to the load
function instead of just a filename.
The load
function also takes a number of parameters:
load(f::FileIO.File{FileIO.format"CSV"}, delim=','; <arguments>...)
delim
: the delimiter characterquotechar
: character used to quote strings, defaults to "escapechar
: character used to escape quotechar in strings. (could be the same as quotechar)nrows
: number of rows in the file. Defaults to 0 in which case we try to estimate this.header_exists
: boolean specifying whether CSV file contains a headercolnames
: manually specified column names. Could be a vector or a dictionary from Int index (the column) to String column name.colparsers
: Parsers to use for specified columns. This can be a vector or a dictionary from column name / column index (Int) to a "parser". The simplest parser is a type such as Int, Float64. It can also be a dateformat"...", see CustomParser if you want to plug in custom parsing behaviortype_detect_rows
: number of rows to use to infer the initial colparsers defaults to 20.
These are simply the arguments from TextParse.jl, which is used under the hood to read CSV files.
The following code saves any iterable table as a CSV file:
using FileIO, CSVFiles
save("output.csv", it)
This will work as long as it
is any of the types supported as sources in IterableTables.jl.
The save
function takes a number of arguments:
save(f::FileIO.File{FileIO.format"CSV"}, data; delim=',', quotechar='"', escapechar='\\', header=true)
delim
: the delimiter character, defaults to,
.quotechar
: character used to quote strings, defaults to"
.escapechar
: character used to escapequotechar
in strings, defaults to\
.header
: whether a header should be written, defaults to ``true.
Both load
and save
also support the pipe syntax. For example, to load a CSV file into a DataFrame
, one can use the following code:
using FileIO, CSVFiles, DataFrame
df = load("data.csv") |> DataFrame
To save an iterable table, one can use the following form:
using FileIO, CSVFiles, DataFrame
df = # Aquire a DataFrame somehow
df |> save("output.csv")
The pipe syntax is especially useful when combining it with Query.jl queries, for example one can easily load a CSV file, pipe it into a query, then pipe it to the save
function to store the results in a new file.