Convert multiple files from a specified format (e.g., CSV) to Parquet format. The converted Parquet files are saved in the original location of the input files, allowing for efficient storage and retrieval of large datasets.
Arguments
- folder_path
A string path to the directory containing sub-directories of input files.
- pattern
A string to filter the sub-directories of input files by names.
- input_format
A string indicating format of input files. Default is
"csv"
. Other supported formats are listed inarrow::open_dataset()
.- max_rows_per_file
Maximum number of rows per output Parquet file. Default is
1e7
(10 million rows).