p3-shuffle

Scramble the Records in a File

p3-shuffle.pl [options]

This script reads a file in batches of 500,000 records at a time and writes them out in a different order. It is used to un-sort files for deep learning purposes.

Parameters

There are no positional parameters.

The standard input can be overridden using the options in Input Options.

Additional command-line options are as follows.

  • batchSize

Number of records to read in a batch. The default is 500,000.

  • verbose

If specified, progress messages will be written to STDERR.