There might be numerous reasons why you have to split a large text file into numerous smaller ones. Because of storage limit, or to amplify transfer speeds by copying the smaller parts using different thumb drives at once. Or maybe you would want to store different parts of the file at different locations due to safety reasons.
The Linux command line is really powerful, and you can achieve all this in the terminal applications themselves using pre-installed tools.
Creating a Sample File
First, I am going to create a large sample file using the dnf and cat command, you don’t have to create this if you already have a large file, you can directly skip to the splitting part of this tutorial. (I’m using fedora, therefore dnf is the package manager, however, you should use the package manager of your distribution)
In the terminal, type:
dnf list installed | cat >> dnflist.txt
Now, open the text file using a text editor such as vim or nano to verify that the file is indeed created using:
Or for nano users:
Now, if you have customized either of the text editors, then you can see the number of lines as seen in the above image.
Splitting the Document
You can use two methods to split the document, one by the size of the document and the other by the number of lines.
First, let’s split the document by the number of lines inside it,
Split the Text File by the Number of Lines
If you want to know the number of lines in the document, you can use the following commands :
wc -l dnflist.txt
Here, the -l flag signifies the number of lines in the text file. Now, using the split command itself, type the following in the terminal :
split -l 100 --additional-suffix=.txt dnflist.txt
If you do not specify the number of lines, then by default the command will spit the file at the 1000th line. And by default, the prefix is set as (x) and suffix as (aa), and our –additional-suffix flag adds .txt at the end of each file.
Split the Text File by Size
First, to check the size of the original document, type the following command in the terminal :
ls -l dnflist.txt
As you can see in the above image, the file is 158159 bytes. Now, we can choose to split the file in numerous parts according to the byte size. Let’s say we want to have each file of 100000 bytes. Then type the following in the terminal :
split -b 100000 --additional-suffix=.txt dnflist.txt
You can also add a numerical suffix at the end of the split command output, by typing the same command like this :
split -d -b 100000 --additional-suffix=.txt dnflist.txt output_text
Or for an alphabetical suffix, you don’t have to use the -d flag, just type :
split -b 100000 --additional-suffix=.txt dnflist.txt output_text
References: Archwiki – Split command