The split command in Linux – Break large files into smaller files

Split

The split command in Linux lets you split large files into smaller files. The smaller files by default contain 1000 lines each. However, the split command also gives you the option to customize the number of lines and bytes in each of the smaller files. In this tutorial, we will learn how to use the split command to split large files into smaller files.

Let’s get started.

How to use the split command in Linux?

First, we will create a sample text file. We’ll then move over the splitting the file using the split command.

1. Create a sample file

Let’s create a sample file using the cat command. The text for the sample file is as follows :

This is line number 1
This is line number 2
This is line number 3
This is line number 4
This is line number 5
This is line number 6
This is line number 7
This is line number 8
This is line number 9 
This is line number 10
This is line number 11
This is line number 12
This is line number 13
This is line number 14

There are 14 lines in total in the sample text file. To create the file, use :

$ cat > sample.txt
This is line number 1
This is line number 2
This is line number 3
This is line number 4
This is line number 5
This is line number 6
This is line number 7
This is line number 8
This is line number 9 
This is line number 10
This is line number 11
This is line number 12
This is line number 13
This is line number 14

Creating A Text File for split command
Creating A Text File

2. Using the split command to split the text file

If we use the split command without any options then by default it breaks the line at the 1000 line mark.

split sample.txt 

Let’s see how can we split the large file into smaller files each containing n number of lines.

3. Fix the number of lines in the files after splitting

To provide the number of lines that the smaller files should contain after splitting, use the following syntax.

split -l 8 sample.txt

This will break our original file after every 8 lines. To see the name of the files that are created use the ls command.

ls

Output :

Files

The two new files are ‘xaa‘ and ‘xab‘.

We can view the contents of the file by using the cat command.

cat xaa

Output :

This is line number 1
This is line number 2
This is line number 3
This is line number 4
This is line number 5
This is line number 6
This is line number 7
This is line number 8
cat xab

Output :

This is line number 9 
This is line number 10
This is line number 11
This is line number 12
This is line number 13
This is line number 14
Cat

Let’s try breaking the sample file into smaller files with three lines each.

split -l 3 sample.txt

The new files that we get are :

  • xaa
  • xab
  • xac
  • xad
  • xae

We can view the contents of the file using the cat command:

3 Lines

You can see that the files are named like xaa, xab, xac, xad and so on.

The split command also lets you give a name of your choice to the new files. Let’s learn how to do that in the next section.

4. Give the new files a name of your choice

To provide a name for the new files, use the following syntax :

 split -l 3 sample.txt [filename] 

The files created will have names like filenameaa, filenameab, filenameac, filenamead snd so on.

Let’s see an example :

 split -l 3 sample.txt three_lines

Now if we run ls, then we can see the new files using the ls command. Output :

Three Lines
ls

You can view the content of the files using the cat command.

Cat

You can also generate a verbose output with the split command. The verbose output will contain the name of the new files that are created.

5. Generate a verbose output from the split command

To generate a verbose output use the –verbose flag along with the split command.

split -l 3 sample.txt three_lines --verbose

Output :

creating file 'three_linesaa'
creating file 'three_linesab'
creating file 'three_linesac'
creating file 'three_linesad'
creating file 'three_linesae'

The output now contains the names of the new files that are being created after splitting.

6. Fix the number of bytes in the files after splitting

Just like the -l flag lets you change the number of lines in the smaller files, the -b flag lets you change the number of bytes in the smaller files.

split -b 50 sample.txt --verbose

This command will break the sample file into smaller files of 50 bytes each. Output:

creating file 'xaa'
creating file 'xab'
creating file 'xac'
creating file 'xad'
creating file 'xae'
creating file 'xaf'
creating file 'xag'

You can also split the files by mentioning the split size in KB , MB and GB. The syntax for that is as follows :

  • KB: split -b nK {file_name}  
  • MB: split -b nM {file_name}     
  • GB: split -b nG {file_name}   

Here n is a numeric value.

7. Change the file naming system from Alphabetic to Numeric

Currently, the files have a suffix like aa, ab, ac and so on. To change this naming system to a numeric system, use the -d flag along with the split command.

 split -db 50 sample.txt --verbose

Output:

creating file 'x00'
creating file 'x01'
creating file 'x02'
creating file 'x03'
creating file 'x04'
creating file 'x05'
creating file 'x06'

We can see that the suffix (01, 02, 03) is in numeric system now.

8. Break a large file into n smaller files

Sometimes you want to break a large file into a fixed number of smaller files. You can do that using the -n flag along with the split command. Let’s see an example:

 split -dn4 sample.txt --verbose

Output :

creating file 'x00'
creating file 'x01'
creating file 'x02'
creating file 'x03'

Four new files were created and since we used the -d flag, the new files follow numeric naming system.

Conclusion

This tutorial was about split command in Linux. We learnt about the different ways of breaking a large file into smaller files. Hope you had fun learning with us!