Sed command is a text editor tool used for editing files. It can be used for replacing/deleting/ adding text to a file. It’s a stream-oriented, non-interactive text editor. S in Sed stands for Stream-oriented and the ‘ed’ part is for editor.
One important thing to note while working with sed is that the original input file is unchanged and the results are sent to standard output. To redirect the output to a file, the redirection operator (>) has to be used. To make changes to the original file (not recommended) “-i” flag can be used.
The sed command performs operations on the text in a single pass through the stream and that makes it very efficient.
The sed command has two parts. An address and a command. The address specifies the location where the change is to be made. Address can be a line number, a string literal, or even a regular expression. The command part is used to differentiate between the various actions such as insert, delete, substitute.
Creating a file for editing
First off we will start by creating a file that will be used throughout this tutorial to apply the transformations.
$ cat > output.txt
The text we are going to use is :
Apples are sweet in taste. Apples are red in colour. Orange is sweet too. Orange is Orange in colour. Colour of a fruit can’t be changed. Apples and oranges are healthy. Orange needs to be pealed while an apple can be consumed without pealing. Apples are sometimes green in colour. An orange if green in colour is not ripe enough to eat. Apples and oranges grow on trees. More people like to eat apples. That's about it for apples and oranges.
Note the multiple occurrences of words like apples and oranges.
The sed command can be used to find occurrences of particular words across the text and replace it. This can be useful if the spelling of a word is wrong and needs to be corrected.
$ sed 's/colour/color/' output.txt
This command only replaces the first occurrence of the target word in each line. If you notice that in the third line we have ‘Colour’ which is still not replaced. This is because of two reasons:
- The command we wrote doesn’t ignore the case of the words. That is to say, it is case-sensitive.
- Even if it ignored the case, it will still not replace the second instance of a word in a line.
If you notice words ‘pealed’ and ‘pealing’ are wrongly spelled. We can correct them individually
$ sed 's/pealed/peeled/' output.txt
$ sed 's/pealing/peeling/' output.txt
In changing them individually, we see that the other returns to the original version. Later we will see how Regular Expressions can be used to achieve the same result.
Replacing multiple words
In the previous example, instead of using the sed command twice we can use a single command combining the two.
$ sed -i 's/pealed/peeled/;s/pealing/peeling/' output.txt
This command takes care of both the substitutions simultaneously. Here we use “-i” to make the change in the original file. This is why we have to run a cat command to see the original file. Don’t confuse this “-i” with the one that is used for making case-insensitive substitutions(see next example).
Ignore case while replacing
sed by default is case sensitive. To ignore the case -i flag can be used with sed command.
$ sed 's/apples/mango/i' output.txt
All the first instances of the world apples have been changed to mango irrespective of the case. You can see that in second last line we still have ‘apples’ as it is. Being the second instance of the word in the same line it has not been replaced.
Replacing all occurrences
As we saw above, the command we have been using so far only looks at the first occurrence in each line. To look at all instances of a word use -g along with the command. Let’s have another look at that second instance of ‘Colour’ in line 3.
$ sed 's/colour/color/gi' output.txt
- g makes sure that all instances are looked at
- i makes sure that matching is case-insensitive
Nice! both the instances of the word colour have been changed to color in line 3.
Replacing selective occurrence
Now suppose we just want to change the second occurrence of a word in each line. This would mean to only change the second occurrence of ‘colour’ in line 3.
$ sed 's/colour/color/2i' output.txt
In the output, the only occurrence of colour in line #2 and the first occurrence in line #3 is left as it is. The second occurrence in line #3 has been replaced by ‘color’. The same is true for the third last line. By replacing g with any n, we can use sed to replace the nth occurrence of a word in each line.
Adding a line after every line
Sometimes it is nice to space out each line in the file. This can be done using ‘G’ with sed command.
$ sed G output.txt
As we can see, a line has been added after every line of text.
Replacing words in selective lines
We can mention specific lines to run the sed editor tool on. This would only make the changes in the lines mentioned explicitly and ignore the rest.
$ sed '2,4 s/Apples/mango/' output.txt
We can see that only lines 2 to 4 have been modified. Since we didn’t mention g, only the first occurrences have been modified.
Printing selected lines
The sed command can be used to display particular lines from the file.
$ sed -n '2,5p' output.txt
Deleting selected lines
The sed command can be used to remove certain lines while displaying. The command doesn’t delete the lines from the actual file, rather it just doesn’t display them when the command is run.
$ sed '2,5d' output.txt
This won’t affect the original file, so think of it as another way of selectively displaying a file. To save the output to another file we can use the redirection operator. We’ll see this next.
Saving sed output to a file
The modifications made through sed command is only visible as an output on the command line. These outputs are not saved and the changes are not reflected in the original file. To save the changes to a file, redirection operator (>) can be used.
$ sed '2,5d' output.txt > new_file.txt
Note that if new_file.txt doesn’t exist, this command will create the file then write the output to it. The file would be created in the current working directory.
Displaying multiple consecutive lines
Just like performing multiple substitutions in the same command, sed can be used to display multiple consecutive lines in a single command.
$ sed -n -e '2,3p' -e '5,6p' output.txt
Lines 2 through 3 and lines 5 through 6 are displayed.
Printing lines that contain a particular word
The sed command can be used to print lines that contain a certain word or a pattern.
$ sed -n /colour/p output.txt
This command prints all the lines with ‘colour’ word in it.
$ sed -n /Colour/p output.txt
Since we haven’t used -i with sed command, the two commands will give different outputs.
Using Regular Expressions
The sed command can be used along with regular expressions to search for patterns. Regular expressions are used to specify certain rules that can be used to match the text and look for patterns.
$ sed ’s/[Cc]olour/color/g’output.txt
This solves the problem of matching Colour and colour as we have specified a regex for [Cc]olour which matches both the occurrences.
Advanced RegEx substitution
Earlier we saw how we replaced pealed to peeled and pealing to peeling. Let’s try doing that by regular expression. We will create one regular expression that will change peal to peel and rest of the the word as it is. This can be useful if the same word is used in many forms such as past, present or future.
sed 's/peal\([a-z]*\)\(\.*[[:space:]]\)/peel\1\2/g' output.txt
The regular expression identifies both pealed and pealing and corrects them. Let’s understand the regular expression.
Backslash are used to escape characters like parenthesis, without which the expression would look like:
- Peal matches the first four letters of the word.
- [a-z]* matches zero or more alphabets after peal. This would be used for matching ‘ed’ and ‘ing’
- .* matches zero or more number of dots, used as full-stop. Used for pealing since it occurs at the end.
- [[:space:]] matches the space after a word. Used for pealed.
- ([a-z]) is referenced as 1 and can be used while substitution back.
- (.*[[:space:]]) is referenced as 2.
On the substituting side we have :
Backslash is used to escape numeric characters. Here peel is followed by the reference 1 from the matched patter which is either ‘ed’ or ‘ing’. Followed by 2 which is either a space or a full-stop. This regular expression successfully achieves the following transformation:
- pealing. -> peeling.
- pealed[space] -> peeled[space]
The sed command along with regular expressions can be used for very powerful text editing. Study more about regular expressions here. We only covered the brief applicability of sed command. To know more about sed refer to this.