Ubuntu’s look Command Behaves Differently
For a simple, but useful, command, look certainly gave me the runaround when I was researching this article. There were two problems: compatibility and documentation.
This article was checked using Ubuntu, Fedora, and Manjaro. look was bundled with each of those distributions, which was great. The problem was the behavior wasn’t the same across all three. The Ubuntu version was very different. According to the Ubuntu Manpages, the behavior should be the same.
I eventually figured it out. look traditionally uses a binary search, while Ubuntu look uses a linear search. The online Ubuntu man pages for Bionic Beaver (18.04), Cosmic Cuttlefish (18.10), and Disco Dingo (19.04) all say the Ubuntu version uses a binary search, which is not the case.
If we take a look at the local Ubuntu man page, we see it clearly states their look uses a linear search. There is a command-line option to force it to use a binary search. Neither of the versions in the other distributions has an option to choose between search methods.
Scrolling down through the man page, we see the section that describes this version of look using a linear instead of binary search.
The moral of the story is to check the local man pages first.
Linear Search versus Binary Search
The binary search method is faster and more efficient than a linear search. Working with large files makes this very apparent. The downside to the binary search is your file must be sorted. If you don’t want to sort your file, sort a copy of it, and then use that with look.
We’ll demonstrate this elsewhere in this article. Just be aware that on Fedora, Manjaro, and I expect most of the rest of the Linux world, you’ll need to create a sorted copy of your file and work with that.
Installing words
look can work with any text file you choose, or it can work with the local dictionary file “words.”
On Manjaro you need to install the “words” file. Use this command:
Using look
For this article, we’ll work with a text file of the Edward Lear poem “The Jumblies.”
Let’s look at its contents with this command:
Here’s the first part of the poem. Note that we’re using Ubuntu, so the file remains unsorted. For Fedora and Manjaro, we’d work with a sorted copy of the file, which we’ll cover later in this article.
If we look for lines that start with the word, “They,” we’ll find out some of what the Jumblies did.
look responds by listing these lines:
Ignoring Character Case
To make look ignore differences between upper- and lowercase, use the -f (ignore case) option. We’ve used “they” as the search word again, but this time, it’s in lowercase.
This time, the results include an extra line.
The line that begins with “THEY” was missed in the last set of results because it’s in all uppercase and didn’t match our search term, “They.”
Ignoring case allows look to include it in the results.
Using look with a Sorted File
If your Linux distribution has a version of look that follows the traditional behavior of using a binary search, you must either sort your file or work with a sorted copy of it.
Let’s repeat the command to search for “They,” but this time on Manjaro.
As you can see, no results were returned. But we know there are lines in the poem that start with the word, “They.”
Let’s make a sorted copy of the file. If you’re going to use the -f (ignore case) or -d (alphanumeric characters and spaces only) options with look, you must use them when you sort the file.
The -o (output) option allows you to specify the name of the file the sorted lines should be added to. In this example, it’s “sorted.txt.”
Let’s use look on the sorted.txt file, and then use the -f and -d options.
Now, we get the results we expected.
Only Consider Spaces and Alphanumerics
To make look ignore anything that isn’t an alphanumeric character or a space, use the -d (alphanumeric) option.
Let’s see if there are any words that start with, “Oh.”
No results are returned by look.
Let’s try again and tell look to ignore anything other than alphanumeric characters and spaces. That means characters and symbols, such as punctuation, will be ignored.
This time, we get a result. We didn’t find this line before because the quotation marks and exclamation point confused the search.
Specifying the Terminating Character
You can tell look to use a specific character as the terminating character. Usually, spaces and end of lines are used as the terminating character.
The -t (terminate character) option allows us to specify the character we’d like to use. In this example, we’re going to use the apostrophe character. We need to quote it with a backward slash so that look knows we’re not opening a string.
We’re also quoting the search term because it includes a space. We’re searching for two words.
The results match the search term, terminated by the apostrophe we used as the terminating character.
Using look Without a File
If you don’t provide a filename on the command line, look uses the words file.
The command:
gives these results:
These are all the words in the file that begin with the word “circle.”
look No Further
That’s all there is to look.
It’s pretty easy once you know there are different behaviors across different Linux distributions, and you’ve bottomed out whether your version uses a binary or linear search.