Chapter 7 Replicating the head command

7.1 Learning objectives

  • How to replicate the head bash command in Python

  • Understand how to read a file and print the first few lines

  • Learn how to use a for loop to limit the number of lines printed

7.2 Create a longer file

  • Let’s create a simple text file called sample2.txt using the Jupyter text editor

    The file should contain the following lines:

    apple
    banana
    cherry
    date
    elderberry
    fig
    grape
    peach
    kiwi
    lemon

7.3 Practice the head command

This exercise should be done in the Jupyter terminal.

The head command in Bash prints the first few lines of a file. By default, it shows the first 10 lines, but you can specify a different number with the -n option.

  • In the terminal, run the command:

    head sample2.txt
    • This will print all the 10 lines of sample2.txt

      apple
      banana
      cherry
      date
      elderberry
      fig
      grape
      peach
      kiwi
      lemon
  • Now try running it with the -n option to print only the first 5 lines

    head -n 5 sample2.txt
    • You should see the following output

      apple
      banana
      cherry
      date
      elderberry

7.4 Designing the head algorithm

We can replicate the head command by adding some logic to the for loop we introduced in Ch 6. Parsing text files.

Starting with a for loop is great for iterating through items in a collection, like lines in a file. We can add a counter variable and a conditional if statement to limit how many lines we print.

i = 0
for my_line in my_file:
    if i >= max_lines:
        break
    my_line = my_line.rstrip("\n")
    print( my_line )
    i = i + 1  

Key points:

  • We can use a counter variable i to keep track of how many lines we’ve printed. i = i + 1 increments the counter by 1 each time we print a line.

  • The if statement checks if the number of lines printed exceeds the maximum specified by the user

  • The max_lines variable is set to 10 by default, but can be changed by providing a second command line argument

7.5 Coding step by step

  • Create a new python script called 04-head.py and add what we introduced in Ch 6. Parsing text files

    #!/usr/bin/env python3
    
    import sys
    
    my_file = open( sys.argv[1] )
    
    for my_line in my_file:
        my_line = my_line.rstrip("\n")
        print( my_line )
    
    my_file.close()
  • Now add code after the open() to set a maximum number of lines to print or defaulting to 10 if not specified

    max_lines = 10
    if len(sys.argv) > 2:
        max_lines = int(sys.argv[2])
  • Finally, add the for loop to limit the number of lines printed

    i = 0
    for my_line in my_file:
        if i >= max_lines:
            break
        my_line = my_line.rstrip("\n")
        print( my_line )
        i = i + 1
  • The complete script should look like this:

    #!/usr/bin/env python3
    
    import sys
    
    my_file = open( sys.argv[1] )
    
    max_lines = 10
    if len(sys.argv) > 2:
        max_lines = int(sys.argv[2])
    
    i = 0
    for my_line in my_file:
        if i >= max_lines:
            break
        my_line = my_line.rstrip("\n")
        print( my_line )
        i = i + 1
    
    my_file.close()
  • Save the file and make it executable

    chmod +x 04-head.py
  • Run the script with the file name and number of lines as arguments

    ./04-head.py sample2.txt 5

    This will print the first 5 lines of sample2.txt

    apple
    banana
    cherry
    date
    elderberry

7.6 Summary

Congratulations! You have just:

  • Created a Python script that replicates the head command

  • Used command line arguments to specify the file and number of lines to print