Joy of Elixir

11. Working with files

So far, we've worked with data that we've defined ourselves in the iex prompt. This has been incredibly handy because it has allowed us (and Izzy) to experiment with Elixir's features. When we program in the real world, data typically comes from sources external to our code. One of those sources is files from the file system.

In this chapter, we'll look at how we can read existing files, create new ones and even delete files using functions that are built-in to Elixir.

Reading a file

The first thing we're going to take a look at here is how to read a file's contents and then use Elixir to do something with those contents.

Certain files may contain data that we can use in our Elixir programs. Elixir can read any file you throw at it. In this chapter, just for simplicity, we're going to stick to a single file format: text files.

A text file is just a file with a bunch of words in it. You probably have a file like that on your computer right now. You could open one of these files in your text editor and read what was written in that file. If for some weird reason you don't have one of these, you can take this content and put it in a new file called haiku.txt:

rixilE evol I
nrael ot ysae os si tI
edoc lanoitcnuf taerG

I've put this file in a directory on my computer called "Joy of Elixir":

Figure 11.1: The haiku.txt file on my Mac within the Finder
And here's the content of that file that I see when I double-click on it:
Figure 11.2: The haiku.txt file's contents within my Mac's TextEdit program

Izzy squints at the haiku. "Um... This haiku looks a little... backwards". Yes, Izzy! Each line is written backwards! The haiku should read:

I love Elixir
It is so easy to learn
Great functional code

We could go in to the file and correct this ourselves, but we're programmers learning a super-duper-awesome programming language and by golly if we aren't going to use it to solve every problem we come across. To solve this problem, we're going to use the power of Elixir.

What we're going to do is to read this haiku into Elixir, and then we're going to reverse the order of words in each line. I make it sound so easy, and that's because it is. Really!

To read this file, we'll need to open an iex prompt where the file is, and then we can use this code to read it:

iex> File.read("haiku.txt")

This code tells Elixir to read a file by calling the File.read/1 function. Calling this function will give us the following output:

{:ok, "I evol rixilE\nnrael ot ysae os si tI\nedoc lanoitcnuf taerG\n"}

In this output, there's not one but two new concepts. You've gotten so far through the book that you're now so great at learning and that means I can introduce things rapid-fire and you'll pick 'em up with no effort.

The first of these is the curly brackets. Did you notice them? They wrap all of the output from the File.read/1. Did you notice that these curly brackets, unlike all the other curly brackets we've seen before, are not prefixed with a percent-sign (%)?

This particular concept in Elixir is called a tuple. You can think of tuples as a fixed-length list and they're used to link a bunch of information together in a particular order. In this case, it's telling us that the file operation was :ok, and then it's giving us a string containing all the file's information.

Izzy pipes up: "What's that :ok thing and why does it have a colon before it?". Good spot, Izzy. That is called an atom. We first saw these back in Chapter 4 when they were used as keys in maps.

We've seen atoms since then, but here's a refresher: Atoms are used to provide informational messages, like in this case. Atoms names are their values. This is unlike strings, maps and lists where we would assign to a variable in order to give a meaningful name to. This atom is telling us that the operation we asked File.read/1 to perform went "ok"; we were able to read the file successfully.

If we specified a file that wasn't present, File.read/1 would give us a different atom in that first place:

iex> File.read("haiboo.txt")
{:error, :enoent}

This cryptic error message uses a tuple containing two atoms, :error and :enoent. The first one is self-explanatory -- there was an error loading this file. The second one gives us a non-regular-human-friendly error message: :enoent. This is computer-lingo for "I couldn't find that file you were looking for, sorry."

The best part about these tuples and atoms being returned from the File.read/1 call is that we can use pattern matching (Chapter 6). If we want to only proceed if our File.read/1 function executes successfully, we can pattern match like this:

iex> {:ok, contents} = File.read("haiku.txt")

Go ahead and try this code out in your iex prompt. Also try it with the wrong path too and see what happens. If you give it the wrong path, you should see it fail like this:

** (MatchError) no match of right hand side value: {:error, :enoent}

This error happens because we're telling Elixir to expect that the tuple returned from this call contains an :ok atom at the start, but it doesn't. Pattern matching can be a useful way of stopping your program in its tracks like this.

Let's look back at the successful read:

iex> {:ok, contents} = File.read("haiku.txt")

You'll see exactly the same values come back as when we did the first File.read/1 invocation:

{:ok, "I evol rixilE\nnrael ot ysae os si tI\nedoc lanoitcnuf taerG\n"}

The difference is: this time, we've got the contents of the file in a contents variable. Oooh, that was sneaky! Let's get back to the task at hand: we still need to correct this haiku back to its proper form. We've now got the contents of this file and we need to reverse each line. To do that, we need some way of splitting apart the string so that we can process each line separately from each other line. To do this, we can use our old friend, String.split/3:

iex> contents |> String.split("\n", trim: true)

This code takes the string stored in contents and passes it as the first argument to String.split/3. The other two arguments are: 2) the string "\n" and 3) the option trim: true. The 2nd argument tells String.split/3 to split the string on the newline characters (\n), and the 3rd argument tells String.split/3 to remove any trailing space at the end.

When this function does its thing, we'll see this output:

["rixilE evol I", "nrael ot ysae os si tI", "edoc lanoitcnuf taerG"]

Yay! We've now got a list of strings here. We need each string here to be reversed. Izzy sticks her hand up and wiggles it around. "Ooh ooh ooh I know how to do this! String.reverse/1!", she says, monospaced font and all. Impressive. Yes, Izzy is correct! We can reverse a string by calling String.reverse/1 as we first saw back in Chapter 8. We know we can do it with a single string like this:

iex> "rixilE evol I" |> String.reverse
"I love Elixir

But we don't have a single string here, we have three strings within a list. But we have knowledge on our side. We have special skills that we built up in Chapter 9, and with those special skills we know that we can call Enum.map/2 to apply a function to multiple elements within a list. We've done exactly this back in Chapter 9:

iex> Enum.map(cities, &String.capitalize/1)

So let's take our list, adapt this code a little bit to use the pipe operator and String.reverse/1 and reverse each string with this code:

iex> contents \
|> String.split("\n", trim: true) \
|> Enum.map(&String.reverse/1)

We're using a multi-line pipe operator chain here to accomplish what we want. Because we're in iex, we need to put a backslash (\) on the end of every line to tell Elixir to treat all these lines as one long line and one chain of operations. If we were writing code in an Elixir file, we wouldn't need to do this. iex is a little special in this regard. Another way we could've written the code is like this:

iex> contents |> String.split("\n", trim: true) |> Enum.map(&String.reverse/1)

But it is considered best practice to split long pipe chains up onto multiple lines when they're really long just to help with readability. Think of it as the same rule that applies when you're writing a book: you split logical breaks into separate paragraphs to make it easier for people to read here. We're applying almost that same rule to our Elixir code.

This code (in either the one-line or three-line form) will give us back a list of strings that now look like proper English:

["I love Elixir", "It is so easy to learn", "Great functional code"]

All we need to do now is to put the file back into a big string, which we can do with a new-to-us function called Enum.join/2:

iex> fixed_contents = contents \
|> String.split("\n", trim: true) \
|> Enum.map(&String.reverse/1) \
|> Enum.join("\n")
"I love Elixir\nIt is so easy to learn\nGreat functional code"

Excellent! We've now got our file's text around the right way. We did this with a combination of quite a few functions and that is really demonstrating the repertoire of things that we know how to do with Elixir now. We have used File.read/1 function to read this file's contents and to bring them into our Elixir code. Once we have the contents there, we can do whatever we wish with them.

Before we move on, I want to show you another way that Elixir has for reading files.

Streaming a file

Elixir provides us at least one very clear way to read a file: File.read/1. We know how this works. But there is another function that Elixir gives us which can also be used to read files. That function is called File.stream!/1.

This function works by reading a file one line at a time. This is useful in cases where we might be reading very large files. If we used File.read/1 to load a large file, it might take a long time for Elixir to read it all in and it might take up a lot of our computer's resources. On the contrary, File.stream! can be used to read an entire file, but will read a file line-by-line.

We call this type of thing in programming parlance "eagerness" or "laziness". The File.read/1 function can be said to be eager: it eagerly loads the whole file without a care in the world if we're going to use the whole lot or not. While the File.stream!/1 function is its more chilled out cousin: it will lazily read a single line at a time from the file.

Think of it in terms of your favourite internet streaming service. When you stream from that service, you don't have to wait for every single episode of your favourite Scandi Noir TV series to download. You don't even have to wait for a single episode to fully download because it only sends through a small part of the episode at a time. This is what File.stream/1 does: gives us the file line-by-line.

Ok, I think we've talked enough about what File.stream/1 does! It's time to see this in action. Let's use this code to start reading our haiku.txt file again:

iex> stream = File.stream!("haiku.txt")
%File.Stream{
  line_or_bytes: :line,
  modes: [:raw, :read_ahead, :binary],
  path: "haiku.txt",
  raw: true
}

This time, we're not given back the contents of the file. We're given back something we've never seen before: a struct.

We can think of structs as maps that follow a particular set of rules. Just like maps, structs have key-value pairs, as we can see from the output here. If we removed the File.Stream part of this output, the syntax would be a map:

iex> %{
  line_or_bytes: :line,
  modes: [:raw, :read_ahead, :binary],
  path: "haiku.txt",
  raw: true
}

What makes a struct different here is that it is named and structs with the same name always have the same keys. Every time we call File.stream!, we'll see a File.Stream struct come back with exactly the same keys.

The keys in this particular struct tell Elixir what this file stream is all about. It tells Elixir that we're going to read from the haiku.txt (path) file line-by-line (line_or_bytes). I won't go into what modes and raw are for here.

At this point, all we have this File.Stream struct. Consider it as an "intent to read" the file. No reading has occurred at this point yet. To trigger this reading, we can pipe this stream into any Enum method. For instance, we could trim all the newline characters (\n) off each line like this:

fixed_contents = stream \
|> Enum.map(&String.trim/1)
["rixilE evol I", "nrael ot ysae os si tI", "edoc lanoitcnuf taerG"]
  

That's a good start. We're not going to be able to peer into exactly what Elixir is doing here, so you'll have to take my word for it that Elixir is now reading each line in this file one at a time and sending it through to Enum.map/2. The output from calling Enum.map/2 is good, but what we really need is a list of sentences the right way around. We can flip these by piping the output again:

fixed_contents = stream \
|> Enum.map(&String.trim/1) \
|> Enum.map(&String.reverse/1)
["I love Elixir", "It is so easy to learn", "Great functional code"]

That's getting better! Now we need to join these back together again and put line breaks in between them with Enum.join/2 again:

fixed_contents = stream \
|> Enum.map(&String.trim/1) \
|> Enum.map(&String.reverse/1) \
|> Enum.join("\n")
"I love Elixir\nIt is so easy to learn\nGreat functional code"

Excellent! We now have the fixed haiku in string form inside Elixir. We could get this through either File.read/1 or File.stream!/1. Normally, we would only use File.stream/1 if the file was really long, but in this section we've used it for illustrative purposes and to prove that there's more than one way to read a file in Elixir.

It would be really handy if we could take this string out of Elixir and put it into a new file so that nobody else has to read a backwards haiku. That's what we'll be looking at next!

Creating and writing to a new file

We now have the re-arranged contents of our haiku stored within a variable called fixed_contents. If you've lost this since the last section, you can use this code to get it back:

iex> fixed_contents = "haiku.txt" \
|> File.stream! \
|> Enum.map(&String.trim/1) \
|> Enum.map(&String.reverse/1) \
|> Enum.join("\n")

The pipe operator makes this code so much easier to read! Now that we most certain have the file contents stored in Elixir and put the right away around, we want to write these contents to a new file. We saw that we could read a file with File.read/1, so it would make sense for there to be a File.write function too. It definitely does exist, and it's called File.write/2. It takes two arguments: 1) the path we want to write to 2) the contents that we want to put in the file. It works like this:

iex> File.write("fixed-haiku.txt", fixed_contents)
:ok

Elixir dutifully takes the file path and the fixed contents and puts them into a file called fixed-haiku.txt. Well, we assume so. All that we know about this operation is that Elixir thought it went OK, as indicated by the atom that it returned: :ok.

If we want to make sure that the fixed-haiku.txt file is definitely there and contains what we expect it to contain, we could simply find that file in the file browser on our computers and double click on it. That would be the easiest way. But, because we're programmers and we've got powerful new skills up our sleeves, we can use what we know to check this. Let's use File.read/1 to see if that file is there:

iex> File.read("fixed-haiku.txt")
{:ok, "I love Elixir\nIt is so easy to learn\nGreat functional code"}

Great! Our file has definitely been written with the right content. Our haiku is in the right order.

Renaming a file

Izzy asks another question: "Why don't we switch these haiku.txt and fixed-haiku.txt files around? If I was opening a file called haiku.txt, I would expect that to be a proper Haiku, not this weird reversed one! I wouldn't think to look in fixed-haiku.txt.

Izzy's right. It's weird that haiku.txt doesn't contain the haiku. We should rename this strange reversed version to reversed-haiku.txt, and then rename fixed-haiku.txt to haiku.txt to ease any confusion people might have.

To read a file there was File.read/1. To write a file there was File.write/2. So it makes sense that in order to rename a file, there is File.rename/2. Elixir is pretty sensible like that, if you haven't realised already. File.rename/2's two arguments are the original name of the file, and then the new name we want.

Let's use this function to first rename haiku.txt to reversed-haiku.txt:

iex> File.rename("haiku.txt", "reversed-haiku.txt")
:ok

Elixir has told us that this operation has succeeded. Well, if that's the case then we should see when we look into that directory again that the haiku.txt file isn't there, but a reversed-haiku.txt file is there instead. Oh, and fixed-haiku.txt exists too. Let's take a peek:

Figure 11.3: Two files now exist, fixed-haiku.txt and reversed-haiku.txt.

Yes! Our file has been renamed successfully. That's a good start. Let's rename fixed-haiku.txt into haiku.txt with File.rename/2 too:

iex> File.rename("fixed-haiku.txt", "haiku.txt")
:ok

Elixir tells us that the file rename operation was successful yet again. Two for two! Peeking into that directory one more time, we'll see that indeed it was:

Figure 11.4: The haiku (haiku.txt), and its reversed version (reversed-haiku.txt) are safely in place.

Wonderful stuff. Whenever we need to rename a file within Elixir we now know that we can reach for File.rename/2.

So now we've seen how to read existing files, create new ones and rename them. The final file operation that we'll look at this chapter is to delete the files.

Deleting a file

We've all deleted files from our computers in our lifetime. Sometimes even on purpose. The files reach a point where they're no longer useful and we want them gone. The way we might do this is to drag the file to the Recycle Bin, or to right click and choose "Delete", or... well, there's so many ways to delete files.

Elixir provides us a way to delete files too, but it isn't as intuitively named as the operations for reading (File.read/1), writing (File.write/2), and renaming (File.rename/2). To delete a file in Elixir we call the function called File.rm/1.

"RM? Like R.M. Williams, the world famous Australian manufacturer of leather things? What does leather shoes and belts have to do with removing files?", Izzy asks with quite the perplexed look on her face, cork hat bobbling quizzically along. Woah, slow down there Izzy. Elixir's File.rm/1 has nothing to do with boots!

This File.rm/1 file is called rm as a shorthand for "remove". We can use this function to remove any file we like. We would like to keep our haiku files, so let's create a separate file that we can remove later by using File.write/2 first, and then we'll delete it with File.rm/1. Let's do this slow so that we can see that the file definitely does exist before we delete it. Let's start with creating the file:

iex> File.write("delete-me.txt", "delete me")
:ok

If we look in our directory, we can see that this file definitely exists:

Figure 11.5: The delete-me.txt file exists.

Let's try removing this file now:

iex> File.rm("delete-me.txt")
:ok

Elixir says that removing this file was successful. Let's take a look at the directory:

Figure 11.6: The delete-me.txt file is gone!

The file is gone!

We've now got our head around some of the essential things that we can do with files with Elixir. We know that we can read existing files with File.read/1, write completely new ones with File.write/2 and delete existing files with File.rm/1. For good measure, we also learned that these functions may return tuples that indicate errors, like {:error, :enoent}, which tells us that files do not exist.

We've now encountered a situation (well, situations) in Elixir where we can call code and get different outcomes depending on external factors like whether files exist or not. The code that we've written so far hasn't been particularly resilient to this sort of thing. We've become accustomed to running code and always seeing the same result.

But the rules have changed. We now will need to account for this sort of thing. In the chapter I will show you a few ways that we can make our code more resilient by making it behave differently, depending on the circumstances.

Exercises

  • Can you make Elixir write a program for itself? Put this code into a file called script.ex with File.write/2: IO.puts "This file was generated from Elixir" and then make it run by running elixir that-file.ex.
  • Figure out what happens if you try to delete a file that doesn't exist with File.rm/1. Is this what you expected to happen?