Plain text is based on representing characters as a single stream per file, using a byte for every character (or from one to
There is a chapter in The Pragmatic Programmer: From Journeyman to Master (the first Pragmatic bookshelf title) that is titled Power of plain text, and where the advantages of text files upon binary formats are discussed: there is no obsolescence of plain text and one can leverage every kind of existing tool being secure that it can handle plain text. Plain text in UTF-8 will be still readable thirty years from now.
In fact, one of the Unix philosophy pillars is:
Write programs to handle text streams, because that is a universal interface. -- Doug McIllroyThere is no limit to what you can do with data in plain text, because you can chain together hundreds of unix programs which will work seamlessly. In a Unix system, no configuration files work with a unit smaller than a byte: all directives are kept in plain text files, in a structured but human editable form.
As an example, I put together a list of what I am using plain text for. When I really thinked about it the first time I was impressed and pleased:
- Todo lists: a list of tasks I have to complete in the near future, divided in Urgent/Important/... sections. Since it is a list, items has a "-" before them and I can indent subpoints by using more than one hyphen, like in -- subtask or --- subsubtask". To mark a completed item and gain confidence, I substitute the hyphens with "+", maintaining indentation. Using vim, it is also fairly simple to reorder items or to mark them with a macro.
- Specifical todo lists: I have one todo list for this blog, for example. A general list can grow too much for being still manageable, so it's a good choice to gather Todos for particular projects in their own list. This is somewhat similar to the Getting Things Done project actions management, but at a simpler level since I do not need a more elaborate one.
- Source code: this is pretty obvious, but I wanted to point out that source code is usually plain text.
- Lists of any kind: for example, books I want to read or to find reviews for.
- Svn diffs and patches: when submitting a patch to a project like Zend Framework or Doctrine, the process involves checking out the Subversion working copy and making the changes needed for addressing a bug or adding a feature. Then, if you do not have commit access, svn diff > myfix.patch saves the changes in a patch you can upload to the bug tracker for evaluation. Patch format builds up on plain text, but it's still readable and before committing on my projects I usually run a svn diff | more to explore the changeset (another example of plain text as a universal interface).
- Goals: it is mandatory to write your goals for the short and long term, if you are serious about achieving them. Plain text is a good choice since you can find everywhere the programs to edit them, even five years from now.
- Blog posts: when writing a new article, I start with a blank vim screen (maybe I should use a template) and write all the content, the most important part of a post. Formatting and images are inserted while putting the post online and proofreading it, and emphasis on words and phrases can be specified by '' or * marks.
- Email: text emails are more portable than html ones and can be forwarded and quoted easily.
- Wiki articles: when I edit a wiki article, not only on wikipedia but in any wiki, I use wiki formatting, which is a superset of plain text. I have included this usage since wiki formatting is very readable and can be used without a subsequent "real" formatting phase, for instance for lists like my Todo ones.
- Schedules: I might use Google Calendar in the future, but now that I'm trying out scheduling my working days a simple text file named 2008-10-15.txt is perfect.
08:00 wake up&breakfast 08:15 mail&reader 08:30 nakedphp user stories estimation ...Tabulations, even when using spaces instead of \t characters, are very useful to align text and provide spreadsheet-like capabilities. In the schedule case, I only specify the tasks for the next day so one file it's enough.
There are only two problems that can surge with plain text: encodings and newlines. Specifying UTF-8 and what type of newlines (LF, CRLF, or CR) will make your text files universal and consistent. Compare this requirements to the ones for working with docx files.
While I advocate that web applications are the future choice for many tasks, I have never abandoned plain text. When testing out some new practice like writing goals or maintaining an effective Todo list, I always start from plain text. This way if it simply does not work for me or I am not satisfied with the results, I'll simply delete a folder on my pc. No need to register to powerful web applications such as Remember The Milk: I'm sure it works pretty well for TODO lists and it's globally accessible by every machine connected to the Internet, but I am not ready at the moment. I'm only exploring possibilities, with a next-to-zero cost in time: I only have to open vim or gedit.
Now before registering to dozens of web services, think about using plain text for your lists, goals, schedules... Often the simplest solution is overlooked.
This is by no means an encouragement to write a book in plain text: use complex formats for complex tasks, because they will pay back their heaviness.
Thanks for the good examples.
ReplyDeleteThere are several lightweight wiki-like text formats that can be converted to HTML/PDF/RTF easily. The most popular and useful is Almost Free Text, others are listed in this article.
Nitpicking: the maximum length of UTF-8 character is 4 bytes, not 6 (characters up to U+10FFFF are allowed).
Thanks for this article.
ReplyDeleteFor me the biggest advantage of plain text files is that I can store them easily in a source control system like git.
Btw: for writing a book in plain text have a look at LaTex.
Peter,
ReplyDeletethe maximum length of an escaped sequence was originally 6 bytes, after RFC 3629 it was limited to 4. In fact Unicode characters do need only four-byte sequences. I've corrected the sentence in the post.
cakebaker,
I stored too my plain text files under Subversion, since it is capable of diffing this format.