File this under…well…files

One of the cool things about having a blog is I can see some of the search terms people use to find the site.

I see quite a few references to the Topaz format in those search terms, so I figured I would try and answer them.

This post is not going to be about the different formats that you put on your Kindle: I addressed that in this earlier post.

This post is about the file formats that Amazon puts on your Kindle.


When you buy a book from Amazon, the most common format you’ll see is .azw.  File extensions (the three or four letter parts after the period in your file names) usually stand for something.  txt is a text file, htm is an HTML (Hyptertext Markup Language) file, and so on.  They are usually pretty obvious.  The rumor is that this one stands for Amazon Whispernet, but I don’t think Amazon has said that.

Azw is a proprietary format…it’s owned by Amazon, and only they use it. 

It seems to be based very much on the .mobi format, which makes sense.  .mobi is the file type used by MOBIPOCKET…and Amazon owns them.    The Kindle does read mobi files without conversion.

The azw file is the book that you get from the Kindle store.  The font they use is called Caecilia.  The files can contain DRM (Digital Rights Management).


The other format in which you’ll get books from the Kindle store is called Topaz.  It has two different file extensions, depending on how you put it on your Kindle.  If you get it wirelessly, through the Whispernet (the Kindle’s wireless internet connection), the extension will be .azw1.  Presumably, that’s Amazon Whispernet 1, to differentiate it from just plain old azw.  It might have made more sense to go with azw2, but that’s geeks for you.  🙂

If you download a Topaz file to your computer first, and then transfer it to your Kindle using the included USB cord, it will have an extension of .tpz, presumably for Topaz.

The most obvious user difference is that it can have “embedded” fonts.  In other words, the publishers get to include fonts in the file.

That makes Topaz sound better, right?  Well, there are reasons that it has earned the nickname of “The Dreaded Topaz”.

One of the biggest things, confirmed by Amazon, was that the notes you made on a Topaz file weren’t searched: the ones you made on an azw were.  I say that they “weren’t” searched, because the last thing I heard was that they were working on it.

Why does that matter?  I use the tagging method on my books as a way to organize them.   I’ve written about that in this earlier post.  The basic idea is that you make a note on a book that has your search terms, and then you search for them.  If the notes you make on a Topaz book aren’t searched, that doesn’t work.

I’ve also seen people saying that the Topaz books looked…sloppy, like a photocopy where the original wasn’t quite straight.  I’m not sure why that would be true.

Also, some people report the Kindle not remembering where they were in a book.  When I’ve asked diagnostic questions, and seen other people do that, it appears to be more of a problem with Topaz books (once we eliminate human error).  Mark that one as a rumor, though.

Another rumor: Topaz is based on EPUB.  I’ve seen this stated a couple of places, but since Amazon doesn’t want people reverse engineering stuff (that’s against the Terms of Use), I’m not going to take one apart and look at it.  However, if that’s true, it’s really interesting.  One of the “knocks” on Amazon recently has been that the Kindles don’t support EPUB.  Well, there was an effort to make EPUB an industry standard…but it never really took off until just recently, when both Google and Sony said they would use it.  The nook (sic) from Barnes and Noble will also support it.

While EPUB books (without Digital Rights Management) are easily converted for the Kindle using the free program Calibre, the Kindles do not read those files natively.   Some libraries also use it, but that’s not why you typically can’t get book from public libraries on your Kindle.  That has to do with the Kindle not telling you its PID (Personal Identity), which libraries need to be able to control your use of copywritten files.

If Topaz is based on EPUB…that doesn’t make me anxious to have it as a standard.

So, .azw and .azw1/tpz are the two files in which you’ll get books from the Kindle store.


You can’t (within the system) annotate those book files.  But you are allowed to make notes, highlights, bookmarks…if you can’t edit the book files, how does that work?

The answer is that there is a separate file associated with each book file.  It will have the same name and have either an extension of mbp (for the azw files…probably MobiPocket) or tan (for the Topaz files…presumably Topaz Annotations).  mbp is also used for the “associated information” files for  pretty much everything you get from other sources (mobi, html, txt, and so on).    That associated information, I believe, also includes the “last page read” data. 

It’s not intended that you look at these files, but it’s important that you know about them.  I recommend that you regularly back up your Kindle’s documents folder (just attach the USB and copy it to your computer).  If you only copy the book files (.azw, .azw1, .tpz), you won’t get your notes (more on that later), highlighting, bookmarks, and last page read.

On the other hand, it appears that those files aren’t keyed to your specific copy of the book.  Hypothetically, you could give your file on 20,000 Leagues Under the Sea to somebody, and your associated information would show up in their file.  That could be a pretty cool way for a teacher to give notes to a class of students with Kindle, for example.

Those files don’t combine, though.  In other words, if the student had started to put notes in a book, and the teacher gave the student another mbp file for the same title, it would overwrite what the student had already done.  The student could back up her or his mbp first, but couldn’t combine the two.

Some people have reported that as a problem…since Amazon syncs your notes to its servers (for Kindle store books), it has been said that a blank Amazon version will overwrite an existing version when it is first downloaded.  This would be the supposed negative sequence:

  1. The consumer buys the book and tells the store to send it wirelessly to the Kindle
  2. The file can’t be delivered because the consumer is out of the Whispernet area
  3. The consumer downloads the file to the computer and copies it to the Kindle’s documents folder using the USB
  4. The consumer makes notes
  5. The consumer takes the Kindle into a Whispernet area…the files download, and blank mbp files wipe out the notes the consumer had made

I don’t know for sure, but I’ve seen that reported.  A new risk is that happening to people using the international wireless.   They may have bought the books when they couldn’t download by wireless, but now they can, and it does…wiping out the notes they’ve made on files they downloaded to the computer.  I’ve also seen reports of those folks getting the $1.99 wireless download fee for each title, even though they already had the books via computer download. 

We’ve been given a new option, which is to choose “download via computer” as a purchase option (instead of sending directly to the Kindle).  You still choose for which Kindle it is intended, which Amazon needs to be able to key the file to your specific device.  That should mean, then, that when it downloads is up to you.  🙂  If you are a US customer living outside the US, I’d always choose “download via computer” as the purchase option.

If you have backed up your associated information files, you could simply replace the blank one from Amazon with the one you backed up.

By the way, you can do a little trouble-shooting with this.  If your Kindle isn’t going to the correct last page, try removing the mbp or tan file.  That will also lose all your other annotations, if any, but if you haven’t annotated yet, that’s worth a try. 

This is all a little hypothetical, but I think that’s how it works.  If you’ve had a different experience, let me know.


In addition to the files created for each title, there is also a central repository for your clippings and bookmarks.   It’s called MyClippings.txt, and it’s fine with Amazon that you look at it.  🙂  It’s just a text file, and it is in your Kindle’s documents folder.  You can even copy and paste your clippings from there into, for example, an e-mail.  It’s not that easy to find them…they are just listed chronologically. 

A clipping looks like this:

– Clipping Loc. 333-38 | Added on Saturday, February 09, 2008, 07:01 AM

 doubt, in the monastery he fully believed  in miracles, but, to my thinking, miracles  are never a stumbling-block to the realist.  It is not miracles that dispose realists to  belief. The genuine realist, if he is an  unbeliever, will always find strength and  ability to disbelieve in the miraculous, and  if he is confronted with a miracle as an  irrefutable fact he would rather disbelieve  his own senses than admit the fact. Even if  he admits it, he admits it as a fact of  nature till then unrecognised by him. Faith  does not, in the realist, spring from the  miracle but the miracle from faith. If the  realist once believes, then he is bound by  his very realism to admit the miraculous

I actually find it easier to work with my clippings (just the ones on books from the Amazon store) at Amazon’s website for that purpose, which I described in this earlier post.

Those are the main files Amazon puts on your Kindle.  If you get a software update, that’s another one, but that’s not typical.

Questions?  Contrary experiences?  Let me know…

This post by Bufo Calvin originally appeared in the I Love My Kindle blog.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: