No SWAP Partition, Journaling Filesystems, … on a SSD?

December 7, 2008

I’m going to get an Asus Eee PC 901go, which has a Solid State Disk (SSD) instead of a normal hard disk (HD). As you know me I’ll remove the installed Linux and install my own Kubuntu. I soon started to look at the best way to install my Kubuntu and I found following recommendations copy and pasted on various sites:

Never choose to use a journaling file system on the SSD partitions
Never use a swap partition on the SSD
Edit your new installation fstab to mount the SSD partitions “noatime”
Never log messages or error log to the SSD

Are they really true or just copy and pasted without knowledge. But first why should that be a problem at all? SSDs have limited write (erase) cycles. Depending on the type of flash-memory cells they will fail after only 10,000 (MLC) or up to 100,000 write cycles for SLC, while high endurance cells may have an endurance of 1–5 million write cycles. Special file systems (e.g. jffs, jffs2, logfs for Linux) or firmware designs can mitigate this problem by spreading writes over the entire device (so-called wear leveling), rather than rewriting files in place. So theoretically there is a problem but what means this in practice?

The experts at storagesearch.com have written an article SSD Myths and Legends – “write endurance” which takes a closer look at this topic. They provide following simple calculation:

One SSD, 2 million cycles, 80MB/sec write speed (that are the fastest SSDs on the market), 64GB (entry level for enterprise SSDs – if you get more the life time increases)
They assume perfect wear leveling which means they need to fill the disk 2 million times to get to the write endurance limit.
2 million (write endurance) x 64G (capacity) divided by 80M bytes / sec gives the endurance limited life in seconds.
That’s a meaningless number – which needs to be divided by seconds in an hour, hours in a day etc etc to give…

The end result is 51 years!

Ok thats for servers, but what is with my Asus 901go?

Lets take the benchmark values from eeepc.it which makes it to a max of 50 MByte/sec. But this is a sequential write, which is not the write profile of our atime, swap, journaling… stuff. That are typically 4k Blocks which leads to 2 MByte/sec. (Side node: The EeePC 901go mount the same disk of SSD ‘EeePC S101, to be precise model ASUS SATA JM-chip Samsung S41.)
We stay also with the 2 million cycles and assume a 16GB SSD
With 50 MByte/sec we get 20 years!
With 2 MByte/sec we get 519 years!
And even if we reduce the write cycles to 100.000 and write with 2 MByte/sec all the time we’re at 26 years!!

And all this is with writing all the time, even ext3 does write the journal only every 30 secs if no data needs to be written. So the recommendation to safeguard SSDs, as the can not write that often is bullshit!!

So lets take a closer look at the 4 points at the beginning of this blog post.

Never choose to use a journaling file system on the SSD partitions: Bullshit, you’re just risking data security. Stay with ext3.
Never use a swap partition on the SSD: If you’ve enough space on your SSD use a SWAP partition it will not be written onto it until there is to less RAM, in which case you can run a program/perform a task which otherwise you could not. And take a look at this article.
Edit your new installation fstab to mount the SSD partitions “noatime”: That is a good idea if all the programs work with this setting as this will speedup your read performace, specially with many small files. Take also a look at nodiratime.
Never log messages or error log to the SSD. Come on, how many log entries do you get on a netbook? That is not an email server with > 1000 log lines per second.

Please write a comment if you disagree or even agree with my blog post. Thx!

In General, IT Security, Linux | 39 Comments

39 Comments »

RSS feed for comments on this post. TrackBack URI

[…] I’ve written before I got an Asus EeePC 901go for Christmas and of course I’ve installed Kubuntu (8.10 / […]

Pingback by Tips for running Kubuntu 8.10 (Intrepid) on an EeePC 901go | Robert Penz Blog — December 26, 2008 #
Thank you for the explanation.
I’ve read quite many threads on SSD but few comes from people used to Linux.

So now, I know that I’ll be able to make a swap partition without issues and use a good old ext3 file system.

Regards.

Comment by Sagittarius — January 4, 2009 #
It’s nice to see someone dispelling the myths about the SSD lifetime. The SSD will probably outlive the gadget that contains it.

A small note on your 3rd point: “noatime” implies “nodiratime”, so specifying “noatime” is enough. The option “relatime” might also be used instead of “noatime” as it provides a balance between having accurate last access timestamps and minimizing disk writes.

Comment by Prasinos — January 9, 2009 #
The information you are using is based on single layer cells (SLC) and not the cheaper MLC which is coming to most consumers now. Going by your calculations (for a 10,000 write cycle life) we might only be getting 2.6 years before failure at 2mb/sec. Realistically though they are saying something like 7 years for a casual user. That’s significant as some random drives will fail sooner.

So for my MLC drive I will still be taking some measures to reduce writes, thanks.

Comment by Dantroline — January 12, 2009 #
My experience with an MTRON MSP-7035 32GB ssd (SLC drive) is that it failed within a year. It failed on rrd files accessed and written to by the Ganglia monitoring system. These files were updated every few seconds.
I noticed it when I wanted to backup that directory with a simple tar command, when I accessed that particular file the whole drive errored out and was not accessible anymore, which meant other partitions as well.

So at least point 4 (no logs) seems to be true in my case, with my drive.

Comment by Gerb — January 22, 2009 #
Really helpful stuff. Thanks.

I really needed this when looking at putting on a new distro on my Eee 1000.

Comment by Malcolm Bastien — January 27, 2009 #
One more thing to consider is that flash-devices handle their space in blocks. The blocksize typically varies between 16KB and 512 KB. Therefore writing one byte may cause erase and rewrite of up to 512KB.

The way it works is that all the bits in a block are first reset to 1 and then the required ones set to 0. Bits with value 1 can be changed to 0 without erasing the block, but not vice versa.

Comment by Janice Parson — January 30, 2009 #
I tmpfs log directories.

Many times I’ve seen bad drivers/hardware spew 100’s messages a second into logs…
every second…

Comment by Chris — February 10, 2009 #
I just yesterday got my EeePC 901 delivered, and after updating to the current BIOS, upgrading the RAM to 2GB, and running a pass with memtest+, I wiped and installed.

I came to similar conclusions to what you did – the fear of wearing out your drives is probably overstated. I’m not sure I believe the over-optimistic “decades” estimates, but so long as one avoids frequent (re-)writing of files it ought to last at least a few years – by which time a replacement drive ought to only cost about $25…

In my case, I’m eschewing a swap partition, and I am setting noatime (but then again, I’ve been doing this on normal HDD’s for years, too), but I’m allowing syslog-ng to log, and I’m even using ext4. We’ll see how it goes. (I may also set aside 256M to mount /tmp as a tmpfs filesystem to further minimize writes).

Comment by Epicanis — April 17, 2009 #
I don’t know why you think that the people at the website you linked to are experts, but I disagree. First of all, I do not know of any NAND flash product that is rated at more than about 100,000 write cycles. The 2 million that they/you suggest would give it (theoretically) 20 times the life span. Secondly, as you said yourself, the article assumed perfect wear leveling – which is impossible, unless all of the files (and other data) that are written to the SSD are multiples of the block size. (Which would be impossible for any practical use; that is, any use that is not in a controlled environment of some sort designed specifically for this.) Third, with the wear-leveling used in the SSDs in the netbooks, it would be farther from perfect than that of the high-end SSDs that would be used in servers.

Comment by computer_freak_8 — May 4, 2009 #
Conversely, let’s assume best conditions, with extremely imperfect wear leveling (i.e., the same block is repeatedly re-written at the fastest possible speed).

Best Parameters:

2 MB/s writes
512 KB block size
2 million write cycles (unlikely for consumer devices)

Calculations:

2 MB/s writes / 512 KB/block-write = 4 block writes per second
2 million writes life cycle / 4 writes per second = 500,000 seconds of life
500,000s / 60s/m / 60m/h = 139 HOURS before likely failure.

Now lets assume worse conditions:

50 MB/s writes
4 KB block size
100,000 write cycles

Worst possible scenario:

50 MB/s writes / 4 KB/block-write = 12,800 block-writes per second
100,000 writes life cycle / 12,800 writes per second = 7.8 SECONDS before failure

Conclusion:

Completely ignoring the life cycles of SSD memory probably isn’t the prudent thing to do. The idea that a block on an SSD drive can be destroyed in as little as 5 days and as quickly as 8 seconds convinces me to take some precautions with regard to wear-leveling/preventing writes on my SSD drives. It’s not a matter of the cost of the drive, it’s a matter of the time spent replacing it and the data loss (I personally am not religious about making my backups as many others are).

Disclaimer:

I am not a filesystem expert (far from it). I do not know which filesystems are most likely to damage SSDs or how to best configure your distro/filesystem. I just hate seeing someone present a best case scenario without also presenting a worst case scenario.

Comment by ssdnub — May 7, 2009 #
@ssdnub: My EeePC does not write with 50MB/s 😉 . I’ve an 16 GB SSD. I would need therefore 300sec for writing once trough. At your worst case scensario this would even not be possible.

So please be realistic. I’ve a swap partition of 2GB and as I’ve 2GB RAM it is most of the time empty according to top. And what does Ext3 do? Write a journal every 30 sec. This can even be changed to e.g. 5 minutes. I’ve you’re using an IRC client and writing a log of the channel messages you’ve more writes than by the swap and ext3 in a normal setup.

@gerb: rrd is really hard for a SSD or a HDD, as the file changes constantly as it ages entries.I’m running a OpenNMS server for monitoring a large network, and the SAS harddisks which contain only the rrd files are the ones which lead to a 10-20% io wait on the system. the other hdds e.g. for the database are much lower load.

Comment by robert — June 18, 2009 #
To (11) ssdnub

You are only considering wear-levelling flash-cards/SSDs with respective sizes 2MB and 50MB.

Wear-levelling flash-cards/SSDs come in e.g. 8GByte, 16GByte…

Comment by Glenn — September 6, 2009 #
The problem I have with these calculations is that they’re all based on some magic figures reported by the manufacturers. Take for example current mechanical SATA drives, say WD RE3 which is pretty high end. The manufacturer reports MTBF of 1,200,000 hours. That’s nearly 140 years. How many hard drives do you see that live even 7 years? Not that many. You should scale down mfg reported statistical bullshit by at least a factor of 10, preferably 20, to get any real-life figures. So the 51 years would actually be around 2.5 years under constant heavy use. Considering the price and performance I’d say that’s still acceptable, but I definately wouldn’t slap these things on the servers at work.

Comment by JPa — September 25, 2009 #
What is being ignored here is how filesystems work.
Most filesystems have a prearranged ‘bookkeeping’ system, i.e. the file allocation table, superblocks, etc. which reside on a static part of the harddisk. Everytime a file is written to, this entry needs to be written to. Even worse, some filesystems like NTFS even maintain data when files are READ!!! (look at the properties of a file in windows and you will see the Accessed field on NTFS)
You can use fsutil to turn this behavior off, by the way.

This information is written in two locations on disk: in the file attributes and in the directory entry. It is thus most likely your SSD will fail first in the Master File Table of a commonly accessed directory (read or write).

Comment by Steef — November 25, 2009 #
@Steef: You are mistaken thinking that the same position in the drive seen by the OS can (is) wear-leveled by the SSD drive controller. Therefore it is not always written on the same blocks, even if you do “see” it at the same offset always.

What you describe would be if there was no wear-leveling on the device (maybe the earlier SSD didn’t have wear-leveling?).

Comment by Elbereth — November 30, 2009 #
ssdnub wrote:

> Now lets assume worse conditions:
>
> 50 MB/s writes
> 4 KB block size
> 100,000 write cycles
>
> Worst possible scenario:
>
> 50 MB/s writes / 4 KB/block-write = 12,800 block-writes per second
> 100,000 writes life cycle / 12,800 writes per second = 7.8 SECONDS before failure

You assume that each write of a block takes one life cycle of all cells -> after writing ~400MB the SSD fails ?
This calculation can be silently ignored…

Comment by papischu — December 1, 2009 #
I think he means writing the same 4k block 100,000 times with a speed of 50MB/s takes 7.8 seconds.

Comment by konrad.c — December 1, 2009 #
Please note that this is only if you use “real” SSDs. Many cheap “SSDs” are technically just SD-Cards with an IDE/SATA-Interface – i broke 3 Cards of different brands (8GB/16GB) in a Year…

Comment by adlerweb — January 12, 2010 #
I’m using a EeePC 901 for over a year now – no problem with the SSD with ext3 and swap sofar.

Comment by robert — January 12, 2010 #
[…] Re: Partitioning 2 HD Are these remarks still relevant for ssd's? I don't know what kind of ssd the author has, but I've seen calculations that ssd's can last 27 years under normal usage. Let's say that you put a swap partition on it and it might be reduced to 7 years. This is still more than long enough for any home user. If you add trim support and reduce kernel swappiness, these number should even increase. Here is one article that I read about the subject: No SWAP Partition, Journaling Filesystems, … on a SSD? | Robert Penz Blog […]

Pingback by Partitioning 2 HD - openSUSE Forums — February 5, 2010 #
[…] I will upgrade to OS 11.3) and I will set kernel swappiness to zero. According to this blog post: No SWAP Partition, Journaling Filesystems, … on a SSD? | Robert Penz Blog it should be fine. Either way I don't intend to use the SSD for longer than 5/6 years and all my […]

Pingback by [SSD] partition alignment during 11.2 install - openSUSE Forums — February 6, 2010 #
[…] boring or familiar, just complex. I'm researching it myself. This was my jump-off point: No SWAP Partition, Journaling Filesystems, … on a SSD? | Robert Penz Blog Still reading… __________________ "On a clear drive, you can seek forever." (HP-UX […]

Pingback by SSD recommendations? - Hardware Canucks — February 23, 2010 #
Some anecdotic data: the 16 GB SSD in my Asus EeePC 900 died in under a year, with read errors in the spot that used to contain the ext3 journal.

I suspect poor write leveling and decided to go with ext2 in the replacement drive.

Comment by Marius Gedminas — April 9, 2010 #
I ordered Patriot SSD (http://www.patriotmemory.com/products/detailp.jsp?prodline=8&catid=21&prodgroupid=141&id=911&type=17) – manufacturer gives 10 years warranty for this disk. Do we really need to worry about data stored on such SSD more than on ordinary HD? I don’t think so…

Comment by cane — April 11, 2010 #
[…] Some people think that using a journaling filesystem will prematurely wear a USB stick, but this guy doesn’t think so. […]

Pingback by Installing Ubuntu 10.04 Server on a USB Stick | Stochastic Bytes — July 10, 2010 #
********************************
> 100,000 write cycles

Worst possible scenario:
>
> 50 MB/s writes / 4 KB/block-write = 12,800 block-writes per second
> 100,000 writes life cycle / 12,800 writes per second = 7.8 SECONDS before failure
************************************
??? i don’t understand this math; you divide number of writes with # blocks and you get seconds?!
so again… writes per blocks means time?

your case is that: w speed of 50mb/s equals 12800 blocks writen/s, or 1 w cycle/12800 blocks in a second. presuming all ssd size is 50mb, one has to write on it 100000 times (@ 1 cycle/s) or 27,7 hours of writes second by second in the same place…

Comment by Romania — November 30, 2010 #
i would like my grand children to have my usb/ssd with all my data in it 🙂

Comment by rex — March 10, 2011 #
[…] Robert Penz goes into details to bust some of the myths around SSD. He concludes that on a normal user system, you don’t need to take special consideration when switching from spinning to solid drives. Only on the advice of using “noatime” he seems incorrect, challenged by this thread: “noatime is not necessary. Fedora defaults to relatime , which is a better choice: it reduces disk access almost as much as noatime, but preserves enough atime info for practical purposes”. tags: Corsair, ext4, Fedora, ssd, TRIM older » Panorama » No Responses to "Choosing an SSD". Add a comment? or Follow comments by RSS? Be the first and share your thoughts! […]

Pingback by Choosing an SSD « hblok.net - Linux, Electronics and Tech — April 9, 2011 #
BRILIANT!

GJ you deserve a mythbuster badge!

Comment by ppetrov — December 18, 2011 #
The thing is,

the calculations with the SSD failing in 8 seconds are unfortunately possible even with the newest SSDs.
32nm Flash has the life of approx 20k cycles and uses considerably big sizes.
Unless one properly utilizes TRIM and uses an optimized OS for a SSD, one can go bad in less than a week (my experience with cheap MLC non-SandForce 64G OCZ and Corsair SSDs).
All you need is a program that uses block based access to the SSD and “touches” the same block many times per second (in my case a ionode benchmark).
And the SSD went 5 feet under REALLY fast. Just sharing my 2 cnts

Comment by SV — February 1, 2012 #
@SV: Thats not possible as all modern SSDs have a chip for wear leveling – take a look at:

http://en.wikipedia.org/wiki/Wear_leveling

Comment by robert — February 2, 2012 #
What would have been really cool, and I have heard A LOT OF PEOPLE say exactly this, is to hear from those who followed this advice NOW (2012 for blog written in ’08) so that we can KNOW that it’s right! I can only HOPE and presume that since the AUTHOR has not corrected himself, that he must be correct??????

Dude… with something as important as potentially losing all your files, I would think the author could actually get a real boost from coming back around in 2012, readdressing this, and saying either (a) I TOLD YOU SO or (b) OOPS! I was wrong!

Let us know man!!!!!!

Comment by LEARNING_UNIX_AGAIN — March 3, 2012 #
I’m no longer using the Asus Eee PC 901go, but my wife does and it still works.

A well known German IT magazine (c’t) wrote some month back about SSD and SWAP and so on … they tell you the same as I did. SWAP and SSD are fine. SSD fail like HDDs do, but thats not because of a SWAP or journaling.

Newer drivers get slower if you don’t use TRIM support, but thats just a mount options in the fstab called discard. And you can use relatime if you can’t use noatime. I’m Using nowadays an SSD with SWAP and jourling as my home server, no problems so far.

Comment by robert — March 3, 2012 #
[…] fast. Although some of the commenters concern themselves with the reliability of SSD, this has been debunked a long time ago. Short summary: on a normal user system, you don’t need to take special […]

Pingback by SSD prices declining, but still expensive « hblok.net - Linux, Electronics and Tech — June 24, 2012 #
Flash SSD devices that have high endurance use internal wear leveling so you can use your ext3/4 journalig filesystems w no fear. But if the SSD is not with such system built-in it`s risky. And finding your priceless data on a corrupted journaling FS is far more difficult and sometimes impossible compared to an ext2 volume (unless you were using Windows/NTFS).

Comment by A.Genchev — August 7, 2012 #
[…] Even if there are frequent articles which crunch the actual numbers, the superstition persists. Back in 2008, Robert Penz concluded that your 64 GB SSD could be used for swap, a journalling file system, and consumer level […]

Pingback by Concerns about SSD reliability debunked (again) « hblok.net - Freedom, Electronics and Tech — March 3, 2013 #
I know, very well, thank you for your advice

Comment by wansview — January 9, 2014 #
A file system or filesystem is a method and data structure that the operating system uses to control how data is stored and retrieved.

Comment by BRAINMAP — January 25, 2023 #

No SWAP Partition, Journaling Filesystems, … on a SSD?

39 Comments »

Leave a comment