Tuesday, February 15, 2011

Why I no longer trust EMC [Update: maybe they are not so bad]

[Update: After publishing this blog post I received a very pleasant phone call from two representatives from Mozy informing me they had managed to recover my data. See the end of this blog post for more details.]

It's possible to argue that my entire research agenda over the past few years has focused on cloud computing. HadoopDB can be thought of as a large scale analytical database system for the cloud. My work on database determinism that focuses on building horizontally scalable database systems is entirely motivated by the elastic scalability of the cloud. In order for this research to make impact, "the cloud" needs to be more than a temporary phenomenon. Therefore, I feel quite invested in the success or failure of the cloud.

One common argument people make against the cloud (amongst others) is that if you put your data in the cloud, you are losing control over your data. If the cloud provider does not have appropriate processes in place to safeguard data, it's quite possible that your data could get corrupted or lost. This is problematic since most users do not get to see the internal processes, so they need to (to some extent) blindly trust the cloud provider --- a tricky proposition for many people. The way I usually answer this criticism is that a competitive business climate will solve this problem --- the companies that have bad processes will lose data and go out of business, and the ones that have more safeguards in place will win.

However, the above argument only works if cases of data loss get publicized so that the companies that lose data will lose business. Since I recently went through the horrible experience of losing data I put in the cloud, I therefore feel obligated to share this experience on this blog.

About a year ago I felt that if I was going to go around talking about how great the cloud was, I should at least be using a cloud data backup service for my PC. I ended up deciding between Mozy and DropBox, and went with Mozy because it was owned by EMC. I figured that EMC was a trustworthy company, and they understand storage and the cloud better than most. I figured I would start out with the free version, and then would upgrade to the paid version when I ran out of space.

Around 2 months ago, the hard drive on my Sony Vaio laptop failed. Since the laptop was owned by Yale, I had to go through the Yale processes to get it replaced. It turned out to be a nightmare because Yale did not buy the laptop directly from Sony, but went through an intermediary organization. Although the laptop was under warrantee, neither Sony nor the intermediary organization was willing to take responsibility for following through on the warrantee. This caused significant delays in getting the hard drive replaced, especially during the holiday season at the end of the semester.

After around two months, my laptop was finally returned with a new hard drive. I was excited that I had an opportunity to take advantage of my EMC Mozy backup for the first time --- theoretically they should have been able to recover all my files and put them in the same places where they existed on my laptop before it failed.

When I went to log in, Mozy claimed that I had the wrong password. I tried again. And again. Mozy would not let me in. Finally, I gave up and clicked on "Forgot my password" and Mozy claimed to reset it and send it to me. But I never received an e-mail. So I tried again. And again. Still no e-mails. I e-mailed support. Four days later, I still had not received a response. I e-mailed again --- another few days and still no response. At this point I was getting desperate --- it had been a week and I had no way of logging in to retrieve my files. Since I wasn't a "MozyPro" customer, all my attempts to call up support were rebuffed. I tried calling up the Mozy sales number to see if they could help me, but they were unable to. I tried the online chat, and they were unable to as well, but suggested that I email "forgot@mozy.com" to try to get my password reset manually that way.

This last suggestion worked, and I was finally able to log in. But to my horror, all my files were gone! It's hard to describe the despair as one starts to realize that one put the trust in the wrong place and all files created in the last year might actually be gone. I e-mailed support and this time received a much faster response:

" Mozy may terminate your account and these Terms immediately and without notice if your computer fails to access the Services to perform a backup for more than thirty (30) days or you fail to comply with these Terms."

So, because it took so long to get my computer replaced (the whole reason why I was using Mozy in the first place), Mozy decided to delete my account (without telling me). I e-mailed back support and asked if there was any way to recover the files even though the account was deleted. They wrote back:

"I wish there was something I could do for you. I have even checked with an L2 tech to try and get the files back and he said that he was not able to recover them."

So I trusted EMC Mozy to backup my files, and they decided to delete them. And they do not have the processes in place to recover them. This is not how the cloud is supposed to work. Clearly EMC does not understand the cloud. I hope that anybody reading this blog does not make the same mistake: EMC 's cloud services are not trustworthy. If you have similar stories, please share them with me --- cloud providers need to feel pressure not to arbitrarily delete data without first warning their customers. Otherwise the cloud cannot work.

[Update: It turns out that EMC Mozy does have important safeguards in place. After coming across this blog post, several members of the Mozy technical team met with each other to try to understand what happened, managed to recover my data, and called me afterward. Here's the scoop: the Mozy software is designed to notify you before your account is deleted. The problem was that my computer with the Mozy software installed had failed, so the software couldn't notify me. Mozy does indeed wait six months before deleting an account, but for me, due to a weird corner case involving a second computer that had previously been backing up to Mozy that I had stopped using, the six month clock had started ticking in July. Thus, the timing of my second computer failing was really unlucky. However, since they did have safeguards in place, they did manage to recover this deleted data. I am obviously really grateful that they went to this great effort. They told me in the phone that they learned from this experience and are making improvements as a result --- most notably to do more than rely on the software to notify a user before the account is deleted. Given how helpful and straightforward the Mozy employees were over the course of this phone call, I wholeheartedly believe that they really are going to fix this issue. Hence, I have no qualms recommending Mozy to other people moving forward. Again, the most important thing was that there were safeguards in place --- obviously it took some additional motivation for these safeguards to be used, but as long as they exist, I feel comfortable using cloud storage moving forward.]


  1. I wish I had something cheerful to share, but as a database consultant I've seen the inside of many well-known online backup services. (Most of them use a database for at least metadata storage, sometimes more.) I would not rely on them, even if they did everything right, which they generally don't.

    I've done data recovery work for some of them who've lost data in the most irresponsible ways. Your case was about deliberate deletion of data by the provider -- but even if they don't do that, my experience is they'll lose your data because a something routine happens (hard drive failure, mistake), and they themselves have no backups.

    I've seen and participated in lots of different ways to do backups, and this is my M.O. I have a set of GPG-encrypted hard drives that I regularly rsync and carry back and forth to a safe deposit box in my bank. If I lose my laptop hard drive, the one at home is usually only a few days old. If my house burns down, the bank has last week's backup. That is acceptable to me. I'm just too paranoid at this point to trust anyone else to store my data safely and securely. I insist on doing it myself the old fashioned way because I believe that it's better. I don't think there is any reason to feel obligated to use the cloud for a service such as backups just because your career relies on the cloud -- I advocate a lot of things I don't trust my most valuable assets to.

  2. Interestingly the verbiage on their site says: "If you are a free user of the Service, Decho may terminate your account after e-mail notification if you have not accessed or used the Service for more than 6 consecutive months.

    "If you are a paying user of the Service, Decho may terminate your account after e-mail notification if you have failed to make payment in full for 3 consecutive months."

    Of course it's also wrapped up in the usual "you can't sure us for anything" clauses anyway.

  3. EMC is a broad company. Mozy was an acquisition. I am familiar with some of the inner workings of EMC and it is safe to say that Mozy is only one head of the hydra.

    I don't defend them. I merely point out that Mozy is not representative of EMC as a whole. This is challenge of a company that does a lot of acquisitions. They inherit the good and bad, and of course they take the heat for all the bad once their name is emblazoned on a product.

    As for the cloud and trusting the cloud, I think a path hasn't yet emerged to permit it to prove itself. Proof by failure is unacceptable -- very few are going to be willing to subject themselves to data loss, especially companies.

    The cloud also has many facets. Cloud storage versus cloud computing face different challenges and offerings. In the former, the whole point is to use it as a place to store things. Making a local backup defeats the purpose. This is a hard path to prove as viable due to, as you said, failure is going to have to push bad companies out so the good ones win. In the latter, it is possible to make local backups of settings, content, etc. However, it can expose individual account information to attacks that the cloud provider must defend against (think eCommerce). If they fail in that regards, companies will revert back to using their systems; something under their control that they trust.

    To me, in either path, the cloud has a long way to go. The technology has a place within a company to permit elasticity in the infrastructure. But as far as tossing it out to "the net", there is a lot of trust yet to be gained.

  4. Eh, I'm having doubts about the cloud myself.

    In the process of moving from a traditional data center to Amazon's AWS with all data stored with Amazon.

    I'm scared we're putting all of our eggs in one basket and that basket isn't as strong as we're led to believe.

  5. I did an 80 gig test restore with Backblaze recently. It didn't go well.

    They force you to materialize a subset of your files into a zip archive. The materialization process takes tens of minutes. Once the archive is materialized you are given the URL via a javascript button so you can't use a download manager. The URL is not reusable so you can't start the download to find the URL and then use a download manager. If the download fails (and it will) you have to start from the beginning.

    You are limited to materializing a maximum of two archives and they are deleted 7 days after materialization. Whatever you can download in 7 days is the limit on what you can materialize in a single archive.

    I emailed support and they told me about an undocumented utility that will reliably transfer the archives and allow resuming.

    It is shocking that they haven't streamlined the process given the age of the product. I don't see why customers should be bothered with implementation details like materializing archives and reliably transferring files. My only guess is that restores are not where they make their money unless you are considering paying $189 for a whopping 400 gigabyte hard drive.

    I also found that Vista's built in zip functionality chokes on archives greater then 4gb and produces 0 length files for all files beyond 4gb. I emailed them and said they should document that bug as well as the utility for reliable restores, but I haven't gotten a response and there hasn't been any updates to their website.

    Hanlon's razor and all, but this makes me think that they are trying to drive people towards the pay services when they are most desperate. I can't see how that makes business sense since the number of people doing restores must be dominated by the number of people silently backing up data.

    I will continue to use Backblaze to capture the tail of my data, and supplement that with an external USB drive or two so that I have two local copies.

  6. Thank you all for your comments.

    oraclesponge: If their official policy is six months, then maybe I have some legal options. I had used Mozy's Website to access individual files within the last month, and had done a backup within the last two months. Also, they had my e-mail address, but they definitely did not e-mail me.

    arch0njw: EMC has had 4 years to incorporate the acquisition. I feel it is fair to criticize EMC for the business and technical processes of Mozy at this point. If EMC is unwilling to take responsibility for its subsidiaries at this point, the users of EMC's other subsidiaries should not feel secure that they are owned by EMC.

    Seth: I think Amazon understands the cloud much better than EMC. I have not heard of cases of Amazon losing their customers' data. Though I can certainly better understand your hesitation now.

  7. Buy a 1TB external drive. They cost about $50.

  8. The Mozy free service is limited to 2GB of data. I don't quite see how you managed to backup your Sony laptop using this paltry amount.

    If you only have 2GB of valuable data, I now amend my previous advice. Invest in a USB memory stick. They come free with some boxes of breakfast cereals or as freebies at trade shows.

  9. I use S3 with Arq from Haystack Software..works only on the mac though. You could find something equivalent for lin/win

  10. Lots of companies, including Mozy, delete free accounts when a customer stops using it. I don't see anything unusual there. Of course, Mozy will never delete a paid account as long as the customer is paying the bills.

    Of course, we are only talking about backup data. One can assume the original copies are still available. The odds of losing BOTH the original files plus the backup copies at the same time are remote. Not impossible, but hopefully rare. In any case, everything was always under control of Daniel Abadi. He seems to be blaming the wrong organization. The problem was not with Mozy, but with his own employer's delays in replacing the laptop. If he needs more than 30 days of storage, he needs to pay for it.

    In addition, he only had ONE backup??? That's an invitation for disaster. He needs at LEAST two backups at all times, stored in different places. Three or four backups would be better. (I have at least four backups of everyone of my important files, stored in three different places.)

    I think Daniel Abadi is too quick to complain about others when, in fact, he isn't doing what he should be doing. Mozy should be just one of the tools that he uses, not his complete protection plan. Any data professional will tell you that you NEVER rely on any one backup. He placed all his eggs in one basket and now he is paying the price for it.

  11. I've been using Hybir backup for online + offline backup. The application can backup to multiple locations, including their online service. So I have a local ioMega NAS plus the online backup service. They keep your data as long as you pay.

    Plus, Hybir does an image backup so you can use a boot CD to restore your HDD to the exact state it was in at the time of the backup. They de-dupe across all their clients so for online backup you only have to upload a small percent of your total data as much if (like Windows 7 files) will already be in their repository.

  12. This comment has been removed by the author.

  13. "The problem was that my computer with the Mozy software installed had failed, so the software couldn't notify me."

    This is inexcusable incompetence, hardly an honest oversight, and for that decision alone (by Mozy), I would never trust them again.

    They want me to believe, that their upcoming-deletion-warning would/could only be delivered via the computer that would very likely be offline because of disk failure?! Not even an email to an external acct?! Come on...

    The only reason they jumped up to help you, is because you publicized your experience on the WWW. Had you not done that, I've no doubt that all you would have ever received from them, would be their canned, brush-off template-mail.

    I highly suspect the internal policy at Mozy, is actually, "ha, screw the free users, we don't owe them anything, it's all about forcing people to pay..." like even the way you're able to contact them, to get their mistake sorted out. It's a form of extortion, IMO.

    As another commenter said, I do my own backups, to my own local hardware. I save the rip-off prices that cloud storage "providers" charge, plus exactly these kind of disasters when the day comes that I'd actually need to rely on them.

    Cloud storage (value) is like web hosting now, you can't really know what you're getting out of the deal until the sh*t hits the fan, and then it may well be way too late.

  14. "The problem was not with Mozy, but with his own employer's delays in replacing the laptop. If he needs more than 30 days of storage, he needs to pay for it."

    I couldn't disagree more. The problem (for users only, of course) is that Mozy doesn't really care if you can receive the notification that they're about to delete all your data.

    Paid or free, one or four back-ups, none of that matters. If Mozy actually cared about their users, they'd ensure that you could be notified. Either by one or two email addresses or by telephone or text.

    Notifying a user that you're about to delete their data, is no trivial issue. Saying, "oh well, we tried to msg you through your dead laptop, but you never responded, so too bad..."

    What a "policy" for happy customers and profitable business. What a joke. You're telling me none of their so-helpful engineers foresaw exactly this scenario, and didn't care to integrate alternate methods of customer contact?

    And only now, they've graciously decided to "make some changes"? That a company whose only product is safe and reliable back-up, can't bother to legitimately notify customers of impending deletion...

  15. check out crashplan.com. it is MUCH better than MOZY

  16. This comment has been removed by the author.

  17. AltDrive.com has unlimited backup for your Windows, Mac or Linux machines. Works like a champ.

  18. This comment has been removed by the author.

  19. /nelson mode on

    "Ha Ha"

    /nelson mode off

  20. I use http://rsync.net/ to push my critical files to a standard offsite filesystem. When I've had to run a restore (only once, accidental corruption of my encrypted drive) rsync could not have been more helpful. They produced a tar for me so I could readily download the whole backup in one go rather than file by file. I can't recommend them highly enough, for my most trusted files, I only use http://rsync.net/

    PS> OpenID on this comment form appears to be broken, at least for me.

  21. I have had great experience with CrashPlan.com and would highly recommend them.

  22. In addition try out Wuala.com, made in Swiss. Mount your cloud space as a volume and copy / sync your files to it.

  23. Gee, I was even thinking of switching from Carbonite to Mozy, because Carbonite had been less than prompt about answering some questions and such. But when I did lose the contents of my disk, clear it off and re-install Windows, Carbonite restored the whole thing just fine. Of course it's probably not a good idea to draw sweeping conclusions from a very small number of anecdotes, but I'm not comfortable telling people to use Mozy after having read this. (I'm using the basic monthly-fee plan at Carbonite. For my desktop I also have an external drive; belt and suspenders. I didn't have that during the aforementioned meltdown.)

  24. Hallo daniel,
    Cloud computing is een goede techniek en is het risico vrij om het verlies van de belangrijke gegevens.Ik heb een ervaring met een bedrijf, doen ze data recovery werken in meer verantwoorde manier.

  25. >After coming across this blog post, several >members of the Mozy technical team met with >each other to try to understand what happened[...]

    Hi Daniel,

    I'm glad to hear you got your data back. I've quoted this snippet from your post, however, to make the point that I think your data would have ended up totally beyond recovery if not for the pressure felt by this blog post. The safeguards you speak of are likely not available to mere mortals.

    I find this post very timely in my own personal situation. I started the new year by treating myself to a system build for Windows 7. I picked an ASUS mobo, Core i5 7XX, maxed out main memory and attached 4 1TB SATA2 drives to the Intel RAID contoller on the mobo. I configured it in RAID10. That was the first week of January. Like a fool I did not buy extra hard drives and at $70.00 per unit I con't explain way. Nonetheless, I finally found a need to reboot this PC--the first time in over 80 days. During boot up I discovered my RAID configuration was in degraded mode as one of the SATA2 drives bit the dust. Given the murderously poor MTBF on these sorts of low-end drives I was totally at risk of suffering a double disk failure. If only the RAID controller had the ability to interrupt the OS to throw a system alert of some sort so I at least could have found out while the system was up.

    I too heartily embrace the cloud and like you had at one point found yourself a bit short on personal testimony of the virtues of using cloud services so was I at the very moment I saw the word DEGRADED on the boot-time RAID status window of my PC.

    What an odd coincidence that I found your blog post while Googling for social network content on Mozy and DropBox--the two cloud offerings I'm considering.

    Oh well, so I ramble on in a comment thread of a two-week old post :-)

    In short, thanks for the post. And, indeed, thanks for showing that social networking pressure can be felt by very large commercial companies.

  26. I'm interested in knowing whether this experience changed your attitude toward the idea of auditing service providers. Do you think (a) internal auditing by experts would have helped you feel more comfortable about the service and the recoverability of the data (b) external auditing by automatic tools that checked the integrity of the data would have alerted you when there were signs of trouble.

  27. P.S. Here's link to the paper: http://www.hpl.hp.com/personal/Mehul_Shah/papers/hotos11_2007_shah.pdf: Auditing to Keep Online Storage Services Honest.

    I realize that EMC fixed your problem, but there are many that would have not received the same attention. Perhaps that made you feel better about the cloud. But, there are innumerable stories of folks not being able to recover their data from all of the most "reliable" and well-known vendors. I don't see how ad-hoc public shaming is making much of a difference. Don't you think that a more principled approach other than bad publicity is needed here?

  28. We are under attack by DECHO, (Mozy). If you have anything that can help us. Please notify us at info@mozyup.com
    Shalom ~ Jimmy