JF002
Posts: 94
Joined: Sat Feb 04, 2012 8:49 am
Contact: Website

Personal distributed "cloud" storage

Sun Jun 28, 2015 5:42 pm

Hi,

As many of you, I have a lot of data (music, files, work, pictures,...) spread across several devices and hard-drives.
I would like to safely and easily backup all my data, and perhaps the ones from my family and friends.

I could buy a NAS and a bunch of hard-drives, but it's quite expensive, and the NAS would be the single point of failure. What happens if the NAS breaks down?

I could use online "free" services like Dropbox and the others, but I don't really trust them... And I would rather like a self-hosted solution.

That's why I was thinking about a distributed storage system : I could use a little piece of hardware (a RaspberryPi for example) and a hard-drive, and copy all my data on it. Ok, it's not safe.
Now, I could build another one, deploy it somewhere else (at a friend's, for example), and sync all my data on it. Mhm much better! And then, I could deploy again another box and improve the reliability of the system.

But why would someone accept to install the little "black-box", consume power and bandwidth? Because he would have access to this distributed and safe storage! All the data would be spread accros all the available nodes.

Ok, this is my idea of the ideal personal distributed cloud storage. Now, I'm wondering if such a thing already exist?
My web site : https://codingfield.com

User avatar
default_user8
Posts: 658
Joined: Mon Nov 18, 2013 3:11 am

Re: Personal distributed "cloud" storage

Sun Jun 28, 2015 7:52 pm

Someone has posted this type of solution before, twice that I can remember right off the top of my head. One was similar to what you explained, he had a pi at his house and one at his parents home that he used as backup to his local drive via rsync over ssh. The other was a server for home movies that he uploaded from his local pi to a pi he set up at his parents home over seas so they could watch videos of their grand kids. Another solution but one that may not be as popular would be to set up btsync on one pi and have the other pi's sync'd to the main pi. That way all files would exist on all pi's simultaneously and a catastrophic failure of any one drive would not cause any lose of data. In effect you would be running your own personal dropbox. There are many ways to acomplish this, do your research and choose what suits your needs best.
Two heads are better than one, unless one's a goat head.

JF002
Posts: 94
Joined: Sat Feb 04, 2012 8:49 am
Contact: Website

Re: Personal distributed "cloud" storage

Mon Jun 29, 2015 9:23 pm

Yeah, rsync is an easy solution, but it is not really "distributed". Rsync will synchronise data from one box to another... But if I have 3 or 4 boxes, synchronisation might become a nightmare...

Here is an example of what I have in my mind : I build 4 identical boxes (rpi + hard-drive). I keep one in my house, give one to my family, and 2 to friends. I tell them to power them on most of the time. This box would create a network hard-drive where they could copy important files. When files are added on these boxes, they would automatically replicate the data to the other boxes in such a way that it would be possible to restore the data even if 1 drive fails.
This way, the data of each owner is always accessible from their box, and can be recovered from the other ones.
It's not very easy to explain what I have in mind because english in not my main language.

Here are some interesting examples :
  • tahoe-lafs : A distributed filesystem, but I don't know if it would work fine with only 3 or 4 nodes, and how it would react with nodes that connect/disconnect often.
  • Symform : An online webservice which allows users to receive some free cloud space if they allocate some space of their hard-drive for the other user. Seems cool, but not open-source, and our data would be scattered all over the world, out of our control.
  • Bittorrent Sync : Not open-source, but more private (in the sens that the data is copied only on computer that knows the unique ID of the share). It would not be very efficient storage-wise because all the data would be completely replicated on all the nodes (Tahoe-lafs handles that more intelligently).
I doubt that what I'm searching for already exist, but if you know any other project of this kind, it could be interesting for me :)
My web site : https://codingfield.com

Return to “Off topic discussion”