IPFS: Duplicate Pinning- Pin a file in two different locations? - ipfs

I'm currently using Filebase - a paid IPFS pinning service - to pin my IPFS files. Filebase are still in beta, so the prices are low but could rise in the future.
Therefore I'd like to keep copies of all my files on a drive on my home computer too just in case Filebase has problems,outages, etc. in the future. My local computer files will be an IPFS node itself.
Should I pin these local files on my computer as well? Or will duplicate pinning cause problems?

You can totally pin the local files on your local IPFS node running on your home computer. However, it is more likely that your node on your home computer will crash or the machine itself will corrupt than Filebase. It's true Filebase is a centralized service with a single point of failure, but they are likely running several IPFS nodes pinning your files and storing them long-term on the Sia network.
To safest option is to explore storing your data on Filecoin, which is equivalent to Sia in the sense that it is for long-term persistence (which IPFS is not). Think of IPFS as the decentralized version of Redis. It can cache and store information, but it's not designed to store them like a database is.

Related

Do I need to have a running ipfs node to be able to store and retrieve files?

I have a basic flask application that stores and retrieves images. I want to store those images on ipfs by simply posting a request to the application which is probably going to be hosted on Heroku. So, I wouldn't have a running ipfs node. Is this possible ?
You can use an IPFS Gateway to access files without running your own node.
When you pin an IPFS file to your own node and shut it down, your files will not be accessible anymore by yourself or others unless another node pins them as well and stays online.
You can pay IPFS file hosters to pin your file on their nodes, Cloudflare and Eternum are two of them.
Here is a list of more: https://www.reddit.com/r/ipfs/comments/9pb5pf/are_there_any_ipfs_file_hosting_services/
There's pinata as a pin service - with a free tier to get started. :)
Files can be accessed via a public gateway, like ipfs.io or the Cloudflare one.

storing images for my website

I want to setup seperate amazon ec2 instance where i store all my images uploaded via my website by users. I want to be able to show images from this exclusive server. I know how to setup DNS names which would point to this server. But i would like to know how to setup the directories, for example if i refer to an image url as http://images.mydomain.com/images/sample.jpg, then
images.mydomain.com is the server name and
images should be the folder name
now the question is should a webserver be running on this server which is what will serve the images or can i just make images folder public so that it is visible to entire world? How do avoid directory listing?
Pointer to any documentation would be greatly appreciated.
It certainly is possible to set up a separate EC2 instances to serve your images. You may have good reasons to do that--for example, you may want to authorize only specific users or groups of users to access certain images, in a way that's closely controlled by program logic.
OTOH, if you're just looking to segment the access of image/media files away from the server that provides HTML/web content, you will get much better performance / scalability by moving those files to a service that is specifically tuned for storage and web access. Amazon's S3 (Simple Storage Service) is one relatively straightforward option. Amazon's CloudFront content distribution network (CDN) or a competing CDN would be an even higher performance option.
Using a CDN for file access does add the complexity of configuring the CDN, but if you're going to the trouble of segmenting media access from your primary web server, and if you're expecting any significant I/O load, I've found it to be a high-return-for-effort-expended approach.
I would definitely not implement this as you are planning. You should store all your images in an Amazon S3 bucket and serve them via Amazon's CloudFront CDN. Why go through the hassle of setting up and maintaining an EC2 instance to do what Amazon has already done? S3 provides infinite storage, manages permissions, metadata, etc. CloudFront provides fast access to your images, caching them at edge locations all around the world. Additionally, you can use Amazon Route 53 (or some other DNS service) to point various CNAMEs to your CloudFront distribution.
If you're interested in this approach I'd be happy to provide more info on how to set this up.
Yes, you will definitly need to run a webserver on the machine. Otherwise it will not bepossible for clients to connect via http/port 80 and view the images in a browser. This has nothing to do with directory listing enabled. Once you have a webserver running, you can disable directory listing in its configuration.
Install an apache on your server and run it (http://httpd.apache.org/docs/2.0/install.html). You then setup what's called a 'site' in its configuration which is pointing to a local directory which will then be the base directory for your server. It could, for example, be /home/apache on a Unix system. There you create your images folder. If your apache is setup correctly you can then access your images via http://images.mydomain.com/images/sample.jpg.

Many binary files synchronization

I have about 100 000 files on office server (images, pdf's, etc...)
Each day files count grows about 100-500 items, and about 20-50 old files changes.
What is the best way to synchronize Web-server with these files?
Can any system like Mercurial, GIT help?
(On office server, I'll commit changes, and web-server periodically do updates)?
Second problem is, that on Web-server I have user-generated-content (binary-files) (other files).
Each day this users upload about 1000-2000 new files. Old files don't change.
And I need to backup these files to local machine.
Can any system like Merurial, GIT help in this situation?
(On web-server I'll commit these files by cron, and on local machine I'll do updates)
Thanks
UPD.
Office server is Windows Server 2008 R2
Web-server is Debian 5 lenny
The simplest and most reliable mechanism (in my experience) is rsync.
On Windows, however, rsync over ssh is badly broken due to issues with how Cygwin interacts with named pipes. Rsync over its own protocol works (as long as you don't care about encryption), but I've had lots of problems getting rsync to stay up as a Windows service for more than a few days at a time. DeltaCopy is a Windows app that uses the rsync tools behind the scenes; it seems to work very well, though I haven't tried the ssh option.
A DVCS is not a good solution in this case: it will keep the all history, which you don't always need, and will make any clone a massive operation.
An artifact repository like Nexus is much more adapted if you need some kind of versioning with integrity check associated with your binaries.
Otherwise (no versioning), a simple rsync like Marcelo proposes is enough.

Where do you keep the configuration files for your stack?

For the website(s) I am a developer for we have a number of different technologies which make up our stack, each with a different set of configurations etc.
This is a Rails stack, so we're running things including:
Nginx w/ Passenger
Varnish
Redis
Memcached
MySQL
MongoDB
As we're continually tweaking our configs and changing them to support our continually changing system, and if we were to 'lose' the configurations (e.g. due to a server crash or otherwise) it would be a huge pain to rebuild from memory.
Given that version control would be extremely useful I can quite easily add these files into a Git repo or similar and store them in the cloud somewhere, but what about application-specific configuration (for example, URL Rewrite config for a website on a shared server)? Should these be in this same repo as well?
Put website specific stuff in the Git repo of that website, and system-wide stuff in a "systems" git repo.
If you are not currently using Source Control (of any kind) in your development environment, stop whatever you are doing and sort that out right now. That is the most important aspect of your setup.
At a very minimum you should keep EVERYTHING that is a text file and relates to your app (yes all config files, URL rewrites).
Others suggest you can put binary files also, but at the very minimum all source code, all config etc should be in source control.
By the end of the day :)

How can I organize/use Mercurial in a multi-location, non-networked environment?

My team has a local development network which is not physically connected to any outside network. This is a contractual obligation and CANNOT be avoided. We also have to coordinate with a team which located halfway across the country and, as previously implied, has no direct network connectivity to us. Our only method of transferring data involves copying data to a USB disk and sending via email/ftp/etc.
NOTE: Let's not discuss the network connection issue or the obvious security flaw with the USB disk access. These issues are non-negotiable.
We're still convincing the external team to use Mercurial (currently don't use ANY SCM). Assume for the rest of this question that they're using Mercurial. We're going to force their hand any day now.
We switched to Mercurial in hopes of being able to utilize the distributed nature to better sync w/ the external team. Internally, we're using Mercurial much like a central server SCM. Each developer clones from a master repo on your integration server. Changes are pushed/pulled from this central location.
Here lies the actual question content:
What is the best way to communicate changes to the remote team (assuming they're using a similar Mercurial setup to us)? Should I have a local master repo for local push/pull), and a local integration repo for remote push/pull? How can I best avoid complicated merge issues that will arise? If we use Mercurial bundles to push changes, who will do the merges and against which repository?
You can basically use it in exactly the same way as if you were online.
You just need to replicate the remote repo locally and unbundle every changeset they send you in it. You should never push your changes directly into the local-mirror (it should always reflect the state of the remote team).
Afterwards you decide what you want, doing merges on your side or on their side, it doesn't really matter.