Can SAN/NAS storage attach to cloud server? - mysql

Currently my boss ask my team to relocate our database to cloud server(Windows). Beside that, he also asked us to attach SAN/NAS storage to that server for a better speed/performance. The problem is we have no experiences in SAN/NAS storage.
The question is, can SAN/NAS storage be attach to cloud server? If can, is this a good practice? We currently using MySQL for our database.
Thanks

are we talking about a private or public cloud (AWS, Azure) ? though there are storage arrays that are able to proxy cloud storage, I don't think there are product to attach onsite storage array to a server in public cloud.
The reason why you want to use ie SAN is for performance - minimum latency. Imagine the connection between a storage array in a separate datacenter to a cloud server over TCP/IP, possibly far appart. The latency would make it unusable for ie. high transaction workload and defeat the purpose of a storage array.
If you were talking about a private cloud - VMware orchestrated or Openstack, then that might be possible via RDM (VMware) or Cinder (probably Cinder storage node). I think Azure is adding a feature where you can integrate part of the local infrastructure to Azure as an availability zone, so there might be possibilities.

Related

Are GCP CloudSQL instances billed by usage?

I'm starting a project where a CloudSQL instance would be a great fit however I've noticed they are twice the price for the same specification VM on GCP.
I've been told by several devops guys I work with that they are billed by usage only. Which would be perfect for me. However on their pricing page it states "Instance pricing for MySQL is charged for every second that the instance is running".
https://cloud.google.com/sql/pricing#2nd-gen-pricing
I also see several people around the web saying they are usage only.
Cloud SQL or VM Instance to host MySQL Database
Am I interpreting Googles pricing pages incorrectly?
Am I going to be billed for the instance being on or for its usage?
Billed by usage
All depend what you mean by USAGE. When you run a Cloud SQL instance, it's like a server (compute engine). Until you stop it, you will pay for it. It's not a pay-per-request pricing, as you can have with BigQuery.
With Cloud SQL, you will also pay the storage that you use. And the storage can grow automatically according with the usage. Be careful the storage can't be reduce!! even if you delete data in database!
Price is twice a similar Compute engine
True! A compute engine standard1-n1 is about $20 per month and a same config on Cloud SQL is about $45.
BUT, what about the price of the management of your own SQL instance?
You have to update/patch the OS
You have to update/patch the DB engine (MySQL or Postgres)
You have to manage the security/network access
You have to perform snapshots, ensure that the restoration works
You have to ensure the High Availability (people on call in case of server issue)
You have to tune the Database parameters
You have to watch to your storage and to increase it in case of needs
You have to set up manually your replicas
Is it worth twice the price? For me, yes. All depends of your skills and your opinion.
There are a lot of hidden configuration options that when modified can quickly halve your costs per option.
Practically speaking, GCP's SQL product only works by running 24/7, there is no time-based 'by usage' option, short of you manually stopping and restarting the compute engine.
There are a lot of tricks you can follow to lower costs, you can read many of them here: https://medium.com/#the-bumbling-developer/can-you-use-google-cloud-platform-gcp-cheaply-and-safely-86284e04b332

Host a mySQL Server

I am making a Javafx program and need to use a small mySQL database. Currently I am hosting one on my computer but I can't access it on other computers on other networks. I need the mySQL server to be accessible from anywhere. How do I host one that does that? Thanks in advance, all help is welcome.
Well you have a few options depending on how important this MySQL database is to you, how you intend to connect to it from outside, and what you want to do with it.
The naive implementation would involve opening your firewall and directing all incoming traffic using whatever port you have configured MySQL for to point to the ip address of your server. If you do this you absolutely must secure your database with a password!!! You'll also need to keep the server's public ip address handy so you know how to find it when you go out.
Use Amazon AWS, Google Compute, Google App Engine, or some other cloud platform to host a MySQL instance. All the big players also tend to host pretty awesome RDBMS solutions. The advantage here is that you're not exposing your home computer to malice and you are connecting into an ecosystem that will answer a lot of other questions for you as they come up along the way (IE - how do you ensure redundancy? Backups? Scale your network for traffic?). There's a ton of other advantages too. It's the cloud... dude...
Use a SaaS DB service such as Firebase (Note: We are leaving MySQL and SQL database territory with Firebase)
If you plan to let other parties access your MySQL instance to make use of your data, you might also want to consider implementing a REST API (or SOAP API if you hate the future) which acts as an abstraction layer to interact with and provide the data from your database in a consistent and reliable format.
Best answer I can give with the details afforded - look around though the options in this arena are near limitless depending on how and what you're trying to do.
You should be able to access your machine from your LAN pretty easily unless there is some firewall rules preventing opening connection to your machine. Another way is there are many cloud shosting providers has free tier you can signup to bring up a test instance of mysql. Example: Open Shift.

Share a persistent disk between Google Compute Engine VMs

From Google's documentation:
It is possible to attach a persistent disk to more than one instance. However, if you attach a persistent disk to multiple instances, all instances must attach the persistent disk in read-only mode. It is not possible to attach the persistent disk to multiple instances in read-write mode.
If you attach a persistent disk in read-write mode and then try to attach the disk to subsequent instances, Google Compute Engine returns an error.
So, I need to have a share persistent-disk as frontend for all my compute engine, good, how can you write on this shared disk?
My guess (I hope) is a read/write persistent-disk can be attached only with 1 compute engine but this same disk can be share in read only to others VMs, is thats right?
Lets say I have 2 Compute Engine VMs and 2 persistent disks,
is this flow is possible?
compute1 read/write disk1 and read only disk2
compute2 read/write disk2 and read only disk1
Update: this is available as of 2020-06-16
As per another answer by Matthew Lenz, the functionality for creating multi-writer persistent disks is available, but it's still in alpha status (even though it's documented as being in the beta track) and requires special per-project enablement.
Note: This GitHub issue notes that the functionality is still in alpha, even though it's labelled as beta. You can submit feedback via Cloud Console to request it for your project if you'd like to get early access to this functionality, but it's not guaranteed to be enabled.
Assuming your project has the permissions to use this feature (or the feature becomes public-access), note that it comes with some caveats:
--multi-writer
Create the disk in multi-writer mode so that it can be attached with read-write access to multiple VMs. Can only be used with zonal SSD persistent disks. Disks in multi-writer mode do not support resize and snapshot operations.
You can use this via:
$ gcloud beta compute disks create DISK_NAME --multi-writer [...]
Note the caveats:
zonal SSD persistent disks only
no disk resizing
no snapshots
If these trade-offs are not acceptable to you, see the original answer (below) which has a long list of recommended storage alternatives for sharing data between multiple GCE VMs.
Original answer (valid prior to 2020-06-16)
No, this is not possible, as the documentation that you cited at the time of writing said (since updated):
However, if you attach a persistent disk to multiple instances, all instances must attach the persistent disk in read-only mode.
The documentation has been re-arranged since then; the new docs are at a different URL but with the same content:
You can attach a non-root persistent disk to more than one virtual machine instance in read-only mode, which allows you to share static data between multiple instances. Sharing static data between multiple instances from one persistent disk is cheaper than replicating your data to unique disks for individual instances.
If you attach a persistent disk to multiple instances, all of those instances must attach the persistent disk in read-only mode. It is not possible to attach the persistent disk to multiple instances in read-write mode. If you need to share dynamic storage space between multiple instances, connect your instances to Cloud Storage or create a network file server.
If you have a persistent disk with data that you want to share between multiple instances, detach it from any read-write instances and attach it to one or more instances in read-only mode.
which means you cannot have one instance have write access while another has read-only access.
If you want to share data between them, you need to use something other than Persistent Disk. Below are some possible solutions.
You can use any of the following hosted/managed services:
Google Cloud Filestore — perhaps closest to what you're looking for, as it provides an NFSv3 file system
You can also use Elastifile on GCP as a fully-managed service; note that GCP acquired Elastifile in July 2019
Google Cloud Datastore
Google Cloud Storage, which you can use via the GCS API (JSON or XML) or you can mount it using gcsfuse as a block device
Google Cloud Bigtable
Google Cloud SQL
Alternatively, you can run your own:
self-managed or third-party managed file servers solutions, including NetApp and Panzura
self-managed Elastifile storage deployment (for fully-managed, see previous section for the link)
database (whether SQL or NoSQL)
distributed filesystem such as Ceph, GlusterFS, OrangeFS, ZFS, etc.
file server such as NFS or SAMBA
single VM as a data storage node, and use sshfs to create a FUSE mount from other VMs that want to access that data
GCP has alpha functionality for 'multi-write' persistent disks. It's been in alpha for quite a long time so who knows if it'll make it to beta or ga any time soon. Here is a link to the documentation. https://cloud.google.com/sdk/gcloud/reference/beta/compute/disks/create#--multi-writer
EDIT: 2020-06-16. This has been promoted to beta.

Storage options for diskless servers [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I am trying to build a neural network simulation running on several high-CPU diskless instances. I am planning to use a persistent disk to store my simulation code and training data and mount them on all server instances. It is basically a map reduce kind of task (several nodes working on the same training data, the results of all nodes need to be collected to one single results file).
My only question now is, what are my options to (permanently) save the simulation results of the different servers (either at some points during the simulation or once at the end). Ideally, I would love to write them to the single persistent disk mounted on all servers but this is not possible because i can only mount it read-only to more than one server.
What is the smartest (and cheapest) way to collect all simulation results of all servers back to one persistent disk?
Google Cloud Storage is a great way to permanently store information in the Google Cloud. All you need to do is enable that product for your project, and you'll be able to access Cloud Storage directly from your Compute Engine virtual machines. If you create your instances with the 'storage-rw' service account, access is even easier because you can use the gsutil command built into your virtual machines without needing to do any explicit authorization.
To be more specific, go to the Google Cloud Console, select the project with which you'd like to use Compute Engine and Cloud Storage and make sure both those services are enabled. Then use the 'storage-rw' service account scope when creating your virtual machine. If you use gcutil to create your VM, simply add the --storage_account_scope=storage-rw (there's also an intuitive way to set the service account scope if you're using the Cloud Console to start your VM). Once your VM is up and running you can use the gsutil command freely without worrying about doing interactive login or OAuth steps. You can also script your usage by integrating any desired gsutil requests into your application (gsutil will also work in a startup script).
More background on the service account features of GCE can be found here.
Marc's answer is definitely best for long-term storage of results. Depending on your I/O and reliability needs, you can also set up one server as an NFS server, and use it to mount the volume remotely on your other nodes.
Typically, the NFS server would be your "master node", and it can serve both binaries and configuration. Workers would periodically re-scan the directories exported from the master to pick up new binaries or configuration. If you don't need a lot of disk I/O (you mentioned neural simulation, so I'm presuming the data set fits in memory, and you only output final results), it can be acceptably fast to simply write your output to NFS directories on your master node, and then have the master node backup results to some place like GCS.
The main advantage of using NFS over GCS is that NFS offers familiar filesystem semantics, which can help if you're using third-party software that expects to read files off filesystems. It's pretty easy to sync down files from GCS to local storage periodically, but does require running an extra agent on the host.
The disadvantages of setting up NFS are that you probably need to sync UIDs between hosts, NFS can be a security hole, (I'd only expose NFS on my private network, not to anything outside 10/8) and that it will require installing additional packages on both client and server to set up the shares. Also, NFS will only be as reliable as the hosting machine, while an object store like GCS or S3 will be implemented with redundant servers and possibly even geographic diversity.
If you want to stay in the google product space, how about google cloud storage?
Otherwise, I've used S3 and boto for these kinds of tasks
As a more general option, you're asking for some sort of general object store. Google, as noted in previous responses, makes a nice package, but nearly all cloud providers provide some storage option. Make sure your cloud provider has BOTH key options -- a volume store, a store for data similar to a virtual disk, and an object store, a key/value store. Both have their strengths and weaknesses. Volume stores are drop-in replacements for virtual disks. If you can use stdio, you can likely use a remote volume store. The problem is, they often have the structure of a disk. If you want anything more than that, you're asking for a database. The object store is a "middle ground" between the disk and the database. It's fast, and semi-structured.
I'm an OpenStack user myself -- first, because it does provide both storage families, and second, it's supported by a variety of vendors, so, if you decide to move from vendor A to vendor B, your code can remain unchanged. You can even run a copy of it on your own machines (Go to www.openstack.org) Note however, OpenStack does like memory. You're not going to run your private cloud on a 4GB laptop! Consider two 16GB machines.

Hosted Database v Cloud Database

I have looked everywhere...
whats the difference between a hosted database and a cloud database they seem like the same things?
Thanks
Both "hosted database" and "cloud database" mean that the database lives on the servers of some external provider/hoster.
The hoster might even be the same in both cases.
The main difference is that the "cloud" plans are usually meant to scale more (at a higher monthly fee), so you'd use them when you expect your site to get huge soon and need to quickly adjust server capacity when needed.
On the other hand, "hosted" plans are not that expensive, but have more limitations (server speed, database size...) and are more suited for "small" websites.
Of course this isn't by any means an "official" description of the two terms, but that's the impression that I get every time I see "cloud" or "hosted" webspaces/databases/services/whatever.
It depends on the context in which they're being used, but, yes, they usually mean the same thing. When I see the term cloud database being used they are usually referencing some cloud platform like Amazon EC2 or Microsoft Azure instead of GoDaddy or HostGator or something. Plus, cloud is the new buzz word - I'm sure it sells better. Lol.
As Christian Specht said, the cloud servers really scale more. So why you need more scaling? and why there are many featured options in cloud database service selection?
Things are not like before. Before smartphones and earlier pc operating systems, users gets information from the server only when they log on the specific web page using their credentials. But now apps like facebook shows notifications, provide ads etc and collect/push other data in parallel while we are looking at something else irrelevant.
Hosted database are reliable to access the database when users log onto the web page. But in case of the lastest smart phone applications, it needs to access the database everytime starting from its birth (installation on the device). So for each installation, the minimum workload over the server is expected to raise up.
So more scalability is required here. More simultaneous connections, Input/Output operation requests are expected daily. So with the dedicated servers with the core purpose, and with the configurable package selection based on your expectation of user count and bandwidth usage, Cloud Service is not yet another marketing term, but is a helpful service.