So I saw there is an option in google compute (I assume the same option exists in other cloud VM suppliers so the question isnt specifically on Google compute, but on the underlying technology) to resize the disk without having to restart the machine, and I ask, how is this possible?
Even if it uses some sort of abstraction to the disk and they dont actually assign a physical disk to the VM, but just part of the disk (or part of a number of disks), once the disk is created in the guest VM is has a certain size, how can it change without needing a restart? Does it utilize NFS somehow?
This is built directly into disk protocols these days. This capability has existed for a while, since disks have been virtualized since the late 1990s (either through network protocols like iSCSI / FibreChannel, or through a software-emulated version of hardware like VMware).
Like the VMware model, GCE doesn't require any additional network hops or protocols to do this; the hypervisor just exposes the virtual disk as if it is a physical device, and the guest knows that its size can change and handles that. GCE uses a virtualization-specific driver type for its disks called VirtIO SCSI, but this feature is implemented in many other driver types (across many OSes) as well.
Since a disk can be resized at any time, disk protocols need a way to tell the guest that an update has occurred. In general terms, this works as follows in most protocols:
Administrator resizes disk from hypervisor UI (or whatever storage virtualization UI they're using).
Nothing happens inside the guest until it issues an IO to the disk.
Guest OS issues an IO command to the disk, via the device driver in the guest OS.
Hypervisor emulates that IO command, notices that the disk has been resized and the guest hasn't been alerted yet, and returns a response to the guest telling it to update its view of the device.
The guest OS recognizes this response and re-queries the device size and other details via some other command.
I'm not 100% sure, but I believe the reason it's structured like this is that traditionally disks cannot send updates to the OS unless the OS requests them first. This is probably because the disk has no way to know what memory is free to write to, and even if it did, no way to synchronize access to that memory with the OS. However, those constraints are becoming less true to enable ultra-high-throughput / ultra-low-latency SSDs and NVRAM, so new disk protocols such as NVMe may do this slightly differently (I don't know).
Related
During configuring IoT-Agent for Ultralight 2.0 there is a possibility to set docker variable IOTA_REGISTRY_TYPE- Whether to hold IoT device info in memory or in a database (mongodb by default). Documentation that I'm referencing.
Firstly I would like to have it set for memory and what would it imply?
Could the data be preserved only in some allocated part of memory within docker env.? Could I omit further variables within configuration file, like IOTA_MONGO_HOST (The hostname of mongoDB - used for holding device information).
Architecture for my system has raspberry pi running IoT Agent and VM running Orion Context Broker and MongoDB. Both are reachable because they see each other in LAN. Is it necessary for MongoDB to be the same database for IoT Agent and Orion Context Broker if they are linked?
Is it possible to run IoT Agent with memory only type of device information persistence (instead of database type)? Will it have any effect on whole infrastructure running besides of obvious lack of device data holding?
Firstly I would like to have it set for memory and what would it imply?
There would be no need for a MongoDB database attached to the IoT Agent, there would be no persistence of provisioned devices in the event of disaster recovery.
Could the data be preserved only in some allocated part of memory within docker env.?
No
Could I omit further variables within configuration file, like IOTA_MONGO_HOST (The hostname of mongoDB - used for holding device information).
The Docker Env parameters are merely overrides to the values found in the config.js within the Enabler itself, so all of the ENV variables can be omitted if you are using defaults.
Is it necessary for MongoDB to be the same database for IoT Agent and Orion
Context Broker if they are linked?
The IoT Agent and Orion can run entirely separately and usually would use separate MongoDB instances. At least this would be the case in a properly architected production environment.
The Step-by-Step Tutorials are lumping everything together on one Docker engine for simplicity. A proper architecture has been sacrificed to keep the narrative focused on the learning goals. You don't need two Mongo-DB instances to handle less than 20 dummy devices.
When deploying to a production environment, try looking at the SmartSDK Recipes
in order to scale up to a proper architecture:
see: https://smartsdk.github.io/smartsdk-recipes/
Is it possible to run IoT Agent with memory only type of device information
persistence (instead of database type)? Will it have any effect on whole
infrastructure running besides of obvious lack of device data holding?
I haven't checked this, but there may be a slight difference in performance since memory access should be slightly faster. The pay-off is that you will lose the provisioned state of all devices if failure occurs. If you need to invest in disaster recovery then Mongo-DB is the way to go, and periodically back-up your database so you can always return to last-known-good
I have a compute engine instance on google cloud which is running fine. user base is increasing and I wish to upgrade to a bigger compute engine in terms of cpu and memory.
What is the most easy way to do such migration?
What is the snapshot, image, persistent disk features in google compute engine? Are they anyway useful to my task?
I figured it out. Lennert answer is good. I will add few more things to complete it. You can always stop a VM, edit the CPU/memory and restart the VM. But this action may change the external IP address and cause lot of issues. You can handle it but it may cause further downtime. You may have to update the new IP address at DNS and inside the code. One way to avoid this hassle is that you should Reserve a static IP adreess [in console, go to NETWORKING > EXTERNAL IP ADDRESS > RESERVE A STATIC IP ADDRESS]. If you do this, your ip address will not change once you restart the VM.
Image is aka Operating System. While creating a VM, you are asked to choose a boot disk, disk which is used to boot your VM from. You can select from pre-defined images.
Snapshot is the copy of the disk. If it is a boot disk, it contains the operating system image too. We can create a snapshot of an existing disk and use it as the boot disk while creating new VM.
Persistent Disk is the disk that can persists even if you delete the VM [provided you have deselect the option of deleting it while deleting the VM]. We can delete VM and use a persistent disk to create new ones. We can simply pay for persistent disk only, without having any VM.
The easiest way is to stop the machine, change the machine type from the console and start the machine again. No need to create backups (snapshots), new VM's, etc.
From what I gather, the only way to use a MySQL database with Azure websites is to use Cleardb but can I install MySQL on VMs provided in Azure Cloud Services. And if so how?
This question might get closed and moved to ServerFault (where it really belongs). That said: ClearDB provides MySQL-as-a-Service in Azure. It has nothing to do with what you can install in your own Virtual Machines. You can absolutely do a VM-based MySQL install (or any other database engine that you can install on Linux or Windows). In fact, the Azure portal even has a tutorial for a MySQL installation on OpenSUSE.
If you're referring to installing in web/worker roles: This simply isn't a good fit for database engines, due to:
the need to completely script/automate the install with zero interaction (which might take a long time). This includes all necessary software being downloaded/installed to the vm images every time a new instance is spun up.
the likely inability for a database cluster to cope with arbitrary scale-out (the typical use case for web/worker roles). Database clusters may or may not work well when a scale-out occurs (adding an additional vm). Same thing when scaling in (removing a vm).
less-optimal attached-storage configuration
inability to use Linux VMs
So, assuming you're still ok with Virtual Machines (vs stateless Cloud Service vm's): You'll need to carefully plan your deployment, with decisions such as:
Distro (Ubuntu, CentOS, etc). Azure-supported Linux distro list here
Selecting proper VM size (the DS series provide SSD attached disk support; the G series scale to 448GB RAM)
Azure Storage attached disks being non-Premium or Premium (premium disks are SSD-backed, durable disks scaling to 1TB/5000 IOPS per disk, up to 32 disks per VM depending on VM size)
Virtual network configuration (for multi-node cluster)
Accessibility of database cluster (whether your app is in the vnet or accesses it through a public endpoint; and if the latter, setting up ACL's)
Backup / HA / DR planning
Someone else mentioned using a pre-built VM image from VM Depot. Just realize that, if you go that route, you're relying on someone else to configure the database engine install for you. This may or may not be optimal for what you're trying to achieve. And the images may or may not be up-to-date with the latest versions, patches, etc.
Of course, what I wrote applies to any database engine you install in your own virtual machines, where a service provider (such as ClearDB) tends to take care of most of these things for you.
If you are talking about standard VMs then you can use a pre-built images on VMDepot for that.
If you are talking about web or worker roles (PaaS) I wouldn't recommend it, but if you really want to you could. You would need to fully script the install of the solution on the host. The only downside (and it's a big one) you would have would be the that the host will be moved to a new host at some point which would mean your MySQL data files would be lost - if you backed up frequently and were happy to lose some data then this option may work for you.
I think, that the main question is "what You want to achieve?". As I see, You want to use PaaS solution with Web Apps or Cloud Service and You need a MySQL database. If Yes, You have two options (both technically as David Makogon said). First one is to deploy Your own (one) server with MySQL and connect to it from the outside (internet side). Second solution is to create one MySQL server or cluster and connect Your application internally in Azure virtual network. WIth Cloud Service it is simple but with Web App it is not. You must create VPN gateway in Azure VM and connect Your Web App to this gateway. In this way You will have internal connection wfrom Your application to Your own MySQL cluster.
I am new to GCE. I was able to create new instance using gcutil tool and GCE console. There are few questions unclear to me and need help:
1) Does GCE provides persistent disk when a new instance is created? I think its 10GB by default, not sure though. What is the right way to stop the instance without loosing data saved on it and what will be the charge (US zone) if say I need 20GB of disk space for that?
2) If I need SSL to enable HTTPS, is there any extra step I should do? I think I will need to add firewall as per the gcutil addfirewall command and create certificate (or install it from third part) ?
1) Persistent disk is definitely the way to go if you want a root drive on which data retention is independent of the life cycle of any virtual machine. When you create a Compute Engine instance via the Google Cloud Console, the “Boot Source” pull-down menu presents the following options for your boot device:
New persistent disk from image
New persistent disk from snapshot
Existing persistent disk
Scratch disk from image (not recommended)
The default option is the first one ("New persistent disk from image"), which creates a new 10 GB PD, named after your instance name with a 'boot-' prefix. You could also separately create a persistent disk and then select the "Existing persistent disk" option (along with the name of your existing disk) to use an existing PD as a boot device. In that case, your PD needs to have been pre-loaded with an image.
Re: your question about cost of a 20 GB PD, here are the PD pricing details.
Read more about Compute Engine persistent disks.
2) You can serve SSL/HTTPS traffic from a GCE instance. As you noted, you'll need to configure a firewall to allow your incoming SSL traffic (typically port 443) and you'll need to configure https service on your web server and install your desired certificate(s).
Read more about Compute Engine networking and firewalls.
As alternative approach i would suggest deploying VMs using Bitnami. There are many stacks you can choose from. This will save you time when deploying the VM. I would suggest you go with the SSD disks, as the pricing is close between magnetic disks and SSDs, but the performance boost is huge.
As for serving the content over SSL, you need to figure out how will the requests be processed. You can use NGINX or Apache servers. In this case you would need to configure the virtual hosts for default ports - 80 for non-encrypted and 443 for SSL traffic.
The easiest way to serve SSL traffic from your VM is generate SSL certificates using the Letsencrypt service.
Is there an easy way to setup an environment on one machine (or a VM) with MySQL replication? I would like to put together a proof of concept of MySQL replication with one Master write instance and two slave instances for reads.
I can see doing it across 2 or 3 VMs running on my computer, but that would really bog down my system. I'd rather have everything running on the same VM. What's the best way to proof out scalability solutions like this in a local dev environment?
Thanks for your help,
Dave
I think to truly test MySQL Replication it is important to do so in realistic constraints.
If you put all the replicate nodes under one operating system then you no longer have the bandwidth constraint, the data transfer speed would be much higher that what you would get if those replicate DBs are on different sites.
Everything under one VM is a shortcut to configurations, for instance it does not make you go through the configuration of the networking.
I suggest you use multiple VMs, even if you have to put them under one physical machine, you can always configure the hypervisor to make the packets go through a router, in which case the I/O will be bound by whatever the network interface has as throughput.
I can see doing it across 2 or 3 VMs
running on my computer, but that would
really bog down my system.
You can try and make a few VMs with JeOS (Just Enough OS) versions of the operating system you want. I know Ubuntu has one and it can boot on 128 RAM, which makes it convenient to deploy lots of cloned VMs under one physical machine without monster RAM.
Next step would be doing the same thing on a cloud (Infrastructure as a Service, IaaS) provider, and try your setup on different geographical sites.
If what you're testing is machine-to-machine replication, then setting up multiple VMs on a virtual private network would be the correct environment to test it. If you use Ubuntu Server, you don't have to install more than you actually need -- just give the VMs enough space for a base install + MySQL + your data. Memory usage can be as little as 256MB per VM. All you have to do is suspend or shutdown the VMs when you're not running a full-up test.
I've had situations where I was running 4 or more VMs simultaneously on my workstation, either for development or testing purposes -- it's not that taxing unless you're trying to do video rendering in each VM.