What is the difference between BigQuery and MySQL? - mysql

As a beginner starting out in Data Analytics, I would like to know if they are similar (or different versions of the same thing), or if I have them confused for two entirely different concepts.

Similarities
The similar thing between the 2 is that we can use SQL to query data stored in both MySQL and BigQuery
Differences
We can say that the 2 technologies have completely different use cases. So their philosophy, design, and internal architecture are different.
You can use MySQL to store data for a transactional system or OLTP. For example, if you have an ecommerce website then you can use a MySQL database to store data about users, orders, payments... You could have a lot of transactions/seconds but a transaction usually involves 1 or some lines in your database. MySQL and other relational database engines are good for that. They use some form of normalization to make write operation efficient and keep data consistent.
Now imagine you need to analyze the data of your Ecommerce website over the last 5 years. Your query now will involve all your entries (or rows), but usually for some columns only. And you don't have the same number of queries/second as in the previous situation. You can see the 2 conditions are different. And in this situation, MySQL isn't an optimal choice anymore but an OLAP system. BigQuery is an example of OLAP. With BigQuery, you store data for analysis, not for operational purposes.
Now you see that 2 technologies serve different purposes, you can understand the difference in their design and architecture. For example, with BigQuery, you're encouraged to denormalize data to avoid expensive JOIN operators. Internally, BigQuery stores data by columns and not by rows like MySQL. These decisions share a common goal, make analytic queries run efficiently.
You can research further about OLTP vs OLAP :).

MySQL is a free RDBMS that runs everywhere, extremely popular, general purpose, is really well supported is extremely flexible.
BigQuery is a proprietary google-owned, Cassandra-like, expensive database that uses SQL but is more limited in features, but can be easier to scale for certain types of problems and is deeper embedded into the google ecosystem.
You should always default to MySQL or Postgres unless you have a specific reason to use something like BigQuery. If you don't know which one you should use, you should use MySQL or Postgres.

Related

Using MySQL or MariaDB to store locations

I am designing a transportation system in which I need to store location of the vehicles at least once or twice a minute. I want to find out which database is better to choose (MySql or MariaDB) for this case in terms of performance and scalability. How much it worth if I switch to NoSQL databases such as MangoDB or whatever!?
If you want to use features provided by NoSQL you may choose MariaDB.It has Cassandra engine and you may use dynamic column to store data as like NoSQL inside MYSQL engine.
In terms of scaling
NoSQL’s simpler data models can make the process easier, and many have been built with scaling functionality from the start. That is a generalization, so seek expert advice if you encounter this situation
In terms of performance
NoSQL’s simpler denormalized store allows you to retrieve all information about a specific item in a single request. There’s no need for related JOINs or complex SQL queries.
Where you need NoSQL ?
unrelated, indeterminate or evolving data requirements
speed and scalability is imperative
Where you need MYSQL?
logical related discrete data requirements which can be identified up-front
data integrity is essential
EDIT :
You may check this link.He explained RDBMS vs NoSQL very well !!

NoSQL Database Design

I'm looking at SQL & NoSQL Databases - namely MySQL and DynamoDB (both at AWS).
I'm building a dating/social network and demos I've built have been using a MySQL Database with around 50 tables for logical separation of data and then using SQL queries (often with joins) to extract required chucks of data to send back to browsers.
I'm moving to AWS and are doing a rebuilt of the system and wanted to know if it would be possible to write a site like this 100% in NoSQL. I understand you don't know the specifics of the site but it could be compared to any other dating/social network like facebook (obviously more involved) or Eharmony/Match Maker etc...
Could a Social Site be built 100% on NoSQL? or would a mix of NoSQL & SQL be move realistic?
thx
It's a very difficult question to answer without a deeper understanding of exactly what features you're after, and what language you are going to be writing the site in. There are lots of different types of NoSQL solutions.
NoSQL databases like Dynamo and Cassandra are Key-Value Stores. They offer a very different set of features than Document Databases like MongoDB and RavenDB. There are many other types as well.
Personally, I would be more than comfortable writing a social media site based entirely on RavenDB. But that's because I tend to focus on Domain Driven Design, and like to write in .Net/C#. It has all the features you would need, like querying indexes, map/reduce for big data jobs, full-text search, and spatial distance proximity searches. You could use their http/rest api if you wanted to program from php or javascript, but their C# client is much easier to use.
Your requirements may be different than mine would be though. I would encourage you to try out several different NoSQL technologies before you settle on one. You may still find that you need a SQL (or MySQL) database for certain things that your NoSQL solution doesn't handle. For example, RavenDB isn't recommended for ad-hoc reporting - so many people set up a separate SQL Server database and replicate data from Raven into SQL so they can provide a separate reporting database to their power users.
The biggest thing to remember is that most noSQL engines (like Cassandra) don't support querying, so that has to be a factor in your design (i.e. many things you take for granted in SQL like JOINs are much harder in a noSQL solution). With that being said, you most definitely can build full-featured applications using a noSQL solution. I encourage you to look into resources available from the many providers out there, like Cassandra, MongoDB, Dyanamo, and many others.

MongoDB vs Mysql Storage space compare

I am building a data ware house that is the range of 15+ TBs. While storage is cheap, but due to limited budget we have to squeeze as much data as possible in to that space while maintaining performance and flexibility since the data format changes quiet frequently.
I tried Infobright(community edition) as a SQL solution and it works wonderful in term of storage and performance, but the limitation on data/table alteration is making it almost a no go. and infobright's pricing on enterprise version is quiet steep.
After checking out MongoDB, it seems promising except one thing. I was in a chat with a 10gen guy, and he stated that they don't really give much of a thought in term of storage space since they flatten out the data to achieve the performance and flexibility, and in their opinion storage is too cheap nowadays to be bother with.
So any experienced mongo user out there can comment on its storage space vs mysql (as it is the standard for what we comparing against to right now). if it's larger or smaller, can you give rough ratio? I know it's very situation dependent on what sort of data you put in SQL and how you define the fields, indexing and such... but I am just trying to get a general idea.
Thanks for the help in advance!
MongoDB is not optimized for small disk space - as you've said, "disk is cheap".
From what I've seen and read, it's pretty difficult to estimate the required disk space due to:
Padding of documents to allow in-place updates
Attribute names are stored in each collection, so you might save quite a bit by using abbreviations
No built in compression (at the moment)
...
IMHO the general approach is to build a prototype, insert data and see how much disk space your specific use case requires. The more realistic you can model your queries (inserts and updates) the better your result will be.
For more details see http://www.mongodb.org/display/DOCS/Excessive+Disk+Space as well.
Pros and Cons of MongoDB
For the most part, users seem to like MongoDB. Reviews on TrustRadius give the document-oriented database 8.3 out of 10 stars.
Some of the things that authenticated MongoDB users say they like about the database include its:
Scalability.
Readable queries.
NoSQL.
Change streams and graph queries.
A flexible schema for altering data elements.
Quick query times.
Schema-less data models.
Easy installation.
Users also have negative things to say about MongoDB. Some cons reported by authenticated users include:
User interface, which has a fairly steep learning curve.
Lack of joins, which can make some data retrieval projects difficult.
Occasional slowness in the cloud environment.
High memory consumption
Poorly structured documentation.
Lack of built-in analytics.
Pros and Cons of MySQL
MySQL gets a slightly higher rating (8.6 out of 10 stars) on TrustRadius than MongoDB. Despite the higher rating, authenticated users still mention plenty of pros and cons of choosing MySQL.
Some of the positive features that users mention frequently include MySQL’s:
Portability that lets it connect to secondary databases easily.
Ability to store relational data.
Fast speed.
Excellent reliability.
Exceptional data security standards.
User-friendly interface that helps beginners complete projects.
Easy configuration and management.
Quick processing.
Of course, even people who enjoy using MySQL find features that they don’t like. Some of their complaints include:
Reliance on SQL, which creates a steeper learning curve for users who
do not know the language.
Lack of support for full-text searches in InnoDB tables.
Occasional stability issues.
Dependence on add-on features.
Limitations on fine-tuning and common table expressions.
Difficulties with some complex data types.
MongoDB vs MySQL Performance
When comparing the performance of MongoDB and MySQL, you must consider how each database will affect your projects on a case-by-case basis. While some performance features may appear to be objectively promising, your team members may never use the features that drew you to a database in the first place.
MongoDB Performance
Many people claim that MongoDB outperforms MySQL because it allows them to create queries in multiple ways. To put it another way, MongoDB can be used without knowing SQL. While the flexibility improves MongoDB's performance for some organizations, SQL queries will suffice for others.
MongoDB is also praised for its ability to handle large amounts of unstructured data. Depending on the types of data you collect, this feature could be extremely useful.
MongoDB does not bind you to a single vendor, giving you the freedom to improve its performance. If a vendor fails to provide you with excellent customer service, look for another vendor.
MySQL Performance
MySQL performs extremely well for teams that want an open-source relational database that can store information in multiple tables. The performance that you get, however, depends on how well you configure the MySQL database. Configurations should differ depending on the intended use. An e-commerce site, for example, might need a different MySQL configuration than a team of research scientists.
No matter how you plan to use MySQL, the database’s performance gets a boost from full-text indexes, a high-speed transactional system, and memory caches that prevent you from losing crucial information or work.
If you don’t get the performance that you expect from MySQL data warehouses and databases, you can improve performance by integrating them with an excellent ETL tool that makes data storage and manipulation easier than ever.
MySQL vs MongoDB Speed
In most speed comparisons between MySQL and MongoDB, MongoDB is the clear winner. MongoDB is much faster than MySQL at accepting large amounts of unstructured data. When dealing with large projects, it's difficult to say how much faster MongoDB is than MySQL. The speed you get depends on a number of factors, including the bandwidth of your internet connection, the distance between your location and the database server, and how well you organise your data.
If all else is equal, MongoDB should be able to handle large data projects much faster than MySQL.
Choosing Between MySQL and MongoDB
Whether you choose MySQL or MongoDB probably depends on how you plan to use your database.
Choosing MySQL
For projects that require a strong relational database management system, such as storing data in a table format, MySQL is likely to be the better choice. MySQL is also a great choice for cases requiring data security and fault tolerance. MySQL is a good choice if you have high-quality data that you've been collecting for a long time.
Keep in mind that to use MySQL, your team members will need to know SQL. You'll need to provide training to get them up to speed if they don't already know the language.
Choosing MongoDB
When you want to use data clusters and search languages other than SQL, MongoDB may be a better option. Anyone who knows how to code in a modern language will be able to get started with MongoDB. MongoDB is also good at scaling quickly, allowing multiple teams to collaborate, and storing data in a variety of formats.
Because MongoDB does not use data tables to make browsing easy, some people may struggle to understand the information stored there. Users can grow accustomed to MongoDB's document-oriented storage system over time.

Using both Mongodb and Mysql in one project

I have been working to learn Mongodb effectively for one week in order to use for my project. In my project, I will store a huge geolocation data and I think Mongodb is the most appropriate to store this information. In addition, speed very important for me and Mongodb responds faster than Mysql.
However, I will use some joins for some parts of the project, and I'm not sure whether I store user's information in Mongodb or not. I heard some issues can occur in mongodb during writing process. should I use only mongodb with collections (instead of join) or both of them?
In most situations I would recommend choosing one db for a project, if the project is not huge. On really big projects (or enterprises in general), I think long term organizations will use a combination of
RDBMS for highly transactional OLTP
NoSQL
a datawarehousing/BI project
But for things of more reasonable scope, just pick the one that does the core of the use case, and use it for everything.
IMO storing user data in mongodb is fine -- you can do atomic operations on single BSON documents so operations like "allocate me this username atomically" are doable. With redo logs (--journal) (v1.8+), replication, slavedelayed replication, it is possible to have a pretty high degree of data safety -- as high as other db products on paper. The main argument against safety would be the product is new and old software is always safer.
If you need to do very complex ACID transactions -- such as accounting -- use an RDBMS.
Also if you need to do a lot of reporting, mysql may be better at the moment, especially if the data set fits on one server. The SQL GROUP BY statement is quite powerful.
You won't be JOINing between MongoDB and MySQL.
I'm not sure I agree with all of your statements. Relative speed is something that's best benchmarked with your use case.
What you really need to understand is what the relative strengths and weaknesses of the two databases are:
MySQL supports the relational model, sets, and ACID; MongoDB does not.
MongoDB is better suited for document-based problems that can afford to forego ACID and transactions.
Those should be the basis for your choice.
MongoDB has some nice features in to support geo-location work. It is not however necessarily faster out of the box than MySQL. There have been numerous benchmarks run that indicate that MySQL in many instances outperforms MongoDB (e.g. http://mysqlha.blogspot.com/2010/09/mysql-versus-mongodb-yet-another-silly.html).
Having said that, I've yet to have a problem with MongoDB losing information during writing. I would suggest that if you want to use MongoDB, you use if for the users as well, which will avoid having to do cross database 'associations', and then only migrate the users to MySQL away if it becomes necessary.

XML or MYSQL.Which should be used for storing connected data?

i am writing code for friend list and messaging system for my college website.I need to store interconnected data.. need to search them ...It has about 3500 records..So which way I proceed MYSQL or XML ..which is fastest..which is best ?why?
I'm going to use one of my professor's favorite answers here: "it depends."
XML and MySQL have very different applications. If you need to be doing lots of simultaneous queries for all sorts of sophisticated things, MySQL is your clear winner. Sometimes MySQL can be hard to use in some applications because you must first create a database schema in which to fit your data. It sounds like though, that you have many records with the same structure, and it would be easy enough to throw them into a database. With a SQL based database engine like MySQL, you can also construct queries using the standard SQL language. Database optimizations can also help to increase the performance of these types of queries, for example, you can used indexes and keys. If your data needs to be updated regularly, than MySQL will likely provide better performance as it will not have to rewrite the XML file. If you need your application to scale to many simultaneous connections of sophisticated queries, you are definitely going to want to go with some sort of SQL solution.
Depending upon your application though, sometimes there are other ways to store and access your data. I for one once needed to create a persistent data structure on the disk which could be accessed very quickly, but never updated. For that, I used cdb. There are also other database systems out there like the Berkeley database, and some No-SQL solutions such as couchdb and mongodb. I posed a somewhat interesting question here on stackoverflow on the use of No-SQL solutions a little while back which you may find interesting as well.
This is really just a sampling of different considerations you may want to make when you are choosing how you want to store your data. Think about questions like: How frequently will things be queried? or updated? What will your queries look like? What kinds of applications do you need to access your information from? etc.