Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm developing an application to post news. Could anyone suggest me a best approach keeping performance and Scalability in view
Approach 1 :
I would like to have Country as a database and cities as a table and post the news to individual table. This way i'm pulling traffic to respective databases/cities
But its very hard to show world news. I need to join multiple databases. I still don't know about the performance here
Approach 2:
Take a Single database and save the news into multiple cities tables. This way i just need to join multiple table for world news. I don't see a difference between approach 1 and approach 2.
Approach 3:
Take single Database and post each news w.r.t country_id and city_id. Have proper indexing for country_id and city_id.
I'm inclined towards approach 3, but if there are 1M records and search becomes too heavy isnt ??
Can anyone suggest me an approach please.
I'm using MySQL Database.
Thanks.
Approach 3 is the only sensible choice. MySQL will handle a million rows without any trouble.
It's pretty simple to build a prototype, stuff 10 million rows of random(ish) data in it, and measure performance. 10 million is not a typo. When it's practical, test with 10 times the data you expect. Learn to use EXPLAIN.
You should be able to build this kind of prototype in less than half an hour. It's a good skill to practice.
One database per country and one table per city are poor substitutes for partitioning.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
Say there are 1,000 tables in a database, and each table has 1,000 rows. When I search for single table from these 1,000 tables, is the search time same as that required to search for data within one of the tables?
In other words, does SQL use the same search algorithm to find a table out of 1,000 tables as it does to get data from a table with 1,000 rows?
No, MySQL doesn't use the same search algorithm to find a table.
MySQL maintains an in-memory "data dictionary" so when you run a query that names a specific table, it looks up that table very quickly. It's much faster for MySQL to identify a table than to search for data within a table. For example, the database servers I maintain at my job have over 150,000 tables, and this isn't a problem.
Does this mean you should split up your data over many tables to make it run faster? No -- that's not usually a good tradeoff. It makes your application code more complex, since your code needs to pick which table to query. You may also find cases where you wish the data were in one table, if you have to search for results across many of your tables.
Here are a couple of principles to follow:
"Everything should be made as simple as possible, but not simpler." (attributed to Albert Einstein)
"First make it work, then make it right, and, finally, make it fast." (Stephen C. Johnson and Brian W. Kernighan, 1983)
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Okay, so I have my user table ready with columns for all the technical information, such as username, profile picture, password and so on. Now I'm at a situation where I need to add superficial profile information, such as location, age, self-description, website, Facebook account, Twitter account, interests etc. In total, I calculated this would amount to 12 new columns, and since my user table already has 18 columns, I come at a crossroads. Other questions I read about this didn't really give a bottom-line answer of the method that is most efficient.
I need to find out if there is a more efficient way, and what is the most efficient way to store this kind of information? The base assumption being that my website would in the future have millions of users, so an option is needed that is able to scale.
I have so far concluded two different options:
Option 1: Store superficial data in user table, taking the total column count in users table up to 30.
Or
Option 2: Store superficial data in separate table, connecting that with Users table.
Which of these has better ability to scale? Which is more efficient? Is there a third option that is better than these two?
A special extra question also, if anyone has information about this; how do the biggest sites on the internet handle this? Thanks to anyone who participates with an answer, it is hugely appreciated.
My current databse is MySQL with rails mysql2 gem in Rails 4.
In your case, I would go with the second option. I suppose this would be more efficient because you would retrieve data from table 1 whenever the user logins and you would use data from table 2 (superficial data) whenever you change his preferences. You would not have to retrieve all data each time you want to do something. In the bottom line, I would suggest modelling your data according to your usage scenarios (use cases), creating data entities (eg tables) matching your use case entities. Then you should take into account the database normalization principles.
If you are interested on how these issues are handled by the biggest sites in the world, you should know that they do not use relational (SQL) databases. They actually use NoSQL databases, which run on a distributed function. This is a much more complicated scenario than yours. If you want to see related tools, you could start reading about Cassandra and hadoop.
Hope I helped!
If you will need to access to these 30 columns of information frequently, you could put all of them into the same table. That's what some widely-used CMS-es do because even though a row is big, it's faster to retrieve one big row than plenty of small rows on various tables (more SQL requests, more searches, more indexes, ...).
Also a good read for your problem is Database normalization.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
For a website having 2000000 users, out of which each user shares thousands of pictures and on each picture there will be thousands of comments, in such scenario there will be more than 2000 million comments, so how can I manage this much of big data using MySql. How can following methods improve performance of my database server
Use of table partationing
Use MySQL clusters
Use MySQL with memcached
Please explain other methods and best practices to handle such big database tables
On top of the mentioned optimization, choosing the right indexes on the right fields is crucial for your query performance, make sure your tables are indexed on everything you group, order or search based on.
Also make sure to check out Chapter 8 of the MySql reference which discusses optimization
What you really should be focusing on is optimizing the structure, queries and indexes before getting into memcached and MySql clusters.
As your database grows you monitor the performance and optimize accordingly.
In this case i dont thinl traditional RDBMS is what you need :) , more like NoSQl is what would serve you best
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I'm a bit newer to structuring databases and I was wondering if, say I have 38 different pieces of data that I want to have per record. Is it better to break that up into say a couple different tables or can I just keep it all in one table.
In this case I have a table of energy usage data for accounts, I have monthly usage, monthly demand, and demand percentage, then 2 identifying keys for each which comes out to 38 pieces of data for each record.
So is it good practice to break it up or should I just leave that all as one table? Also are there any effects on the efficiency of the product doing queries once this database ends up accumulating a couple thousand records at it peak?
Edit: I'm using Hibernate to query, I'm not sure if that would have any effect on the efficiency depending on how I end up breaking this data up.
First, check the normal forms:
1) Wiki
2) A Simple Guide to Five Normal Forms in Relational Database Theory
Second, aggregation data like "monthly sales" or "daily clicks" typically go to a separate tables. This is motivated not only by normal forms, but also by the implementation of the database.
For example, MySQL offers the Archive storage engine which is designed for that.
If you're watching current month's data, these may appear in the same table, or can be stored in cache. The per-month data in a separated table may be computed 1st day of month.
when you read a record do you use often all data? or you have different sections or masks (loaded separatly) to show energy usage data, monthly statistics and so on?
how many records do you plan to have on this table? If they grow dramatically and continually, is it possible create tables with a postfix for grouping them by period (for month, half year, year ...)?
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I have a copy of the PAF (UK Postcode) database it is currently stored in a MySql Database, and i use it on my site to pre-fill address details, however the Database huge 28,000,000+ records and it is very slow to search.
Any ideas how I could slit the DB to improve performance?
Thanks for the help guys!
that is not a large database, not even a large table. you must set appropiate indexes over the table and you will get good performance
There could be several ideas:
create indexes, meaningful ofcourse
review your schema. Avoid using huge datatypes like INT, BIGINT, TEXT etc unless absolutely required
optimize your queries so they use indexes, EXPLAIN statement might help
split your table into multiple smaller tables, say for example based on zones - north, east, west, south etc.
If your table doesn't require many INSERTs or UPDATEs, which I assume it might not being a postcode table, query cache can be a big help for faster queries
You will need to play around and see what option works best for you. But I think the first 2 should just be enough.
Hope it helps!