I've done some searching around but I haven't found a clear answer and explanation to my question.
I have 5 tables called table1, table2, table3, table4 and table5 and I want to do COUNT(*) on each of the tables to get the number of rows.
Should I try to combine these into one query or use 5 separate queries? I have always been taught that the least number of queries the better so I am guessing I should try to combine them into one query.
One way of doing it is to use UNION but does anyone know what the most efficient way of doing this is and why?
Thanks for any help.
Assuming you just want a count(*) from each one, then
SELECT
( SELECT count(*) from table1 ) AS table1,
( SELECT count(*) from table2 ) AS table2,
( SELECT count(*) from table3 ) AS table3,
etc...
)
would give you those counts as a single row. The DB server would still be running n+1 queries (n tables, 1 parent query) to get those counts, but it'd be the same story if you were using a UNION anyways. The union would produce multiple rows with 1 value in each, v.s. the 1 row with multiple values of the subselect method.
Provided you have read access to this (so rather not on shared hosting where you don’t have your actual own database instance) you could read that info from the INFORMATION_SCHEMA TABLES table, that does have a TABLE_ROWS column.
(Be aware of what it says for InnoDB tables there – so if you don’t you MyISAM and need the precise counts, the other answer has the better solution.)
Related
I have about 20 tables. These tables have only id (primary key) and description (varchar). The data is a lot reaching about 400 rows for one table.
Right now I have to get data of at least 15 tables at a time.
Right now I am calling them one by one. Which means that in one session I am giving 15 calls. This is making my process slow.
Can any one suggest any better way to get the results from the database?
I am using MySQL database and using Java Springs on server side. Will making view for all combined help me ?
The application is becoming slow because of this issue and I need a solution that will make my process faster.
It sounds like your schema isn't so great. 20 tables of id/varchar sounds like a broken EAV, which is generally considered broken to begin with. Just the same, I think a UNION query will help out. This would be the "View" to create in the database so you can just SELECT * FROM thisviewyoumade and let it worry about the hitting all the tables.
A UNION query works by having multiple SELECT stataements "Stacked" on top of one another. It's important that each SELECT statement has the same number, ordinal, and types of fields so when it stacks the results, everything matches up.
In your case, it makes sense to manufacturer an extra field so you know which table it came from. Something like the following:
SELECT 'table1' as tablename, id, col2 FROM table1
UNION ALL
SELECT 'table2', id, col2 FROM table2
UNION ALL
SELECT 'table3', id, col2 FROM table3
... and on and on
The names or aliases of the fields in the first SELECT statement are the field names that are used in the result set that is returned, so no worries about doing a bunch AS blahblahblah in subsequent SELECT statements.
The real question is whether this union query will perform faster than 15 individual calls on such a tiny tiny tiny amount of data. I think the better option would be to change your schema so this stuff is already stored in one table just like this UNION query outputs. Then you would need a single select statement against a single table. And 400x20=8000 is still a dinky little table to query.
To get a row of all descriptions into app code in a single roundtrip send a query kind of
select t1.description, ... t15.description
from t -- this should contain all needed ids
join table1 t1 on t1.id = t.t1id
...
join table1 t15 on t15.id = t.t15id
I cannot get you what you really need but here merging all those table values into single table
CREATE TABLE table_name AS (
SELECT *
FROM table1 t1
LEFT JOIN table2 t2 ON t1.ID=t2.ID AND
...
LEFT JOIN tableN tN ON tN-1.ID=tN.ID
)
I have 3 access tables with information from the past 3 years. There are tons of the same records in each but there are also unique records in each.
2 tables have the same unique primary key (ID) while the 3rd table has a different set of unique IDs
How do I combine and select all the unique ID's into one master table? Thanks
Not 100% sure I understand where the overlaps occur and not, but try this:
select ID
into All_Id
from (
select ID from Table1
union
select ID from Table2
union all
select ID from Table3
)
This presupposes that Table1 and Table2 might share some IDs, and you only want them listed once, but Table3 doesn't have any overlaps.
Truth be told, there is no harm in making them all union, other than maybe having the query run slower.
If you want unique IDs, use a UNION query. If you want everything, use a UNION ALL.
UNION = no dupes
UNION ALL = returns all records including dupes
The Access engine supports union queries but you have to manually write the union query in the SQL view. Design view is not available.
Depending on how much data you have from the past three years, the UNION might take some time and may even blow up a few times. I'd make a back up copy first just in case.
If you want purely unique IDs and a new table, here's what I would do:
1.) Write your union query.
SELECT ID FROM Table1
UNION
SELECT ID FROM Table2
...
2.) Save the query.
3.) Create a make table query (to select and combine all unique IDs into a presumably new master table).
4.) Run the make table query. The new table will be created.
Hope that helps. Let us know how you make out!
I'm working on a project involving joins between datasets and we have a requirement to allow previews of arbitrary joins between arbitrary datasets. Which is crazy, but thats why its fun. This is use facing so given a join I want to show ~10 rows of results quickly.
I've been basing my experimentation around different ways to sub-sample the different tables in such a way that I get at least a few result rows but keep the samples small enough that the join is fast and not cause the sampling to be expensive.
Here are the methods I've found pass the smell test. I would like to know a few things about them:
What types of joins or datasets would these fail at?
How could I identify those datasets?
If both of these are bad at the same thing, how could they be improved?
Is there a type of sampling I have not put here that is better?
Subselect with a limit.
Takes a random sample of one dataset to reduce the overall size.
SELECT col1, col2 FROM table1 JOIN
(SELECT col1, col2 FROM table2 LIMIT #) AS sample2
on table1.col1 = sample2.col1
LIMIT 10;
I like this because its easy and there is potential in the future to be smart about which table to samples from. It is also possible to select a portion where table1.col1 never equals sample2.col1 so no results are returned.
Find equals values of col1 and Sample them
More complicated, multi-query approach. Here I would do a distinct select of the columns to join on, compare the results to find common values and then do a subselect limiting the results to the common values.
SELECT DISTINCT col1 FROM table1;
SELECT DISTINCT col1 FROM table2;
commonVals = intersection of above results
SELECT col1, col2 FROM table1 JOIN
(SELECT col1, col2 FROM table2 WHERE col1 IN(commonVals) LIMIT #) as sample2
on table1.col1 = sample2.col1
LIMIT 10;
This gets us a good sample of table2, but the select distinct query may be more expensive than the join. I believe there may be a way to determine if this method is faster if you knew something about how long the distinct cals would take but at this point we don't have that much knowledge of the datasets.
Slap a LIMIT on the join
This is the easiest and the one I'm leaning towards.
SELECT col1, col1 FROM table1 join table2 on table1.col1 = table2.col1 LIMIT #
Assuming the join is good, this will always return data and for at least a large set of cases it will do it fast.
The problem with the first approach is that the rows in the first table might not have a match in the second table. Remember, inner joins not only do matching, they also do filtering.
The second approach could work, if all the columns used for joining have indexes on them. You can then get a list of matching ids by doing something like:
where id in (select id from table1) and id in (select id from table2) . . .
This gets rid of the initial code and should be pretty fast.
The third method is using the capabilities of the database most directly. You would be depending on the ability of MySQL to optimize according to the size of the result set. This is something that it does, at least in theory.
I would strongly recommend the third approach in conjunction with indexes on the columns used in the joins. This requires minimal changes to the query (just add a limit clause). It allows the database to pursue additional optimizations, if appropriate. It works on a more general set of queries.
Forgive me if this seems like common sense as I am still learning how to split my data between multiple tables.
Basically, I have two:
general with the fields userID,owner,server,name
count with the fields userID,posts,topics
I wish to fetch the data from them and cannot decide how I should do it: in a UNION:
SELECT `userID`, `owner`, `server`, `name`
FROM `english`.`general`
WHERE `userID` = 54 LIMIT 1
UNION
SELECT `posts`, `topics`
FROM `english`.`count`
WHERE `userID` = 54 LIMIT 1
Or a JOIN:
SELECT `general`.`userID`, `general`.`owner`, `general`.`server`,
`general`.`name`, `count`.`posts`, `count`.`topics`
FROM `english`.`general`
JOIN `english`.`count` ON
`general`.`userID`=`count`.`userID` AND `general`.`userID`=54
LIMIT 1
Which do you think would be the more efficient way and why? Or perhaps both are too messy to begin with?
It's not about efficiency, but about how they work.
UNION just unions 2 different independent queries. So you get 2 result sets one after another.
JOIN appends each row from one result set to each row from another result set. So in total result set you have "long" rows (in terms of amount of columns)
Just for completeness as I don't think it's mentioned elsewhere: often UNION ALL is what's intended when people use UNION.
UNION will remove duplicates (so relatively expensive because it requires a sort). This remove duplicates in the final result (so it doesn't matter if there's a duplicate in a single query or the same data from individual SELECTs). UNION is a set operation.
UNION ALL just sticks the results together: no sorting, no duplicate removal. This is going to be quicker (or at least no worse) than UNION.
If you know the individual queries won't return duplicate results use UNION ALL. (In fact often best to assume UNION ALL and think about UNION if you need that behaviour; using SELECT DISTINCT with UNION is redundant).
You want to use a JOIN. Joining is used to creating a single set which is a combination of related data. Your union example doesn't make sense (and probably won't run). UNION is for linking two result sets with identical columns to create a set that has the combined rows (it does not 'union' the columns.)
If you want to fetch users and near user posts and topics. you need to write QUERY using JOIN like this:
SELECT general.*,count.posts,count.topics FROM general LEFT JOIN count ON general.userID=count.userID
how to select all the data from many tables?
i try
`"SELECT * FROM `table1`, `table2`"`
,
but result none understandable for me. it returns only some rows from table1, and 3 times all the data from table2. i've red one same question here, but don't understand the answer. so could you help me? thanks in advance.
update:
when i try
(SELECT * FROM `table1`) UNION (SELECT * FROM `table2`)
it returns
#1222 - The used SELECT statements have a different number of columns
By doing that select with the "," between 2 tables and no WHERE clause, you are doing an implicit cross join of the 2 tables (all combinations of rows between the 2 tables). This is likely Not What You Want. See UNION, as mentioned by other answers.
Use the UNION SELECT construct
How do you want the data displayed? Are both tables of the same schema? If so you could use the UNION operator.
http://www.w3schools.com/sql/sql_union.asp
If you are just trying to show the data from many tables and there is no relationship between the data, you have to program logic instead of database logic.
show tables (SQL command)
foreach result (programming language of your choice)
select * from tablename (SQL command)