I need to find the latest post for each author and then group the results so I only a single latest post for each author.
SELECT wp_posts.* FROM wp_posts
WHERE wp_posts.post_status='publish'
AND wp_posts.post_type='post'
GROUP BY wp_posts.post_author
ORDER BY wp_posts.post_date DESC
This is correctly grouping the output so I only get one post per author, but it is ordering the results after they have been grouped and not before they have been selected.
select wp_posts.* from wp_posts
where wp_posts.post_status='publish'and wp_posts.post_type='post'
group by wp_posts.post_author
having wp_posts.post_date = MAX(wp_posts.post_date) /* ONLY THE LAST POST FOR EACH AUTHOR */
order by wp_posts.post_date desc
EDIT:
After some comments I have decided to add some additional informations.
The company I am working at also uses Postgres and especially SQL Server. This databases don't allow such queries. So I know that there is a other way to do this (I write a solution below). You shoud also have to know what you do if you don't group by all columns treated in the projection or use aggregate functions. Otherwise let it be!
I chose the solution above, because it's a specific question. Tom want to get the recent post for each author in a wordpress site. In my mind it is negligible for the analysis if a author do more than one post per second. Wordpress should even forbid it by its spam-double-post detection. I know from personal experience that there is a really significant benefit in performance doing a such dirty group by with MySQL. But if you know what you do, then you can do it! I have such dirty groups in apps where I'm professionally accountable for. Here I have tables with some mio rows which need 5-15s instead of 100++ seconds.
May be useful about some pros and cons: http://ftp.nchu.edu.tw/MySQL/tech-resources/articles/debunking-group-by-myths.html
SELECT
wp_posts.*
FROM
wp_posts
JOIN
(
SELECT
g.post_author
MAX(g.post_date) AS post_date
FROM wp_posts as g
WHERE
g.post_status='publish'
AND g.post_type='post'
GROUP BY g.post_author
) as t
ON wp_posts.post_author = t.post_author AND wp_posts.post_date = t.post_date
ORDER BY wp_posts.post_date
But if here is more then one post per second for a author you will get more then one row and not the only last one.
Now you can spin the wheel again and get the post with the highest Id. Even here it is at least not guaranteed that you really get the last one.
Not sure if I understand your requirement correct but following inner statement gets the list of the latest post_date for each author and joins these back with the wp_posts table to get a complete record.
SELECT *
FROM wp_posts wp
INNER JOIN (
SELECT post_author
, MAX(post_date) AS post_date
FROM wp_posts
WHERE post_status = 'publish'
AND post_type = 'post'
GROUP BY
post.author
) wpmax ON wpmax.post_author = wp.post_author
AND wpmax.post_date = wp.post_date
ORDER BY
wp.post_date DESC
I think that #edze response is wrong.
In the MySQL manual you can read:
MySQL extends the use of GROUP BY so that the select list can refer to
nonaggregated columns not named in the GROUP BY clause. You can use
this feature to get better performance by avoiding unnecessary column
sorting and grouping. However, this is useful primarily when all
values in each nonaggregated column not named in the GROUP BY are the
same for each group. The server is free to choose any value from each
group, so unless they are the same, the values chosen are
indeterminate. Furthermore, the selection of values from each group
cannot be influenced by adding an ORDER BY clause. Sorting of the
result set occurs after values have been chosen, and ORDER BY does
not affect which values the server chooses.
Two great references:
http://kristiannielsen.livejournal.com/6745.html
http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/
Sorry, but I can not comment the #edze response because of my reputation, so I have written a new answer.
Do a GROUP BY after the ORDER BY by wrapping your query with the GROUP BY like this:
SELECT t.* FROM (SELECT * FROM table ORDER BY time DESC) t GROUP BY t.author
it doesn't matter if you order before or after the group-statement, because order means only that 213 goes to 123 or 321 and not more. group by takes only SOME entry per column, not only the latest. I consider you working with subselects here like
SELECT wp_posts.* FROM wp_posts
WHERE wp_posts.post_status='publish'
AND wp_posts.post_type='post'
AND wp_posts.post_date = (Select max(post_date) from wp_posts where author = ... )
What do you think about this?? Seems to work for me
SELECT wp_posts.post_author, MAX(wp_posts.post_date), wp_posts.status, wp_posts.post_type
FROM wp_posts
WHERE wp_posts.post_status='publish'
AND wp_posts.post_type='post'
GROUP BY wp_posts.post_author
It brings me all the Authors with the most updated post_date ... Do you identify a problem there?? I don't
SELECT wp_posts.*,max(wp_posts.post_date) FROM wp_posts
WHERE wp_posts.post_status='publish'
AND wp_posts.post_type='post'
GROUP BY wp_posts.post_author
When our table became large, performance need to checked also.
I checked all the options in the questions here, with a PM system with a 136K messages and link table with 83K rows.
When you need only count, or only IDs - Alex's solution is the best.
SELECT wp_posts.post_author, MAX(wp_posts.post_date), wp_posts.status, wp_posts.post_type
FROM wp_posts
WHERE wp_posts.post_status='publish'
AND wp_posts.post_type='post'
GROUP BY wp_posts.post_author
When you need other fields, I need to modify Husky110 solution (to my table design - here it is only example - not checked), that in my tables 10x faster than the subquery option:
SELECT wp_posts.* FROM wp_posts,
(Select post_id as pid, max(post_date) maxdate from wp_posts where author = ... group by author order by maxdate desc limit 4) t
WHERE wp_posts.post_status='publish'
AND wp_posts.post_type='post'
AND wp_posts.post_id = pid
This change can select more than one post (one for user, for example), and can be modified to other solutions.
Moshe.
Use the below code...
<?php
//get all users, iterate through users, query for one post for the user,
//if there is a post then display the post title, author, content info
$blogusers = get_users_of_blog();
if ($blogusers) {
foreach ($blogusers as $bloguser) {
$args = array(
'author' => $bloguser->user_id,
'showposts' => 1,
'caller_get_posts' => 1
);
$my_query = new WP_Query($args);
if( $my_query->have_posts() ) {
// $user = get_userdata($bloguser->user_id);
// echo 'This is one post for author with User ID: ' . $user->ID . ' ' . $user- >user_firstname . ' ' . $user->user_lastname;
while ($my_query->have_posts()) : $my_query->the_post(); ?>
<?php the_title(); ?>
<small><?php the_time('F jS, Y') ?> by <?php the_author_posts_link() ?> </small><?php
the_content();
endwhile;
}
}
}
?>
HERE a simple answer from
http://www.cafewebmaster.com/mysql-order-sort-group
SELECT * FROM
(
select * from `my_table` order by timestamp desc
) as my_table_tmp
GROUP BY catid
ORDER BY nid desc
it worked wonders for me
Related
I have the following query, which currently takes about 0.3s to load, causing a heavy load on my Wordpress site.
SELECT SQL_CALC_FOUND_ROWS wp11_posts.ID
FROM wp11_posts
WHERE 1=1
AND ( wp11_posts.ID NOT IN (
SELECT object_id
FROM wp11_term_relationships
WHERE term_taxonomy_id IN (137,141) )
AND (
SELECT COUNT(1)
FROM wp11_term_relationships
WHERE term_taxonomy_id IN (53)
AND object_id = wp11_posts.ID ) = 1 )
AND wp11_posts.post_type = 'post'
AND ((wp11_posts.post_status = 'publish'))
GROUP BY wp11_posts.ID
ORDER BY wp11_posts.post_date DESC
LIMIT 0, 5
Where should I start to make it execute faster? Is there an apparent mistake standing out, that should definitely had been done differently?
You have a so-called dependent subquery (a/k/a correlated subquery) in your example. It's a performance killer.
WHERE (
SELECT COUNT(1)
FROM wp_term_relationships
WHERE term_taxonomy_id IN (53)
AND object_id = wp_posts.ID
) = 1
Refactoring it to an independent subquery looks like this:
SELECT SQL_CALC_FOUND_ROWS wp_posts.ID
FROM wp_posts
JOIN (
SELECT object_id
FROM wp_term_relationships
WHERE term_taxonomy_id IN (53)
GROUP BY object_id
HAVING COUNT(*) = 1
) justone ON wp_posts.ID = justone.object_id
... WHERE ...
See how this works? It needs to scan term_relationships just one time looking for object_ids meeting your criterion (just one). Then the ordinary inner JOIN excludes posts rows that don't meet that criterion. (The dependent subquery loops to scan the table multiple times, while we wait.)
The SQL_FOUND_ROWS thing: WordPress puts it there to help with "pagination" -- it lets WordPress figure out how many pages (in your case of five items) there are to display. It provides data to the familiar
987 Items << < 2 of [ 20 ] > >>
page-selection interface you see in many parts of WordPress: it counts all the items matched by your query (987 in this example), not just one pageload of them.
If you don't need that pagination you can turn it off by giving a 'nopagination' => true element to WP_Query(). But if your query only yields a small number of items without the LIMIT clause, this probably doesn't matter much. If you wrote the query yourself, just leave it out along with the ORDER BY and LIMIT clauses.
So, leaving in the pagination stuff, a better query is
ANALYZE FORMAT=JSON SELECT wp_posts.ID
FROM wp_posts
JOIN (
SELECT object_id
FROM wp_term_relationships
WHERE term_taxonomy_id IN (53)
GROUP BY object_id
HAVING COUNT(*) = 1
) justone ON wp_posts.ID = justone.object_id
WHERE 1 = 1
AND (
wp_posts.ID NOT IN (
SELECT object_id
FROM wp_term_relationships
WHERE term_taxonomy_id IN (137,141)
)
AND wp_posts.post_type = 'post'
AND wp_posts.post_status = 'publish')
GROUP BY wp_posts.ID
ORDER BY wp_posts.post_date DESC LIMIT 0, 5
You also have an unnecessary GROUP BY near the end of your query. It doesn't hurt performance: MySQL can tell it's not needed in this case and doesn't do anything with it. But it is extra stuff. If you wrote the query yourself leave it out.
SQL_CALC_FOUND_ROWS requires doing nearly as much work as the same query without the LIMIT. [However, removing it without doing most of the following things probably won't help much.]
Do you already have this plugin installed? https://wordpress.org/plugins/index-wp-mysql-for-speed/ If not, that may be a good starting point.
WP is not designed to handle millions of posts/attributes/terms; you may have move on beyond WP.
Using JOIN or LEFT JOIN or [NOT] EXISTS ( SELECT 1 ... ) may be more efficient than IN ( SELECT ... ), especially in older versions of MySQL.
Is your SELECT COUNT(1) attempting to demand exactly 1? That is, 2 would be disallowed? If you really wanted to know if any exist, then use
AND EXISTS( SELECT 1 FROM wp11_term_relationships
WHERE term_taxonomy_id IN (53)
AND object_id = wp11_posts.ID )`
A better index for wp11_posts [I don't know whether your WP or the Plugin has this already]:
INDEX(post_status, post_type, -- first, either order is OK
post_date, ID) -- last, in this order
Having the GROUP BY and ORDER BY the 'same' may eliminate a sort. The following change will probably give you the same results, but faster.
GROUP BY wp11_posts.ID
ORDER BY wp11_posts.post_date DESC
-->
GROUP BY wp11_posts.post_date, wp11_posts.ID
ORDER BY wp11_posts.post_date DESC, wp11_posts.ID DESC
I have a table wp_views, with columns postid and views
I want to get the IDs that have the highest values of views (top 4)
Then return the title and link from wp_posts using the postid.
What's the right way of doing this?
You can try the following
global $wpdb;
$top4=$wpdb->get_results('SELECT post_title, post_name from `'.$wpdb->prefix.'views`
INNER JOIN `'.$wpdb->prefix.'posts` ON `postid`=`ID`
ORDER BY `views` DESC
LIMIT 4;', ARRAY_A);
I have tried to replicate your table structure from what your write and from this i have come up with the following:
SELECT id, title, link
FROM wp_views RIGHT JOIN wp_posts ON wp_posts.id = wp_views.post_id
ORDER BY views DESC
LIMIT 4;
you can try it out here: http://sqlfiddle.com/#!9/1cea23/1
I am using RIGHT JOIN to allow null values in the wp_posts part of the result. If you want to avoid NULL values in your results you can use INNER JOIN instead.
I have a meta key which is set by a select drop down so a user can select an option between 1 and 14 and then save their post. I want the posts to display on the page from 1 to 14 ordered by date but if the user creates a new set of posts the next day I also want this to happen so you have posts 1 to 14 each day displaying in that order.. the SQL i have so far is as follows
SELECT SQL_CALC_FOUND_ROWS
wp_postmeta.meta_key,
wp_postmeta.meta_value,
wp_posts.*
FROM wp_posts
INNER JOIN wp_postmeta ON (wp_posts.ID = wp_postmeta.post_id)
WHERE 1=1
AND wp_posts.post_type = 'projectgallery'
AND ( wp_posts.post_status = 'publish'
OR wp_posts.post_status = 'private')
AND (wp_postmeta.meta_key = 'gallery_area' )
GROUP BY wp_posts.post_date asc
ORDER BY CAST(wp_postmeta.meta_value AS UNSIGNED) DESC,
DATE(wp_posts.post_date) desc;
Which gives me the following output noticte thatthe posts entered at different dates with either 1 or 3 show up in sequence, ideally i want the latest ones to display directly after 14 so it starts over again. the number 14 should not be static either as if someone adds another option to the select then it will increase and decrease if an option is removed.
GROUP BY is confusingly named. It only makes sense when there's a SUM() or COUNT() or some such function in the SELECT clause. It's not useful here.
The canonical way of getting a post_meta.value into a result set of post items is this. You're close but this makes it easier to read.
SELECT SQL_CALC_FOUND_ROWS
ga.meta_value gallery_area,
p.*
FROM wp_posts p
LEFT JOIN wp_postmeta ga ON p.ID = ga.post_id AND ga.meta_key = 'gallery_area'
WHERE 1=1
AND p.post_status IN ('publish', 'private')
AND p.post_type = 'projectgallery'
Notice the two parts of the ON clause in the JOIN. That way of doing the SQL gets you just the meta_key value you want cleanly.
So, that's your result set. You'll get a row for every post. If the metadata is missing, you'll get a NULL value for gallery_area.
Then you have to order the result set the way you want. First order by date, then order by gallery_area, like so:
ORDER BY DATE(p.post_date) DESC,
0+gallery_area ASC
The 0+value trick is sql shorthand for casting the value as an integer.
Edit. Things can get fouled up if the meta_value items contain extraneous characters like leading spaces. Try diagnosing with these changes. Put
DATE(p.post_date) pdate,
0+ga.meta_value numga,
ga.meta_value gallery_area
in your SELECT clause. If some of the numga items come up zero, this is your problem.
Also try
ORDER BY DATE(p.post_date) DESC,
0+TRIM(gallery_area) ASC
in an attempt to get rid of the spaces. But they might not be spaces.
I have this query for load user stream in my app , is it too hard if we have 10000 matched row in 'follow' ?
SELECT *
FROM post
WHERE user_id
IN (SELECT follow_id
FROM follow
WHERE id='$some_id')
AND type='accepted'
ORDER BY id DESC LIMIT $page , 20
Syntactically your code looks correct.. I don't see any errors so then if you're talking about efficiency I would join the tables and include the second filter on the JOIN
SELECT p.*
FROM post p
JOIN follow f
ON f.follow_id = p.user_id
AND f.id = '$some_id'
WHERE p.type = 'accepted'
ORDER BY p.id DESC LIMIT $page , 20
MySQL handles large sets of data a lot better through a join than with an IN()...
Think of it this way.. because the IN() can have pretty much anything inside of it, MySQL has to check it with everything returned for each row... instead of checking once when you JOIN..
With that many returning, I have a feeling a Join might be more efficient
SELECT *
FROM post p Join
follow f On p.user_id = f.follow_id
WHERE f.id='$some_id'
AND p.type='accepted'
ORDER BY p.id DESC LIMIT $page , 20
Was looking for some help and found some but nothing for when the FROM also is a subquery.
SELECT COUNT(*)
FROM
( SELECT tc.*,
( SELECT status FROM test_case_executions tce
WHERE tce.test_case_id = tc.id
ORDER BY tce.execution_date DESC, tce.id DESC LIMIT 1
) AS last_status FROM test_cases tc
) a
WHERE a.last_status = '$status'
Is there a way in CI to just use this and execute it or can someone help me write this in the way CI wants it? Thanks
Everything you need can really be found here, as mentioned in above comments. Just to get you started, here's how you could do it:
$this->db->query("
SELECT COUNT(*) AS amount
FROM ( SELECT tc.*,
( SELECT status
FROM test_case_executions AS tce
WHERE tce.test_case_id = tc.id
ORDER BY tce.execution_date DESC, tce.id DESC
LIMIT 1) AS last_status
FROM test_cases AS tc
) AS a
WHERE a.last_status = ?
", array($status));
Basicly this is what the comments are saying. What makes this more "CI convenient" than simple mysql_query etc. is that you're escaping passed values to free yourself from errors and sql injections. Note the last part ? and the second parameter array($status). I also styled this query to be a bit easier on the eye (imo).
You might think "But I wanna use Active Records! D:", however more advanced stuff requires you to leave the comfort zone. Good luck!