MySQL: showing totals - mysql

I am trying to figure out how to have PHP check and print 2 different functions.
Both of these questions are referring to table called "remix".
The first, and more important problem at the minute, is I would like to know how to show how many DIFFERENT values are under "author", as to compile the amount of total authors registered. I need to know not only how to most efficiently use COUNT on returning UNIQUE names under "author", but how to show it inline with the total number of rows, which are currently numbered.
The second question would be asking how I would be able to set up a top 3 artists, based on how many times their name occurs in a list. This also would show on the same page as the above code.
Here is my current code:
require 'remix/archive/connect.php';
mysql_select_db($remix);
$recentsong = mysql_query("SELECT ID,song,author,filename FROM remix ORDER by ID desc limit 1;");
$row = mysql_fetch_array($recentsong);
echo'
<TABLE BORDER=1><TR><TD WIDTH=500>
Currently '.$row['ID'].' Remixes by **(want total artists here)** artists.<BR>
Most recent song: <A HREF=remix/archive/'.$row['filename'].'>'.$row['song'].'</A> by <FONT COLOR=white>'.$row['author'].'</FONT>
So as you can see, I have it currently set up to show the most recent song (not the most efficient way), but want the other things in there, such as at least the top contributor, but don't know if I would be able to put it all in one php block, break it, or be able to do it all within one quarry call, with the right code.
Thanks for any help!

I'm not sure I really understood everything in your question but we'll work this through together :p
I've created an SQLFiddle to work on some test data: http://sqlfiddle.com/#!2/9b613/1/0.
Note the INDEX on the author field, it will assure good performance :)
In order to know how to show how many DIFFERENT values are under "author" you can use:
SELECT COUNT(DISTINCT author) as TOTAL_AUTHORS
FROM remix;
In order to know the total number of rows, which are currently numbered you can use:
SELECT COUNT(*) as TOTAL_SONGS
FROM remix;
And you can combine both in a single query:
SELECT
COUNT(DISTINCT author) as TOTAL_AUTHORS,
COUNT(*) as TOTAL_SONGS
FROM remix;
To the top 3 subject now. This query will give you the 3 authors with the greatest number of songs, first one on top:
SELECT
author,
COUNT(*) as AUTHOR_SONGS
FROM remix
GROUP BY author
ORDER BY AUTHOR_SONGS DESC
LIMIT 3;
Let me know if this answer is incomplete and have fun with SQL !
Edit1: Well, just rewrite your PHP code in:
(...)
$recentsong = mysql_query("SELECT COUNT(DISTINCT author) as TOTAL_AUTHORS, COUNT(*) as TOTAL_SONGS FROM remix;");
$row = mysql_fetch_array($recentsong);
(...)
Currently '.$row['TOTAL_SONGS'].' Remixes by '.$row['TOTAL_AUTHORS'].' artists.<BR>
(...)
For the top3 part, use another mysql_query and create your table on the fly :)

Related

yii pagination issue trying to use 2 criterias

Disclaimer I'm self taught. Got my rudimentary knowledge of php reading forums. I'm an sql newb, and know next to nothing about yii.
I've got a controller that shows the products on our webstore. I would like the out of stock products to show up on the last pages.
I know I could sort by stock quantity but would like the in stock products to change order every time the page is reloaded.
My solution (probably wrong but kinda works) is to run two queries. One for the product that has stock, sorted randomly. One for the out of stock product also ordered randomly. I then merge the two resulting arrays. This much has worked using the code below (although I feel like there must be a more efficient way than running two queries).
The problem is that this messes up the pagination. Every product returned is listed on the same page and changing pages shows the same results. As far as I can tell the pagination only works for 1 CDbCriteria at a time. I've looked at the yii docs for CPagination for a way around this but am not getting anywhere.
$criteria=new CDbCriteria;
$criteria->alias = 'Product';
$criteria->addCondition('(inventory_avail>0 OR inventoried=0)');
$criteria->addCondition('Product.parent IS NULL');
$criteria->addCondition('web=1');
$criteria->addCondition('current=1');
$criteria->addCondition('sell>sell_web');
$criteria->order = 'RAND()';
$criteria2=new CDbCriteria;
$criteria2->alias = 'Product';
$criteria2->addCondition('(inventory_avail<1 AND inventoried=1)');
$criteria2->addCondition('Product.parent IS NULL');
$criteria2->addCondition('web=1');
$criteria2->addCondition('current=1');
$criteria2->addCondition('sell>sell_web');
$criteria2->order = 'RAND()';
$crit1=Product::model()->findAll($criteria);
$crit2=Product::model()->findAll($criteria2);
$models=array_merge($crit1,$crit2);
//I know there is something wrong here, no idea how to fix it..
$count=Product::model()->count($criteria);
$pages=new CPagination($count);
//results per page
$pages->pageSize=30;
$pages->applyLimit($criteria);
$this->render('index', array(
'models' => $models,
'pages' => $pages
));
Clearly I am in over my head. Any help would be much appreciated.
Edit:
I figured that a third CDbCriteria that includes both the in stock and out of stock items could be used for the pagination (as it would include the same number of products as the combined results of the first 2). So I tried adding this (criteria1 and criteria2 remain the same):
$criteria3=new CDbCriteria;
$criteria3->alias = 'Product';
//$criteria3->addCondition('(inventory_avail>0 OR inventoried=0)');
$criteria3->addCondition('Product.parent IS NULL');
$criteria3->addCondition('web=1');
$criteria3->addCondition('current=1');
$criteria3->addCondition('sell>sell_web');
//$criteria3->order = 'RAND()';
$crit1=Product::model()->findAll($criteria);
$crit2=Product::model()->findAll($criteria2);
$models=array_merge($crit1,$crit2);
$count=Product::model()->count($criteria3);
$pages=new CPagination($count);
//results per page
$pages->pageSize=30;
$pages->applyLimit($criteria3);
$crit1=Product::model()->findAll($criteria);
$crit2=Product::model()->findAll($criteria2);
$models=array_merge($crit1,$crit2);
$this->render('index', array(
'models' => $models,
'pages' => $pages
));
I'm sure I'm missing something super obvious here... Been searching all day getting nowhere.
So you are running into what is IMO one of the potential drawbacks of natural language query builder frameworks. They can get your thinking on how you might approach a SQL problem going down a bad path when trying to work with the "out of the box" methods for building queries. Sometimes you might need to think about using raw SQL query capabilities that most every framework to provide in order to best address your problem.
So let's start with the basic SQL for how I would suggest you approach your problem. You can either work this into your query builder style (if possible) or make a raw query.
You could easily form a calculated field representing binary inventory status for sorting. Then also sort by another criteria secondarily.
SELECT
field1,
field2,
/* other fields */
IF(inventory_avail > 0, 1, 0) AS in_inventory
FROM product
WHERE /* where conditions */
ORDER BY
in_inventory DESC, /* sort items in inventory first */
other_field_to_sort ASC /* other sort criteria */
LIMIT ?, ? /* pagination row limit and offset */
Note that this approach only returns the rows of data you need to display. You move away from your current approach of doing a lot of work in the application to merge record sets and such.
I do question use of RAND() for pagination purposes as doing so will yield products potentially appearing on one page after another as the user paginates through the pages, with other products perhaps not showing up at all. Either that or you need to have some additional complexity added to your applicatoin to somehow track the "randomized" version of the entire result set for each specific user. For this reason, it is really unusual to see order randomization for paginated results display.
I know you mentioned you might like to spike out a randomized view to the user on a "first page". If this is a desire that is OK, but perhaps you decouple or differentiate that specific view from a wider paginated view of the product listing so as to not confuse the end user with a seemingly unpredictable pagination interface.
In your ORDER BY clause, you should always have enough sorting conditions to where the final (most specific) condition will guarantee you a predictable order result. Oftentimes this means you have to include an autoincrementing primary key field, or similar field that provides uniqueness for the row.
So let's say for example I had the ability for user to sort items by price, but you still obviously wanted to show all inventoried items first. Now let's say you have 100K products such that you will have many "pages" of products with a common price when ordered by price
If you used this for ordering:
ORDER BY in_inventory DESC, price ASC
You could still have the problem of a user seeing the same product repeated when navigating between pages, because a more specific criteria than price was not given and ordering beyond that criteria is not guaranteed.
You would probably want to do something like:
ORDER BY in_inventory DESC, price ASC, unique_id ASC
Such that the order is totally predictable (even though the user may not even know there is sorting being applied by unique id).

Get the number of created Wikipedia articles during a specific period of time

I want to get number of articles written/created in some language (say English) during a specific week (say last week). How can I run this query on Wikipedia?
I have no experience in wikipedia-api
You can run this query on the Wikipedia database:
SELECT COUNT(*)
FROM enwiki_p.recentchanges
WHERE rc_new = 1
AND rc_namespace = 0
AND rc_timestamp BETWEEN 20160417000000 AND 20160424000000
See the results here. If you want to count the number of new articles for another language, you can change enwiki_p to another two-character language code.

MySQL Join not Producing Desired Results

I'm working on writing a snippet for ModX that will find all document with the specified TV set to a user submitted value.
Here is a description of the tables I'm working with.
http://wiki.modxcms.com/index.php/Template_Variable_Database_Tables
Here is my query:
SELECT contentid
FROM prefix_site_tmplvar_contentvalues
JOIN prefix_site_tmplvar_contentvalues
ON prefix_site_tmplvars.id = prefix_site_tmplvar_contentvalues.tmplvarid
WHERE value="Red"
Currently it's producing results such as this:
http://pastebin.com/mEJ1w2be
Where each document ID will have a new row in the results for each Template Variable. So, for 7455 in the example there will be one array for the color="red" one for material="wood" one for size="small". Which, makes it difficult if I want to find a product that is red, small, and made of wood.
Is there a way that I could join these tables so that I could get one row per product with the document id and a set of template variable with associate values—not all broken up?
try
GROUP BY contentid
this will smush all the rows with the same contentid together.

inner query of subqery returning multiple rows

I am not that experience in sql so please forgive if its not a good question to ask,but i researched around almost for 3-4 days but no able to solve.
My problem is i have a table which have multiple image names in it,so what i have to do is whoever is the follower of a particular user i have to get the imaged from this table,so one user there can be multiple followers,so i have to fetch the images posted by all the followers.
Here is the subquery code snippet i am using.
SELECT id,
outfit_image,
img_title,
description
FROM outfitpic_list r2
WHERE Email=ANY(SELECT being_followed
FROM follower_table
WHERE follower='test#gmail.com')
So the inner query here returns multiple values,for each value(being_followed) i have to fetch all the images and display it,but with this query each time i get only one image.I tried IN also but didnot work out.
Table structure:-
Outfitpic_list table
id|outfit_image|datetime|Email|image_title|description
Follower_table
bring_followed|follower
Please help,I am stuck..!!
Thank you..!!
I think your problem may be the = sign between "E-mail" and "Any". Try this statement:
SELECT
id,
outfit_image,
img_title,
description
FROM outfitpic_list r2
WHERE Email IN
(
SELECT being_followed
FROM follower_table
WHERE follower='test#gmail.com'
)
It's the same statement, without the = sign, and the ANY keyword replaced with IN. (I cleaned it up a little to make it more readable)

Stumbleupon type query

Wow, makes your head spin!
I am about to start a project, and although my mySql is OK, I can't get my head around what required for this:
I have a table of web addresses.
id,url
1,http://www.url1.com
2,http://www.url2.com
3,http://www.url3.com
4,http://www.url4.com
I have a table of users.
id,name
1,fred bloggs
2,john bloggs
3,amy bloggs
I have a table of categories.
id,name
1,science
2,tech
3,adult
4,stackoverflow
I have a table of categories the user likes as numerical ref relating to the category unique ref. For example:
user,category
1,4
1,6
1,7
1,10
2,3
2,4
3,5
.
.
.
I have a table of scores relating to each website address. When a user visits one of these sites and says they like it, it's stored like so:
url_ref,category
4,2
4,3
4,6
4,2
4,3
5,2
5,3
.
.
.
So based on the above data, URL 4 would score (in it's own right) as follows: 2=2 3=2 6=1
What I was hoping to do was pick out a random URL from over 2,000,000 records based on the current users interests.
So if the logged in user likes categories 1,2,3 then I would like to ORDER BY a score generated based on their interest.
If the logged in user likes categories 2 3 and 6 then the total score would be 5. However, if the current logged in user only like categories 2 and 6, the URL score would be 3. So the order by would be in context of the logged in users interests.
Think of stumbleupon.
I was thinking of using a set of VIEWS to help with sub queries.
I'm guessing that all 2,000,000 records will need to be looked at and based on the id of the url it will look to see what scores it has based on each selected category of the current user.
So we need to know the user ID and this gets passed into the query as a constant from the start.
Ain't got a clue!
Chris Denman
What I was hoping to do was pick out a random URL from over 2,000,000 records based on the current users interests.
This screams for predictive modeling, something you probably wouldn't be able to pull off in the database. Basically, you'd want to precalculate your score for a given interest (or more likely set of interests) / URL combination, and then query based on the precalculated values. You'd most likely be best off doing this in application code somewhere.
Since you're trying to guess whether a user will like or dislike a link based on what you know about them, Bayes seems like a good starting point (sorry for the wikipedia link, but without knowing your programming language this is probably the best place to start): Naive Bayes Classifier
edit
The basic idea here is that you continually run your precalculation process, and once you have enough data you can try to distill it to a simple formula that you can use in your query. As you collect more data, you continue to run the precalculation process and use the expanded results to refine your formula. This gets really interesting if you have the means to suggest a link, then find out whether the user liked it or not, as you can use this feedback loop really improve the prediction algorithm (have a read on machine learning, particularly genetic algorithms, for more on this)
I did this in the end:
$dbh = new NewSys::mySqlAccess("xxxxxxxxxx","xxxxxxxxxx","xxxxxxxxx","localhost");
$icat{1}='animals pets';
$icat{2}='gadget addict';
$icat{3}='games online play';
$icat{4}='painting art';
$icat{5}='graphic designer design';
$icat{6}='philosophy';
$icat{7}='strange unusual bizarre';
$icat{8}='health fitness';
$icat{9}='photography photographer';
$icat{10}='reading books';
$icat{11}='humour humor comedy comedian funny';
$icat{12}='psychology psychologist';
$icat{13}='cartoons cartoonist';
$icat{14}='internet technology';
$icat{15}='science scientist';
$icat{16}='clothing fashion';
$icat{17}='movies movie latest';
$icat{18}="\"self improvement\"";
$icat{19}='drawing art';
$icat{20}='latest band member';
$icat{21}='shop prices';
$icat{22}='recipe recipes food';
$icat{23}='mythology';
$icat{24}='holiday resorts destinations';
$icat{25}="(rude words)";
$icat{26}="www website";
$dbh->Sql("DELETE FROM precalc WHERE member = '$fdat{cred_id}'");
$dbh->Sql("SELECT * FROM prefs WHERE member = '$fdat{cred_id}'");
#chos=();
while($dbh->FetchRow()){
$cat=$dbh->Data('category');
$cats{$cat}='#';
}
foreach $cat (keys %cats){
push #chos,"\'$cat\'";
push #strings,$icat{$cat};
}
$sqll=join("\,",#chos);
$words=join(" ",#strings);
$dbh->Sql("select users.id,users.url,IFNULL((select sum(scoretot.scr) from scoretot where scoretot.id = users.id and scoretot.category IN \($sqll\)),0) as score from users WHERE MATCH (description,lasttweet) AGAINST ('$words' IN BOOLEAN MODE) AND IFNULL((SELECT ref FROM visited WHERE member = '$fdat{cred_id}' AND user = users.id LIMIT 1),0) = 0 ORDER BY score DESC limit 30");
$cnt=0;
while($dbh->FetchRow()){
$id=$dbh->Data('id');
$url=$dbh->Data('url');
$score=$dbh->Data('score');
$dbh2->Sql("INSERT INTO precalc (member,user,url,score) VALUES ('$fdat{cred_id}','$id','$url','$score')");
$cnt++;
}
I came up with this answer about three months ago, and just cannot read it. So sorry, I can't explain how it finally worked, but it managed to query 2 million websites and choose one based on the history of a users past votes on other sites.
Once I got it working, I moved on to another problem!
http://www.staggerupon.com is where it all happens!
Chris