Select the most used value in document from SQL - sql-server-2008

I've documents with several line. For each line on this code there are some products. Those product has each one a category. So they can be mixed inside the same documents, so for example, in the document 4 I've 10 lines, 4 can be for example "fruit" category and the others 6 are for example "ice cream" category. So When I extract this document I retrieve a result where I've have
--------------------------------
CUSTOMER_NAME --- DOC NUMBER ------ CATEGORY
CUSTOMER_1 ---- 10 ------- FRUIT
CUSTOMER_1 ---- 10 ------- ICE CREAM
----------
My need is to retrieve only the row with the most used category, so in this case, "Ice Cream"
This is my code
DECLARE #dataa NVARCHAR(MAX) ;
DECLARE #datada NVARCHAR(MAX) ;
SET #datada = DATEADD(DAY, -1, GETDATE());
SET #dataa = DATEADD(DAY, -60, GETDATE());
SELECT
DSCCONTO1, TABCATEGORIE.DESCRIZIONE, TESTEDOCUMENTI.NUMERODOC
FROM
.dbo.TESTEDOCUMENTI
INNER JOIN
.dbo.ANAGRAFICACF ON CODCLIFOR = CODCONTO
INNER JOIN
.dbo.RIGHEDOCUMENTI ON PROGRESSIVO = IDTESTA
INNER JOIN
.dbo.ANAGRAFICAARTICOLI ON CODART = ANAGRAFICAARTICOLI.CODICE
INNER JOIN
.dbo.TABCATEGORIE ON CATEGORIA = TABCATEGORIE.CODICE
INNER JOIN
.dbo.TABCATEGORIESTAT ON CODCATEGORIASTAT = TABCATEGORIESTAT.CODICE
WHERE
.dbo.TESTEDOCUMENTI.DOCCHIUSO = '0'
AND .dbo.TESTEDOCUMENTI.BLOCCATO = '0'
AND DATADOC BETWEEN #dataa AND #datada
AND CODCLIFOR LIKE '%C%'
AND TESTEDOCUMENTI.TIPODOC = 'PCL'
GROUP BY
DSCCONTO1, TABCATEGORIE.DESCRIZIONE, TESTEDOCUMENTI.NUMERODOC

Related

Fetch Data that do not exist in a table from four related tables

I have four tables in my Database which are related to each other.
document_category(document_category_id, document_category)
document_type(document_type_id, document_category.document_category_id, document_type)
student(student_id, f_name, l_name, ...other_columns)
student_document(id, student.student_id, document_type.document_type_id, file)
document_category, document_type, student and student_document
Table student_document stores uploaded documents. I want a query to display a list of documents that a student did not upload.
I have tried
(SELECT document_type FROM document_category JOIN document_type ON document_category.document_category_id = document_type.document_category_id
) LEFT JOIN(SELECT FILE FROM student_document JOIN student ON student.student_id = student_document.student_id) ON document_type.document_type_id = student_document.document_type_id
And I get an error
#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'LEFT JOIN(
SELECT FILE
FROM
student_document
JOIN student ON stud...' at line 8
I also tried this
SELECT * FROM document_type A LEFT JOIN student_document B ON A.document_type_id = B.document_type_id WHERE B.document_type_id is null
which gives me
Results, but I cannot get data for a specific student.
and the last one I tried is
SELECT student.email, student_document.file, document_type, document_category FROM student, document_type, document_category, student_document WHERE NOT EXISTS(SELECT * FROM student_document WHERE student_id = 'M054/T19' AND document_type_id ='20') AND student.student_id = student_document.student_id AND document_type.document_category_id = document_category.document_category_id
Which gives me Undesirable, and it is not what I want.
DECLARE #dc TABLE(dc_id Int, ctg VarChar(30));
INSERT INTO #dc VALUES (3,'Admission'),(5,'Payment');
DECLARE #dt TABLE (dt_id Int, dc_id Int, dtp VarChar(30));
INSERT INTO #dt VALUES (27,3, 'Admission Offer'),
(28,3,'Acceptance Letter');
DECLARE #s TABLE(s_id Int, f_name VarChar(30))
INSERT INTO #s VALUES (1, 'Marco'), (2, 'Mike')
DECLARE #sd TABLE (sd_id Int, s_id Int, dt_id Int, [file] VarChar(30))
INSERT INTO #sd VALUES (10,1,27,'File01');
SELECT dc.dc_id, ctg, dt.dt_id, dt.dtp, s.s_id, f_name FROM #dc dc
JOIN #dt dt ON dc.dc_id = dt.dc_id
CROSS APPLY #s s
-- All expected documents
dc_id ctg dt_id dtp s_id f_name
3 Admission 27 Admission Offer 1 Marco
3 Admission 27 Admission Offer 2 Mike
3 Admission 28 Acceptance Letter 1 Marco
3 Admission 28 Acceptance Letter 2 Mike
-- Provided documents
SELECT dc.dc_id, ctg, dt.dt_id, dt.dtp, s.s_id, f_name FROM #dc dc
JOIN #dt dt ON dc.dc_id = dt.dc_id
JOIN #sd sd ON sd.dt_id = dt.dt_id
JOIN #s s ON s.s_id = sd.s_id
dc_id ctg dt_id dtp s_id f_name
3 Admission 27 Admission Offer 1 Marco
-- Subtracting Set from Set
SELECT dc.dc_id, ctg, dt.dt_id, dt.dtp, s.s_id, f_name FROM #dc dc
JOIN #dt dt ON dc.dc_id = dt.dc_id
CROSS APPLY #s s
EXCEPT
SELECT dc.dc_id, ctg, dt.dt_id, dt.dtp, s.s_id, f_name FROM #dc dc
JOIN #dt dt ON dc.dc_id = dt.dc_id
JOIN #sd sd ON sd.dt_id = dt.dt_id
JOIN #s s ON s.s_id = sd.s_id
dc_id ctg dt_id dtp s_id f_name
3 Admission 27 Admission Offer 2 Mike
3 Admission 28 Acceptance Letter 1 Marco
3 Admission 28 Acceptance Letter 2 Mike
This code worked perfectly. The first query before WHERE NOT EXISTS retrieves the list of all required documents and student list, this details list of documents that a student has to upload. The second part retrieves the list of documents a student has upload.
So the query retrieves a list of required documents that are not present in the list of uploaded documents.
SELECT sp.student_id, do.document_type_id, do.document_type
FROM student_profile sp
CROSS JOIN document_type do
WHERE NOT EXISTS (SELECT sd.student_id, sd.document_type_id, d.document_type FROM student_document sd
INNER JOIN document_type d ON sd.document_type_id = d.document_type_id
WHERE do.document_type_id = sd.document_type_id
AND sp.student_id = sd.student_id );

Function to return many values in one column based on multiple queries

I am about to a build a notification feature
The app is a car ads website
The dealer inserts car ads
The visitor Could save searches as string (URL)
---------------------------------------------------------
saved_search_id|visitor_id |search_url
---------------------------------------------------------
0 | 1 |type=0&price_max=10000&color=red
1 | 1 |type=2&price_max=15000&color=black
2 | 2 |type=3&price_max=20000&color=white
Whene the dealer inserts a new car, i parse all saved searches into SQL queries
//array(arrays(saved_search_id, saved_search_query))
array(
array(0, "EXISTS(SLECT car_id FROM Car WHERE type=0 AND price <= 10000 AND color = red)"),
array(1, "EXISTS(SLECT car_id FROM Car WHERE type=2 AND price <= 15000 AND color = black)"),
array(2, "EXISTS(SLECT car_id FROM Car WHERE type=3 AND price <= 20000 AND color = white)")
)
For each saved_search_query i check Whether the new car is included in search result or not. if yes, i send an Email to notify the visitor
i can't figure out how to build one query that returns relevant saved_search_id … instead of running all queries one by one (thousands of Saved searches)
Below is the closest expression to what i am trying to translate
CREATE FUNCTION get_saved_search_id(query, id){
if(query){
return id;
}
}
SELECT get_saved_search_id('EXISTS(SLECT car_id FROM Car WHERE type=0 AND price <= 10000 AND color = red)', 0)
UNION
SELECT get_saved_search_id('EXISTS(SLECT car_id FROM Car WHERE type=2 AND price <= 15000 AND color = black) ', 1)
UNION
SELECT get_saved_search_id('EXISTS(SLECT car_id FROM Car WHERE type=3 AND price <= 20000 AND color = white)', 2)
You could potentially do it by using a CROSS JOIN and generating a humongous WHERE/OR clause (instead of your EXISTS), with one condition for each saved_search_id, as follows:
SELECT saved_search_id,
visitor_id,
car_id
FROM searches a
CROSS JOIN cars b
-- generated WHERE clause below based on saved_search_id + search_url column
WHERE (saved_search_id = 0 AND type = 0 AND price <= 10000 AND color = 'red')
OR (saved_search_id = 1 AND type = 2 AND price <= 15000 AND color = 'black')
OR (saved_search_id = 2 AND type = 3 AND price <= 20000 AND color = 'white')
EDIT: add in a filter on inserted car id (10, for example)
SELECT saved_search_id,
visitor_id,
car_id
FROM searches a
CROSS JOIN cars b
-- generated WHERE clause below based on saved_search_id + search_url column
WHERE (
(saved_search_id = 0 AND type = 0 AND price <= 10000 AND color = 'red')
OR (saved_search_id = 1 AND type = 2 AND price <= 15000 AND color = 'black')
OR (saved_search_id = 2 AND type = 3 AND price <= 20000 AND color = 'white')
)
AND car_id = 10 --<-- inserted car id
The dealer inserts a new car with particular #type, #price, and #color. Use these parameters to find the related searches:
select *
from searches
where search_url like concat('type=', #type, '%color=', #color)
and cast(substring_index(search_url, '&price_max=', -1) as int) >= #price;
(Casting to integer in MySQL takes the number only and ignores the rest of the string.)
And if you want to use the already inserted car row instead:
select *
from searches s
where exists
(
select null
from cars c
where c.id = 12345 -- the inserted row's ID
and s.search_url like concat('type=', c.type, '%color=', c.color)
and cast(substring_index(s.search_url, '&price_max=', -1) as int) >= c.price
);
Of course you can also write a function for this accepting the search string and the car parameters - or the car ID for an already inserted car row. With the latter you'd have something like:
select *
from searches s
where search_matches_car(s.search_url, 12345);
The same with a join for the case you want to see car information, too:
select *
from cars c
join searches s on search_matches_car(s.search_url, c.id)
where c.id = 12345;

Conditional condition in ON clause

I am trying to apply a conditional condition inside ON clause of a LEFT JOIN. What I am trying to achieve is somewhat like this:
Pseudo Code
SELECT * FROM item AS i
LEFT JOIN sales AS s ON i.sku = s.item_no
AND (some condition)
AND (
IF (s.type = 0 AND s.code = 'me')
ELSEIF (s.type = 1 AND s.code = 'my-group')
ELSEIF (s.type = 2)
)
I want the query to return the row, if it matches any one of the conditions (Edit: and if it matches one, should omit the rest for the same item).
Sample Data
Sales
item_no | type | code | price
1 0 me 10
1 1 my-group 12
1 2 14
2 1 my-group 20
2 2 22
3 2 30
4 0 not-me 40
I want the query to return
item_no | type | code | price
1 0 me 10
2 1 my-group 20
3 2 30
Edit: The sales is table is used to apply special prices for individual users, user groups, and/or all users.
if type = 0, code contains username. (for a single user)
if type = 1, code contains user-group. (for users in a group)
if type = 2, code contains empty-string (for all users).
Use the following SQL (assumed, the the table sales has a unique id field as usual in yii):
SELECT * FROM item AS i
LEFT JOIN sales AS s ON i.sku = s.item_no
AND id = (
SELECT id FROM sales
WHERE item_no = i.sku
AND (type = 0 AND code = 'me' OR
type = 1 AND code = 'my-group' OR
type = 2)
ORDER BY type
LIMIT 1
)
Try following -
SELECT *,SUBSTRING_INDEX(GROUP_CONCAT(s.type ORDER BY s.type),','1) AS `type`, SUBSTRING_INDEX(GROUP_CONCAT(s.code ORDER BY s.type),','1) AS `code`,SUBSTRING_INDEX(GROUP_CONCAT(s.price ORDER BY s.type),','1) AS `price`
FROM item AS i
LEFT JOIN sales AS s
ON i.sku = s.item_no AND (SOME CONDITION)
GROUP BY i.sku

Determine ranking with single mysql query

I am selecting a set of items from my table and determine their ranking to display this on my page, my code for selecting the items:
<?
$attra_query=mysqli_query($link, "select * from table WHERE category ='4'");
if(mysqli_num_rows($attra_query)>
0){
while($attra_data=mysqli_fetch_array($attra_query,1)){
?>
In the while loop I determine the ranking for each of those items like so:
SELECT COUNT(mi.location) + 1 rank
FROM table m
LEFT JOIN (
SELECT id,location,country, ROUND(COALESCE(total_rating/total_rating_amount,0),10) rating_per_vote
FROM table WHERE category = '4'
) mi
ON mi.location = m.location
AND mi.country = m.country
AND mi.rating_per_vote > ROUND(COALESCE(m.total_rating/m.total_rating_amount,0),10)
WHERE m.id = '$attra_id';
I figure this is highly inefficient, is there a way to combine the 2 queries into a single one so I don't have to run the ranking query for each item separately ?
//EDIT
Sample data:
id | location | country | category | total_rating | total_rating_amount
1 berlin DE 4 12 2
2 munich DE 4 9 1
Vote system is 1-10 points, for the sample data berlin has received a total rating of 12 with 2 votes, munich has received a rating of 9 with 1 vote, so berlin would have a rating of 6/10 and munich a rating of 9/10 and therefore should be ranked #1
SELECT COUNT(m.id) rank, m.id
FROM
(SELECT * FROM table WHERE category = '4') m
LEFT JOIN (
SELECT id,location,country, ROUND(COALESCE(total_rating/total_rating_amount,0),10) rating_per_vote
FROM table WHERE category = '4'
) mi
ON (mi.location = m.location
AND mi.country = m.country
AND mi.rating_per_vote > ROUND(COALESCE(m.total_rating/m.total_rating_amount,0),10))
OR mi.id=m.id
GROUP BY m.id
This should do I suppose. I don't know if this is the best possible solution.
In MySQL, you can do the ranking using variables. It is a bit hard to tell what you want to rank by from your query, but it would be something like this:
select t.*, (#rn := #rn + 1) as ranking
from table t cross join
(select #rn := 0) vars
where category = '4'
order by rating_per_vote;
If you provide sample data and desired results, it would be possible to refine this solution.

MySQL: find most recent value for list of subdocuments

I have a collection content that has four columns; id, timestamp, locationID, and authorID. Here is an example of my data; in production, this is tens of millions of rows in length.
id timestamp locationID authorID
1 2012-03-01 11:52:00 1 1
2 2012-03-16 19:56:00 1 2
3 2012-04-02 11:26:00 2 1
4 2012-04-22 11:52:00 2 3
5 2012-05-19 09:48:00 2 2
6 2012-05-30 07:12:00 2 1
7 2012-06-04 19:17:00 1 2
I'd like to collect the list of authorIDs whose most recent content (ordered by timestamp) matched a specific locationID.
The correct values for a query of locationID = 2 would be: [ 1, 3 ], as authorID 1 and 3 were most recently 'seen' at locationID = 2, while authorID 2's most recent content was at locationID 1.
I can certainly execute one query per authorID, but on production the authorID array has a length >100,000. This seems terribly inefficient (especially when each 'subquery' would be hitting this multi-million row content collection), and I'm looking for a better way to emerge this data from my dataset, ideally fast enough to be executed on a page render.
Something like this? This is from SQL Server, but I think it should work in mySQL as well.
DECLARE #locationId INT
SET #locationId = 2;
SELECT *
FROM (SELECT AuthorId, Max(TimeStamp) as MaxTimeStamp
FROM Content C
WHERE LocationId = #locationId
GROUP BY AuthorId) AS CBL
LEFT JOIN Content AS C ON CBL.AuthorId = C.AuthorId
AND C.TimeStamp > CBL.MaxTimeStamp
WHERE C.AuthorId IS NULL
For locationId = 2, it returns 1 and 3; and for locationId = 1, it returns 2
Per JW (thanks!), the correct mySql approach:
SET #locationId := 2;
SELECT *
FROM (SELECT AuthorId, Max(TimeStamp) as MaxTimeStamp
FROM Content C
WHERE LocationId = #locationId
GROUP BY AuthorId) AS CBL
LEFT JOIN Content AS C ON CBL.AuthorId = C.AuthorId
AND C.TimeStamp > CBL.MaxTimeStamp
WHERE C.AuthorId IS NULL
Try derieved subquery
SELECT
*
FROM content as c
INNER JOIN(
SELECT
MAX(id) as ID
FROM content
WHERE locationID = 2
GROUP BY authorID
) as t on t.ID = c.id
SQL FIDDLE DEMO