Sum of long vectors in SQL - mysql

I know it is easy to compute a sparse dot product in SQL, but what is the best way to do a sum (for very long vectors)?
A join is not enough because if a coordinate is filled in one vector but not in the other, it will be ignored.
Thus, I computed the sum with a PHP loop... and that was a pretty stupid idea.
I'm currently thinking of filling the missing 0's in order to prepare an inner join, but is there a shortcut (like an outer join converting NULL to 0)?
Edit. Here is the structure of my table of vectors:
CREATE TABLE `eigaki_vectors` (
`name` varchar(2) COLLATE utf8_unicode_ci NOT NULL,
`i1` int(10) NOT NULL,
`i2` int(10) NOT NULL,
`value` double NOT NULL,
UNIQUE KEY `key` (`name`,`i1`,`i2`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
In this particular case, a vector has composed indices: v_{i_1, i_2}, but this has nothing to do with the problem.
I expected to do something like (thanks xQbert):
SELECT v1.i1, v1.i2, isNull(v1.value, 0) + isNull(v2.value, 0)
FROM eigaki_vectors v1 FULL OUTER JOIN eigaki_vectors v2
ON v1.i1 = v2.i1 AND v1.i2 = v2.i2
AND v1.name = 'a' AND v2.name = 'b'
to add vectors a and b. But FULL OUTER JOIN doesn't exist on MySQL, and I think I'm clumsy with the name column. Any ideas?

coalesce(Field,OtherField, AnotherField,0)
coalesce is basically a if then forever... it picks the first non-null value from your list of variables
isNull does the same thing but only for 2 values
isNull(Field,0)

I managed to get something, thanks to the snippet provided in MySQL: Union of a Left Join with a Right Join:
SELECT IFNULL(v1.value, 0) + IFNULL(v2.value, 0) FROM
(
SELECT i1, i2 FROM eigaki_vectors WHERE name = 'a'
UNION
SELECT i1, i2 FROM eigaki_vectors WHERE name = 'b'
) indices
LEFT OUTER JOIN eigaki_vectors v1 ON indices.i1 = v1.i1 AND indices.i2 = v1.i2 AND v1.name = 'a'
LEFT OUTER JOIN eigaki_vectors v2 ON indices.i1 = v2.i1 AND indices.i2 = v2.i2 AND v2.name = 'b'

Related

Error Code: 1054. Unknown column in 'on clause'

The query is meant to return the percentage based on value from one table being divided by the value of another table. However, there is something wrong and I am missing it.
similar problems noted on the board looked related to JOIN, but did not seem to be the problem, when I tried and explicit join -- basically mysql was like -- now you are an idiot-- I must have did that wrong or that is not the problem.
SELECT (pathogenPop / locationpop) as PercentInfected
FROM (
(SELECT apinfectcount.APInfectCountInfected
as pathogenPop, apinfectcount.APInfectCountLocation
FROM apstart.apinfectcount
GROUP BY apinfectcount.APInfectCountLocation) as pathogenPop
Inner JOIN
(SELECT apcountrypop.apcountrypopPopulation
as locationpop, apcountrypop.apcountrypopCountry
FROM apstart.apcountrypop
GROUP BY apcountrypop.apcountrypopCountry)
as locationpop
on apinfectcount.APInfectCountLocation = apcountrypop.apcountrypopCountry
and apinfectcount.APInfectCountWeek = 23);
Table Schema: apcountrypop
idapcountrypop INT(11)
apcountrypopCountry VarChar(45)
apcountrypopPopulation FLOAT
Table Schema: apinfectcount
idAPInfectCount INT(11)
APInfectCountLocation VarChar(45)
APInfectCountOutBreak VarChar(45)
APInfectCountPathogen VarChar(45)
APInfectCountInfected FLOAT
APInfectCountDead FLOAT
APInfectCountWeek VarChar(45)
If it worked --
it would assign apinfectcount.APInfectCountInfected to pathogenPop
and apcountrypop.apcountrypopPopulation to locationpop
for the values where the locations are the same(apinfectcount.APInfectCountLocation = apcountrypop.apcountrypopCountry)
then it would return the value of the apinfectcount table value is divided by the apcountrypop table to give the percentage.
so in this specific example I only have sample data so I am just wanted to return one value so I added the where clause to just test the logic and syntax.
I appreciate the help.
You have assugned the tables alias pathogenPop and locationpop so
you need pathogenPop.APInfectCountLocation = locationpop.apcountrypopCountry
and pathogenPop.APInfectCountWeek = 23 in ON clause
SELECT (pathogenPop / locationpop) as PercentInfected
FROM (
(SELECT apinfectcount.APInfectCountInfected
as pathogenPop, apinfectcount.APInfectCountLocation
FROM apstart.apinfectcount
GROUP BY apinfectcount.APInfectCountLocation) as pathogenPop
Inner JOIN
(SELECT apcountrypop.apcountrypopPopulation
as locationpop, apcountrypop.apcountrypopCountry
FROM apstart.apcountrypop
GROUP BY apcountrypop.apcountrypopCountry)
as locationpop
on pathogenPop.APInfectCountLocation = locationpop.apcountrypopCountry
and pathogenPop.APInfectCountWeek = 23) T;
and also a table alias for the outer FROM(..) T
I don't have the database to test against so I'm not 100% certain this will run, but would the following query not be a bit simpler?
SELECT (apinfectcount.APInfectCountInfected / apcountrypop.apcountrypopPopulation) as PercentInfected, apinfectcount.APInfectCountLocation
FROM apinfectcount
INNER JOIN apcountrypop ON apcountrypop.apcountrypopCountry = apinfectcount.APInfectCountLocation
WHERE apinfectcount.APInfectCountWeek = 23
GROUP BY apinfectcount.APInfectCountLocation
And I assume there is only one location record per location in each table?
There is an issue within a query. As scope of apinfectcount.APInfectCountLocation column and apcountrypop.apcountrypopCountry column is limited to subquery only you cannot use it outside the subquery (within where clause).
You can check out these docs on subquery https://learn.microsoft.com/en-us/sql/relational-databases/performance/subqueries?view=sql-server-2017
Refer code below.
SELECT (countInfected / countrypopulation) as PercentInfected
FROM (
(SELECT apinfectcount.APInfectCountInfected
as countinfected, apinfectcount.APInfectCountLocation, APInfectCountWeek as
countweek
FROM apstart.apinfectcount
GROUP BY apinfectcount.APInfectCountLocation) as pathogenPop
Inner JOIN
(SELECT apcountrypop.apcountrypopPopulation
as countrypopulation, apcountrypop.apcountrypopCountry
FROM apstart.apcountrypop
GROUP BY apcountrypop.apcountrypopCountry)
as locationpop
on pathogenPop.countinfected = locationpop.countrypopulation
and pathogenPop.countweek= 23);

How to get values through connected tables?

I have such a question. I got two tables, the first one contains comments, and the second id comments and album id to which the comment was left
> CREATE TABLE `review` (`id` VARCHAR(32) NOT NULL,
> `user_id` VARCHAR(32) NOT NULL,`comment` MEDIUMTEXT NOT NULL,
> PRIMARY KEY (`id`) )
> CREATE TABLE `review_album` (`review_id` VARCHAR(32) NOT NULL,
> `album_id` VARCHAR(32) NOT NULL, PRIMARY KEY (`review_id`,
> `album_id`), INDEX `review_album_review_idx` (`review_id`) )
I tried this way:
SELECT * from review_album JOIN review WHERE album_id = '300001'
But i got result two times.
How can I get comment text for a specific album_id?
The general syntax is:
SELECT column-names
FROM table-name1 JOIN table-name2
ON column-name1 = column-name2
WHERE condition
The general syntax with INNER is:
SELECT column-names
FROM table-name1 INNER JOIN table-name2
ON column-name1 = column-name2
WHERE condition
Note: The INNER keyword is optional: it is the default as well as the most commonly used JOIN operation.
Refrence : https://www.dofactory.com/sql/join
Try with InnerJoin
SELECT *
FROM review_album
JOIN review ON review_album.review_id=review.id
WHERE album_id = '300001'
Reference
you have forgotten the on condition, everytime you have a join you'd better specify the condition of join, otherwais you have every connection available.
Hovewer the solution
SELECT *
FROM review_album RA
JOIN review R ON RA.column_fk = R.column_fk
WHERE album_id = '300001'
Here the documentation for join https://www.w3schools.com/sql/sql_join.asp
try using this :
SELECT *
FROM review_album ra
JOIN review r ON rareview_id=r.id
WHERE album_id = '300001'

SQL Join For Exact Result

I have the following table in my database. Its purpose is to hold colour sets. I.e. [red + black], [blue + green + yellow], etc.
CREATE TABLE `df_productcolours`
(
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_colourSet` int(11) NOT NULL,
`id_colour` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQUE` (`id_colourSet`,`id_colour`),
KEY `idx_colourSet` (`id_colourSet`),
KEY `idx_colour_id` (`id_colour`),
CONSTRAINT `fk_colourid` FOREIGN KEY (`id_colour`) REFERENCES `df_lu_color` (`id`)
ON DELETE NO ACTION ON UPDATE NO ACTION
)
I made a stored proc that takes an array of id_colour integers as input, and returns a colour set id. What it's meant to do is return the set that contains those colours, and ONLY those colours that are provided as input. What it's actually doing is returning sets that contain the colours requested plus some others.
This is the code that I have so far:
SET #count = (SELECT COUNT(*) FROM tempTable_inputColours);
SELECT A.id_colourSet
FROM df_productcolours AS A
INNER JOIN tempTable_inputColours AS B
ON A.id_colour = B.id_colour
GROUP BY A.id_colourSet
HAVING COUNT(A.id_colour) = #count
AND COUNT(B.id_colour) = #count;
I have a feeling the issue may be with the way I'm joining, but I just can't seem to get it. Any help would be appreciated. Thanks.
You can try this:
SELECT A.id_colourSet
FROM df_productcolours AS A
INNER JOIN tempTable_inputColours AS B
ON A.id_colour = B.id_colour
WHERE A.id_colourSet IN (SELECT id_colour FROM tempTable_inputColours)
AND A.id_colour IN (SELECT id_colour FROM tempTable_inputColours)
EDIT
SELECT A.id_colourSet
FROM df_productcolours AS A
INNER JOIN tempTable_inputColours AS B
ON A.id_colour = B.id_colour
WHERE A.id_colourSet =(SELECT SUM(id_colour) FROM tempTable_inputColours)
I think I solved it myself after a few days of punishment. Here's the code:
SET clrCount = (SELECT COUNT(*) FROM _tmp_ColourSet);
-- The first half of the query does an inner join,
-- it will return all sets that have ANY of our requested colours.
-- But the HAVING condition will make it return sets that have AT LEAST all of the colours we are requesting.
-- So at this point we have all the super-sets, if you will.
-- Then, the second half of the query will restrict that further,
-- to only sets that have the same number of colours as we are requesting.
-- And voila :)
-- FIND ALL COLOUR SETS THAT HAVE ALL REQUESTED COLOURS
SET colourSetId = (SELECT A.id_colourSet
FROM df_productcolours AS A
INNER JOIN _tmp_colourset AS B
ON A.id_colour = B.id_colour
GROUP BY A.id_colourSet
HAVING COUNT(A.id_colour) = clrCount
-- FIND ALL COLOUR SETS THAT HAVE EXACTLY N COLOURS
AND A.id_colourSet IN (SELECT A.id_colourSet
FROM df_productcolours AS A
GROUP BY A.id_colourSet
HAVING COUNT(A.id_colour) = clrCount));
Hope it saves someone pulling their hair out.

Multiple entries in syscolumns for each column of type 'geography'

First, I have created a table called Placemarks containing a column of type 'geography'.
CREATE TABLE [dbo].[Placemarks](
[ID] [int] NOT NULL,
[Name] [nvarchar](50) NOT NULL,
[Location] [geography] NOT NULL,
CONSTRAINT [PK_Placemarks]
PRIMARY KEY CLUSTERED([ID] ASC)
)
Then, I use the following query in a stored procedure to get a list of all columns in the table with their data types.
SELECT
b.name, c.name as TypeName, b.length, b.isnullable, b.collation, b.xprec, b.xscale
FROM sysobjects a
inner join syscolumns b on a.id = b.id
inner join systypes c on b.xtype = c.xtype and c.name <> 'sysname'
WHERE a.id = object_id(N'[dbo].[Placemarks]')
and OBJECTPROPERTY(a.id, N'IsUserTable') = 1
ORDER BY b.colId
The result of the query can be viewed here:
I am using this query in a stored procedure and need to get a single row for each column in my Placemarks table. I could filter out rows with TypeName = geometry or hierarchyid.
But I may use the geometry datatype in the future and want the query to be forward compatible. Any other ideas?
The additional rows are being brought in by the join on systypes. Changing the join condition to
inner join systypes c on b.xtype = c.xtype and b.xusertype=c.xusertype
seems to work. You should use sys.columns, sys.types etc. instead of the deprecated syscolumns, systypes backward compatibility views.
I would recommend using the newer sys system catalog views rather than the old sysobjects and similar views - those will be removed soon.
With this query, you should get your desired result:
SELECT
c.name 'ColName',
ty.Name 'TypeName',
c.max_length, c.is_nullable, c.collation_name, c.precision, c.scale
FROM
sys.tables t
INNER JOIN
sys.columns c ON t.object_id = c.object_id
INNER JOIN
sys.types ty ON c.user_type_id = ty.user_type_id
WHERE
t.name = 'Placemarks'
At least in my case, I now get:
ColName TypeName max_length is_nullable collation_name precision scale
ID int 4 0 NULL 10 0
Name nvarchar 100 0 Latin1_General_CI_AS 0 0
Location geography -1 0 NULL 0 0

Mysql query to check if all sub_items of a combo_item are active

I am trying to write a query that looks through all combo_items and only returns the ones where all sub_items that it references have Active=1.
I think I should be able to count how many sub_items there are in a combo_item total and then compare it to how many are Active, but I am failing pretty hard at figuring out how to do that...
My table definitions:
CREATE TABLE `combo_items` (
`c_id` int(11) NOT NULL,
`Label` varchar(20) NOT NULL,
PRIMARY KEY (`c_id`)
)
CREATE TABLE `sub_items` (
`s_id` int(11) NOT NULL,
`Label` varchar(20) NOT NULL,
`Active` int(1) NOT NULL,
PRIMARY KEY (`s_id`)
)
CREATE TABLE `combo_refs` (
`r_id` int(11) NOT NULL,
`c_id` int(11) NOT NULL,
`s_id` int(11) NOT NULL,
PRIMARY KEY (`r_id`)
)
So for each combo_item, there is at least 2 rows in the combo_refs table linking to the multiple sub_items. My brain is about to make bigbadaboom :(
I would just join the three tables usually and then combo-item-wise sum up the total number of sub-items and the number of active sub-items:
SELECT ci.c_id, ci.Label, SUM(1) AS total_sub_items, SUM(si.Active) AS active_sub_items
FROM combo_items AS ci
INNER JOIN combo_refs AS cr ON cr.c_id = ci.c_id
INNER JOIN sub_items AS si ON si.s_id = cr.s_id
GROUP BY ci.c_id
Of course, instead of using SUM(1) you could just say COUNT(ci.c_id), but I wanted an analog of SUM(si.Active).
The approach proposed assumes Active to be 1 (active) or 0 (not active).
To get only those combo-items whose all sub-items are active, just add WHERE si.Active = 1. You could then reject the SUM stuff anyway. Depends on what you are looking for actually:
SELECT ci.c_id, ci.Label
FROM combo_items AS ci
INNER JOIN combo_refs AS cr ON cr.c_id = ci.c_id
INNER JOIN sub_items AS si ON si.s_id = cr.s_id
WHERE si.Active = 1
GROUP BY ci.c_id
By the way, INNER JOIN ensures that there is at least one sub-item per combo-item at all.
(I have not tested it.)
See this answer:
MySQL: Selecting foreign keys with fields matching all the same fields of another table
Select ...
From combo_items As C
Where Exists (
Select 1
From sub_items As S1
Join combo_refs As CR1
On CR1.s_id = S1.s_id
Where CR1.c_id = C.c_id
)
And Not Exists (
Select 1
From sub_items As S2
Join combo_refs As CR2
On CR2.s_id = S2.s_id
Where CR2.c_id = C.c_id
And S2.Active = 0
)
The first subquery ensures that at least one sub_item exists. The second ensures that none of the sub_items are inactive.