mysql - query to extract report from book register - mysql

I have the below query in mysql, when I run the query, it gives me the complete report and "where clause does not work"
SELECT oo.dateaccessioned AS 'Date',
oo.barcode AS 'Acc. No.',
ooo.title AS 'Title',
ooo.author AS 'Author/Editor',
concat_ws(' , ', o.editionstatement, oo.enumchron) AS 'Ed./Vol.',
concat_ws(' ', o.place, o.publishercode) AS 'Place & Publisher',
ooo.copyrightdate AS 'Year', o.pages AS 'Page(s)',
ooooooo.name AS 'Source',
oo.itemcallnumber AS 'Class No./Book No.',
concat_ws(', ₹', concat(' ', ooooo.symbol, oooo.listprice), oooo.rrp_tax_included) AS 'Cost',
concat_ws(' , ', oooooo.invoicenumber, oooooo.shipmentdate) AS 'Bill No. & Date',
'' AS 'Withdrawn Date',
'' AS 'Remarks'
FROM biblioitems o
LEFT JOIN items oo ON oo.biblioitemnumber=o.biblioitemnumber
LEFT JOIN biblio ooo ON ooo.biblionumber=o.biblionumber
LEFT JOIN aqorders oooo ON oooo.biblionumber=o.biblionumber
LEFT JOIN currency ooooo ON ooooo.currency=oooo.currency
LEFT JOIN aqinvoices oooooo ON oooooo.booksellerid=oo.booksellerid
LEFT JOIN aqbooksellers ooooooo ON ooooooo.id=oo.booksellerid
WHERE cast(oo.barcode AS UNSIGNED) BETWEEN <<Accession Number>> AND <<To Accession Number>>
GROUP BY oo.barcode
ORDER BY oo.barcode ASC
Can you please help me to generate a report based on above query - oo.barcode (it is a varchar). I am a Library team member than a database administrator. My oo.barcode begins with HYD and then numercs. I know if it(oo.barcode) is a number only field the above query works without any issue.
I search about how cast works but not able to understand as i am not into database administration.

If the barcode column is VARCHAR and begins with "HYD", CAST AS UNSIGNED will cause a value of HYD123 to result in 0.
The non-numeric characters of the string would need to be removed prior to casting the value as an integer.
This can be achieved by trimming the leading text "HYD" from the barcode.
CAST(TRIM(LEADING 'HYD' FROM barcode) AS UNSIGNED)
Otherwise, if the prefix is always 3 characters, the substring position of barcode can be used.
CAST(SUBSTR(barcode, 4) AS UNSIGNED)
If any other non-numeric characters are contained within the string, such as HYD-123-456-789, HYD123-456-789PT, HYD123-456.789, etc, they will also needed to be removed, as the type conversion will treat them in unexpected ways.
In addition, any leading 0's of the resulting numeric string value will be truncated from the resulting integer, causing 0123 to become 123.
For more details on how CAST functions see: 12.3 Type Conversion in Expression Evaluation
Examples db<>fiddle
CREATE TABLE tester (
barcode varchar(255)
);
INSERT INTO tester(barcode)
VALUES ('HYD123'), ('HYD0123'), ('HYD4231');
Results
SELECT cast(barcode AS UNSIGNED)
FROM tester;
cast(barcode AS UNSIGNED)
0
0
0
SELECT CAST(TRIM(LEADING 'HYD' FROM barcode) AS UNSIGNED)
FROM tester;
CAST(TRIM(LEADING 'HYD' FROM barcode) AS UNSIGNED)
123
123
4231
SELECT barcode
FROM tester
WHERE CAST(TRIM(LEADING 'HYD' FROM barcode) AS UNSIGNED) BETWEEN 120 AND 4232;
barcode
HYD123
HYD0123
HYD4231
SELECT CAST(SUBSTR(barcode, 4) AS UNSIGNED)
FROM tester;
CAST(SUBSTR(barcode, 4) AS UNSIGNED)
123
123
4231
SELECT barcode
FROM tester
WHERE CAST(SUBSTR(barcode, 4) AS UNSIGNED) BETWEEN 120 AND 4232;
barcode
HYD123
HYD0123
HYD4231
JOIN optimization
To obtain the expected results, you most likely want an INNER JOIN of the items table with an ON criteria matching the desired barcode range condition. Since INNER JOIN is the equivalent of using WHERE oo.barcode IS NOT NULL, as is the case with your current criteria - NULL matches within the items table are already being excluded.
INNER JOIN items AS oo
ON oo.biblioitemnumber = o.biblioitemnumber
AND CAST(SUBSTR(oo.barcode, 4) AS UNSIGNED) BETWEEN ? AND ?
Full-Table Scanning
It is important to understand that transforming the column value to suit a criteria will cause a full-table scan that does not benefit from indexing, which will run very slowly.
Instead it is best to store the integer only version of the value in the database to see the benefits of indexing.
This can be accomplished in many ways, such as generated columns.
GROUP BY without an aggregate
Lastly, you should avoid using GROUP BY without an aggregate function. You most likely are expecting DISTINCT or similar form of limiting the record set. Please see MySQL select one column DISTINCT, with corresponding other columns on ways to accomplish this.
To ensure MySQL is not selecting "any value from each group" at random (leading to differing results between query executions), limit the subset data to the distinct biblioitemnumber column values from the available barcode matches. One approach to accomplish the limited subset is as follows.
/* ... */
FROM biblioitems o
INNER JOIN (
SELECT biblioitemnumber, barcode, booksellerid, enumchron, itemcallnumber
FROM items WHERE biblioitemnumber IN(
SELECT MIN(biblioitemnumber)
FROM items
WHERE CAST(SUBSTR(barcode, 4) AS UNSIGNED) BETWEEN ? AND ?
GROUP BY barcode
)
) AS oo
ON oo.biblioitemnumber = o.biblioitemnumber
LEFT JOIN biblio ooo ON ooo.biblionumber=o.biblionumber
LEFT JOIN aqorders oooo ON oooo.biblionumber=o.biblionumber
LEFT JOIN currency ooooo ON ooooo.currency=oooo.currency
LEFT JOIN aqinvoices oooooo ON oooooo.booksellerid=oo.booksellerid
LEFT JOIN aqbooksellers ooooooo ON ooooooo.id=oo.booksellerid
ORDER BY oo.barcode ASC

Try this :
...
WHERE cast(SUBSTRING_INDEX(oo.barcode,'HYD',-1) AS UNSIGNED INTEGER) BETWEEN <<Accession Number>> AND <<To Accession Number>>
...
SUBSTRING_INDEX(oo.barcode,'HYD',-1) will transform HYD132453741 to 132453741
demo here

Related

Aggregating row values in MySQl or Snowflake

I would like to calculate the std dev. min and max of the mer_data array into 3 other fields called std_dev,min_mer and max_mer grouped by mac and timestamp.
This needs to be done without flattening the data as each mer_data row consists of 4000 float values and multiplying that with 700k rows gives a very high dimensional table.
The mer_data field is currently saved as varchar(30000) and maybe Json format might help, I'm not sure.
Input:
Output:
This can be done in Snowflake or MySQL.
Also, the query needs to be optimized so that it does not take much computation time.
While you don't want to split the data up, you will need to if you want to do it in pure SQL. Snowflake has no problems with such aggregations.
WITH fake_data(mac, mer_data) AS (
SELECT * FROM VALUES
('abc','43,44.25,44.5,42.75,44,44.25,42.75,43'),
('def','32.75,33.25,34.25,34.5,32.75,34,34.25,32.75,43')
)
SELECT f.mac,
avg(d.value::float) as avg_dev,
stddev(d.value::float) as std_dev,
MIN(d.value::float) as MIN_MER,
Max(d.value::float) as Max_MER
FROM fake_data f, table(split_to_table(f.mer_data,',')) d
GROUP BY 1
ORDER BY 1;
I would however discourage the use of strings in the grouping process, so would break it apart like so:
WITH fake_data(mac, mer_data, timestamp) AS (
SELECT * FROM VALUES
('abc','43,44.25,44.5,42.75,44,44.25,42.75,43', '01-01-22'),
('def','32.75,33.25,34.25,34.5,32.75,34,34.25,32.75,43', '02-01-22')
), boost_data AS (
SELECT seq8() as seq, *
FROM fake_data
), math_step AS (
SELECT f.seq,
avg(d.value::float) as avg_dev,
stddev(d.value::float) as std_dev,
MIN(d.value::float) as MIN_MER,
Max(d.value::float) as Max_MER
FROM boost_data f, table(split_to_table(f.mer_data,',')) d
GROUP BY 1
)
SELECT b.mac,
m.avg_dev,
m.std_dev,
m.MIN_MER,
m.Max_MER,
b.timestamp
FROM boost_data b
JOIN math_step m
ON b.seq = m.seq
ORDER BY 1;
MAC
AVG_DEV
STD_DEV
MIN_MER
MAX_MER
TIMESTAMP
abc
43.5625
0.7529703087
42.75
44.5
01-01-22
def
34.611111111
3.226141056
32.75
43
02-01-22
performance testing:
so using this SQL to make 70K rows of 4000 values each:
create table fake_data_tab AS
WITH cte_a AS (
SELECT SEQ8() as s
FROM TABLE(GENERATOR(ROWCOUNT =>70000))
), cte_b AS (
SELECT a.s, uniform(20::float, 50::float, random()) as v
FROM TABLE(GENERATOR(ROWCOUNT =>4000))
CROSS JOIN cte_a a
)
SELECT s::text as mac
,LISTAGG(v,',') AS mer_data
,dateadd(day,s,'2020-01-01')::date as timestamp
FROM cte_b
GROUP BY 1,3;
takes 79 seconds on a XTRA_SMALL,
now with that we can test the two solutions:
The second set of code (group by numbers, with a join):
WITH boost_data AS (
SELECT seq8() as seq, *
FROM fake_data_tab
), math_step AS (
SELECT f.seq,
avg(d.value::float) as avg_dev,
stddev(d.value::float) as std_dev,
MIN(d.value::float) as MIN_MER,
Max(d.value::float) as Max_MER
FROM boost_data f, table(split_to_table(f.mer_data,',')) d
GROUP BY 1
)
SELECT b.mac,
m.avg_dev,
m.std_dev,
m.MIN_MER,
m.Max_MER,
b.timestamp
FROM boost_data b
JOIN math_step m
ON b.seq = m.seq
ORDER BY 1;
takes 1m47s
the original group by strings/dates
SELECT f.mac,
avg(d.value::float) as avg_dev,
stddev(d.value::float) as std_dev,
MIN(d.value::float) as MIN_MER,
Max(d.value::float) as Max_MER,
f.timestamp
FROM fake_data_tab f, table(split_to_table(f.mer_data,',')) d
GROUP BY 1,6
ORDER BY 1;
takes 1m46s
Hmm, so leaving the "mac" as a number made the code very fast (~3s), and dealing with strings in ether way changed the data processed from 1.5GB for strings and 150MB for numbers.
If the numbers were in rows, not packed together like that, we can discuss how to do it in SQL.
In rows, GROUP_CONCAT(...) can construct a commalist like you show, and MIN(), STDDEV(), etc can do the other stuff.
If you continue to have the commalist, the do the rest of work in you app programming language. (It is very ugly to have SQL pick apart an array.)

Is there a way to use aggregate COUNT() values within CASE?

I need to retrieve unique yet truncated part numbers, with their description values being conditionally determined.
DATA:
Here's some simplified sample data:
(the real table has half a million rows)
create table inventory(
partnumber VARCHAR(10),
description VARCHAR(10)
);
INSERT INTO inventory (partnumber,description) VALUES
('12345','ABCDE'),
('123456','ABCDEF'),
('1234567','ABCDEFG'),
('98765','ZYXWV'),
('987654','ZYXWVU'),
('9876543','ZYXWVUT'),
('abcde',''),
('abcdef','123'),
('abcdefg','321'),
('zyxwv',NULL),
('zyxwvu','987'),
('zyxwvut','789');
TRIED:
I've tried too many things to list here.
I've finally found a way to get past all the 'unknown field' errors and at least get SOME results, but:
it's SUPER kludgy!
my results are not limited to unique prods.
Here's my current query:
SELECT
LEFT(i.partnumber, 6) AS prod,
CASE
WHEN agg.cnt > 1
OR i.description IS NULL
OR i.description = ''
THEN LEFT(i.partnumber, 6)
ELSE i.description
END AS `descrip`
FROM inventory i
INNER JOIN (SELECT LEFT(ii.partnumber, 6) t, COUNT(*) cnt
FROM inventory ii GROUP BY ii.partnumber) AS agg
ON LEFT(i.partnumber, 6) = agg.t;
GOAL:
My goal is to retrieve:
prod
descrip
12345
ABCDE
123456
123456
98765
ZYXWV
987654
987654
abcde
abcde
abcdef
abcdef
zyxwv
zyxwv
zyxwvu
zyxwvu
QUESTION:
What are some cleaner ways to use the COUNT() aggregate data with a CASE type conditional?
How can I limit my results so that all prods are UNIQUE?
You can check if a left(partnumber, 6) is not unique in the result by checking if count(*) > 1. In such a case let descrip be left(partnumber, 6). Otherwise you can use max(description) (or min(description)) to get the single description but satisfy the needs to use an aggregation function on columns not in the GROUP BY. To replace empty or NULL descriptions, nullif() and coalesce() can be used.
That would lead to the following using just one level of aggregation and no joins:
SELECT left(partnumber, 6) AS prod,
CASE
WHEN count(*) > 1 THEN
left(partnumber, 6)
ELSE
coalesce(nullif(max(description), ''), left(partnumber, 6))
END AS descrip
FROM inventory
GROUP BY left(partnumber, 6)
ORDER BY left(partnumber, 6);
But there seems to be a bug in MySQL and this query fails. The engine doesn't "see" that, in the list after SELECT partnumber is only used in the expression left(partnumber, 6), which is also in the GROUP BY. Instead the engine falsely complains about partnumber not being in the GROUP BY and not subject to an aggregation function.
As a workaround, we can use a derived table, that does the shortening of partnumber to its first six characters. We then use use that column of the derived table instead of left(partnumber, 6).
SELECT l6pn AS prod,
CASE
WHEN count(*) > 1 THEN
l6pn
ELSE
coalesce(nullif(max(description), ''), l6pn)
END AS descrip
FROM (SELECT left(partnumber, 6) AS l6pn,
description
FROM inventory) AS x
GROUP BY l6pn
ORDER BY l6pn;
Or we slap some actually pointless max()es around the left(partnumber, 6) other than the first, to work around the bug.
SELECT left(partnumber, 6) AS prod,
CASE
WHEN count(*) > 1 THEN
max(left(partnumber, 6))
ELSE
coalesce(nullif(max(description), ''), max(left(partnumber, 6)))
END AS descrip
FROM inventory
GROUP BY left(partnumber, 6)
ORDER BY left(partnumber, 6);
db<>fiddle (Change the DBMS to some other like Postgres or MariaDB to see that they also accept the first query.)

UNION in MySQL 5.7.2

I'm using MySQL 5.7.
I am getting bad results by a UNION of COUNT(*).
SELECT
COUNT(*) AS Piezas
, ''Motor
from parque
where `parque`.`CausasParalizacion` = 2
UNION
SELECT
''Piezas
, COUNT(*) AS Motor
from parque
where `parque`.`CausasParalizacion` = 3
The result should be 30 and 12, and I am getting 3330 and 3132.
Can anyone help?
I don't think MySQL is returning a "bad" result. The results returned by MySQL are per the specification.
Given no GROUP BY each of the SELECT statements will return one row. We can verify by running each SELECT statement separately. We'd expect the UNION result of the two SELECT to be something like
Piezas Motor
------ -----
mmm
ppp
You say the results should be '30' and '12'
My guess is that MySQL is returning the characters '30' and '12'.
But we should be very suspicious, and note the hex representation of the ASCII encoding of those characters
x'30' -> '0'
x'31' -> '1'
x'32' -> '2'
x'33' -> '3'
As a demonstration
SELECT HEX('30'), HEX('12')
returns
HEX('30') HEX('12')
--------- ---------
3330 3132
I don't think MySQL is returning "bad" results. I suspect that the column metadata for the columns is confusing the client. (We do note that both of the columns is a mix of two different datatypes being UNION'd. On one row, the datatype is string/varchar (an empty string), and the other row is integer/numeric (result of COUNT() aggregate.)
And I'm not sure what the resultset metadata for the columns ends up as.
I suspect that the issue with the client interpretation the resultset metadata, determining the datatype of the columns. And the client is deciding that the most appropriate way to display the values is as a hex representation of the raw bytes.
Personally, I would avoid returning a UNION result of different/incompatible datatypes. I'd prefer the datatypes be consistent.
If I had to do the UNION of incompatible datatypes, I would include an explicit conversion into compatible/appropriate datatypes.
But once I am at that point, I have to question why I need any of that rigmarole with the mismatched datatypes, why we need to return two separate rows, when we could just return a single row (probably more efficiently to boot)
SELECT SUM( p.`CausasParalizacion` = 2 ) AS Piezas
, SUM( p.`CausasParalizacion` = 3 ) AS Motor
FROM parque p
WHERE p.`CausasParalizacion` IN (2,3)
To avoid the aggregate functions returning NULL,
we can wrap the aggregate expressions in an IFNULL (or ANSI-standard COALESCE) function..
SELECT IFNULL(SUM( p.`CausasParalizacion` = 2 ),0) AS Piezas
, IFNULL(SUM( p.`CausasParalizacion` = 3 ),0) AS Motor
FROM parque p
WHERE p.`CausasParalizacion` IN (2,3)
-or-
we could use a COUNT() of an expression that is either NULL or non-NULL
SELECT COUNT(IF( p.`CausasParalizacion` = 2 ,1,NULL) AS Piezas
, COUNT(IF( p.`CausasParalizacion` = 3 ,1,NULL) AS Motor
FROM parque p
WHERE p.`CausasParalizacion` IN (2,3)
If, for some reason it turns out it is faster to run two separate SELECT statements, we could still combine the results into a single row. For example:
SELECT s.Piezas
, t.Motor
FROM ( SELECT COUNT(*) AS Piezas
FROM parque p
WHERE p.`CausasParalizacion` = 2
) s
CROSS
JOIN ( SELECT COUNT(*) AS Motor
FROM parque q
WHERE q.`CausasParalizacion` = 3
) t
Spencer, I think that the problem was about encoding. Ej. When I execute the consult in console, the result was the expected, the otherwise in the phpmyadmin.
However, I must say that your first solution works perfectly, Thanks a lot bro.

filed showing null value when joining table

below is my query
select C.cName,DATE_FORMAT(CT.dTransDate,'%d-%M-%Y') as dTransDate,
(c.nOpBalance+IFNULL(CT.nAmount,0)) AS DrAMount,IFNULL(CTR.nAmount,0) AS
CrAMount,((c.nOpBalance+IFNULL(CT.nAmount,0))-IFNULL(CTR.nAmount,0)) AS
Balance,CT.cTransRefType,CT.cRemarks,cinfo.cCompanyName,cinfo.caddress1,cinfo.cP
honeOffice,cinfo.cMobileNo,cinfo.cEmailID,cinfo.cWebsite from Customer
C LEFT JOIN Client_Transaction CT ON CT.nClientPk = C.nCustomerPk AND
CT.cTransRefType='PAYMENT' AND CT.cClientType='CUSTOMER' AND CT.dTransDate
between '' AND '' LEFT JOIN Client_Transaction CTR ON CTR.nClientPk =
C.nCustomerPk AND CTR.cTransRefType='RECEIPT' AND
CTR.cClientType='CUSTOMER' AND CTR.dTransDate between '2015-05-01' AND
'2015-05-29' LEFT JOIN companyinfo cinfo ON cinfo.cCompanyName like
'%Fal%' Where C.nCustomerPk = 4 Order By dTransDate
it's showing all value but dTransDate ,cTransRefType,cRemarks, showing null.
One obvious thing jumps out at us:
CT.dTransDate BETWEEN '' AND ''
^^ ^^
Another thing that jumps out at us is that there's a semi-Cartesian join between rows from CT and rows from CTR. If 5 rows are returned from CT for a given customer, and 5 rows are returned from CTR, that's going to produce a total of 5*5 = 25 rows. That just doesn't seem like a resultset that you'd really want returned.
Also, if more than one row is returned from cinfo, that's also going to cause another semi-Cartesian join. If there's two rows returned from cinfo, the total number or rows in the resultset will be doubled. It's valid to do that in SQL, but this is an unusual pattern.
The calculation of the balance is also very strange. For each row, the nAmount is added/subtracted from opening balance. On the next row, the same thing, on the original opening balance. There's nothing invalid SQL-wise with doing that, but the result being returned just seems bizarre. (It seems much more likely that you'd want to show a running balance, with each transaction.)
Another thing that jumps out at us is that you are ordering the rows by a string representation of a DATE, with the day as the leading portion. (As long as all the rows have date values in the same year and month, that will probably work, but it just seems bizarre that we wouldn't sort on the DATE value, or a canonical string representation.
I strongly suspect that you want to run a query that's more like this. (This doesn't do a "running balance" calculation. It does return the 'PAYMENT' and 'RECEIPT' rows as individual rows, without producing a semi-Cartesian result.
SELECT c.cName
, DATE_FORMAT(t.dTransDate,'%d-%M-%Y') AS dTransDate
, C.nOpBalance
, IF(t.cTransRefType='PAYMENT',IFNULL(t.nAmount,0),0) AS DrAMount
, IF(t.cTransRefType='RECEIPT',IFNULL(t.nAmount,0),0) AS CrAMount
, t.cTransRefType
, t.cRemarks
, ci.*
FROM Customer c
LEFT
JOIN Client_Transaction t
ON t.nClientPk = c.nCustomerPk
AND t.cClientType = 'CUSTOMER'
AND t.dTransDate >= '2015-05-01'
AND t.dTransDate <= '2015-05-29'
AND t.cTransRefType IN ('PAYMENT','RECEIPT')
CROSS
JOIN ( SELECT cinfo.cCompanyName
, cinfo.caddress1
, cinfo.cPhoneOffice
, cinfo.cMobileNo
, cinfo.cEmailID
, cinfo.cWebsite
FROM companyinfo cinfo
WHERE cinfo.cCompanyName LIKE '%Fal%'
ORDER BY cinfo.cCompanyName
LIMIT 1
) ci
WHERE c.nCustomerPk = 4
ORDER BY t.dTransDate, t.cTransRefTpye, t.id

MySQl count grouping by 4 columns

Basically, this query returns me different values from counts()
Geographic Address(city),Office,Device type, Device unique type identifier, number case by device type
0001,1002,ORDENADOR,ORD1234,5 INCIDENCIAS
0001,1002,ORDENADOR,ORD3333,2 INCIDENCIAS
0001,1002,ORDENADOR,ORD2222,1 INCIDENCIAS
0001,1002,TECLADO,TECYYYY,2 INCIDENCIAS
0001,1002,TECLADO,TECXXXX,4 INCIDENCIAS
0001,1002,PANTALLA,PAN0000,1 INCIDENCIAS
Select
d.dt as 'Direccion Territorial',
t.centro as 'Oficina',
nombrelargo,
if(length(p.Oficina)=3,concat('0',p.Oficina),p.Oficina) as 'Oficina2',
p.Tipo_Disp as 'Dispositivo',
count(p.Tipo_Disp) as 'Nº de partes/Etiqueta',
p.Etq_Amarilla as 'Etiqueta',
------------ count(TOTAL INC DE ESE DISPOSITIVO) ---------------------------,
------------ count(TOTAL INC DE ESA OFICINA) ---------------------------
from textcentro t,dtdz d,ppp p
where
t.jcentro03=d.dt and
t.organizativo='OFIC./AGEN./DELEG.' and
t.situacion='ABIERTO' and
t.sociedad='0900' and
(p.Estado != "Abierto" and p.Estado!= 'Planificado') and
(month(p.Fecha_y_hora_de_creacion) = 8 and year(Fecha_y_hora_de_creacion)=2013) and
t.centro=if(length(p.Oficina)=3,concat('0',p.Oficina),p.Oficina)
GROUP BY d.dt,t.centro,p.Tipo_Disp,p.Etq_Amarilla
The grouping:
1 - d.dt ----> Postal code
2 - t.centro ----> Office code
3 - p.Tipo_Disp ----> Device Type
4 - d.Etq_Amarilla ----> Unique identifier for this device
The tables are :
1- textcentro ----> Specific information of the offices
2- dtdz ----> auxiliary table to find the Postal Code of the office
3- ppp ----> Table where we can find all the cases
So now, I want to sum the total number of cases by device type, should be this:
Postal Code,Office,Device type, Unique identifier for Device, total number of cases by unique identifier device, total number case by device type, total number case by office
0001,1002,ORDENADOR,ORD1234,5 INCIDENCIAS,8 INC,15
0001,1002,ORDENADOR,ORD3333,2 INCIDENCIAS,8 INC,15
0001,1002,ORDENADOR,ORD2222,1 INCIDENCIAS,8 INC,15
0001,1002,TECLADO,TECYYYY,2 INCIDENCIAS,6 INC,15
0001,1002,TECLADO,TECXXXX,4 INCIDENCIAS,6 INC,15
0001,1002,PANTALLA,PAN0000,1 INCIDENCIAS,1 INC,15
I'm trying with sums and counts functions but i dont reach it, i don't have any way to take the last two columns. I think that i can try to take this number by sub-query in the column but the performance will be down too much.
The example would be this... but even i get to finish the query and im waiting around 12-13 minutes.
Select
d.dt as 'Direccion Territorial',
t.centro as 'Oficina',
nombrelargo,
if(length(p.Oficina)=3,concat('0',p.Oficina),p.Oficina) as 'Oficina2',
p.Tipo_Disp as 'Dispositivo',
count(p.Tipo_Disp) as 'Nº de partes/Etiqueta',
p.Etq_Amarilla as 'Etiqueta',
(Select count(*) from People_DB pp where pp.Oficina=p.Oficina and pp.Tipo_Disp=Dispositivo and (month(pp.Fecha_y_hora_de_creacion) = 8 and year(pp.Fecha_y_hora_de_creacion)=2013) and (pp.Estado != "Abierto" and pp.Estado!= 'Planificado') )
from textcentro t,dtdz d,ppp p
where
t.jcentro03=d.dt and
t.organizativo='OFIC./AGEN./DELEG.' and
t.situacion='ABIERTO' and
t.sociedad='0900' and
(p.Estado != "Abierto" and p.Estado!= 'Planificado') and
(month(p.Fecha_y_hora_de_creacion) = 8 and year(Fecha_y_hora_de_creacion)=2013) and
t.centro=if(length(p.Oficina)=3,concat('0',p.Oficina),p.Oficina)
GROUP BY d.dt,t.centro,p.Tipo_Disp,p.Etq_Amarilla
Sorry for my poor english, maybe this post is unintelligible
May I make some suggestions:
First, your choice of tables looks like this:
from textcentro t,dtdz d,ppp p
For the sake of clarity I suggest you employ explicit JOIN statements instead. For example
FROM textcentro AS t
JOIN dtdx AS d ON t.jcentro03=d.dt
JOIN ppp AS p ON XXXXXXXXX
You may want to use LEFT JOIN in cases for example, where there might be no corresponding row in dtdx to go with a row in textcentro.
I cannot tell from your sample query what the ON constraint for the JOIN to ppp should be. I have shown that with XXXXXXXXX in my code above. I think your condition is this:
t.centro=if(length(p.Oficina)=3,concat('0',p.Oficina),p.Oficina)
but that is a nasty expression to compute, and therefore very slow. It looks like your t.centro is a char column containing an integer with leading zeros, and your p.Oficina is the same but without the leading zeros. Instead of adding the leading zero to p.Oficina, try stripping it from the t.centro column.
CAST(t.centro AS INTEGER) = p.Oficina
Keep in mind that without a simple JOIN constraint you get a combinatorial explosion: m times n rows. This makes things slow and possibly wrong.
So, your table selection becomes:
FROM textcentro AS t
JOIN dtdx AS d ON t.jcentro03=d.dt
JOIN ppp AS p ON CAST(t.centro AS INTEGER) = p.Oficina
Second, your date/time search expressions are not built for speed. Try this:
p.Fecha_y_hora_de_creacion >= '2013-08-01'
AND p.Fecha_y_hora_de_creacion < '2013-08-01' + INTERVAL 1 MONTH
If you have an index on your p.Fecha... column, this will permit a range-scan search on that column.
Third, this item in your SELECT list is killing performance.
(Select count(*)
from People_DB pp
where pp.Oficina=p.Oficina
and pp.Tipo_Disp=Dispositivo
and (month(pp.Fecha_y_hora_de_creacion) = 8
and year(pp.Fecha_y_hora_de_creacion)=2013)
and (pp.Estado != "Abierto" and pp.Estado!= 'Planificado') )
Refactor this to be a virtual table in your JOIN list, as follows.
(SELECT COUNT(*) AS NumPersonas,
Oficina,
Tipo_Disp
FROM People_DB
WHERE Fecha_y_hora_de_creacion >= '2013-08-01'
AND Fecha_y_hora_de_creacion < '2013-08-01' + INTERVAL 1 MONTH
AND Estado != 'Abierto'
AND Estado != 'Planificado
GROUP BY Oficina, Tipo_Disp
) AS pp_summary ON ( pp_summary.Oficina=p.Oficina
AND pp_summary.Tipo_Disp=Dispositivo)
So, this is your final list of tables.
FROM textcentro AS t
JOIN dtdx AS d ON t.jcentro03=d.dt
JOIN ppp AS p ON CAST(t.centro AS INTEGER) = p.Oficina
JOIN (
SELECT COUNT(*) AS NumPersonas,
Oficina,
Tipo_Disp
FROM People_DB
WHERE Fecha_y_hora_de_creacion >= '2013-08-01'
AND Fecha_y_hora_de_creacion < '2013-08-01' + INTERVAL 1 MONTH
AND Estado != 'Abierto'
AND Estado != 'Planificado
GROUP BY Oficina, Tipo_Disp
) AS pp_summary ON ( pp_summary.Oficina=p.Oficina
AND pp_summary.Tipo_Disp=Dispositivo)
Three of these tables are "physical" tables, and the fourth is a "virtual" table, constructed as a summary of the physical table called People_DB.
You can include
pp_summary.NumPersonas
in your SELECT list.
Fourth, avoid the nonstandard extensions to MySQL GROUP BY functionality, and use standard SQL. Read this for more information.
http://dev.mysql.com/doc/refman/5.0/en/group-by-extensions.html
Fifth, add appropriate indexes to your tables.