Sorry, long pre-history, but it is needed to clarify the question.
In my org the computers have names like CNT30[0-9]{3}[1-9a-z], for example cnt300021 or cnt30253a.
Last symbol is a "qualifier", so single workplace may have equally named computers assigned to it, distinguished by this qualifier. For example cnt300021 may mean desktop computer on workplace #002, and cnt30002a may mean notebook assigned for same workplace. Workplaces are "virtual" and their existence made just for our (IT dept) convenience.
Each dept have its own unique range [0-9]{3}. For example, computers of accounting have names starting cnt302751 upto cnt30299z which gives them 25 unique workplaces max, with up to 35 computers per workplace. (IRL most users have one desktop PC, much lesser have desktop and notebook, and only 2 or 3 technicians have more than one notebook at their disposal).
Recently, doing some inventory of computers' passports (unsure about a term: a paper, which means for computer the same, what a passport means for human), I found that there some holes in sequential numbering. For example, we have cnt302531 and cnt302551, but have no cnt302541, which means that there's no workplace #254.
What I want to do? I want to find these gaps without manual searching. For this I need a cycle from 1 to MaxComp=664 (no more workplace numbers assigned yet)
That's what I could write using some pseudo-SQL-BASIC:
for a=0 to MaxComp
a$="CNT30"+right(a+1000,3)
'comparing only 8 leftmost characters, ignoring 9th one - the qualifier
b$=(select name from table where left(name,8) like a$)
print a$;b$
next a
That code should give me two colummns: possible names and existing ones.
But I can't figure out how to implement this in SQL-query. What I tried:
# because of qualifier there may be several computers with same
# 8 leftmost characters
select #cnum:=#cnum+1 as CompNum, group_concat(name separator ',')
# PCs are inventoried by OCS-NG Inventory software
from hardware
cross join (select #cnum:=0) cnt
where left(hardware.name,8)=concat('CNT30',right(#cnum+1000,3))
limit 100
But this construct returns exactly one row. And I can't understand, if it is possible without using the stored procedures, and what I did wrong if it is possible?
Found working path:
(at first I tried to use stored function)
CREATE FUNCTION `count_comps`(num smallint) RETURNS tinytext CHARSET utf8
BEGIN
return (select group_concat(name separator ',')
from hardware where left(hardware.name,8)=concat('CNT30',right(num+1000,3))
);
END
Then I tried hard to replicate function's results in subquery. And I did it! Note: the inner select returns exactly same results as function does
# Starting point. May be INcreased to narrow the results list
set #cnum:=0;
select
#cnum:=#cnum+1 as CompNum,
concat('CNT30',right(#cnum+1000,3)) as CalcNum,
# this
count_comps(#cnum) as hwns,
# and this gives equal results
(select group_concat(name separator ',')
from hardware where left(name,8)=calcnum
) hwn2
from hardware
# no more dummy tables here
# Ending point. May be DEcreased to narrow the results list
where #cnum<665;
So, the wrong part of "classical" approach was the usage of dummy table, which seems to be not necessary.
Partial results example (starting set #cnum:=479;, finishing where #cnum<530;):
CompNum, CalcNum, hwns, hwn2
'488', 'CNT30488', 'CNT304881', 'CNT304881'
'489', 'CNT30489', 'CNT304892', 'CNT304892'
'490', 'CNT30490', 'CNT304901,CNT304902,CNT304903', CNT304901,CNT304902,CNT304903'
'491', 'CNT30491', NULL, NULL
'492', 'CNT30492', NULL, NULL
'493', 'CNT30493', 'CNT304932', 'CNT304932'
'494', 'CNT30494', 'CNT304941', 'CNT304941'
I found that there no workplaces #491 and #492. On next adding the PCs for the 'October Region' dept (range 480-529), at least two of new PCs will get names CNT304911 and CNT304921, filling this gap.
Related
In pseudo code I'd like to accomplish this:
select
count(groupA) as gA,
count(groupB) as gB,
count(groupC) as gC,
sum(gA,gB,gC) as groupABCTotal
But the various ways I've tried has resulted in syntax errors. What's the correct way to achieve this without re-selecting the group counts for the sum?
If you want the sum as another column of output, you'll have to re-select the constituent columns in one fashion or another.
I find this rather readable:
SELECT d.*, gA+gB+gC AS groupABCTotal
FROM (SELECT COUNT(groupA) AS gA,
COUNT(groupB) AS gB,
COUNT(groupC) AS gC
FROM tbl) d;
But this works, too, as you know:
SELECT COUNT(groupA) AS gA,
COUNT(groupB) AS gB,
COUNT(groupC) AS gC,
COUNT(groupA)+COUNT(groupB)+COUNT(groupC) AS groupABCTotal
FROM tbl;
Now, MySQL is probably smart enough not to recompute redundant aggregrates, so that COUNT(groupA) would be computed only once in the second form above.
I am using the following MySQL query which is working fine, I mean giving me the desired output but... lets first see the query:
select
fl.file_ID,
length(fl.filedesc) as l,
case
when
fl.part_no is null
and l>60
then concat(fl.fileno,' ', left(fl.filedesc, 60),'...')
when
fl.part_no is null
and length(fl.filedesc)<=60
then concat(fl.fileno,' ',fl.filedesc)
when
fl.part_no is not null
and length(fl.filedesc)>60
then concat(fl.fileno,'(',fl.part_no,')', left(fl.filedesc, 60),'...')
when
fl.part_no is not null
and length(fl.filedesc)<=60
then concat(fl.fileno,'(',fl.part_no,')',fl.filedesc)
end as filedesc
from filelist fl
I don't want to use the length function repeatedly because I guess it would hit the database everytime claiming performance issue. Please suggest if I can store the length once and use it several times.
Once you have accessed a given row, what you do with the columns has only a small impact on performance. So your guess that it "hits the database" more to serve repeated use of that length function isn't as bad as you think.
The analogy I would use is a postal carrier delivering mail to your house, which is miles outside of town. He drives for 20 minutes to your mailbox, and then he worries that it takes too much time to insert one letter at a time into your mailbox, instead of all the letters at once. The cost of that inefficiency is insignificant compared to the long drive.
That said, you can make the query more concise or easier to code or to look at. But this probably won't have a big benefit for performance.
select
fl.file_ID,
concat(fl.fileno,
ifnull(concat('(',fl.part_no,')'), ' '),
left(fl.filedesc,60),
if(length(fl.filedesc)>60,'...','')
) as filedesc
from filelist fl
Is there any way to directly access the stemmer used in the FORMSOF() option of a CONTAINS Full Text Search query so that it returns the stems/inflections of an input word, not just those derivations that exist in a search column.
For example, the query
SELECT * FROM dbo.MyDB WHERE contains(CHAR_COL,'FORMSOF(INFLECTIONAL, prettier)')
returns the stem "pretty" and other inflections such as "prettiest" if they exists in the CHAR_COL column. What I want is to call the FORMSOF() function directly without referencing a column at all. Any chance?
EDIT:
The query that met my needs ended up being
SELECT * FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY group_ID ORDER BY GROUP_ID) ord, display_term
from sys.dm_fts_parser('FORMSOF( FREETEXT, running) and FORMSOF(FREETEXT, jumping)', 1033, null, 1)) a
WHERE ord=1
Requires membership in the sysadmin
fixed server role and access rights to
the specified stoplist.
No. You can not do this. You can't get an access to stemmer directly.
You can get an idea of how it works by looking into Solr source code. But it might (and I guess will) be different from the one implemented in MS SQL FT.
UPDATE: It turns out that in SQL Server 2008 R2 you can do something quite close to what you want. A special table-valued UDF was added:
sys.dm_fts_parser('query_string', lcid, stoplist_id, accent_sensitivity)
it allows you to get a tokenization result (i.e. the result after applying word breaking, thesaurus and stop list application). So in case you feed it 'FORMSOF(....)' it will give you the result you want (well, you will have to process result set anyway). Here's corresponding article in MSDN.
I'm running this query
SELECT
country,
countries.code,
countries.lat,
countries.lng,
countries.zoom,
worldip.start,
worldip.end
FROM countries, worldip
WHERE countries.code = worldip.code
AND
'91.113.120.5' BETWEEN worldip.start AND worldip.end
ORDER BY worldip.start DESC
on a tables with these fields,
worldip countries
-------------- ----------------
start code
end country
code lat
country_name lng
zoom
And sometimes I'm getting two results in two different countries for one ip. I understand why
'91.113.120.5' BETWEEN worldip.start AND worldip.end
would return two different results since 10 is between 9 and 11, but also 5 and 12. I would have thought including WHERE countries.code = worldip.code would have prevented this, or at least ensure I got the right country no matter how many results it returned. but it doesn't.
I also added ORDER BY worldip.start DESC which seems to work since the more accurate an ip adress, the higher up the list it appears. you can see it working (or not) here . But that's a quick fix and I'd like to do it right.
SQL is a real weak point for me. Can anyone explain what I'm doing wrong?
Firstly nice app. I was looking for flights - I would love price comparisons and no #based links please. You could try a free geolocator service instead of using your own geoip database. That aside is your ip field of 'IP' datatype in MySQL allowing comparison ? This may help you get correct ordinality. Otherwise the stuff is compared as strings and problems may arise where the length of IP's is different and so on.
With integer representation of IP's you can use the <= and >= operators.
Is it particularly bad to have a very, very large SQL query with lots of (potentially redundant) WHERE clauses?
For example, here's a query I've generated from my web application with everything turned off, which should be the largest possible query for this program to generate:
SELECT *
FROM 4e_magic_items
INNER JOIN 4e_magic_item_levels
ON 4e_magic_items.id = 4e_magic_item_levels.itemid
INNER JOIN 4e_monster_sources
ON 4e_magic_items.source = 4e_monster_sources.id
WHERE (itemlevel BETWEEN 1 AND 30)
AND source!=16 AND source!=2 AND source!=5
AND source!=13 AND source!=15 AND source!=3
AND source!=4 AND source!=12 AND source!=7
AND source!=14 AND source!=11 AND source!=10
AND source!=8 AND source!=1 AND source!=6
AND source!=9 AND type!='Arms' AND type!='Feet'
AND type!='Hands' AND type!='Head'
AND type!='Neck' AND type!='Orb'
AND type!='Potion' AND type!='Ring'
AND type!='Rod' AND type!='Staff'
AND type!='Symbol' AND type!='Waist'
AND type!='Wand' AND type!='Wondrous Item'
AND type!='Alchemical Item' AND type!='Elixir'
AND type!='Reagent' AND type!='Whetstone'
AND type!='Other Consumable' AND type!='Companion'
AND type!='Mount' AND (type!='Armor' OR (false ))
AND (type!='Weapon' OR (false ))
ORDER BY type ASC, itemlevel ASC, name ASC
It seems to work well enough, but it's also not particularly high traffic (a few hundred hits a day or so), and I wonder if it would be worth the effort to try and optimize the queries to remove redundancies and such.
Reading your query makes me want to play an RPG.
This is definitely not too long. As long as they are well formatted, I'd say a practical limit is about 100 lines. After that, you're better off breaking subqueries into views just to keep your eyes from crossing.
I've worked with some queries that are 1000+ lines, and that's hard to debug.
By the way, may I suggest a reformatted version? This is mostly to demonstrate the importance of formatting; I trust this will be easier to understand.
select *
from
4e_magic_items mi
,4e_magic_item_levels mil
,4e_monster_sources ms
where mi.id = mil.itemid
and mi.source = ms.id
and itemlevel between 1 and 30
and source not in(16,2,5,13,15,3,4,12,7,14,11,10,8,1,6,9)
and type not in(
'Arms' ,'Feet' ,'Hands' ,'Head' ,'Neck' ,'Orb' ,
'Potion' ,'Ring' ,'Rod' ,'Staff' ,'Symbol' ,'Waist' ,
'Wand' ,'Wondrous Item' ,'Alchemical Item' ,'Elixir' ,
'Reagent' ,'Whetstone' ,'Other Consumable' ,'Companion' ,
'Mount'
)
and ((type != 'Armor') or (false))
and ((type != 'Weapon') or (false))
order by
type asc
,itemlevel asc
,name asc
/*
Some thoughts:
==============
0 - Formatting really matters, in SQL even more than most languages.
1 - consider selecting only the columns you need, not "*"
2 - use of table aliases makes it short & clear ("MI", "MIL" in my example)
3 - joins in the WHERE clause will un-clutter your FROM clause
4 - use NOT IN for long lists
5 - logically, the last two lines can be added to the "type not in" section.
I'm not sure why you have the "or false", but I'll assume some good reason
and leave them here.
*/
Default MySQL 5.0 server limitation is "1MB", configurable up to 1GB.
This is configured via the max_allowed_packet setting on both client and server, and the effective limitation is the lessor of the two.
Caveats:
It's likely that this "packet" limitation does not map directly to characters in a SQL statement. Surely you want to take into account character encoding within the client, some packet metadata, etc.)
SELECT ##global.max_allowed_packet
this is the only real limit it's adjustable on a server so there is no real straight answer
From a practical perspective, I generally consider any SELECT that ends up taking more than 10 lines to write (putting each clause/condition on a separate line) to be too long to easily maintain. At this point, it should probably be done as a stored procedure of some sort, or I should try to find a better way to express the same concept--possibly by creating an intermediate table to capture some relationship I seem to be frequently querying.
Your mileage may vary, and there are some exceptionally long queries that have a good reason to be. But my rule of thumb is 10 lines.
Example (mildly improper SQL):
SELECT x, y, z
FROM a, b
WHERE fiz = 1
AND foo = 2
AND a.x = b.y
AND b.z IN (SELECT q, r, s, t
FROM c, d, e
WHERE c.q = d.r
AND d.s = e.t
AND c.gar IS NOT NULL)
ORDER BY b.gonk
This is probably too large; optimizing, however, would depend largely on context.
Just remember, the longer and more complex the query, the harder it's going to be to maintain.
Most databases support stored procedures to avoid this issue. If your code is fast enough to execute and easy to read, you don't want to have to change it in order to get the compile time down.
An alternative is to use prepared statements so you get the hit only once per client connection and then pass in only the parameters for each call
I'm assuming you mean by 'turned off' that a field doesn't have a value?
Instead of checking if something is not this, and it's also not that etc. can't you just check if the field is null? Or set the field to 'off', and check if type or whatever equals 'off'.