How to get the current text of a given page id - mediawiki

I have a bot, that analyses the current text of some pages directly from database. The page ids are known. In the past the bot used where revision.rev_id = page.page_latest && text.old_id = revision.rev_text_id. After an Mediawiki update, the bot doesn't work anymore.
Nowadays member revision.rev_text_id is missed. The docu tells, that text.old_id is now referenced by table content. My problem is now, to to find a way from page_id to table content.

After posting the question, I continued mý investigation, read the docu again and found the solution (table slots):
SELECT p.page_title, t.old_id, t.old_text
FROM `page` p,
`slots` s,
`content` c,
`text` t
WHERE p.page_id = $page_id
&& s.slot_origin = p.page_latest
&& c.content_id = s.slot_content_id
&& substr(c.content_address,1,3) = "tt:"
&& t.old_id = substr(c.content_address,4)
But it is much slower than the old bot (tested on same server): 7 min instead of 1.55s for 11274 pages. Maybe I add some indexes.
EDIT
After adding a key with alter table slots add index (slot_origin) the process needs 1.162s (a little bit faster than the old bot).

Related

Updating JSON in SQLite with JSON1

The SQLite JSON1 extension has some really neat capabilities. However, I have not been able to figure out how I can update or insert individual JSON attribute values.
Here is an example
CREATE TABLE keywords
(
id INTEGER PRIMARY KEY,
lang INTEGER NOT NULL,
kwd TEXT NOT NULL,
locs TEXT NOT NULL DEFAULT '{}'
);
CREATE INDEX kwd ON keywords(lang,kwd);
I am using this table to store keyword searches and recording the locations from which the search was ininitated in the object locs. A sample entry in this database table would be like the one shown below
id:1,lang:1,kwd:'stackoverflow',locs:'{"1":1,"2":1,"5":1}'
The location object attributes here are indices to the actual locations stored elsewhere.
Now imagine the following scenarios
A search for stackoverflow is initiated from location index "2". In this case I simply want to increment the value at that index so that after the operation the corresponding row reads
id:1,lang:1,kwd:'stackoverflow',locs:'{"1":1,"2":2,"5":1}'
A search for stackoverflow is initiated from a previously unknown location index "7" in which case the corresponding row after the update would have to read
id:1,lang:1,kwd:'stackoverflow',locs:'{"1":1,"2":1,"5":1,"7":1}'
It is not clear to me that this can in fact be done. I tried something along the lines of
UPDATE keywords json_set(locs,'$.2','2') WHERE kwd = 'stackoverflow';
which gave the error message error near json_set. I'd be most obliged to anyone who might be able to tell me how/whether this should/can be done.
It is not necessary to create such complicated SQL with subqueries to do this.
The SQL below would solve your needs.
UPDATE keywords
SET locs = json_set(locs,'$.7', IFNULL(json_extract(locs, '$.7'), 0) + 1)
WHERE kwd = 'stackoverflow';
I know this is old, but it's like the first link when searching, it deserves a better solution.
I could have just deleted this question but given that the SQLite JSON1 extension appears to be relatively poorly understood I felt it would be more useful to provide an answer here for the benefit of others. What I have set out to do here is possible but the SQL syntax is rather more convoluted.
UPDATE keywords set locs =
(select json_set(json(keywords.locs),'$.**N**',
ifnull(
(select json_extract(keywords.locs,'$.**N**') from keywords where id = '1'),
0)
+ 1)
from keywords where id = '1')
where id = '1';
will accomplish both of the updates I have described in my original question above. Given how complicated this looks a few explanations are in order
The UPDATE keywords part does the actual updating, but it needs to know what to updatte
The SELECT json_set part is where we establish the value to be updated
If the relevant value does not exsit in the first place we do not want to do a + 1 on a null value so we do an IFNULL TEST
The WHERE id = bits ensure that we target the right row
Having now worked with JSON1 in SQLite for a while I have a tip to share with others going down the same road. It is easy to waste your time writing extremely convoluted and hard to maintain SQL in an effort to perform in-place JSON manipulation. Consider using SQLite in memory tables - CREATE TEMP TABLE... to store intermediate results and write a sequence of SQL statements instead. This makes the code a whole lot eaiser to understand and to maintain.

trying to understand if my page is vulnerable

I am creating a php page with a small and simple database.
when I visit it online and try to pass the parameter "length" in the url like: index.php/?length=1 it works fine and fetches the data.
If I add the single quote like index.php/?length=1' I have no SQL error on the page...
but if I use index.php/?length=-1 I see the SQL error in my page.
Does this mean that my page is vulnerable?
How can I further test it and fix the problem?
Edit: added the code
$length = $wpdb->get_results( $wpdb->prepare("SELECT `title`, `website`, `material`, `color`, `width`, `height`, `group`, `category`, `numbers_positive`, `numbers_negative`, `custom` FROM {$wpdb->shirts} WHERE `id` = '%d' ORDER BY `rank` ASC, `id` ASC", intval($shirt_id)) );
if (!isset($shirt[0])) return false;
$shirt= $shirt[0];
$shirt->title = htmlspecialchars(stripslashes($shirt->title), ENT_QUOTES);
$shirt->custom = maybe_unserialize($shirt->custom);
$shirt->color = maybe_unserialize($shirt->color);
if ( $this->hasBridge() ) {
global $lmBridge;
$shirt->shirtColor = $lmBridge->getShirtColor($shirt->color);
}
$shirt = (object)array_merge((array)$shirt,(array)$shirt->custom);
unset($shirt->custom);
return $shirt;
Yes, from the URL examples you have given, it seems like you take user input and directly insert it into your MySQL statement. That is the absolute worst. You should always parse user input because direct input from a user can result in the string being escaped and them deleting every table in your DB. This is a great example: Bobby Tables
Also, this is been a topic of great discussion. There is a great answer here
Edit* Using the WordPress framework and looking at your code, its not as bad as it seemed.
accepting, but generating an error on -1 does not nessicarily mean you are suseptable to an injection attack. As long as you are varifying that the input is an integer and only using the integer compontent, you're fairly safe.
Prepared statements make it even more secure, by seperating the data from the query. Doing that means someone can never 'break out' of what you are supposed to be working on. It's absolutely the right way to use SQL.
We can even take it another step farther, by limiting the abilty of the account to do anything other that run stored queries, and storing your queries on the SQL server side, rather then in your PHP. At that point, even IF they broke out (which they can't), they would only be able to access those defined queries.

MySQL order by problems

I have the following codes..
echo "<form><center><input type=submit name=subs value='Submit'></center></form>";
$val=$_POST['resulta']; //this is from a textarea name='resulta'
if (isset($_POST['subs'])) //from submit name='subs'
{
$aa=mysql_query("select max(reservno) as 'maxr' from reservation") or die(mysql_error()); //select maximum reservno
$bb=mysql_fetch_array($aa);
$cc=$bb['maxr'];
$lines = explode("\n", $val);
foreach ($lines as $line) {
mysql_query("insert into location_list (reservno, location) values ('$cc', '$line')")
or die(mysql_error()); //insert value of textarea then save it separately in location_list if \n is found
}
If I input the following data on the textarea (assume that I have maximum reservno '00014' from reservation table),
Davao - Cebu
Cebu - Davao
then submit it, I'll have these data in my location_list table:
loc_id || reservno || location
00001 || 00014 || Davao - Cebu
00002 || 00014 || Cebu - Davao
Then this code:
$gg=mysql_query("SELECT GROUP_CONCAT(IF((#var_ctr := #var_ctr + 1) = #cnt,
location,
SUBSTRING_INDEX(location,' - ', 1)
)
ORDER BY loc_id ASC
SEPARATOR ' - ') AS locations
FROM location_list,
(SELECT #cnt := COUNT(1), #var_ctr := 0
FROM location_list
WHERE reservno='$cc'
) dummy
WHERE reservno='$cc'") or die(mysql_error()); //QUERY IN QUESTION
$hh=mysql_fetch_array($gg);
$ii=$hh['locations'];
mysql_query("update reservation set itinerary = '$ii' where reservno = '$cc'")
or die(mysql_error());
is supposed to update reservation table with 'Davao - Cebu - Davao' but it's returning this instead, 'Davao - Cebu - Cebu'. I was previously helped by this forum to have this code working but now I'm facing another difficulty. Just can't get it to work. Please help me. Thanks in advance!
I got it working (without ORDER BY loc_id ASC) as long as I set phpMyAdmin operations loc_id ascending. But whenever I delete all data, it goes back as loc_id descending so I have to reset it. It doesn't entirely solve the problem but I guess this is as far as I can go. :)) I just have to make sure that the table column loc_id is always in ascending order. Thank you everyone for your help! I really appreciate it! But if you have any better answer, like how to set the table column always in ascending order or better query, etc, feel free to post it here. May God bless you all!
The database server is allowed to rewrite your query to optimize its execution. This might affect the order of the individual parts, in particular the order in which the various assignments are executed. I assume that some such reodering causes the result of the query to become undefined, in such a way that it works on sqlfiddle but not on your actual production system.
I can't put my finger on the exact location where things go wrong, but I believe that the core of the problem is the fact that SQL is intended to work on relations, but you try to abuse it for sequential programming. I suggest you retrieve the data from the database using portable SQL without any variable hackery, and then use PHP to perform any post-processing you might need. PHP is much better suited to express the ideas you're formulating, and no optimization or reordering of statements will get in your way there. And as your query currently only results in a single value, fetching multiple rows and combining them into a single value in the PHP code shouldn't increase complexety too much.
Edit:
While discussing another answer using a similar technique (by Omesh as well, just as the answer your code is based upon), I found this in the MySQL manual:
As a general rule, you should never assign a value to a user variable
and read the value within the same statement. You might get the
results you expect, but this is not guaranteed. The order of
evaluation for expressions involving user variables is undefined and
may change based on the elements contained within a given statement;
in addition, this order is not guaranteed to be the same between
releases of the MySQL Server.
So there are no guarantees about the order these variable assignments are evaluated, therefore no guarantees that the query does what you expect. It might work, but it might fail suddenly and unexpectedly. Therefore I strongly suggest you avoid this approach unless you have some relaibale mechanism to check the validity of the results, or really don't care about whether they are valid.

Insert bulk failed due to a schema change of the target table

select
FiscalMonthID = (select FiscalMonthID from CurrentFiscalMonth (nolock)),
T.OrgKey,
DataSourceKey = 26,
OrganizationTypeKey = 2,
SourceSystemID = MAX(T.MbsId),
WEGFlag = convert(bit,0),
D.CreateDT,
D.CreateBy,
D.UpdateDT,
D.UpdateBy
from WorkDB.dbo.TempMbsOrgMap (nolock) as T
join WorkDB.dbo.MBSOrganization_Denorm2 (nolock) as D
on T.MbsId = D.OrganizationID
--where OrgKey not in (select OrgKey from OrgMap where FiscalMonthID=258 and DataSourceKey=26 and OrganizationTypeKey=2)
group by
T.OrgKey,
D.CreateDT,
D.CreateBy,
D.UpdateDT,
D.UpdateBy
I don't know whether this person ever got the problem resolved. If anyone else hits this error, the most helpful article I have found so far is:
https://learn.microsoft.com/en-us/archive/blogs/sqlserverfaq/executing-bcp-fails-with-sqlstate-37000-nativeerror-4891-error-microsoftodbc-sql-server-driversql-serverinsert-bulk-failed-due-to-a-schema-change-of-the-target-table
The recommendations from the article (it's worth reading the whole thing though) are as follows:
Below is some of the action plan you can try, this helped in my case
though.
Drop the Constraints before the BCP run and recreate them after the
run
Disable the Auto update stats (To isolate the issue)
Check if any parallel index rebuilds happening.
If still the issue persists after implementing the above change,
collect the Profiler trace to capture the activity when bcp is failing
to further investigation.

Can we control LINQ expression order with Skip(), Take() and OrderBy()

I'm using LINQ to Entities to display paged results. But I'm having issues with the combination of Skip(), Take() and OrderBy() calls.
Everything works fine, except that OrderBy() is assigned too late. It's executed after result set has been cut down by Skip() and Take().
So each page of results has items in order. But ordering is done on a page handful of data instead of ordering of the whole set and then limiting those records with Skip() and Take().
How do I set precedence with these statements?
My example (simplified)
var query = ctx.EntitySet.Where(/* filter */).OrderByDescending(e => e.ChangedDate);
int total = query.Count();
var result = query.Skip(n).Take(x).ToList();
One possible (but a bad) solution
One possible solution would be to apply clustered index to order by column, but this column changes frequently, which would slow database performance on inserts and updates. And I really don't want to do that.
EDIT
I ran ToTraceString() on my query where we can actually see when order by is applied to the result set. Unfortunately at the end. :(
SELECT
-- columns
FROM (SELECT
-- columns
FROM (SELECT -- columns
FROM ( SELECT
-- columns
FROM table1 AS Extent1
WHERE EXISTS (SELECT
-- single constant column
FROM table2 AS Extent2
WHERE (Extent1.ID = Extent2.ID) AND (Extent2.userId = :p__linq__4)
)
) AS Project2
limit 0,10 ) AS Limit1
LEFT OUTER JOIN (SELECT
-- columns
FROM table2 AS Extent3 ) AS Project3 ON Limit1.ID = Project3.ID
UNION ALL
SELECT
-- columns
FROM (SELECT -- columns
FROM ( SELECT
-- columns
FROM table1 AS Extent4
WHERE EXISTS (SELECT
-- single constant column
FROM table2 AS Extent5
WHERE (Extent4.ID = Extent5.ID) AND (Extent5.userId = :p__linq__4)
)
) AS Project6
limit 0,10 ) AS Limit2
INNER JOIN table3 AS Extent6 ON Limit2.ID = Extent6.ID) AS UnionAll1
ORDER BY UnionAll1.ChangedDate DESC, UnionAll1.ID ASC, UnionAll1.C1 ASC
My workaround solution
I've managed to workaround this problem. Don't get me wrong here. I haven't solved precedence issue as of yet, but I've mitigated it.
What I did?
This is the code I've used until I get an answer from Devart. If they won't be able to overcome this issue I'll have to use this code in the end.
// get ordered list of IDs
List<int> ids = ctx.MyEntitySet
.Include(/* Related entity set that is needed in where clause */)
.Where(/* filter */)
.OrderByDescending(e => e.ChangedDate)
.Select(e => e.Id)
.ToList();
// get total count
int total = ids.Count;
if (total > 0)
{
// get a single page of results
List<MyEntity> result = ctx.MyEntitySet
.Include(/* related entity set (as described above) */)
.Include(/* additional entity set that's neede in end results */)
.Where(string.Format("it.Id in {{{0}}}", string.Join(",", ids.ConvertAll(id => id.ToString()).Skip(pageSize * currentPageIndex).Take(pageSize).ToArray())))
.OrderByDescending(e => e.ChangedOn)
.ToList();
}
First of all I'm getting ordered IDs of my entities. Getting only IDs is well performant even with larger set of data. MySql query is quite simple and performs really well. In the second part I partition these IDs and use them to get actual entity instances.
Thinking of it, this should perform even better than the way I was doing it at the beginning (as described in my question), because getting total count is much much quicker due to simplified query. The second part is practically very very similar, except that my entities are returned rather by their IDs instead of partitioned using Skip and Take...
Hopefully someone may find this solution helpful.
I haven't worked directly with Linq to Entities, but it should have a way to hook specific stored procedures into certain locations when needed. (Linq to SQL did.) If so, you could turn this query into a stored procedure, doing exacly what is required, and doing it efficiently.
Assuming from you comment the persisting the values in a List is not acceptable:
There's no way to completely minimize the iterations, as you intended (and as I would have tried too, living in hope). Cutting the iterations down by one would be nice. Is it possible to just get the Count once and cache/session it? Then you could:
int total = ctx.EntitySet.Count; // Hopefully you can not repeat doing this.
var result = ctx.EntitySet.Where(/* filter */).OrderBy(/* expression */).Skip(n).Take(x).ToList();
Hopefully you can cache the Count somehow, or avoid needing it every time. Even if you can't, this is the best you can do.
Could you please create a sample illusrating the problem and send it to us (support * devart * com, subject "EF: Skip, Take, OrderBy")?
Hope we will be able to help you.
You can also contact us using our forums or contact form.
Are you absolutely certain the ordering is off? What does the SQL look like?
Can you reorder your code as follows and post the output?
// Redefine your queries.
var query = ctx.EntitySet.Where(/* filter */).OrderBy(e => e.ChangedDate);
var skipped = query.Skip(n).Take(x);
// let's look at the SQL, shall we?
var querySQL = query.ToTraceString();
var skippedSQL = skipped.ToTraceString();
// actual execution of the queries...
int total = query.Count();
var result = skipped.ToList();
Edit:
I'm absolutely certain. You can check my "edit" to see trace result of my query with skipped trace result that is imperative in this case. Count is not really important.
Yeah, I see it. Wow, that's a stumper. Might even be an outright bug. I note you're not using SQL Server... what DB are you using? Looks like it might be MySQl.
One way:
var query = ctx.EntitySet.Where(/* filter */).OrderBy(/* expression */).ToList();
int total = query.Count;
var result = query.Skip(n).Take(x).ToList();
Convert it to a List before skipping. It's not too efficient, mind you...