SQL RegEx Create Split: Include SET - mysql

I'm writing a regex pattern to split MySQL CREATE statements into column definition arrays. So far, it works great for everything aside from SET columns. Here's the process:
$lines = explode("\n", $create_sql);
foreach ($lines as $line) {
$line = str_replace('NOT NULL', 'NOT_NULL', $line);
$pattern = '/`(.*)`[\s]*([^\s]*)[\s]*(NOT_NULL|NULL)[\s]*(.*),/';
$search = preg_match($pattern, $line, $matches);
if ($search !== false && count($matches) === 5) {
$columns[$matches[1]] = array(
'type' => $matches[2],
'null' => $matches[3] === 'NULL',
'extra' => $matches[4],
);
}
}
This process works great on a column with a definition like this:
`id` INT(11) NOT NULL AUTO_INCREMENT,
...but fails on SET columns. How can I best accommodate for the extra quotes and parentheses in this line?
`platform` SET('iOS', 'Android') NULL DEFAULT NULL,
To summarize, I need to integrate a pattern to match SET('one', 'two', n) in a string.

First just let slightly modify your pattern expression, first for readability (\s* instead of [\s]*), then to ensure working even with lowercase statements :
$pattern = '/`(.*)`\s*([^\s]*)\s*(NOT_NULL|NULL)\s*(.*),/i';
Then, to answer your question, prepare the pattern depending on a SET is present or not in the line:
$sub_pattern = stripos($line, ' SET(') ?
'SET\([^\)]*\)\s*[^\s]*'
: '[^\s]*';
$pattern = '/`(.*)`\s*(' . $sub_pattern . ')\s*(NOT_NULL|NULL)\s*(.*),/i';
Note that, however, this code is not secured against syntax errors in the source lines.

// Input: $sql
$match = '(
([^(),]+?| # no parens, or contains ():
[^(),]+?
\( [^)]+? \) # (123), (enum...), etc
[^(),]*?
) \s*(,\s*|$) # separator or terminator
)';
preg_match('/(CREATE\s+TABLE.*?\()(.*)(\)[^()]*(;|$))/si', $sql, $m);
list($x, $pre, $list, $post, $z) = $m;
$list = trim($list);
$list = preg_replace('/\s+/', ' ', $list);
$list = preg_replace("/$match/sx", "$1
", $list);
$list = trim($list);
// Output:
echo "<pre>
$pre
$list
$post\n</pre>\n";
Notes:
Certain quoted strings will fail to parse correctly.
I think all cases of parentheses work ok.
You will still need to parse each element in $list[], which should be one per column.

Related

Calculations cannot be performed by passed value from two different subroutines to a new subroutine: Perl

I used an anonymous hash to pass value from two different subroutines to a new subroutine. But, now I'm not able to perform calculations using the passed variables.
use warnings;
use strict;
use feature 'say';
use DBI;
use autodie;
use Data::Dumper;
use CGI;
print "Enter sequence";
my $seq = <STDIN>;
chomp $seq;
$len = length $seq;
my $f = nuc($seq);
perc({ len => $len });
sub nuc {
my ($c) = #_;
chomp $c;
my $len = length $c;
for (my $i = 0; $i< = $len; $i++) {
my $seq2 = substr($c, $i, 1);
$nuc=$nuc . $seq2;
chomp $nuc;
}
my $l = perc({nuc => $nuc});
}
sub perc {
my $params = shift;
my $k = $params->{nuc};
my $w = $params->{len};
my $db = "hnf1a";
my $user = "root";
my $password = "";
my $host = "localhost";
my $dbh = DBI->connect("DBI:mysql:database=$db:$host",$user,$password);
my $sth = $dbh->prepare('SELECT COUNT(*) FROM mody where nm = ?');
for (1..100) {
$sth->execute(int(rand(10)));
}
chomp (my $input = $k);
my #num = split /':'/, $input;
for my $num(#num) {
say "rows matching input nuc <$num>:";
$sth->execute($num);
my $count = $sth->fetchrow_array;
say "$count";
$u += $count;
}
}
$h = $u / $w;
print $h;
I passed the variables : $nuc and $len to the last subroutine 'perc' by declaring an anonymous hash.
When I use these variables to perform calculations I don't get a proper answer.
For the above division performed I got a statement as 'Illegal division'.
Please help me out. Thanks in advance.
You are making two separate calls to perc, each with only one of the required values in the hash. You can't do that: the subroutine won't "remember" a value passed to it across separate calls unless you write the code to do that
You need to collect all the values and pass them in a single call to perc
There are rather a lot of misunderstandings here. Let's go through your code.
use CGI;
Using CGI.pm is a bit dated, but it's not a terrible idea if you're writing a CGI program. But this isn't a CGI program, so this isn't necessary.
print "Enter sequence";
my $seq = <STDIN>;
chomp $seq;
$len = length $seq;
my $f = nuc($seq);
This looks OK. You prompt the user, get some input, remove the newline from the end of the input, get the length of the input and then pass your input into nuc().
So, let's look at nuc() - which could probably have a better name!
sub nuc {
my ($c) = #_;
chomp $c;
my $len = length $c;
for (my $i = 0; $i< = $len; $i++) {
my $seq2 = substr($c, $i, 1);
$nuc=$nuc . $seq2;
chomp $nuc;
}
my $l = perc({nuc => $nuc});
}
You get the parameter that has been passed in and remove the newline from the end of it (which does nothing as this is $seq which has already had its newline removed). You then get the length of this string (again!)
Then it gets very strange. Firstly, there's a syntax error (< = should be <=). Then you use a C-style for loop together with substr() too... well, basically you just copy $c to $nuc in a really inefficient manner. So this subroutine could be written as:
sub nuc {
my ($c) = #_;
$nuc = $c;
my $l = perc({ nuc => $nuc });
}
Oh, and I don't know why you chomp($nuc) each time round the loop.
Two more strange things. Firstly, you don't declare $nuc anywhere, and you have use strict in your code. Which means that this code doesn't even compile. (Please don't waste our time with code that doesn't compile!) And secondly, you don't explicitly return a value from nuc(), but you store the return value in $f. Because of the way Perl works, this subroutine will return the value in $l. But it's best to be explicit.
Then there's your perc() subroutine.
sub perc {
my $params = shift;
my $k = $params->{nuc};
my $w = $params->{len};
my $db = "hnf1a";
my $user = "root";
my $password = "";
my $host = "localhost";
my $dbh = DBI->connect("DBI:mysql:database=$db:$host",$user,$password);
my $sth = $dbh->prepare('SELECT COUNT(*) FROM mody where nm = ?');
for (1..100) {
$sth->execute(int(rand(10)));
}
chomp (my $input = $k);
my #num = split /':'/, $input;
for my $num(#num) {
say "rows matching input nuc <$num>:";
$sth->execute($num);
my $count = $sth->fetchrow_array;
say "$count";
$u += $count;
}
}
You get the hash ref which is passed in an store that in $params. You then extract the nuc and len values from that hash and store them in variables called $k and $w (you really need to improve your variable and subroutine names!) But each call to perc only has one of those values set - so only one of your two variables get a value, the other will be undef.
So then you connect to the database. And you run a select query a hundred times passing in random integers between 0 and 9. And ignore the value returned from the select statement. Which is bizarre and pointless.
Eventually, you start doing something with one of your input parameters, $k (the other, $w, is completely ignored). You copy it into another scalar variable before splitting it into an array. You then run the same SQL select statement once for each element in that array and add the number you get back to the running total in $u. And $u is another variable that you never declare, so (once again) this code doesn't compile.
Outside of your subroutines, you then do some simple maths with $u (an undeclared variable) and $w (a variable that was declared in a different scope) and store the result in $h (another undeclared variable).
I really don't understand what this code is supposed to do. And, to be honest, I don't think you do too. If you're at school, then you need to go back to your teacher and say that you have no idea what you are doing. If you're in a job, you need to tell your boss that you're not the right person for this task.
Either way, if you want to be a programmer, you need to go right back to the start and cover the very basics again.

Perl JSON issue when string starts with [ and not {

Hope some Perl gurus out there can help me out here. Basically my issue is when a JSON string starts with a "[" instead of a "{", Perl doesn't treat the variable as a hash after I use decode_json.
Here's a sample code.
#!/usr/bin/perl
use JSON;
use Data::Dumper;
$string1 = '{"Peti Bar":{"Literature":88,"Mathematics":82,"Art":99},"Foo Bar":{"Literature":67,"Mathematics":97}}';
$string = '[{"ActionID":5,"ActionName":"TEST- 051017"},{"ActionID":10,"ActionName":"Something here"},{"ActionID":13,"ActionName":"Some action"},{"ActionID":141,"ActionName":"Email Reminder"}]';
print "First string that starts with \"{\" below:\n$string1\n\n";
my $w = decode_json $string1;
my $count = keys %$w;
print "printing \$count's value -> $count\n\n";
print "Second string starts with \"[\" below:\n$string\n\n";
my $x = decode_json $string;
my $count2 = keys %$x;
print "printing \$count2's value -> $count2\n\n";
Below is the script output.
Both $w and $x works though. It's just I have to use keys $x instead of keys %$x on the other json string.
Now the issue with using that is I get a keys on reference is experimental at tests/jsontest.pl error. It won't stop the script but I'm worried about future compatibility issues.
What's the best way to approach this?
Use the ref function to determine what type the reference is. See perldoc -f ref.
my $w = decode_json $string1;
my $count = 1;
if( my $ref = ref( $w ) ){
if( $ref eq 'HASH' ){
$count = keys %$w;
}elsif( $ref eq 'ARRAY' ){
$count = scalar #$w;
}else{
die "invalid reference '$ref'\n";
}
}

DBI convert fetched arrayref to hash

I'm trying to write a program to fetch a big MySQL table, rename some fields and write it to JSON. Here is what I have for now:
use strict;
use JSON;
use DBI;
# here goes some statement preparations and db initialization
my $rowcache;
my $max_rows = 1000;
my $LIMIT_PER_FILE = 100000;
while ( my $res = shift( #$rowcache )
|| shift( #{ $rowcache = $sth->fetchall_arrayref( undef, $max_rows ) } ) ) {
if ( $cnt % $LIMIT_PER_FILE == 0 ) {
if ( $f ) {
print "CLOSE $fname\n";
close $f;
}
$filenum++;
$fname = "$BASEDIR/export-$filenum.json";
print "OPEN $fname\n";
open $f, ">$fname";
}
$res->{some_field} = $res->{another_field}
delete $res->{another_field}
print $f $json->encode( $res ) . "\n";
$cnt++;
}
I used the database row caching technique from
Speeding up the DBI
and everything seems good.
The only problem I have for now is that on $res->{some_field} = $res->{another_field}, the row interpreter complains and says that $res is Not a HASH reference.
Please could anybody point me to my mistakes?
If you want fetchall_arrayref to return an array of hashrefs, the first parameter should be a hashref. Otherwise, an array of arrayrefs is returned resulting in the "Not a HASH reference" error. So in order to return full rows as hashref, simply pass an empty hash:
$rowcache = $sth->fetchall_arrayref({}, $max_rows)

Perl XML2JSON : How to preserve XML element order?

I have a configuration file which is in XML format. I need to parse the XML and convert to JSON. I'm able to convert it with XML2JSON module of perl. But the problem is, it is not maintaining the order of XML elements. I strictly need the elements in order otherwise I cannot configure
My XML file is something like this. I have to configure an IP address and set that IP as a gateway to certain route.
<Config>
<ip>
<address>1.1.1.1</address>
<netmask>255.255.255.0</netmask>
</ip>
<route>
<network>20.20.20.0</network>
<netmask>55.255.255.0</netmask>
<gateway>1.1.1.1</gateway>
</route>
</Config>
This is my perl code to convert to JSON
my $file = 'config.xml';
use Data::Dumper;
open my $fh, '<',$file or die;
$/ = undef;
my $data = <$fh>;
my $XML = $data;
my $XML2JSON = XML::XML2JSON->new();
my $Obj = $XML2JSON->xml2obj($XML);
print Dumper($Obj);
The output I'm getting is,
$VAR1 = {'Config' => {'route' => {'netmask' => {'$t' => '55.255.255.0'},'gateway' => {'$t' => '1.1.1.1'},'network' => {'$t' => '20.20.20.0'}},'ip' => {'netmask' => {'$t' => '255.255.255.0'},'address' => {'$t' => '1.1.1.1'}}},'#encoding' => 'UTF-8','#version' => '1.0'};
I have a script which reads the json object and configure..
But it fails as it first tries to set gateway ip address to a route where the ip address is not yet configured and add then add ip address.
I strictly want key ip to come first and then route for proper configuration without error. Like this I have many dependencies where order of keys is a must.
Is there any way I can tackle this problem? I tried almost all modules of XML parsing like XML::Simple,Twig::XML,XML::Parser. But nothing helped..
Here's a program that I hacked together that uses XML::Parser to parse some XML data and generate the equivalent JSON in the same order. It ignores any attributes, processing instructions etc. and requires that every XML element must contain either a list of child elements or a text node. Mixing text and elements won't work, and this isn't checked except that the program will die trying to dereference a string
It's intended to be a framework for you to enhance as you require, but works fine as it stands with the XML data you show in your question
use strict;
use warnings 'all';
use XML::Parser;
my $parser = XML::Parser->new(Handlers => {
Start => \&start_tag,
End => \&end_tag,
Char => \&text,
});
my $struct;
my #stack;
$parser->parsefile('config.xml');
print_json($struct->[1]);
sub start_tag {
my $expat = shift;
my ($tag, %attr) = #_;
my $elem = [ $tag => [] ];
if ( $struct ) {
my $content = $stack[-1][1];
push #{ $content }, $elem;
}
else {
$struct = $elem;
}
push #stack, $elem;
}
sub end_tag {
my $expat = shift;
my ($elem) = #_;
die "$elem <=> $stack[-1][0]" unless $stack[-1][0] eq $elem;
for my $content ( $stack[-1][1] ) {
$content = "#$content" unless grep ref, #$content;
}
pop #stack;
}
sub text {
my $expat = shift;
my ($string) = #_;
return unless $string =~ /\S/;
$string =~ s/\A\s+//;
$string =~ s/\s+\z//;
push #{ $stack[-1][1] }, $string;
}
sub print_json {
my ($data, $indent, $comma) = (#_, 0, '');
print "{\n";
for my $i ( 0 .. $#$data ) {
# Note that $data, $indent and $comma are overridden here
# to reflect the inner context
#
my $elem = $data->[$i];
my $comma = $i < $#$data ? ',' : '';
my ($tag, $data) = #$elem;
my $indent = $indent + 1;
printf qq{%s"%s" : }, ' ' x $indent, $tag;
if ( ref $data ) {
print_json($data, $indent, $comma);
}
else {
printf qq{"%s"%s\n}, $data, $comma;
}
}
# $indent and $comma (and $data) are restored here
#
printf "%s}%s\n", ' ' x $indent, $comma;
}
output
{
"ip" : {
"address" : "1.1.1.1",
"netmask" : "255.255.255.0"
},
"route" : {
"network" : "20.20.20.0",
"netmask" : "55.255.255.0",
"gateway" : "1.1.1.1"
}
}
The problem isn't so much to do with XML parsing, but because perl hashes are not ordered. So when you 'write' some JSON... it can be any order.
The way to avoid this is to apply a sort function to your JSON.
You can do this by using sort_by to explicitly sort:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
use JSON::PP;
use Data::Dumper;
sub order_nodes {
my %rank_of = ( ip => 0, route => 1, address => 2, network => 3, netmask => 4, gateway => 5 );
print "$JSON::PP::a <=> $JSON::PP::b\n";
return $rank_of{$JSON::PP::a} <=> $rank_of{$JSON::PP::b};
}
my $twig = XML::Twig -> parse (\*DATA);
my $json = JSON::PP -> new;
$json ->sort_by ( \&order_nodes );
print $json -> encode( $twig -> simplify );
__DATA__
<Config>
<ip>
<address>1.1.1.1</address>
<netmask>255.255.255.0</netmask>
</ip>
<route>
<network>20.20.20.0</network>
<netmask>55.255.255.0</netmask>
<gateway>1.1.1.1</gateway>
</route>
</Config>
In some scenarios, setting canonical can help, as that sets ordering to lexical order. (And means your JSON output would be consistently ordered). This doesn't apply to your case.
You could build the node ordering via XML::Twig, either by an xpath expression, or by using twig_handlers. I gave it a quick go, but got slightly unstuck in figuring out how you'd 'tell' how to figure out ordering based on getting address/netmask and then network/netmask/gateway.
As a simple example you could:
my $count = 0;
foreach my $node ( $twig -> get_xpath ( './*' ) ) {
$rank_of{$node->tag} = $count++ unless $rank_of{$node->tag};
}
print Dumper \%rank_of;
This will ensure ip and route are always the right way around. However it doesn't order the subkeys.
That actually gets a bit more complicated, as you'd need to recurse... and then decide how to handle 'collisions' (like netmask - address comes before, but how does it sort compared to network).
Or alternatively:
my $count = 0;
foreach my $node ( $twig->get_xpath('.//*') ) {
$rank_of{ $node->tag } = $count++ unless $rank_of{ $node->tag };
}
This walks all the nodes, and puts them in order. It doesn't quite work, because netmask appears in both stanzas though.
You get:
{"ip":{"address":"1.1.1.1","netmask":"255.255.255.0"},"route":{"netmask":"55.255.255.0","network":"20.20.20.0","gateway":"1.1.1.1"}}
I couldn't figure out a neat way of collapsing both lists.

REGEXP BINARY sql syntax issue

I have this $value= 24153;
I have a field that can hold one or more values like this {"id":"2","value":["3"]} or like this {"id":"2","value":["3","4"]} or {"id":"2","value":["3","4","2"]}
I have this regex that works fine, but returns if the ONE value exists.
I need to improve this regex to the case there are more than one value in field.
REGEXP BINARY \'(.*{"id":"2","value":\["[^\"]*['.$value.'][^\"]*",?)+\]}.*\'
here is a regexp that will do what you want - but i have to agree with Your Common Sense - storing json and then using regex to extract data from it is sickening
REGEXP BINARY \'(.*{"id":"2","value":\[("[^\"]*",)*"'.$value.'"(,"[^\"]*")*)+\]}.*\'
$value = 4;
$subject_a = '{"id":"2","value":["4"]}';
$subject_b = '{"id":"2","value":["1","4","1","1"]}';
$subject_c = '{"id":"2","value":["1","1","1","1","1","44","1","1"]}';
$pattern = '/{"id":"2","value":\[("|\d|,)*"' . $value . '"("|\d|,)*\]/';
$matches[0] = preg_match($pattern,$subject_a);
$matches[1] = preg_match($pattern,$subject_b);
$matches[2] = preg_match($pattern,$subject_c);
echo '<pre>';
var_dump($matches);
echo '</pre>';
The result will be:
array(3) {
[0]=> int(1)
[1]=> int(1)
[2]=> int(0)
}