Django subquery pass extra workaround distinct since mysql - mysql

I am trying to prepare subnet=adress/netmask for presentation, and start_ip and end_ip for filtering purpose under lifs qs. then I want to use subquery to cut of single interface as vserver may have multiple. Problem is that I can't pass extra field generated in lifs queryset.
lifs = (NetworkLif.objects.filter(vserverid=OuterRef('pk'))
.annotate(network=Concat('address', Value('/'), 'netmasklength', output_field=CharField()))
.extra(select={'start_ip': "INET_NTOA(INET_ATON(address) & 0xffffffff ^ ((0x1 << ( 32 - netmasklength) ) -1 ))"})
.extra(select={'end_ip': "INET_NTOA(INET_ATON(address) | ((0x100000000 >> netmasklength ) -1 ))"})
.order_by('network')
)
queryset = (Vserver.objects.filter(type='DATA')
.exclude(name__endswith='-mc')
.annotate(subnet=Subquery(lifs.values('network' )[:1]),
#.annotate(start_ip_address=Subquery(lifs.values('start_ip')[:1]))
#.annotate(end_ip_address=Subquery(lifs.values('end_ip')[:1]))
.values('id', 'clusterid__name', 'name', 'state', 'nfsenabled', 'cifsenabled', 'ldapclientenabled', 'dnsenabled', 'subnet', )#'start_ip_address')#, 'end_ip_address' )
.order_by('id')
)
getting this error, I guess this is becuase evaulation logic
raise FieldError("Cannot resolve expression type, unknown output_field")
django.core.exceptions.FieldError: Cannot resolve expression type, unknown output_field
is there any workaround? I know in postgresql distinct('name') would fix my problem. unfortunetelly ActiveIQ is running on mysql.

Related

Python 3 psycopg2 COPY from stdin failed: error in .read()

I am trying to apply the code found on this page, in particular part 'Copy Data from String Iterator' of the Table of Contents, but run into an issue with my code.
Since not all lines coming from the generator (here log_lines) can be imported into the PostgreSQL database, I try to filter the correct lines (here row) using itertools.filterfalse like in the codeblock below:
def copy_string_iterator(connection, log_lines) -> None:
with connection.cursor() as cursor:
create_staging_table(cursor)
log_string_iterator = StringIteratorIO((
'|'.join(map(clean_csv_value, (
row['date'],
row['time'],
row['cs_uri_query'],
row['s_contentpath'],
row['sc_status'],
row['s_computername'],
...
row['sc_substates'],
row['s_port'],
row['cs_version'],
row['c_protocol'],
row.update({'cs_cookie':'x'}),
row['timetakenms'],
row['cs_uri_stem'],
))) + '\n')
for row in filterfalse(lambda line: "#" in line.get('date'), log_lines)
)
cursor.copy_from(log_string_iterator, 'log_table', sep = '|')
When I run this, cursor.copy_from() gives me the following error:
QueryCanceled: COPY from stdin failed: error in .read() call
CONTEXT: COPY log_table, line 112910
I understand why this error happens, it is because in the test file I use there are only 112909 lines that meet the filterfalse condition. But why does it try to copy line 112910 and throw the error and not just stop?
Since Python doesn't have a coalescing operator, add something like:
(map(clean_csv_value, (
row['date'] if 'date' in row else None,
:
row['cs_uri_stem'] if 'cs_uri_stem' in row else None,
))) + '\n')
for each of your fields so you can handle any missing fields in the JSON file. Of course the fields should be nullable in the db if you use None otherwise replace with None with some default value for that field.

Double Quotes in temporary JSON variable on MySQL using R

I have a table in MYSQL that contains the user interactions with a Web Page, I needed to extract the rows for the users where the date of that interaction is lower than a certain benchmark date and that benchmark date is different for each customer (I extract that date from a different database).
My approach was to set a json variable in which the key is a user and the value is the benchmark date, and used it in the query to extract the intended fields.
Example in R:
#MainDF contains the user and the benchmark date from a different database
json_str <- mapply(function(uid, bench_date){
paste0(
'{','"',cust,'"', ':', '"', bench_date, '"','}'
)
}, MainDF[, 'uid'],
MainDF[, 'date']
)
json_str <- paste0("'", '[', paste0(json_str , collapse = ','), ']', "'")
temp_var <- paste('set #test=', json_str)
The intention was to make temp_var to be like:
set #test= '{"0001":"2010-05-05",
"0012":"2015-05-05",
"0101":"2018-07-20"}'
but it actually looks like :
set #test= '{\"0001\":\"2010-05-05\",
\"0012\":\"2015-05-05\",
\"0101\":\"2018-07-20\"}'
then create the main query:
main_Q <- "select user_id, date
from interaction
where 1=1
and json_contains(json_keys(#test), concat('\"',user_id,'\"')) = 1
and date <= json_unquote(json_extract(#test,
concat('$.','\"',user_id, '\"')
)
)
"
For the execution, first, set the temporal variable and then execute the main query
dbSendQuery(connection, temp_var)
resp <- dbSendQuery(connection, main_Q )
target_df <- fetch(resp, n=-1)
dbClearResult(resp )
When I test a fraction of it in a SQL IDE it does works. However, in R it doesn't return anything.
I think that the issue is that R escape the double quotes in temp_var and SQL end up reading
set #test= '{\"0001\":\"2010-05-05\",
\"0012\":\"2015-05-05\",
\"0101\":\"2018-07-20\"}'
which is not won't work.
For example if I execute:
set #test= '{"0001":"2010-05-05",
"0012":"2015-05-05",
"0101":"2018-07-20"}'
select json_keys(#test)
it will return an array with the keys, but that is not the case with
set #test= '{\"0001\":\"2010-05-05\",
\"0012\":\"2015-05-05\",
\"0101\":\"2018-07-20\"}'
select json_keys(#test)
I am not sure how to solve the issue, but I need double quotes to specify the JSON. Is there any other approach that I should try or a way to make this work?
First, I think it is generally better to use a well-known library/package for converting to/from JSON, for several reasons.
This gives you a string that you should be able to place just about anywhere.
json_str <- jsonlite::toJSON(setNames(as.list(MainDF$date), MainDF$uid), auto_unbox=TRUE)
json_str
# {"0001":"2010-05-05","0012":"2015-05-05","0101":"2018-07-20"}
And while looking at the object on the R console will give the escaped-doublequotes,
as.character(json_str)
# [1] "{\"0001\":\"2010-05-05\",\"0012\":\"2015-05-05\",\"0101\":\"2018-07-20\"}"
that is merely R's representation (shows all strings within double-quotes, and therefore needs to escape any double-quotes within the string).
Adding it into some script should be straight-forward:
cat(paste('set #test=', sQuote(json_str)), '\n')
# set #test= '{"0001":"2010-05-05","0012":"2015-05-05","0101":"2018-07-20"}'
I'm assuming that having each on its own row is not critical. If it is, and indentation is important, perhaps this is more your style:
spaces <- strrep(' ', 2+nchar('set #test = '))
cat(paste0('set #test = ', sQuote(gsub(",", paste0(",\n", spaces), json_str))), '\n')
# set #test = '{"0001":"2010-05-05",
# "0012":"2015-05-05",
# "0101":"2018-07-20"}'
Data:
MainDF <- read.csv(stringsAsFactors=FALSE, colClasses='character', text='
uid,date
0001,2010-05-05
0012,2015-05-05
0101,2018-07-20')

Querying cassandra error no viable alternative at input 'ALLOW'

I'm trying to run a query on Cassandra through spark.
When running this command:
val test = sc.cassandraTable[Person](keyspace,table)
.where("name=?","Jane").collect
I get the appropriate output for the query.
When I try to use the where statement to enter the query as a whole string I get an error.
I receive the query as a json:
{"clause": " name = 'Jane' "}
then turn it into a string.
When running
val query = (json \ "clause").get.as[String]
//turns json value into a string
val test = sc.cassandraTable[Person](keyspace,table)
.where(query).collect
I get the following error:
java.io.IOException: Exception during preparation of SELECT "uuid", "person", "age" FROM "test"."users" WHERE token("uuid") > ? AND token("uuid") <= ? AND name = Jane ALLOW FILTERING: line 1:232 no viable alternative at input 'ALLOW' (...<= ? AND name = [Jane] ALLOW...)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.createStatement(CassandraTableScanRDD.scala:288)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.com$datastax$spark$connector$rdd$CassandraTableScanRDD$$fetchTokenRange(CassandraTableScanRDD.scala:302)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$18.apply(CassandraTableScanRDD.scala:328)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$18.apply(CassandraTableScanRDD.scala:328)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at com.datastax.spark.connector.util.CountingIterator.hasNext(CountingIterator.scala:12)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at com.datastax.spark.connector.util.CountingIterator.foreach(CountingIterator.scala:4)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
I suspect that when I turn the json value " name = 'Jane' " into a string, I lose the single quotes hence I get " name = Jane " which of course raises an error. I tried escaping the single quotes with \ and with a second pair of single quotes around the name Jane {"clause": " name = ''Jane'' "}. It doesn't solve the issue.
Edit: After further testing it's definitely the json that loses the single quotes and CQL needs them to perform the query. Can anyone suggest a way to escape/save the presence of the single quotes? I tried escaping with \ double single quotes '' . Is there a way to use JSON to provide proper whole CQL statements?
Please use Unicode character \u0027.

Jython: test to prevent exception "Cannot create PyString with non-byte value"?

Jython 2.7.0 (final release). OS: W7 (64-bit)
this code:
keys = javax.swing.UIManager.getDefaults().keys()
while keys.hasMoreElements():
key = keys.nextElement()
logger.info( "=== key %s" % str( key ) )
try:
value = javax.swing.UIManager.get(key)
except java.lang.Throwable, t:
logger.error( "=== thrown %s" % str( t ) )
produces all sorts of keys... until it outputs
=== key PasswordField.echoChar
it then throws
java.lang.IllegalArgumentException: Cannot create PyString with
non-byte value
I'm aware this is a known bug in Jython ... just wondering if there is a way of testing for this before the exception is thrown?
For me this gets triggered when using print() directly on a Java HashMap that contains any value with a Unicode character. A simple Python version of isBytes from PyString class is one way to detect it, but frankly, I don't think that is a good option unless you (a) know what element of the data that is triggering the issue and/or (b) intend to mask or fix the values triggering the issue. Probably the best solution is to just catch the exception.
This definitely effects Jython >= 2.7 and is a very annoying bug when troubleshooting. For me I just commented out the IllegalArgumentException in the PyString class code and recompiled. Now Jython will happily print out HashMaps directly and replace the Unicode characters with ? just like it did in previous versions. I would guess that this causes issues somewhere, maybe with code dealing with a lot of Unicode or something, but I haven't found any issues as of yet.
Catch Exception:
from java.lang import IllegalArgumentException
keys = javax.swing.UIManager.getDefaults().keys()
while keys.hasMoreElements():
key = keys.nextElement()
logger.info( "=== key %s" % str( key ) )
try:
value = javax.swing.UIManager.get(key)
except java.lang.Throwable, t:
try:
logger.error( "=== thrown %s" % str( t ) )
except IllegalArgumentException:
# fix it or w/e
Python version of isBytes:
def test_if_char(value):
for e in value:
if ord(e) > 255: return False
return True
In case you use Jython 2.7.0, one could use the following code to use any of Unicode strings within your code:
PyString str = Py.newStringOrUnicode("颜军")

How do I cleanly extract MySQL enum values in Perl?

I have some code which needs to ensure some data is in a mysql enum prior to insertion in the database. The cleanest way I've found of doing this is the following code:
sub enum_values {
my ( $self, $schema, $table, $column ) = #_;
# don't eval to let the error bubble up
my $columns = $schema->storage->dbh->selectrow_hashref(
"SHOW COLUMNS FROM `$table` like ?",
{},
$column
);
unless ($columns) {
X::Internal::Database::UnknownColumn->throw(
column => $column,
table => $table,
);
}
my $type = $columns->{Type} or X::Panic->throw(
details => "Could not determine type for $table.$column",
);
unless ( $type =~ /\Aenum\((.*)\)\z/ ) {
X::Internal::Database::IncorrectTypeForColumn->throw(
type_wanted => 'enum',
type_found => $type,
);
}
$type = $1;
require Text::CSV_XS;
my $csv = Text::CSV_XS->new;
$csv->parse($type) or X::Panic->throw(
details => "Could not parse enum CSV data: ".$csv->error_input,
);
return map { /\A'(.*)'\z/; $1 }$csv->fields;
}
We're using DBIx::Class. Surely there is a better way of accomplishing this? (Note that the $table variable is coming from our code, not from any external source. Thus, no security issue).
No need to be so heroic. Using a reasonably modern version of DBD::mysql, the hash returned by DBI's column info method contains a pre-split version of the valid enum values in the key mysql_values:
my $sth = $dbh->column_info(undef, undef, 'mytable', '%');
foreach my $col_info ($sth->fetchrow_hashref)
{
if($col_info->{'TYPE_NAME'} eq 'ENUM')
{
# The mysql_values key contains a reference to an array of valid enum values
print "Valid enum values for $col_info->{'COLUMN_NAME'}: ",
join(', ', #{$col_info->{'mysql_values'}}), "\n";
}
...
}
I'd say using Text::CSV_XS may be an overkill, unless you have weird things like commas in enums (a bad idea anyway if you ask me). I'd probably use this instead.
my #fields = $type =~ / ' ([^']+) ' (?:,|\z) /msgx;
Other than that, I don't think there are shortcuts.
I spent part of the day asking the #dbix-class channel over on MagNet the same question and came across this lack of answer. Since I found the answer and nobody else seems to have done so yet, I'll paste the transcript below the TL;DR here:
my $cfg = new Config::Simple( $rc_file );
my $mysql = $cfg->get_block('mysql');
my $dsn =
"DBI:mysql:database=$mysql->{database};".
"host=$mysql->{hostname};port=$mysql->{port}";
my $schema =
DTSS::CDN::Schema->connect( $dsn, $mysql->{user}, $mysql->{password} );
my $valid_enum_values =
$schema->source('Cdnurl')->column_info('scheme')->{extra}->{list};
And now the IRC log of me beating my head against a wall:
14:40 < cj> is there a cross-platform way to get the valid values of an
enum?
15:11 < cj> it looks like I could add 'InflateColumn::Object::Enum' to the
__PACKAGE__->load_components(...) list for tables with enum
columns
15:12 < cj> and then call values() on the enum column
15:13 < cj> but how do I get dbic-dump to add
'InflateColumn::Object::Enum' to
__PACKAGE__->load_components(...) for only tables with enum
columns?
15:20 < cj> I guess I could just add it for all tables, since I'm doing
the same for InflateColumn::DateTime
15:39 < cj> hurm... is there a way to get a column without making a
request to the db?
15:40 < cj> I know that we store in the DTSS::CDN::Schema::Result::Cdnurl
class all of the information that I need to know about the
scheme column before any request is issued
15:42 <#ilmari> cj: for Pg and mysql Schema::Loader will add the list of
valid values to the ->{extra}->{list} column attribute
15:43 <#ilmari> cj: if you're using some other database that has enums,
patches welcome :)
15:43 <#ilmari> or even just a link to the documentation on how to extract
the values
15:43 <#ilmari> and a willingness to test if it's not a database I have
access to
15:43 < cj> thanks, but I'm using mysql. if I were using sqlite for this
project, I'd probably oblige :-)
15:44 <#ilmari> cj: to add components to only some tables, use
result_components_map
15:44 < cj> and is there a way to get at those attributes without making a
query?
15:45 < cj> can we do $schema->resultset('Cdnurl') without having it issue
a query, for instance?
15:45 <#ilmari> $result_source->column_info('colname')->{extra}->{list}
15:45 < cj> and $result_source is $schema->resultset('Cdnurl') ?
15:45 <#ilmari> dbic never issues a query until you start retrieving the
results
15:45 < cj> oh, nice.
15:46 <#ilmari> $schema->source('Cdnurl')
15:46 <#ilmari> the result source is where the result set gets the results
from when they are needed
15:47 <#ilmari> names have meanings :)