I'm trying to optimize a migration, it's taking too long, about 15 minutes every time you try to run it, because there is a lot of data on this table. It's an old database that have dates like this '14102019' (%d%m%Y) as String, and need to convert them to DateField. I created a DateField for both.
Database is MySQL.
dtobito and dtnasc are the old strings that need to be converted
data_obito and data_nasc are the new DateFields
What works (very slowly):
def date_to_datefield(apps, schema_editor):
Obitos = apps.get_model('core', 'Obitos')
for obito in Obitos.objects.all():
if obito.dtnasc and obito.dtnasc != '':
obito.data_nasc = datetime.strptime(obito.dtnasc, '%d%m%Y')
if obito.dtobito and obito.dtobito != '':
obito.data_obito = datetime.strptime(obito.dtobito, '%d%m%Y')
obito.save()
What doesn't work:
Obitos.objects.update(
data_nasc=datetime.strptime(F('dtnasc'), '%d%m%Y'),
data_obito=datetime.strptime(F('dtobito'), '%d%m%Y')
)
What could work, but I don't know how:
Obitos.objects.raw("""
UPDATE obitos new,
(
SELECT
STR_TO_DATE(dtnasc, '%d%m%Y') AS dtnasc,
STR_TO_DATE(dtobito, '%d%m%Y') AS dtobito,
FROM obitos
) old
SET new.data_nasc = old.dtnasc
SET new.data_obtio = old.dtobito
""")
Try using bulk_update:
def date_to_datefield(apps, schema_editor):
Obitos = apps.get_model('core', 'Obitos')
objs = []
def to_date_object(date_string):
return datetime.strptime(date_string, '%d%m%Y')
for obito in Obitos.objects.all():
updated = False
if obito.dtnasc:
obito.data_nasc = to_date_object(obito.dtnasc)
updated = True
if obito.dtobito:
obito.data_obito = to_date_object(obito.dtobito)
updated = True
if updated:
objs.append(obito)
if objs:
Obitos.objects.bulk_update(objs, ['data_nasc', 'data_obito'])
Related
I'm new to Python and I believe the issue with my code is being caused by the fact that I'm a newbie and there's some theory or something that I must not be familiar with yet.
Yes, this question was asked before but, is different from mine. Believe me I tried everything that I thought that needs to be done.
Everything worked until I added everything in "if five in silos" statement.
After I enter the values for the 6 input functions, the program just finishes with exit code 0. Nothing else happens. The for loop is not initiated.
I want for the code to accept either 103 or 106 when prompting to enter something for the "five" variable.
I'm using PyCharm and Python 3.7.
import mysql.connector
try:
db = mysql.connector.connect(
host="",
user="",
passwd="",
database=""
)
one = int(input("Number of requested telephone numbers: "))
two = input("Enter the prefix (4 characters) with a leading 0: ")[:4]
three = int(input("Enter the ccid: "))
four = int(input("Enter the cid: "))
six = input("Enter case number: ")
five = int(input("Enter silo (103, 106 only): "))
cursor = db.cursor()
cursor.execute(f"SELECT * FROM n1 WHERE ddi LIKE '{two}%' AND silo = 1 AND ccid = 0 LIMIT {one}")
cursor.fetchall()
silos = (103, 106)
if five in silos:
if cursor.rowcount > 0:
for row in cursor:
seven = input(f"{row[1]} has been found on our system. Do you want to continue? Type either Y or N.")
if seven == "Y":
cursor.execute(f"INSERT INTO n{five} (ddi, silo, ccid, campaign, assigned, allocated, "
f"internal_notes, client_notes, agentid, carrier, alias) VALUES "
f"('{row[1]}', 1, {three}, {four}, NOW(), NOW(), 'This is a test.', '', 0, "
f"'{row[13]}', '') "
f"ON DUPLICATE KEY UPDATE "
f"silo = VALUES (silo), "
f"ccid = VALUES (ccid), "
f"campaign = VALUES (campaign);")
cursor.execute(f"UPDATE n1 SET silo = {five}, internal_notes = '{six}', allocated = NOW() WHERE "
f"ddi = '{row[1]}'")
else:
print("The operation has been canceled.")
db.commit()
else:
print(f"No results for prefix {two}.")
else:
print("Enter either silo 103 or 106.")
cursor.close()
db.close()
except (ValueError, NameError):
print("Please, enter an integer for all questions, except case number.")
Because it must be:
for row in cursor.fetchall():
// do something
In your code cursor returns a Python Class defined by db.cursor() but you need to call the fetchall() function to read the rows contained in it.
You're actually calling cursor.fetchall() without doing nothing with it, you can assign the call to a variable and than do this:
result = cursor.fetchall()
for row in result:
//do something
I found the problem: I had to store cursor.fetchall() into a variable.
After I put: eight = cursor.fetchall() before the "silos" tuple, everything worked perfectly.
I was not able to find something relevant anywhere.
I would like to put a timeout on "password" that I am saving on mysql with python.
cur.execute("INSERT INTO generatedcode(password, codetimeout) VALUES (%s,
%s)", [passwordd, timestamp])
As I'm not sufficiently acquainted with python and hence I'll leave the specifics of the code to you; I will only present the idea of the solution and the relevant SQL commands (syntax will need to be verified as I don't have an environment to test it).
Let say that you want to set a timeout of 1 hour from the moment you save the password and your table has (at least) the following two fields: Password and Expiration (I assume that Password would be of binary type to allow encryption and Expiration would be of DATETIME type).
Then, you would implement the following SQL command:
INSERT INTO <your table> (Password , Expiration )
VALUES (%s , DATEADD(NOW(),3600))
[spaces added just for clarity purposes]
and send that string towards the DB where %s would be replaced by the value of the password.
What DATEADD(NOW(),3600) does is:
Gets the current date and time,
Adds one hour to it.
Once you have that inserted into your table, you would later on retrieve the password using the following command:
SELECT Password
FROM <your table>
WHERE User = <could be Username or any other key that you are using now>
AND Expiration > NOW()
meaning, get the password (if any) that its expiration datetime is still in the future.
Hope this is what you were looking for.
Cheers!!
EDIT:
I'm adding hereinafter your code after fixes:
#app.route('/signattendance', methods=['GET'])
def signattendance():
stamp = "signed"
error = None
# now = datetime.datetime.today()
# tdelta = datetime.timedelta(seconds=10000)
now = datetime.datetime.now()
if request.method == 'GET':
cur1 = mysql.connection.cursor()
result = cur1.execute("SELECT password FROM generatedcode WHERE codetimeout > NOW()")
if result is False:
cur = mysql.connection.cursor()
cur.execute("INSERT INTO attendance(studentattendance) VALUES(%s,DATEADD(NOW(),3600)", [stamp])
mysql.connection.commit()
cur.close()
# cur1.close()
flash('Succsefully signed', 'Acepted')
else:
flash('You couldnt sign attendance', 'Denied')
else:
return redirect(url_for('dashboard'))
return render_template('signattendance.html', error=error)
Note that I leave for the DB to check the current time.
if request.method == 'GET' or 'POST':
cur1 = mysql.connection.cursor()
result = cur1.execute("SELECT password FROM generatedcode "
"WHERE DATE_SUB(CURRENT_TIME (),INTERVAL 10 MINUTE) <= codetimeout;")
The SQLAlchemy ORM allows for:
instance1 = Class1()
instance1.foo = 'bar'
What I want to do, however, is refer the column name 'foo' by passing it as a variable to the instance, something like:
column_name = 'foo'
instance1[column_name] = 'bar' # or
instance1.column(column_name) = 'bar' # or
instance1.Column(column_name) = 'bar'
Or, perhaps to do something like:
instance1=Class1(foo='bar') # or
instance1=Class1('foo':'bar')
or:
dict = {'foo':'bar'}
instance1 = Class1(dict) # or
instance1 = Class1(**dict) # <-- THIS *DOES* WORK
None of the above work, of course. Is there a method of referring to the column name with a variable, please?
FYI, the bigger picture version of what I'm trying to do is to collect data into a dictionary with keys that match a subset of column names in multiple tables. Then I want to dump whatever I've gathered in the dictionary into those tables:
instance1 = Class1()
instance2 = Class2()
for key in data_dictionary:
instance1[key] = data_dictionary[key]
instance2[key] = data_dictionary[key]
or simply:
instance1 = Class1(data_dictionary)
instance2 = Class2(data_dictionary)
Thanks for your help,
Andrew
Thank you, univerio - the following works:
dict = {'foo':'bar'}
instance = Class1(**dict)
As does your other suggestion:
setattr(instance, 'foo', 'bar')
I would like to scan hbase table and see integers as strings (not their binary representation). I can do the conversion but have no idea how to write scan statement by using Java API from hbase shell:
org.apache.hadoop.hbase.util.Bytes.toString(
"\x48\x65\x6c\x6c\x6f\x20\x48\x42\x61\x73\x65".to_java_bytes)
org.apache.hadoop.hbase.util.Bytes.toString("Hello HBase".to_java_bytes)
I will be very happy to have examples of scan, get that searching binary data (long's) and output normal strings. I am using hbase shell, not JAVA.
HBase stores data as byte arrays (untyped). Therefore if you perform a table scan data will be displayed in a common format (escaped hexadecimal string), e.g: "\x48\x65\x6c\x6c\x6f\x20\x48\x42\x61\x73\x65" -> Hello HBase
If you want to get back the typed value from the serialized byte array you have to do this manually.
You have the following options:
Java code (Bytes.toString(...))
hack the to_string function in $HBASE/HOME/lib/ruby/hbase/table.rb :
replace toStringBinary with toInt for non-meta tables
write a get/scan JRuby function which converts the byte array to the appropriate type
Since you want it HBase shell, then consider the last option:
Create a file get_result.rb :
import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.hbase.client.HTable
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Result;
import java.util.ArrayList;
# Simple function equivalent to scan 'test', {COLUMNS => 'c:c2'}
def get_result()
htable = HTable.new(HBaseConfiguration.new, "test")
rs = htable.getScanner(Bytes.toBytes("c"), Bytes.toBytes("c2"))
output = ArrayList.new
output.add "ROW\t\t\t\t\t\tCOLUMN\+CELL"
rs.each { |r|
r.raw.each { |kv|
row = Bytes.toString(kv.getRow)
fam = Bytes.toString(kv.getFamily)
ql = Bytes.toString(kv.getQualifier)
ts = kv.getTimestamp
val = Bytes.toInt(kv.getValue)
output.add " #{row} \t\t\t\t\t\t column=#{fam}:#{ql}, timestamp=#{ts}, value=#{val}"
}
}
output.each {|line| puts "#{line}\n"}
end
load it in the HBase shell and use it:
require '/path/to/get_result'
get_result
Note: modify/enhance/fix the code according to your needs
Just for completeness' sake, it turns out that the call Bytes::toStringBinary gives the hex-escaped sequence you get in HBase shell:
\x0B\x2_SOME_ASCII_TEXT_\x10\x00...
Whereas, Bytes::toString will try to deserialize to a string assuming UTF8, which will look more like:
\u8900\u0710\u0115\u0320\u0000_SOME_UTF8_TEXT_\u4009...
you can add a scan_counter command to the hbase shell.
first:
add to /usr/lib/hbase/lib/ruby/hbase/table.rb (after the scan function):
#----------------------------------------------------------------------------------------------
# Scans whole table or a range of keys and returns rows matching specific criterias with values as number
def scan_counter(args = {})
unless args.kind_of?(Hash)
raise ArgumentError, "Arguments should be a hash. Failed to parse #{args.inspect}, #{args.class}"
end
limit = args.delete("LIMIT") || -1
maxlength = args.delete("MAXLENGTH") || -1
if args.any?
filter = args["FILTER"]
startrow = args["STARTROW"] || ''
stoprow = args["STOPROW"]
timestamp = args["TIMESTAMP"]
columns = args["COLUMNS"] || args["COLUMN"] || get_all_columns
cache = args["CACHE_BLOCKS"] || true
versions = args["VERSIONS"] || 1
timerange = args[TIMERANGE]
# Normalize column names
columns = [columns] if columns.class == String
unless columns.kind_of?(Array)
raise ArgumentError.new("COLUMNS must be specified as a String or an Array")
end
scan = if stoprow
org.apache.hadoop.hbase.client.Scan.new(startrow.to_java_bytes, stoprow.to_java_bytes)
else
org.apache.hadoop.hbase.client.Scan.new(startrow.to_java_bytes)
end
columns.each { |c| scan.addColumns(c) }
scan.setFilter(filter) if filter
scan.setTimeStamp(timestamp) if timestamp
scan.setCacheBlocks(cache)
scan.setMaxVersions(versions) if versions > 1
scan.setTimeRange(timerange[0], timerange[1]) if timerange
else
scan = org.apache.hadoop.hbase.client.Scan.new
end
# Start the scanner
scanner = #table.getScanner(scan)
count = 0
res = {}
iter = scanner.iterator
# Iterate results
while iter.hasNext
if limit > 0 && count >= limit
break
end
row = iter.next
key = org.apache.hadoop.hbase.util.Bytes::toStringBinary(row.getRow)
row.list.each do |kv|
family = String.from_java_bytes(kv.getFamily)
qualifier = org.apache.hadoop.hbase.util.Bytes::toStringBinary(kv.getQualifier)
column = "#{family}:#{qualifier}"
cell = to_string_scan_counter(column, kv, maxlength)
if block_given?
yield(key, "column=#{column}, #{cell}")
else
res[key] ||= {}
res[key][column] = cell
end
end
# One more row processed
count += 1
end
return ((block_given?) ? count : res)
end
#----------------------------------------------------------------------------------------
# Helper methods
# Returns a list of column names in the table
def get_all_columns
#table.table_descriptor.getFamilies.map do |family|
"#{family.getNameAsString}:"
end
end
# Checks if current table is one of the 'meta' tables
def is_meta_table?
tn = #table.table_name
org.apache.hadoop.hbase.util.Bytes.equals(tn, org.apache.hadoop.hbase.HConstants::META_TABLE_NAME) || org.apache.hadoop.hbase.util.Bytes.equals(tn, org.apache.hadoop.hbase.HConstants::ROOT_TABLE_NAME)
end
# Returns family and (when has it) qualifier for a column name
def parse_column_name(column)
split = org.apache.hadoop.hbase.KeyValue.parseColumn(column.to_java_bytes)
return split[0], (split.length > 1) ? split[1] : nil
end
# Make a String of the passed kv
# Intercept cells whose format we know such as the info:regioninfo in .META.
def to_string(column, kv, maxlength = -1)
if is_meta_table?
if column == 'info:regioninfo' or column == 'info:splitA' or column == 'info:splitB'
hri = org.apache.hadoop.hbase.util.Writables.getHRegionInfoOrNull(kv.getValue)
return "timestamp=%d, value=%s" % [kv.getTimestamp, hri.toString]
end
if column == 'info:serverstartcode'
if kv.getValue.length > 0
str_val = org.apache.hadoop.hbase.util.Bytes.toLong(kv.getValue)
else
str_val = org.apache.hadoop.hbase.util.Bytes.toStringBinary(kv.getValue)
end
return "timestamp=%d, value=%s" % [kv.getTimestamp, str_val]
end
end
val = "timestamp=#{kv.getTimestamp}, value=#{org.apache.hadoop.hbase.util.Bytes::toStringBinary(kv.getValue)}"
(maxlength != -1) ? val[0, maxlength] : val
end
def to_string_scan_counter(column, kv, maxlength = -1)
if is_meta_table?
if column == 'info:regioninfo' or column == 'info:splitA' or column == 'info:splitB'
hri = org.apache.hadoop.hbase.util.Writables.getHRegionInfoOrNull(kv.getValue)
return "timestamp=%d, value=%s" % [kv.getTimestamp, hri.toString]
end
if column == 'info:serverstartcode'
if kv.getValue.length > 0
str_val = org.apache.hadoop.hbase.util.Bytes.toLong(kv.getValue)
else
str_val = org.apache.hadoop.hbase.util.Bytes.toStringBinary(kv.getValue)
end
return "timestamp=%d, value=%s" % [kv.getTimestamp, str_val]
end
end
val = "timestamp=#{kv.getTimestamp}, value=#{org.apache.hadoop.hbase.util.Bytes::toLong(kv.getValue)}"
(maxlength != -1) ? val[0, maxlength] : val
end
second:
add to /usr/lib/hbase/lib/ruby/shell/commands/
the following file called: scan_counter.rb
#
# Copyright 2010 The Apache Software Foundation
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
module Shell
module Commands
class ScanCounter < Command
def help
return <<-EOF
Scan a table with cell value that is long; pass table name and optionally a dictionary of scanner
specifications. Scanner specifications may include one or more of:
TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH,
or COLUMNS. If no columns are specified, all columns will be scanned.
To scan all members of a column family, leave the qualifier empty as in
'col_family:'.
Some examples:
hbase> scan_counter '.META.'
hbase> scan_counter '.META.', {COLUMNS => 'info:regioninfo'}
hbase> scan_counter 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
hbase> scan_counter 't1', {FILTER => org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}
hbase> scan_counter 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]}
For experts, there is an additional option -- CACHE_BLOCKS -- which
switches block caching for the scanner on (true) or off (false). By
default it is enabled. Examples:
hbase> scan_counter 't1', {COLUMNS => ['c1', 'c2'], CACHE_BLOCKS => false}
EOF
end
def command(table, args = {})
now = Time.now
formatter.header(["ROW", "COLUMN+CELL"])
count = table(table).scan_counter(args) do |row, cells|
formatter.row([ row, cells ])
end
formatter.footer(now, count)
end
end
end
end
finally
add to /usr/lib/hbase/lib/ruby/shell.rb the function scan_counter.
replace the current function with this: (you can identify it by: 'DATA MANIPULATION COMMANDS',)
Shell.load_command_group(
'dml',
:full_name => 'DATA MANIPULATION COMMANDS',
:commands => %w[
count
delete
deleteall
get
get_counter
incr
put
scan
scan_counter
truncate
]
)
I'm trying to save some data in to a table. I get the data from a database and it works ok.
My problem is that the data is not saved in the table. It is a lua table like table = {} and NOT a database table.
Maybe it is saved but it looks like the prints are done before the saving even though I call them after. In fact it seems like my network request is done last in my program even though I call it first.
I would real like to know the reason for this. Any ideas?
Here is the code:
---TESTING!
print("Begin teting!")
--hej = require ( "test2" )
local navTable = {
Eng_Spd = 0,
Spd_Set = 0
}
local changeTab = function()
navTable.Eng_Spd = 2
end
printNavTable = function()
print("navTable innehåller: ")
print(navTable.Eng_Spd)
print(navTable.Spd_Set)
end
require "sqlite3"
local myNewData
local json = require ("json")
local decodedData
local SaveData2 = function()
local i = 1
local counter = 1
local index = "livedata"..counter
local navValue = decodedData[index]
print(navValue)
while (navValue ~=nil) do
--tablefill ="INSERT INTO navaltable VALUES (NULL,'" .. navValue[1] .. "','" .. navValue[3] .."','" .. navValue[4] .."','" .. navValue[5] .."','" .. navValue[6] .."');"
--print(tablefill)
--db:exec(tablefill)
if navValue[3] == "Eng Spd" then navTable.Eng_Spd = navValue[4]
elseif navValue[3] == "Spd Set" then navTable.Spd_Set = navValue[4]
else print("blah")
end
print(navTable.Eng_Spd)
print(navTable.Spd_Set)
counter=counter+1
index = "livedata"..counter
navValue = decodedData[index]
end
end
local function networkListener( event )
if (event.isError) then
print("Network error!")
else
myNewData = event.response
print("From server: "..myNewData)
decodedData = (json.decode(myNewData))
SaveData2()
--db:exec("DROP TABLE IN EXISTS navaltable")
end
end
--function uppdateNavalTable()
network.request( "http://127.0.0.1/firstMidle.php", "GET", networkListener )
--end
changeTab()
printNavTable()
--uppdateNavalTable()
printNavTable()
print("Done!")
And here is the output:
Copyright (C) 2009-2012 C o r o n a L a b s I n c .
Version: 2.0.0
Build: 2012.971
Begin teting!
navTable innehåller:
2
0
navTable innehåller:
2
0
Done!
From server: {"livedata1":["1","0","Eng Spd","30","0","2013-03-15 11:35:48"],"li
vedata2":["1","1","Spd Set","13","0","2013-03-15 11:35:37"]}
table: 008B5018
30
0
30
13
And btw, navTable innehåller means navTable contains.
The answer is that networklistener run parallell with the rest of the code.