Default size of integer in Rails tables (MySQL) - mysql

When I run
rails g model StripeCustomer user_id:integer customer_id:integer
annotate
I got
# == Schema Information
# Table name: stripe_customers
# id :integer(4) not null, primary key
# user_id :integer(4)
# customer_id :integer(4)
# created_at :datetime
# updated_at :datetime
Does it mean I can only hold up to 9,999 records only? (I am quite surprise how small a default size for keys is). How do I change default IDs to be 7 digits in existing tables?
Thank you.

While the mysql client's describe command really uses the display width (see the docs), the schema information in the OP's question is very probably generated by the annontate_models gem's get_schema_info method that uses the limit attribute of each column. And the limit attribute is the number of bytes for :binary and :integer columns (see the docs).
The method reads (see how the last line adds the limit):
def get_schema_info(klass, header, options = {})
info = "# #{header}\n#\n"
info << "# Table name: #{klass.table_name}\n#\n"
max_size = klass.column_names.collect{|name| name.size}.max + 1
klass.columns.each do |col|
attrs = []
attrs << "default(#{quote(col.default)})" unless col.default.nil?
attrs << "not null" unless col.null
attrs << "primary key" if col.name == klass.primary_key
col_type = col.type.to_s
if col_type == "decimal"
col_type << "(#{col.precision}, #{col.scale})"
else
col_type << "(#{col.limit})" if col.limit
end
#...
end

Rails actually means 4 bytes here, i.e. the standard mysql integer type (see the docs)

Related

Proper use of flatMap

Why I keep getting this error everytime I try an action of my RDD & how to fix it?
/databricks/spark/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
317 raise Py4JJavaError(
318 "An error occurred while calling {0}{1}{2}.\n".
--> 319 format(target_id, ".", name), value)
320 else:
321 raise Py4JError(
I've tried to figure out which is the last RDD I can do action on and its ratingByUser, which indicates the problem is in the flatMap.
What I'm trying to do is that I take CSV with (userID,movieID,rating) and I want to create unique combinations of movieID per userID with the rating, but different users can generate the same pair of movieID,ex for this CSV:
1,2000,5
1,2001,2
1,2002,3
2,2000,4
2,2001,1
2,2004,5
I want RDD:
key (2000,2001), value (5,2,1)
key (2000,2002), value (5,3,1)
key (2001,2002), value (2,3,1)
key (2000,2001), value (4,1,1)
key (2000,2004), value (4,5,1)
key (2001,2004), value (1,5,1)
# First Map function - gets line and returns key(userID) value(movieID,rating)
def parseLine(line):
fields=line.split(",")
userID=int(fields[0])
movieID=int(fields[1])
rating=int(fields[2])
return userID, (movieID,rating)
# Function to create movie unique pairs with ratings
# all pair start with the lowest ID
# returns key (movieIDj,movieIDi) & value (rating-j,rating-i,1)
# the 1 in value is added in order to count number of ratings in the reduce
def createPairs(userRatings):
pairs=[]
for i1 in range(len(userRatings[1])-1):
for i2 in range(i1+1,len(userRatings[1])):
if userRatings[i1][0]<userRatings[1][i2][0]:
pairs.append(((userRatings[1][i1][0],userRatings[1][i2][0]),(userRatings[1][i1][1],userRatings[1][i2][1],1)))
else:
pairs.append(((userRatings[1][i2][0],userRatings[1][i1][0]),(userRatings[1][i2][1],userRatings[1][i1][1],1)))
return pairs
# Create SC object from the ratings file
lines = sc.textFile("/FileStore/tables/dvmlbdnj1487603982330/ratings.csv")
# Map lines to Key(userID),Value(movieID,rating)
movieRatings = lines.map(parseLine)
# Join all rating by same user into one key
# (UserID1,(movie1,rating1)),(UserID1,(movie2,rating2)) --> UserID1,[(movie1,rating1),(movie2,rating2)]
ratingsPerUser = movieRatings.groupByKey()
# activate createPairs func
# We use flatMap, since each user have different number of ratings --> different number pairs
pairsOfMovies = ratingsPerUser.flatMap(createPairs)
Problem is function passed to flatMap not flatMap.
Group by key returns iterator:
It cannot be traversed multiple times
It cannot be indexed.
Convert to list first:
ratingsPerUser.mapValues(list).flatMap(createPairs)

How to find a rails object based on a date value

I have a User object and I am attempting to do 2 different queries as part of a script that needs to run nightly. Given the schema below I would like to:
Get all the Users with a non nil end_date
Get all the Users with an end_date that is prior to today (I.E. has passed)
Users Schema:
# == Schema Information
#
# Table name: users
#
# id :integer not null, primary key
# name :string(100) default("")
# end_date :datetime
I've been trying to use User.where('end_date != NULL) and other things but I cannot seem to get the syntax correct.
Your methods should be as below inside the User model :
def self.users_with_end_date_not_null
self.where.not(end_date: nil)
# below Rails 4 use
# self.where("end_date != ?", nil)
end
def self.past_users n
self.where(end_date: n.day.ago)
end

mysql2sqlite.sh Auto_Increment

original MySQl Tbl_driver
delimiter $$
CREATE TABLE `tbl_driver` (
`_id` int(11) NOT NULL AUTO_INCREMENT,
`Driver_Code` varchar(45) NOT NULL,
`Driver_Name` varchar(45) NOT NULL,
`AddBy_ID` int(11) NOT NULL,
PRIMARY KEY (`_id`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=latin1$$
mysql2sqlite.sh
#!/bin/sh
# Converts a mysqldump file into a Sqlite 3 compatible file. It also extracts the MySQL `KEY xxxxx` from the
# CREATE block and create them in separate commands _after_ all the INSERTs.
# Awk is choosen because it's fast and portable. You can use gawk, original awk or even the lightning fast mawk.
# The mysqldump file is traversed only once.
# Usage: $ ./mysql2sqlite mysqldump-opts db-name | sqlite3 database.sqlite
# Example: $ ./mysql2sqlite --no-data -u root -pMySecretPassWord myDbase | sqlite3 database.sqlite
# Thanks to and #artemyk and #gkuenning for their nice tweaks.
mysqldump --compatible=ansi --skip-extended-insert --compact "$#" | \
awk '
BEGIN {
FS=",$"
print "PRAGMA synchronous = OFF;"
print "PRAGMA journal_mode = MEMORY;"
print "BEGIN TRANSACTION;"
}
# CREATE TRIGGER statements have funny commenting. Remember we are in trigger.
/^\/\*.*CREATE.*TRIGGER/ {
gsub( /^.*TRIGGER/, "CREATE TRIGGER" )
print
inTrigger = 1
next
}
# The end of CREATE TRIGGER has a stray comment terminator
/END \*\/;;/ { gsub( /\*\//, "" ); print; inTrigger = 0; next }
# The rest of triggers just get passed through
inTrigger != 0 { print; next }
# Skip other comments
/^\/\*/ { next }
# Print all `INSERT` lines. The single quotes are protected by another single quote.
/INSERT/ {
gsub( /\\\047/, "\047\047" )
gsub(/\\n/, "\n")
gsub(/\\r/, "\r")
gsub(/\\"/, "\"")
gsub(/\\\\/, "\\")
gsub(/\\\032/, "\032")
print
next
}
# Print the `CREATE` line as is and capture the table name.
/^CREATE/ {
print
if ( match( $0, /\"[^\"]+/ ) ) tableName = substr( $0, RSTART+1, RLENGTH-1 )
}
# Replace `FULLTEXT KEY` or any other `XXXXX KEY` except PRIMARY by `KEY`
/^ [^"]+KEY/ && !/^ PRIMARY KEY/ { gsub( /.+KEY/, " KEY" ) }
# Get rid of field lengths in KEY lines
/ KEY/ { gsub(/\([0-9]+\)/, "") }
# Print all fields definition lines except the `KEY` lines.
/^ / && !/^( KEY|\);)/ {
gsub( /AUTO_INCREMENT|auto_increment/, "" )
gsub( /(CHARACTER SET|character set) [^ ]+ /, "" )
gsub( /DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP|default current_timestamp on update current_timestamp/, "" )
gsub( /(COLLATE|collate) [^ ]+ /, "" )
gsub(/(ENUM|enum)[^)]+\)/, "text ")
gsub(/(SET|set)\([^)]+\)/, "text ")
gsub(/UNSIGNED|unsigned/, "")
if (prev) print prev ","
prev = $1
}
# `KEY` lines are extracted from the `CREATE` block and stored in array for later print
# in a separate `CREATE KEY` command. The index name is prefixed by the table name to
# avoid a sqlite error for duplicate index name.
/^( KEY|\);)/ {
if (prev) print prev
prev=""
if ($0 == ");"){
print
} else {
if ( match( $0, /\"[^"]+/ ) ) indexName = substr( $0, RSTART+1, RLENGTH-1 )
if ( match( $0, /\([^()]+/ ) ) indexKey = substr( $0, RSTART+1, RLENGTH-1 )
key[tableName]=key[tableName] "CREATE INDEX \"" tableName "_" indexName "\" ON \"" tableName "\" (" indexKey ");\n"
}
}
# Print all `KEY` creation lines.
END {
for (table in key) printf key[table]
print "END TRANSACTION;"
}
'
exit 0
when execute this script, my sqlite database become like this
Sqlite Tbl_Driver
CREATE TABLE "tbl_driver" (
"_id" int(11) NOT NULL ,
"Driver_Code" varchar(45) NOT NULL,
"Driver_Name" varchar(45) NOT NULL,
"AddBy_ID" int(11) NOT NULL,
PRIMARY KEY ("_id")
)
i want to change "_id" int(11) NOT NULL ,
become like this "_id" int(11) NOT NULL PRIMARY KEY AUTO_INCREMENT,
or
become like this "_id" int(11) NOT NULL AUTO_INCREMENT,
with out primary key also can
any idea to modify this script?
The AUTO_INCREMENT keyword is specific to MySQL.
SQLite has a keyword AUTOINCREMENT (without the underscore) which means the column auto-generates monotonically increasing values that have never been used before in the table.
If you leave out the AUTOINCREMENT keyword (as the script you show does currently), SQLite assigns the ROWID to a new row, which means it will be a value 1 greater than the current greatest ROWID in the table. This could re-use values if you delete rows from the high end of the table and then insert new rows.
See http://www.sqlite.org/autoinc.html for more details.
If you want to modify this script to add the AUTOINCREMENT keyword, it looks like you could modify this line:
gsub( /AUTO_INCREMENT|auto_increment/, "" )
To this:
gsub( /AUTO_INCREMENT|auto_increment/, "AUTOINCREMENT" )
Re your comments:
Okay I tried it on a dummy table using sqlite3.
sqlite> create table foo (
i int autoincrement,
primary key (i)
);
Error: near "autoincrement": syntax error
Apparently SQLite requires that autoincrement follow a column-level primary key constraint. It's not happy with the MySQL convention of putting the pk constraint at the end, as a table-level constraint. That's supported by the syntax diagrams in the SQLite documentation for CREATE TABLE.
Let's try putting primary key before autoincrement.
sqlite> create table foo (
i int primary key autoincrement
);
Error: AUTOINCREMENT is only allowed on an INTEGER PRIMARY KEY
And apparently SQLite doesn't like "INT", it prefers "INTEGER":
sqlite> create table foo (
i integer primary key autoincrement
);
sqlite>
Success!
So your awk script is not able to translate MySQL table DDL into SQLite as easily as you thought it would.
Re your comments:
You're trying to duplicate the work of a Perl module called SQL::Translator, which is a lot of work. I'm not going to write a full working script for you.
To really solve this, and make a script that can automate all syntax changes to make the DDL compatible with SQLite, you would need to implement a full parser for SQL DDL. This is not practical to do in awk.
I recommend that you use your script for some of the cases of keyword substitution, and then if further changes are necessary, fix them by hand in a text editor.
Also consider making compromises. If it's too difficult to reformat the DDL to use the AUTOINCREMENT feature in SQLite, consider if the default ROWID functionality is close enough. Read the link I posted above to understand the differences.
I found a weird solution but it works with PHP Doctrine.
Create a Mysql database.
Create Doctrine 2 Entities From database, make up all consistences.
Doctrine 2 has a feature that compare the Entities to database and fix database to validate to entities.
Exporting the database by mysql2sqlite.sh does exactly what you describe.
so then you configure the doctrine driver to use the sqlite db and:
by composer:
vendor/bin/doctrine-module orm:schema-tool:update --force
It fix up the auto increment without need to do in hand.

How to access the mysql table field schema description column?

How to obtain or query the description column of the table schema?
Currently:
si_table_name = params[:rid]
#si_field_names = Array.new
si_cols = ActiveRecord::Base.connection.columns(si_table_name, "#{name} Columns")
si_cols.each do |c|
#si_field_names << "#{c.name}:#{c.type}" <---------------
end
Goal: (this example doesn't work... looking for the correct way to query this)
si_table_name = params[:rid]
#si_field_names = Array.new
si_cols = ActiveRecord::Base.connection.columns(si_table_name, "#{name} Columns")
si_cols.each do |c|
#si_field_names << "#{c.name}:#{c.type}:#{c.description}" <---------------
end
Not sure what you mean by 'description'.
In any case, the table metadata can be queried using the information schema.
See
http://dev.mysql.com/doc/refman/5.6/en/columns-table.html
and in particular the table information_schema.columns, column COLUMN_COMMENT.

Fetch all table name and row count for specific table with Rails?

How can i fetch all the table name and row count for the specific table from the specific database ?
Result
Table Name , Row Count , Table Size(MB)
---------------------------------------
table_1 , 10 , 2.45
table_2 , 20 , 4.00
ActiveRecord::Base.connection.tables.each do |table|
h = ActiveRecord::Base.connection.execute("SHOW TABLE STATUS LIKE '#{table}'").fetch_hash
puts "#{h['Name']} has #{h['Rows']} rows with size: #{h['Data_length']}"
end
The question is tagged mysql but you can do it in a DB-agnostic manner via ORM.
class DatabaseReport
def entry_counts
table_model_names.map do |model_name|
entity = model_name.constantize rescue nil
next if entity.nil?
{ entity.to_s => entity.count }
end.compact
end
private
def table_model_names
ActiveRecord::Base.connection.tables.map(&:singularize).map(&:camelize)
end
end
Note that this will skip tables for which you don't have an object mapping such as meta tables like ar_internal_metadata or schema_migrations. It also cannot infer scoped models (but could be extended to do so). E.g. with Delayed::Job I do this:
def table_model_names
ActiveRecord::Base.connection.tables.map(&:singularize).map(&:camelize) + ["Delayed::Job"]
end
I came up with my own version which is also db agnostic.
As it uses the descendants directly it also handles any tables where the table_name is different to the model name.
The rescue nil exists for cases when you have the class that inherits from ActiveRecord but for some reason don't have a table associated with it. It does give data for STI classes and the parent class.
my_models = ActiveRecord::Base.descendants
results = my_models.inject({}) do |result, model|
result[model.name] = model.count rescue nil
result
end
#temp_table = []
ActiveRecord::Base.connection.tables.each do |table|
count = ActiveRecord::Base.connection.execute("SELECT COUNT(*) as count FROM #{table}").fetch_hash['count']
size = ActiveRecord::Base.connection.execute("SHOW TABLE STATUS LIKE '#{table}'").fetch_hash
#temp_table << {:table_name => table,
:records => count.to_i,
:size_of_table => ((BigDecimal(size['Data_length']) + BigDecimal(size['Index_length']))/1024/1024).round(2)
}
end
end