I have a Couchbase database and I would like to store price without losing precision - double is really not good enough for my application. However, it seems that there is no support for currency data types in Couchbase.
Is there a preferred solution for this problem for this database engine?
I was thinking about storing each price twice, once as string and once as double, so that I can still query price for inequality. It's better than nothing but not really a nice solution.
This is really a problem with JSON, but since Couchbase uses pure JSON, it applies :)
One solution that I've seen is to store it as an integer.
For example, if you want to store a price of $129.99, you would store a number of 12999. This could be kinda annoying, but depending on what language/framework you're using, it could be relatively easy to customize your (de)serializer to handle this automatically. Or you could create a calculated property in your class (assuming you're using OOP). Or you could use AOP.
But in any case, your precision is stored. Your string solution would also work, with similar caveats.
Related
I need to store an array of integers of length about 1000 against an integer ID and string name. The number of such tuples is almost 160000.
I will pick one array and calculate the root mean square deviation (RMSD) elementwise with all others and store an (ID1,ID2,RMSD) tuple in another table.
Could you please suggest the best way to do this? I am currently using MySQL for other datatables in the same project but if necessary I will switch.
One possibility would be to store the arrays in a BINARY or a BLOB type column. Given that the base type of your arrays is an integer, you could step through four bytes at a time to extract values at each index.
If I understand the context correctly, the arrays must all be of the same fixed length, so a BINARY type column would be the most efficient, if it offers sufficient space to hold your arrays. You don't have to worry about database normalisation here, because your array is an atomic unit in this context (again, assuming I'm understanding the problem correctly).
If you did have a requirement to access only part of each array, then this may not be the most practical way to store the data.
The secondary consideration is whether to compute the RMSD value in the database itself, or in some external language on the server. As you've mentioned in your comments, this will be most efficient to do in the database. It sounds like queries are going to be fairly expensive anyway, though, and the execution time may not be a primary concern: simplicity of coding in another language may be more desirable. Also depending on the cost of computing the RMSD value relative to the cost of round-tripping a query to the database, it may not even make that much of a difference?
Alternatively, as you've alluded to in your question, using Postgres could be worth considering, because of its more expressive PL/pgSQL language.
Incidentally, if you want to search around for more information on good approaches, searching for database and time series would probably be fruitful. Your data is not necessarily time series data, but many of the same considerations would apply.
I need to store different types of same value. The Value can represent DATE, INT, BOOLEAN, DOUBLE and any other type. I'm wondering if it is possible not using multiple tables - each for different type (I suppose that it would significantly complicate usage of stored Values (mainly searching, filtering)). Would it be a big storage and performance degradation when storing multiple column table with mainly NULL values in a row?
I'm thinking of table with such example columns from which only one value column will be filled:
id valueVarchar valueDate valueBoolean valueInt valueDouble
If that approach is clearly wrong please enlighten me.
I'm creating JSF application using MySQL (InnoDB) (database is not a big issue, it can be changed if necessary) and JPA.
edit:
As for now I have one table with one text value field. I'm converting values to/from database on server side. Because the project has just recently been started and changing model now will be less painful than in the future, I'm considering existence of better approach.
If you are using EclipseLink, you can use a #TypeConverter to convert any data-type to String. You could also have two column one for the value and one for the value's type, you could map this using an #Transformation mapping.
With generic JPA you could transform the type through get/set methods using property access.
From the comments, it appears that it is intended that for each entity, each attribute will have its value stored separately.
There is a name for this: it is called the Entity-Attribute-Value model, or EAV for short.
Although there are situations in which EAV is the only applicable solution, its use is generally regarded as an anti-pattern where viable alternatives are available.
A case study of an inappropriate implementation of an EAV database can be found here.
One of the most obvious purposes for which EAV can be used is for the persistence of Object data from OO designs, in relational databases. If this is how you want to use it, I urge you to consider Object-Relational Mapping (ORM for short) instead.
You can find EAV-related questions on SO using the eav tag.
You can store anything as a binary value, and cast to the type at client side. But the value can hardly be efficiently used within query conditions.
So I am creating a time tracking application using Ruby On Rails and am storing the time as a number representing hours.
Since anything beyond 0.01 (36 seconds ) hours is irrelevant I only need 2 decimal places.
I am using a MySQL database with a float as the column type. While this works most of the time, every now and then i get an error with the calculation and rounding of floats.
I have done some research into my options and see that a lot of people recommend using BigDecimal. Since I use a lot of custom Database querys using calculations, so I wanted to know how changing the column type would affect this. Does it store this as a string or yaml, or is it natively supported by MySQL?
Or is there an equivalent way to do fixed-point decimal arithmetic in Ruby / Rails.
I assume any method is going to require much refactoring, how can I avoid this the most?
Any insight is appreciated.
Instead of storing the time as a number representing hours, store it as a number representing increments of 36 seconds (or maybe individual seconds).
You shouldn't need a decimal supporting type to do fixed-point, simply divide in the business logic to get hours.
MySQL does have built-in BigDecimal support. http://dev.mysql.com/doc/refman/5.1/en/precision-math-decimal-changes.html
I would suggest using that; it works well in my Rails applications. Allowing the database to handle that instead of the application makes life easier - you're using the abstractions the way they're designed.
Here's the migration code:
change_column :table_name, :column_name, :decimal
Reference: Rails migration for change column
We have actually build a time tracking app (http://www.yanomo.com) and store all our times as the number of hours they represent with MySQL as the underlying dbms. For the column type we use DECIMAL(precision,scale). In your case something like DECIMAL(5,2) would do. In our businesslogic (JAVA) we use BigDecimal.
I'm developing a form generator, and wondering if it would be bad mojo to store JSON in an SQL database?
I want to keep my database & tables simple, so I was going to have
`pKey, formTitle, formJSON`
on a table, and then store
{["firstName":{"required":"true","type":"text"},"lastName":{"required":"true","type":"text"}}
in formJSON.
Any input is appreciated.
I use JSON extensively in my CMS (which hosts about 110 sites) and I find the speed of access data to be very fast. I was surprised that there wasn't more speed degradation. Every object in the CMS (Page, Layout, List, Topic, etc) has an NVARCHAR(MAX) column called JSONConfiguration. My ORM tool knows to look for that column and reconstitute it as an object if needed. Or, depending on the situation, I will just pass it to the client for jQuery or Ext JS to process.
As for readability / maintainability of my code, you might say it's improved because I now have classes that represent a lot of the JSON objects stored in the DB.
I used JSON.net for all serialization / deserialization. https://www.newtonsoft.com/json
I also use a single query to return meta-JSON with the actual data. As in the case of Ext JS, I have queries that return both the structure of the Ext JS object as well as the data the object will need. This cuts out one post back / SQL round trip.
I was also surprised at how fast the code was to parse a list of JSON objects and map them into a DataTable object that I then handed to a GridView.
The only downside I've seen to using JSON is indexing. If you have a property of the JSON you need to search, then you have to store it as a separate column.
There are JSON DB's out there that might server your needs better: CouchDB, MongoDB, and Cassandra.
A brilliant way to make an object database from sql server. I do this for all config objects and everything else that doesn't need any specific querying. extending your object - easy, just create a new property in your class and init with default value. Don't need a property any more? Just delete it in the class. Easy roll out, easy upgrade. Not suitable for all objects, but if you extract any prop you need to index on - keep using it. Very modern way of using sql server.
It will be slower than having the form defined in code, but one extra query shouldn't cause you much harm. (Just don't let 1 extra query become 10 extra queries!)
Edit: If you are selecting the row by formTitle instead of pKey (I would, because then your code will be more readable), put an index on formTitle
We have used a modified version of XML for exactly the purpose you decribe for seven or eight years and it works great. Our customers' form needs are so diverse that we could never keep up with a table/column approach. We are too far down the XML road to change very easily but I think JSON would work as well and maybe evan better.
Reporting is no problem with a couple of good parsing functions and I would defy anyone to find a significant difference in performance between our reporting/analytics and a table/column solution to this need.
I wouldn't recommend it.
If you ever want to do any reporting or query based on these values in the future it's going to make your life a lot harder than having a few extra tables/columns.
Why are you avoiding making new tables? I say if your application requires them go ahead and add them in... Also if someone has to go through your code/db later it's probably going to be harder for them to figure out what you had going on (depending on what kind of documentation you have).
You should be able to use SisoDb for this. http://sisodb.com
I think it not an optimal idea to store object data in a string in SQL. You have to do transformation outside of SQL in order to parse it. That presents a performance issue and you lose the leverage of using SQL native data parsing capability. A better way would be to store JSON as an XML datatype in SQL. This way, you kill two birds with one stone: You don't have to create shit load of tables and still get all the native querying benefits of SQL.
XML in SQL Server 2005? Better than JSON in Varchar?
Trying to make a MySQL-based application support MS SQL, I ran into the following issue:
I keep MySQL's auto_increment as unsigned integer fields (of various sizes) in order to make use of the full range, as I know there will never be negative values. MS SQL does not support the unsigned attribute on all integer types, so I have to choose between ditching half the value range or creating some workaround.
One very naive approach would be to put some code in the database abstraction code or in a stored procedure that converts between negative values on the db side and values from the larger portion of the unsigned range. This would mess up sorting of course, and also it would not work with the auto-id feature (or would it some way?).
I can't think of a good workaround right now, is there any? Or am I just being fanatic and should simply forget about half the range?
Edit:
#Mike Woodhouse: Yeah, I guess you're right. There's still a voice in my head saying that maybe I could reduce the field's size if I optimize its utilization. But if there's no easy way to do this, it's probably not worth worrying about it.
When is the problem likely to become a real issue?
Given current growth rates, how soon do you expect signed integer overflow to happen in the MS SQL version?
Be pessimistic.
How long do you expect the application to live?
Do you still think the factor of 2 difference is something you should worry about?
(I have no idea what the answers are, but I think we should be sure that we really have a problem before searching any harder for a solution)
I would recommend using the BIGINT data type as this goes up to 9,223,372,036,854,775,807.
SQL Server does not support signed and unsigned values.
I would say this.. "How do we normally deal with differences between components?"
Encapsulate what varies..
You need to create an abstraction layer within you data access layer to get it to the point where it doesn't care whether or not the database is MySQL or MS SQL..