what is a better way to store spatial data in MySQL (say tracks)
internally or as references to the external flat files?
MySQL has a spatial extensions to store geographic objects (objects with a geometric attributes). More detail available there.
I would reccomend against mysql if you want to store it as explicitly spatial information. Instead I would reccomend Postgresql/PostGIS if you want to stay with Open Source DB. MySQL barely implements any of their spatial functionality. If you read the doc closely most spatial functions are yet to be implemented.
If you are don't care about explicitly spatial information then go ahead and store it directly in the DB.
If you give some more background on what you want to do we might be able to help more
The "better way" to store data depends on several factors which you, yourself need to consider:
Are the files rediculously large? +50MB? MySql can time out on long transactions.
Are you working on a closed network environment where the file system is secure and controlled?
Do you plan only to serve the raw files? There's no point in processing them into MySql format only to re-process them on the way out.
Is it expected that 'non technical' people are going to want to access this data? 'non technical' people generally don't like obfuscated information.
Do you have the capability in your applciation (if you have an applicaiton) to read the spatial data in the format that MySql stores it in? There's no point in processing and storing a .gpx or .shp file into MySql format if you can't read it back from there.
Do you have a system / service that will control the addition / removal / modification of the file structure and corresponding database records? Keeping a database and file system in sync is not an easy task. Especially when you consider the involvement of 'non technical' people.
Related
Although this question has been appear in past previous post, but different scenario and different consideration decide which one is the best.
I need to implement a system whereby it can handle 200GB - 400GB size of images yearly(approximately < 1mb per image). It is P&C images which only allowed for authorised personal to access and VIEW only. I am planning to use an application based of system to INSERT to MYSQL database and using PHP web based application for VIEW only.
I am thinking to use FILESYSTEM because it is easy to do backup & restore on the images and no need to worry on the size of the MYSQL database.
I am using MySQL + Apache + PHP running in Windows Server.
Your advice and input is very much appreciated.
Thank you.
Regards,
Desmond
Also worth reading:
Best Practice in File Storage while Building Applications - Database (Blob Storage) Vs File System
BLOB Storage as the Best Solution
For better scalability. Although file systems are designed to handle a large number of objects of varying sizes, say files and folders, actually they are not optimized for a huge number (tens of millions) of small files. Database systems are optimized for such scenarios.
For better availability. Database servers have availability features that extend beyond those provided by the file system. Database replication is a set of solutions that allow you to copy, distribute, and potentially modify data in a distributed environment whereas Log shipping provides a way of keeping a stand-by copy of a database in case the primary system fails.
For central repository of data with controlled growth. DBA has the privilege to control and monitor the growth of database and split the database as and when needed.
For full-text index and search operations. You can index and search certain types of data stored in BLOB columns. When a database designer decides that a table will contain a BLOB column and the column will participate in a full-text index, the designer must create, in the same table, a separate character-based data column that will hold the file extension of the file in the corresponding BLOB field. During the full-text indexing operation, the full-text service looks at the extensions listed in the character-based column (.txt, .doc, .xls, etc.), applies the corresponding filter to interpret the binary data, and extracts the textual information needed for indexing and querying.
File System Storage as the Best Solution
For the application in which the images will be used requires streaming performance, such as real-time video playback.
For applications such as Microsoft PhotoDraw® or Adobe PhotoShop, which only know how to access files.
If you want to use some specific feature in the NTFS file system such as Remote Storage.
objects smaller than 256K are best stored in a database while objects
larger than 1M are best stored in the filesystem. Between 256K and 1M,
the read:write ratio and rate of object overwrite or replacement are
important factors.
souce:
http://research.microsoft.com/apps/pubs/default.aspx?id=64525
Edit: It is MS SQL, so MAYBE same as Mysql :)
I've only recently started to deal with database systems.
I'm developing an ios app that will have a local database (sqlite) and that will have to periodically update the internal database with the contents of a database stored in a webserver (mySQL). My questions is, whats the best way to fetch the data from the webserver and store it in the local database? There are some options that came to me, don't know if all of them are possible
Webserver->XML/JSON->Send it->Locally convert and store in local database
Webserver->backupFile->Send it->Feed it to the SQLite db
Are there any other options? Which one is better in terms of amount of data taken?
Thank you
The XML/JSON route is by far the simplest while providing sufficient flexibility to handle updates to the database schema/older versions of the app accessing your web service.
In terms of the second option you mention, there are two approaches - either use an SQL statement dump, or a CSV dump. However:
The "default" (i.e.: mysqldump generated) backup files won't import into SQLite without substantial massaging.
Using a CSV extract/import will mean you have considerably less flexibility in terms of schema changes, etc. so it's probably not a sensible approach if the data format is ever likely to change.
As such, I'd recommend sticking with the tried and tested XML/JSON approach.
In terms of the amount of data transmitted, JSON may be smaller than the equivalent XML, but it really depends on the variable/element names used, etc. (See the existing How does JSON compare to XML in terms of file size and serialisation/deserialisation time? question for more information on this.)
I have an application where customers upload files like Powerpoints and Excel spreadsheets to the application through a web UI. The files then have meta data associated with them and they are stored as BLOBs in a MySQL database. The users may download these files occasionally, but not very often. The emphasis here is on archiving. Security of data is also important.
If that is the case, what are the pros and cons of storing the files as BLOBs in MySQL as opposed to putting them on Amazon S3? I've never used S3 before but hear that it's popular for storing files.
The main advantage of relational databases (such as MySQL) is the elegance it permits you to query for data. BLOB columns, however, offer very little in terms of rich query semantics compared to other column types, so If that's your main use case, there's hardly any reason to use a relational database at all, it doesn't offer much above and beyond a regular filesystem or simple key-value datastore (such as s3).
Dollars to bytes, s3 is likely much more cost effective.
On the other hand, there are some things that a relational database can bring that would be worhtwhile. The most obvious is transactional semantics (only on the InnoDB engine, not available with MyISAM), so that you can safely know that whole groups of uploads or modifications take place consistencly. Another advantage is that you can still add metadata about your blobs (even if it's only over time, as your application improves) so you can still benefit some from the rich queries MySQL supports.
storing binary data into blob
make your database fat
size limitation (is overcome at the later version in mysql)
data portability is not there (you need a mysql api/client to access the data)
there is no true security
If you are archiving the binary data,
store into normal disk file
If security is important,
consider separate between your UI server and storage server,
but is hard to archive,
you can always consider to embed password / encryption into these binary files
security over amazon s3
http://docs.amazonwebservices.com/AmazonS3/latest/dev/index.html?UsingAuthAccess.html
http://docs.amazonwebservices.com/AmazonS3/latest/dev/index.html?S3_QSAuth.html
Security of data is also important.
Do note that files on S3 are not stored on encrypted disks, so you may have to encrypt client-side or on your servers before sending it up to S3.
I've been storing data in S3 for years and completely love it! What I do is upload the file to S3 (where its copied multiple times by the way) and then store a reference to the file path and name into my MySQL files table. If anything else, it takes that much load off of the MySQL DB and S3 now offers AES256 bit encryption with revolving master keys so you know its secure!
I am new to this, so any thoughts are much welcomed. :)
What I am trying to do is to read serial data via an RS232 cable going into the COM1 of a laptop and then saving this data into a web server database of some kind. I think MySQL is the way to go as to store my database. However, I don't see much documentation on how I can automate streaming in the serial data into the database. I only found this webpage that says it is possible. Any thoughts? Pointers to tutorials and/or reference?
Thanks.
MySQL is a relational database. Is the data you read on the serial port relational? From your usage of words, I doubt it.
If it is some kind of measurement data you need to store for a specific interval, the "Round Robin Database" might be a better choice. It even offers the option of storing old data with less resolution using less disk space.
If you insist on using mysql, you probably want to collect the data for a while, and save a standard sized chunk as a "binary large object" along with a timestamp.
A few questions come to mind - are you able to develop and install software solution or you want to create this with off the shelf tools?
If you are allowed to install custom software - reading from RS-232 and connecting to mysql is really simple with C# so the whole program will be less then a hundred lines of code. You just read the stream and from time to time insert it into the table with strucure like that id,datetime,TEXT. Depending on the nature of the stream you can insert on number of bytes/time elapsed or some logical condition.
my question is similar to other friend posted here...we are trying to develop an application that supports possibly terabytes of information based on a land registry in Paraguay with images and normal data.
The problem is that we want to reduce the cost of operation to minimum as possible because it´s like a competition between companies, and for that reason we want to use a free database....I have read a lot of information about it but I am still confused. We have to realize that the people who is gonna use it are government people so the DB has to be easy to manage at the same time.
What would u people recommend me?
Thanu very much
MySQL and even SQLite already have spatial indexes, so no problem there.
To store the datafiles you could use a BLOB field, but it's usually much better (and easier to optimise) to store as files. To keep the files related to the DB records you can either put the full path (or URL) in a varchar field, or store the image in a path calculated by the record's ID.
To easily scale into the multi-terabyte store, plan from the start on using several servers. If the data is read-mostly, an easy way is to store the images on different hosts, each with a static HTTP server, and the database records where is each image. then put a webapp frontend for the database, where the URLs for each image directly point to the appropriate storage server. That way you can keep adding storage without creating a bottleneck on the 'central' server.
Postgresql, SQL Server 2008 and Any recent version of Oracle all have spatial indexing, table partitioning and BLOBs and are capable of acting as the back-end of a large geographic database. You might also want to check out two open-source GIS applications: GRASS and QGIS, which might support doing what you want with less modification work than writing a bespoke application. Both can use Postgresql and other database back-ends.
As for support, any commercial or open-source database is going to need the attentions of a competent DBA if you want to get it to work well on terabyte-size databases. I don't think you will get away with a model of pure end-user support - attempts to do this are unlikely to work.
It sounds like the image files will be a considerable amount of your storage. Don't store them in a database just store the file location details in the database.
(If you want access via the internet try Amazon Storage. It isn't free but very cheap and they handle the scaleability for you. )
Another cautionary note on using B/C/LOBs, as I've been bitten on exponential DB growth by storing internally w/in the DB.
What about storing the GIS maps on a separate server and just store the LAT/LONG "shape" of the area w/in the DB. The GIS can be updated separately w/out the cost of storing the images in the main database.
Smaller to admin. Less cost to backup.
Whilst not meeting your criteria of being free, I would strongly recommend you consider using SQL Server 2008, because of two Gfeatures in this version which could help:
FILESTREAM - allows you to store your binary images within the filesystem, rather than within the database itself. This will make your database much more manageable whilst still allowing you to query the data in the usual way.
GEOGRAPHIC DATA TYPES - support for geospatial (lat/long) datatypes is likely to be very valuable to your solution.
Good luck!
Use ESRI's Image Server. You won't need a database to serve the images. Its very easy to use. It also works off of files and its fast and handles many image formats. Plus it does image processing on the fly and supports many clients. AutoCAD, Microstation, ArcMap, ArcIMS, ArcServer...etc.
Image Server