whats best way to create documents Archive of images in the database's? - mysql

What is the best way to create an Archive of image documents in the database ?
Given we have about 2-10 million records and each record includes 2-4 images and about 20 text fields , what is the best way for create this archive so that we have good speed and high security for data?
Also, what database is good for this project?

Definitely use the file system as Minor suggested.
One option is SQL Server FILESTREAM. See http://msdn.microsoft.com/en-us/library/cc949109.aspx.

Use file system storage for archive image. You must save link in DB for the image file. And if you use a HTTP content you can use the cache proxy server such as Squid, Nginx, etc.

More questions for you:
How dynamic is the data? Do you store it once and never change it or it gets frequently changed?
Do you need versioning for the documents or the latest version overwrites the previous and that's it.
Are the documents always edited using one application or they can be changed outside (ex: using Word)
Are the documents related to other "non-document" data (database rows) or is it the only thing that you need to store?

File system won't offer any real security, so I would discount that straight off.
In Oracle there is built-in image support through the ORDImage type.
Check out Marcel's blog as he, and the Piction company, do a lot of work in this area and he has lots of useful material to download.

You can use control downloads. Look at http://kovyrin.net/2006/11/01/nginx-x-accel-redirect-php-rails/lang/en/

Related

Right way to manage application generated files

tl;dr
In my node.js application I create pdf documents. What is the best/right way to save them? Right now I use node.js fileserver and shell.js to do it.
I am working on a node.js web application to manage apartments and tenants for learning purpose and on some point I create PDF Documents that I want to save under a path
/documents/building_name/apartment_name/tenant_name/year/example.pfd
Now if the user wants to change the building, apartment or tenant name via an http PUT request I change the database but also the want to change the path.
Well both works but I can't write good tests for these functions.
Now a friend told me that it's a bad practice to save documents on a file server and I better should use BLOB.
On the other side google doesn't really agree on using blobs
So what is the right way to save documents?
Thanks
Amit
You should first define a source of truth. Unless you're legally obliged to keep copies of those files and they are not being accessed very often, I wouldn't even bother storing those at all and just generate them upon request.
If not, keep the DB clean, blobs will make it huge. Put them into cold storage (again assuming they are not being accessed too frequently) without those paths. If the paths are reliant on often changing information, that can't be performant for neither the file server nor your system.
Instead store a revision number in your DB that the file can be found under and limit the path structure to information that rarely change.
Like {building}/{apartment}/{tenant}_{revision}.pfd
That - depending on your backup structure - will allow you to time-travel if necessary and doesn't force a re-index all the time.
Note: I don't know too much about your use case.

From blob to filesystem. Which is the best location to store images?

I have read a lot of discussions about this topic and we all agree that, in most cases, the best solution is to save the images on the filesystem, and record the path in the database.
For this reason, we are redesigning our web application and enjoy all the advantages of this type of solution.
My question however is:
what is the best location where to save the images, especially to avoid problems of version control and deploying? The same way as the database, I think it's better that these files are not under version control, right?
For this I would like to get an advice on what is the best location where to save the file system:
Inside the Working Copy saying to version control to ignore them?
Outside the working copy?
In the folder where the database stores the data so you have "all the data together"?
are 3000 images of about 2mb
the control version is svn but soon will migrate to git
the db is MySQL with InnoDB tables with innodb_file_per_table

Preemptively getting pages with HTML5 offline manifest or just their data

Background
I have a (glorified) CRUD application that I'd like to enable HTML5 offline support with. The cache-manifest system looks simple yet powerful, but I'm curious about how I can allow users to access data while offline.
For example, suppose I have these pages for the entity "Case" (i.e. this is CRM case-management software):
http://myapplication.com/Case
http://myapplication.com/Case/{id}
http://myapplication.com/Case/Create
The first URI contains a paged listing of all cases, using the querystring parameters pageIndex and pageSize, e.g. /Case?pageIndex=2&pageSize=20.
The second URI is the template for editing individual cases, e.g. /Case/1 or /Case/56.
Finally, /Case/Create is the form used to create cases.
The Problem
I would like all three to be available offline.
/Case
The simple way would be to add /Case to the cache-manifest, however that would break paging (as the links wouldn't work).
I think I could instead add something like /Case/AllData which is an XML resource, which is cached and if offline then a script on /Case would use this XML data to populate the list and provide for pagination.
If I go for the latter, how can I have this XML data stored in the in-browser SQL database instead of as a cached resource? I think using the SQL database would be more resilient.
/Case/{id}
This is more complicated. There is the simple solution of manually adding /Case/1, /Case/2, /Case/3 etc... to /Case/1234, but there can be hundreds or even thousands of cases so this isn't very practical.
I think the system should provide access to the 30 most recent cases, for example. As above, how can I store this data in the database?
Also, how would this work? If I don't explicitly add /Case/34 to the manifest and the user clicks on to /Case/34 how can I get the browser to load a page that my JavaScript will populate based on the browser's SQL database data and not display the offline message?
/Case/Create
This one is more simple - as it's just an empty page and on the <form>'s submit action my script would detect if it's offline, and if it is offline then it would add it to the browser's SQL database. Does this sound okay?
Thanks!
I think you need to be looking at a LocalStorage database (though it does have some downsides), but there are other alternatives such as WebSQL and IndexedDB.
Also I don't think you should be using numeric Id's if you are allowing people to create as you will get Primary Key conflicts, it is probably best to use something like a GUID.
Another thing you need is the ability to push those new cases onto the server. there could be multiple...
Can they be edited? If they can I think you really need to be thinking about synchronization and conflict resolution hard very hard if that is the case.
Shameless self promotion, I have a project that is designed to handle these very issues, though it's not done, it's close. You can see it (with an ugly but very functional) demo at https://github.com/forbesmyester/SyncIt

Storing image in image column of SQL Server. Is it beneficial than storing image in folder on website

I have to display images on website and I can store image in the folder on my website and also I can store the image in image column of SQL Server.
So which way of storing image is better : in folder or in Image column of SQL Server.
1. Which way of storing image and retrieving it is faster
With SQL Server 2008, while you can store BLOB data, it's best to avoid it. I've done it in the past, grudgingly, and it did have negative performance implications. Unless you have some constraint which prevents it, use the file system. That's what it's built for, and it's much faster.
As #Martin Smith pointed out you could use FileStream. We started storing our files using FileStream so that we could also add full-text indexing and allow the users to not only search the data, but the files on our site. It is also nice because we can easily move our files along with teh Database to other environments (Dev, Test).
nice file stream Article: Here
Also, please use varbinary(max) if you are going to store in the DB. The image column is going to be deprecated in future versions.
--S

Dealing with mass images, some small - some large, in spring/java application using mysql

I was wondering what the best pattern was to handle the management of images these days when using spring/java and mysql.
I have several options. Some of the
images are just small avatars for
the users. Is it fine to put these
directly into mysql? Or use the file
system?
For the larger images, is file
system pretty much the only option,
and then use mysql to store the
location on the file system?
Where is a good spot to put them on
a linux server? /var/files/images?
Since the files are hidden from the
war deployment directory, what is
the best way to stream them? Use
some kind of a file output stream as
the response body for an http
request?
Also, do I have to develop all of
the file management stuff myself,
like cleaning up unused files and
the like?
What about image security? Some images should not be accessed by everyone. I think I'd need to use a separate url with Spring security checking the current user for this.
I'd appreciate advice on all of these questions. Thanks.
You could use MySQL, and that would have the advantage of centralization and easy cleanup, but IMHO it's a waste of the database's resources if you plan to scale.
For data like images where everything is public, consider something like Amazon S3 which allows you to serve images directly from S3's web servers. If you plan to host everything yourself, just serve from a directory. Just remember to turn directory listings off :)