Couchbase DocumentSize affecting View-Visibility? - couchbase

I'm trying to save some larger files (0.1 - 10MB, Images) in Couchbase.
Dimensions from 128x128 up to 4096x4096.
My problem now is:
I am not able to get the keys of larger files in a View.
They are simply missing even in the Couchbase-Console and with stale=false
View-Code is simply (show me all the keys):
function(doc, meta){
emit(meta.id, null)
}
I was able to narrow down:
The key-Name is irrelevant
The "DataBuckets" -> "Documents" shows all the entries in the CouchbaseConsole
All Files are accessible with the key directly
The File-Type does not seem to matter (jpg,png,bmp)
The File-Size matters.
Somewhere between 2300 KB and 2800 KB the Indexing changes. Smaller Documents are included, bigger ones are no longer included in the View (in the Documents-Tab they are visible in the CouchbaseConsole)
I was unable to find anything about this in the documentation or somewhere else. Just a 20MB soft-maximum filesize is mentioned. Am I missing something? Deleting several entries without the ability to use views gets kinda complicated.
You are running version 2.5.1 enterprise edition (build-1083)
Thanks

Related

Is higher number of different folders affecting website load speed?

I have one question about page load speed. Consider a case when I have different kinds of images on my web page, such as icons, logos, images inside content etc....
I want to know whether having separate folders for each media category may affect the page load speed:
/logos
/icons
/images
Will the webpage load faster if the images of all categories were located in a single folder rather in multiple ones?
Thanks in advance for your advice.
Even though performance-related questions often get closed due to them not being answerable without benchmarks on the machine, this one is worth an answer since unless you run a potato-based computer, you won't have any performance impact.
Directories are not actually physical folders like you would have in real life.
They are simply registers of pointers to disk spaces where your files are stored. (Of course this is massively over-simplified as it involves file-systems and more low-level stuff, but that's not needed at that point).
To come back to your question, the difference between loading two files from two directories:
/var/foobar/dir1/image1.jpeg
/var/foobar/dir2/image2.jpeg
or one directory:
/var/foobar/dir1/image1.jpeg
/var/foobar/dir1/image2.jpeg
...is that your file system will have to look-up two different directories tables. With modern file-systems and moderate (even low-end) hardware, this causes no issues.
As #AjitZero mentioned, here your performance impact will come from the size of the files, the number of distinct HTTP requests (i.e.: How many images, CSS, scripts, etc...) and the way you cache data on the user's computer.
No, the number of folders doesn't affect the speed of page-load.
Lesser number of HTTP requests matter, however, so you can use sprite sheets.

Chrome localStorage Limits

From what I have understood, and been able to test, thus far Chrome (v 19 on Windows) "limits" local storage to 2.49 Mb, a figure I have verified for myself. However, the storage scenario I have to deal with is rather more complicted. I have an IDE like interface for which I am fetching context sensitive help from the server when the user hovers over something relevant. Once this is done I store that help text (HTML typically between 120 and 1024 chars) to localStorage. No problems thus far. However, the IDE is a very large and complex one and in due course the localStorage will contain 100s or even 1000s of keys. What is not clear to me is this - will the results of the rather rudimentary localStorage limits tests (my own and ones I ran into on the web) still be valid - the tests are done by storing a long char string under one key which is significantly different from what I have described above. I assume that at the very least there is an overhead associated with the space consumed by key storage itself.

Why people always encourage single js for a website?

I read some website development materials on the Web and every time a person is asking for the organization of a website's js, css, html and php files, people suggest single js for the whole website. And the argument is the speed.
I clearly understand the fewer request there is, the faster the page is responded. But I never understand the single js argument. Suppose you have 10 webpages and each webpage needs a js function to manipulate the dom objects on it. Putting 10 functions in a single js and let that js execute on every single webpage, 9 out of 10 functions are doing useless work. There is CPU time wasting on searching for non-existing dom objects.
I know that CPU time on individual client machine is very trivial comparing to bandwidth on single server machine. I am not saying that you should have many js files on a single webpage. But I don't see anything go wrong if every webpage refers to 1 to 3 js files and those js files are cached in client machine. There are many good ways to do caching. For example, you can use expire date or you can include version number in your js file name. Comparing to mess the functionality in a big js file for all needs of many webpages of a website, I far more prefer split js code into smaller files.
Any criticism/agreement on my argument? Am I wrong? Thank you for your suggestion.
A function does 0 work unless called. So 9 empty functions are 0 work, just a little exact space.
A client only has to make 1 request to download 1 big JS file, then it is cached on every other page load. Less work than making a small request on every single page.
I'll give you the answer I always give: it depends.
Combining everything into one file has many great benefits, including:
less network traffic - you might be retrieving one file, but you're sending/receiving multiple packets and each transaction has a series of SYN, SYN-ACK, and ACK messages sent across TCP. A large majority of the transfer time is establishing the session and there is a lot of overhead in the packet headers.
one location/manageability - although you may only have a few files, it's easy for functions (and class objects) to grow between versions. When you do the multiple file approach sometimes functions from one file call functions/objects from another file (ex. ajax in one file, then arithmetic functions in another - your arithmetic functions might grow to need to call the ajax and have a certain variable type returned). What ends up happening is that your set of files needs to be seen as one version, rather than each file being it's own version. Things get hairy down the road if you don't have good management in place and it's easy to fall out of line with Javascript files, which are always changing. Having one file makes it easy to manage the version between each of your pages across your (1 to many) websites.
Other topics to consider:
dormant code - you might think that the uncalled functions are potentially reducing performance by taking up space in memory and you'd be right, however this performance is so so so so minuscule, that it doesn't matter. Functions are indexed in memory and while the index table may increase, it's super trivial when dealing with small projects, especially given the hardware today.
memory leaks - this is probably the largest reason why you wouldn't want to combine all the code, however this is such a small issue given the amount of memory in systems today and the better garbage collection browsers have. Also, this is something that you, as a programmer, have the ability to control. Quality code leads to less problems like this.
Why it depends?
While it's easy to say throw all your code into one file, that would be wrong. It depends on how large your code is, how many functions, who maintains it, etc. Surely you wouldn't pack your locally written functions into the JQuery package and you may have different programmers that maintain different blocks of code - it depends on your setup.
It also depends on size. Some programmers embed the encoded images as ASCII in their files to reduce the number of files sent. These can bloat files. Surely you don't want to package everything into 1 50MB file. Especially if there are core functions that are needed for the page to load.
So to bring my response to a close, we'd need more information about your setup because it depends. Surely 3 files is acceptable regardless of size, combining where you would see fit. It probably wouldn't really hurt network traffic, but 50 files is more unreasonable. I use the hand rule (no more than 5), but surely you'll see a benefit combining those 5 1KB files into 1 5KB file.
Two reasons that I can think of:
Less network latency. Each .js requires another request/response to the server it's downloaded from.
More bytes on the wire and more memory. If it's a single file you can strip out unnecessary characters and minify the whole thing.
The Javascript should be designed so that the extra functions don't execute at all unless they're needed.
For example, you can define a set of functions in your script but only call them in (very short) inline <script> blocks in the pages themselves.
My line of thought is that you have less requests. When you make request in the header of the page it stalls the output of the rest of the page. The user agent cannot render the rest of the page until the javascript files have been obtained. Also javascript files download sycronously, they queue up instead of pull at once (at least that is the theory).

Tools for viewing logs of unlimited size

It's no secret that application logs can go well beyond the limits of naive log viewers, and the desired viewer functionality (say, filtering the log based on a condition, or highlighting particular message types, or splitting it into sublogs based on a field value, or merging several logs based on a time axis, or bookmarking etc.) is beyond the abilities of large-file text viewers.
I wonder:
Whether decent specialized applications exist (I haven't found any)
What functionality might one expect from such an application? (I'm asking because my student is writing such an application, and the functionality above has already been implemented to a certain extent of usability)
I've been using Log Expert lately.
alt text http://www.log-expert.de/images/stories/logexpertshowcard.gif
I can take a while to load large files, but it will in fact load them. I couldn't find the file size limit (if there is any) on the site, but just to test it, I loaded a 300mb log file, so it can at least go that high.
Windows Commander has a built-in program called Lister which works very quickly for any file size. I've used it with GBs worth of log files without a problem.
http://www.ghisler.com/lister/
A slightly more powerful tool I sometimes use it Universal Viewer from http://www.uvviewsoft.com/.

Branching failures in SourceGear's Vault?

I'm using SourceGear's Vault version control software (v4.1.2) and am experiencing DBReadFailures when attempting to branch a folder. I don't really know if I'd call the folder "large" or not (treesize is 680MB and the disk space used is 1.3GB)... but during the branch operation, the sql server it's querying times out (approx 5m) and the transaction fails. During the branch operation, the database server pegs 1 of it's 4 CPUs at 100%, which tells me the operation isn't really hardware constrained so much as it is constrained by it's algorithm). The db server is also not memory bound (has 4GB and only uses 1.5GB during this process). So I'm left thinking that there is just a finite limit to the size of the folders you can branch in the Vault product. Anyone have any similar experiences with this product that might help me resolve this?
When attempting to branch smaller folders (i.e. just the sub folders within the main folder I'm trying to branch) it apparently works. Looks like another indicator that it's just size limitations I'm hitting. Is there a way to increase the 5m timeout?
In the Vault config file, there's a SqlCommandTimeout item - have you tried modifying that? I'm not sure what the default is, but ours is set as follows:
<SqlCommandTimeout>360</SqlCommandTimeout>
There's a posting on the SourceGear support site here that seems to describe your exact problem.
The first reply in that posting mentions where to find the config file, if you're not familiar with it.