Branching failures in SourceGear's Vault? - sourcegear-vault

I'm using SourceGear's Vault version control software (v4.1.2) and am experiencing DBReadFailures when attempting to branch a folder. I don't really know if I'd call the folder "large" or not (treesize is 680MB and the disk space used is 1.3GB)... but during the branch operation, the sql server it's querying times out (approx 5m) and the transaction fails. During the branch operation, the database server pegs 1 of it's 4 CPUs at 100%, which tells me the operation isn't really hardware constrained so much as it is constrained by it's algorithm). The db server is also not memory bound (has 4GB and only uses 1.5GB during this process). So I'm left thinking that there is just a finite limit to the size of the folders you can branch in the Vault product. Anyone have any similar experiences with this product that might help me resolve this?
When attempting to branch smaller folders (i.e. just the sub folders within the main folder I'm trying to branch) it apparently works. Looks like another indicator that it's just size limitations I'm hitting. Is there a way to increase the 5m timeout?

In the Vault config file, there's a SqlCommandTimeout item - have you tried modifying that? I'm not sure what the default is, but ours is set as follows:
<SqlCommandTimeout>360</SqlCommandTimeout>
There's a posting on the SourceGear support site here that seems to describe your exact problem.
The first reply in that posting mentions where to find the config file, if you're not familiar with it.

Related

What would be an ideal way to share writable volume across containers for a web server?

The application in question is Wordpress, I need to create replicas for rolling deployment / scaling purposes.
It seem can't create more then 1 instance of the same container, if it uses a persistent volume (GCP term):
The Deployment "wordpress" is invalid: spec.template.spec.volumes[0].gcePersistentDisk.readOnly: Invalid value: false: must be true for replicated pods > 1; GCE PD can only be mounted on multiple machines if it is read-only
What are my options? There will be occasional writes and many reads. Ideally writable by all containers. I'm hesitant to use the network file systems as I'm not sure whether they'll provide sufficient performance for a web application (where page load is rather critical).
One idea I have is, create a master container (write and read permission) and slaves (read only permission), this could work - I'll just need to figure out the Kubernetes configuration required.
In https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistent-volumes you can see a table with the possible storage classes that allow ReadWriteMany (the option you are looking for).
AzureFile (not suitable if you are using GCP)
CephFS
Glusterfs
Quobyte
NFS
PortworxVolume
The one that I've tried is that of NFS. I had no issues with it, but I guess you should also consider potential performance issues. However, if the writes are to be occassional, it shouldn't be much of an issue.
I think what you are trying to solve is having a central location for wordperss media files, in that case this would be a better solution: https://wordpress.org/plugins/gcs/
Making your kubernetes workload truly stateless and you can scale horizontally.
You can use Regional Persistent Disk. It can be mounted to many nodes (hence pods) in RW more. These nodes can be spread across two zones within one region. Regional PDs can be backed by standard or SSD disks. Just note that as of now (september 2018) they are still in beta and may be subject to backward incompatible changes.
Check the complete spec here:
https://cloud.google.com/compute/docs/disks/#repds

CaseStudy in Bugzilla

I wish to conduct a case study in bugzilla, where I would like to ideally find out some statistics such as
The number of Memory Leaks
The percentage of bugs which are performance bugs
The percentage of semantic bugs
How can I search through the bugzilla database for softwares such as apache http server or mysql database server to generate such statistics. I would like an idea of how to get started?
I finally figured it out and am going to show my approach here. It's more or less a manual process. Hopefully this helps others as well who might be doing case-studies:
Selecting the bugs:
I went to bugs.mysql.com and searched for all bugs which were marked as resolved and fixed. Unfortunately, for mysql you cannot select specific components. I filtered out a random time range (2013-2014). And saved all of these in excel file(csv)
Classifying and filtering the bugs:
I manually went through the bugs, I skipped the ones which I could see clearly belonged to the documentation component, installation, compilation failure, or required restarts.
Then I read the report, and checked if the report actually suggested a bug fix, and if the bugfix was semantic (i.e. change limits, add a condition check, make sure some if condition is correctly marked for an edge case etc. - most were along similar lines). I followed a similar process for performance, resource-leak (both cpu resources, and memory leak considered in this category), and concurrency bugs.

Impact of clustered file systems on programming if any?

I have a huge number of files that my cluster nodes must reach (think a bout a bunch of configuration parameters i have to apply to my computations).
We are considering deploying a SAN with a clustered file system (GFS or OCFS), the thing is I dont't know how my program would be impacted (would I have to deal with file locking ?)
This is maybe completely dumb, but it's a bit like a paradigm shift for me and I need to be sure.
thanks

How to get your code ready for Loadbalancing

As we did this in the past, i'd like to gather useful information for everyone moving to loadbalancing, as there are issues which your code must be aware of.
We moved from one apache server to squid as reverse proxy/loadbalancer with three apache servers behind.
We are using PHP/MySQL, so issues may differ.
Things we had to solve:
Sessions
We moved from "default" php sessions (files) to distributed memcached-sessions. Simple solution, has to be done. This way, you also don't need "sticky sessions" on your loadbalancer.
Caching
To our non-distributed apc-cache per webserver, we added anoter memcached-layer for distributed object caching, and replaced all old/outdated filecaching systems with it.
Uploads
Uploads go to a shared (nfs) folder.
Things we optimized for speed:
Static Files
Our main NFS runs a lighttpd, serving (also user-uploaded) images. Squid is aware of that and never queries our apache-nodes for images, which gave a nice performance boost. Squid is also configured to cache those files in ram.
What did you do to get your code/project ready for loadbalancing, any other concerns for people thinking about this move, and which platform/language are you using?
When doing this:
For http nodes, I push hard for a single system image (ocfs2 is good for this) and use either pound or crossroads as a load balancer, depending on the scenario. Nodes should have a small local disk for swap and to avoid most (but not all) headaches of CDSLs.
Then I bring Xen into the mix. If you place a small, temporal amount of information on Xenbus (i.e. how much virtual memory Linux has actually promised to processes per VM aka Committed_AS) you can quickly detect a brain dead load balancer and adjust it. Oracle caught on to this too .. and is now working to improve the balloon driver in Linux.
After that I look at the cost of splitting the database usage for any given app across sqlite3 and whatever db the app wants natively, while realizing that I need to split the db so posix_fadvise() can do its job and not pollute kernel buffers needlessly. Since most DBMS services want to do their own buffering, you must also let them do their own clustering. This really dictates the type of DB cluster that I use and what I do to the balloon driver.
Memcache servers then boot from a skinny initrd, again while the privileged domain watches their memory and CPU use so it knows when to boot more.
The choice of heartbeat / takeover really depends on the given network and the expected usage of the cluster. Its hard to generalize that one.
The end result is typically 5 or 6 physical nodes with quite a bit of memory booting a virtual machine monitor + guests while attached to mirrored storage.
Storage is also hard to describe in general terms.. sometimes I use cluster LVM, sometimes not. The not will change when LVM2 finally moves away from its current string based API.
Finally, all of this coordination results in something like Augeas updating configurations on the fly, based on events communicated via Xenbus. That includes ocfs2 itself, or any other service where configurations just can't reside on a single system image.
This is really an application specific question .. can you give an example? I love memcache, but not everyone can benefit from using it, for instance. Are we reviewing your configuration or talking about best practices in general?
Edit:
Sorry for being so Linux centric ... its typically what I use when designing a cluster.

Distributed filesystem sanity check [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I'm in need of a distributed file system that must scale to very large sizes (about 100TB realistic max). Filesizes are mostly in the 10-1500KB range, though some files may peak at about 250MB.
I very much like the thought of systems like GFS with built-in redundancy for backup which would - statistically - render file loss a thing of the past.
I have a couple of requirements:
Open source
No SPOFs
Automatic file replication (that is, no need for RAID)
Managed client access
Flat namespace of files - preferably
Built in versioning / delayed deletes
Proven deployments
I've looked seriously at MogileFS as it does fulfill most of the requirements. It does not have any managed clients, but it should be rather straight forward to do a port of the Java client. However, there is no versioning built in. Without versioning, I will have to do normal backups besides the file replication built into MogileFS.
Basically I need protection from a programming error that suddenly purges a lot of files it shouldn't have. While MogileFS does protect me from disk & machine errors by replicating my files over X number of devices, it doesn't save me if I do an unwarranted delete.
I would like to be able to specify that a delete operation doesn't actually take effect until after Y days. The delete will logically have taken place, but I can restore the file state for Y days until it's actually deleten. Also MogileFS does not have the ability to check for disk corruption during writes - though again, this could be added.
Since we're a Microsoft shop (Windows, .NET, MSSQL) I'd optimally like the core parts to be running on Windows for easy maintainability, while the storage nodes run *nix (or a combination) due to licensing.
Before I even consider rolling my own, do you have any suggestions for me to look at? I've also checked out HadoopFS, OpenAFS, Lustre & GFS - but neither seem to match my requirements.
Do you absolutely need to host this on your own servers? Much of what you need could be provided by Amazon S3. The delayed delete feature could be implemented by recording deletes to a SimpleDB table and running a garbage collection pass periodically to expunge files when necessary.
There is still a single point of failure if you rely on a single internet connection. And of course you could consider Amazon themselves to be a point of failure but the failure rate is always going to be far lower because of scale.
And hopefully you realize the other benefits, the ability to scale to any capacity. No need for IT staff to replace failed disks or systems. Usage costs will continually drop as disk capacity and bandwidth gets cheaper (while disks you purchase depreciate in value).
It's also possible to take a hybrid approach and use S3 as a secure backend archive and cache "hot" data locally, and find a caching strategy that best fits your usage model. This can greatly reduce bandwidth usage and improve I/O, epecially if data changes infrequently.
Downsides:
Files on S3 are immutable, they can
only be replaced entirely or
deleted. This is great for caching,
not so great for efficiency when
making small changes to large files.
Latency and bandwidth are those of
your network connection. Caching can
help improve this but you'll never
get the same level of performance.
Versioning would also be a custom solution, but could be implemented using SimpleDB along with S3 to track sets of revisions to a file. Overally, it really depends on your use case if this would be a good fit.
You could try running a source control system on top of your reliable file system. The problem then becomes how to expunge old check ins after your timeout. You can setup an Apache server with DAV_SVN and it will commit each change made through the DAV interface. I'm not sure how well this will scale with large file sizes that you describe.
#tweakt
I've considered S3 extensively as well, but I don't think it'll be satisfactory for us in the long run. We have a lot of files that must be stored securely - not through file ACL's, but through our application layer. While this can also be done through S3, we do have one bit less control over our file storage. Furthermore there will also be a major downside in forms of latency when we do file operations - both initial saves (which can be done asynchronously though), but also when we later read the files and have to perform operations on them.
As for the SPOF, that's not really an issue. We do have redundant connections to our datacenter and while I do not want any SPOFs, the little downtime S3 has had is acceptable.
Unlimited scalability and no need for maintenance is definitely an advantage.
Regarding a hybrid approach. If we are to host directly from S3 - which would be the case unless we want to store everything locally anyways (and just use S3 as backup), the bandwidth prices are simply too steep when we add S3 + CloudFront (CloudFront would be necessary as we have clients from all around). Currently we host everything from our datacenter in Europe, and we have our own reverse squids setup in the US for a low-budget CDN functionality.
While it's very domain dependent, ummutability is not an issue for us. We may replace files (that is, key X gets new content), but we will never make minor modifications to a file. All our files are blobs.