Strategies for learning and writing code when I'm not allowed to be "polluted" with open source code? [closed] - open-source

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I work at a company where the rule basically is (as I understand it) that you cannot use any code unless (a) you write the code yourself or (b) there is some explicit indemnification clause guarding your use of any other code (like open source code). I am finding this making my coding difficult.
For example, coding samples in books are pretty much use "as-is". Microsoft SDK Code Samples are use "as-is". Blog posts about coding are use "as-is". There are several sites out there with code samples (including SO) that are use at your own risk. No warranties implied or indemnification against intellectual property lawsuits, blah, blah, etc.
Basically, I'm confined to using Asp.Net and the .Net Framework and nothing else and to bar my eyes from accidentally picking something up that I haven't created (ok...that may be my anal interpretation of the rule ;-).
I find this difficult because a big part of learning to code I think is reading other code. Reading blogs that have code, reading books that have code, looking at coding samples, using code from SDK samples etc. Also, I would think it is safe to use code that people have shown to be a good solution or pattern for something and freely put up for others to use. I'm not about to think that I can code everything myself. I definitely have to stand on the coding shoulders of others to reach certain heights.
It could be that I don't understand licensing very well either. From the companie's perspective (I suppose) they don't want to incur any risk of beind sued for IP infringement.
My thought is that you have to weigh risks. Taking a coding snippet from a book is low risk. Incorporating code from an open source library could be high-risk. I say make decisions based on how much risk you are willing to take.
Has anybody had experience working in a situation like this or similar to this? Is this a rare thing or is it common in some sectors? Are there others in the same position like me out there?
Any insight or guidance would be appreciated! Thanks!
Edit:
Thanks for the responses! To clear up some things: I'm not advocating stealing code. I'm talking about code that has some kind of public license that allows it to be used in its defined legal way. The key is there is no indemnification in public licenses in using the code. That means it you use it at your own legal risk (and other risk). If someone sues an open source project that you used code from, you could be roped into the lawsuit as well because you are using the code even though it had a public license.
In 2005, Microsoft was using indemnification to compete against open source venders by promising it's partners that Microsoft would protect them against IP lawsuits. http://www.microsoft.com/presspass/press/2005/jun05/06-22PartnerIndemnificationPR.mspx
So, even if the risk of being sued for IP infringement may be extremely low, it is a non-zero probability. Thus, I can't use any of it. Even if it has a public license of some sort. :-(

The "risk of beind sued for IP infringement" isn't really the right way to think about it. This isn't a "risk" thing.
Either
You have a license and can use the source. There's no risk. You have the license. There can't be a lawsuit.
Or
You don't have a license and you're in violation. Effectively, you will be sued. There's no risk here, either. You're in violation of someone's copyrights (or worse).
Companies are averse to Open Source for a variety of strange reasons. Risk of lawsuit is not one of them.
Things I've heard.
What if it has a virus?
What if it doesn't work as advertised?
What if it "crashes" something? Who do we sue?
None of these are "risk" items. They're "due diligence" items. And mostly, they're easy to address: pick products with enough users that someone else vets the code before you; QA open source as if one of your own people typed it in. Except for one.
This leads us to the real reason. [Hint: It's not "risk of lawsuit".]
There's no one to sue when you didn't perform due diligence on open source.
Most shops don't have real solid configuration management or QA policies (the kind that would stand up in court as best practices). Until they have these things in place, they don't dare think about introducing open source for which you really need solid QA and configuration management.

I think what your company is really worried about is you directly copying large segments of code for which there may be licensing issues, presenting a legal problem to the company if they are caught using it. However, you may read blogs or other non-licensed code and discover a solution which works for the particular problem you are working on. In that case, you would be better off rewriting the code (that is, look at the solution and reproduce it) as opposed to just copying the code and making modifications to it. At my company, that is what they generally recommend for using non-proprietary code.
As well, for small amounts of code (e.g. a standard implementation of a cache) where everyone implements this the same way, every time, your company is unlikely to be afraid of using outside code, as long as you are sure to test it carefully.

By "indemnification", I assume they mean assurance that the code is free of copyright or patent or maybe trade secret encumbrance that they don't know about up front, or that somebody's willing to compensate them if something like that turns up. I've never been in a company that worried about this, nor have I heard of one before.
It's not clear what you actually want here, other than sympathy (and I do have sympathy for people trapped in corporate foolishness). It sounds like the policy is quite rigid, if you're worried about sample code in books. This is a bad policy, and will hinder you, but I don't know what you can do about it. Unlike Joel's blog post on getting things done as a grunt, it sounds like you can't just start doing thing intelligently without being in clear violation of corporate policy.
Not knowing your situation, my suggestion would be to look for another job. This one will definitely stifle your professional growth, and a company with that policy is unlikely to be reasonable about it.
(It would be nice if you could assure them there was no danger, but that's not true. People have lied about copyrights, although open source projects tend not to, and only a fool would claim definitely that a large chunk of code did not infringe on any patents in the US; even if it was written a year before software patents were first awarded, that would be merely good grounds for a court fight, rather than avoiding a court fight. GPLed software is actually better than BSD software, since it requires some patent licensing downstream, but it can't deal with third-party patents. Of course, if they're that worried about being sued, writing in-house software is no solution. That can infringe on patents.)

You could rename the variables and how would they find out? Do they check every line of code ? Universities tell you that all the time, not to copy code without referencing. Why don't you try coding something and useing parts of code you find in the Internet?
Generally you will use more from communities like stack overflow or blogs than from open source projects.
Finally since the code has no warranties, its at your own risk.. well the is the same case if you came up with the code by yourself: its at your own risk.
Hope that helps... and good luck.

It could be that I don't understand licensing very well either. From the companie's perspective (I suppose) they don't want to incur any risk of beind sued for IP infringement.
My thought is that you have to weigh risks. Taking a coding snippet from a book is low risk. Incorporating code from an open source library could be high-risk. I say make decisions based on how much risk you are willing to take.
I'm not sure if I understood correctly. If you are saying that license infringement is fine when you don't get caught, I will have to disagree with you.
You can learn by reading code without breaking laws or getting fired. Just don't copy the code to your company's code base if the license doesn't allow it.

If you're not aware of the "clean room" concept, then there's always that approach. Have a friend look at some open source code and get them to tell you how they think it works. Diagram it out, and then code it yourself.
If it worked for IBM, right?

Keep in mind that not all Open Source is GPL. Your company can copy as much BSD-licensed code as they like. BSD-licensed code has made it into OS X (that's probably my biggest understatement of today) and to a lesser extent Windows NT.

Related

About to release code into the wild [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 7 years ago.
Improve this question
I have a program I wrote and I have been encouraged by folks to release it into public.
What would be the best way to go about it? Just dump it on a public site and hope for the best?
How much criticism will come (on the standards, decisions made etc...) and how best to prepare for that. I have been the sole developer for this app for about two years.
And how much difference does the license (GPL, MIT etc...) practically make?
Any experiences?
A license is a good idea, even if you don't care what people do with the code - most of the time people will happily take code "as is" and if it doesn't do what they want they will just throw it away - but you never know when some idiot might try to sue you because they burned their mouth drinking a hot coffee while reading your code. You may also wish to restrict usage (derivative works etc) where someone else makes profit out of your hard work. Fron the other side of the fence, people who might take and use your product/code like to know where they stand with regard to use/copying/distribution. By asking that your name stays on the code, you can also ensure that you get vcredit for the work, and that any improvements/suggestions that happen in the wild can make their way back to you.
If you just want to give away the code wihtout much ongoing development, then a great place is CodeProject - you can release the application and write a small article describing it, and then it's up to you to decide if/when you will post updates.
If you want other people to collaborate then there are plenty of open-source websites that will support this approach.
As for criticism, you are likely to get a few mails from people who need tech support, or who want to suggest extra features. Most people are very polite though. If you wrote the program for yourself, there is a good chance that when it gets into the wild you will discover all the bits that have to be used in a particular way to work well, and all the additional options that you don't care about but which the product needs to make it applicable to a wider audience - you can get sucked into a lot of support work if you're not careful. Ultimately don't be afraid to say "no" to someone if they ask for something you don't want to support - it's your program and your time after all.
The main thing is to have fun :-)
Using a well-known, well-tested open-source license will make it easier for your users to know where they stand with regard to your code. The worst thing you can do is release your code without a license. No license means no use, since in most jurisdictions software is automatically copyrighted with no right of use or reuse.
If you don't want the project to wither away from lack of interest, you'll need to get it in front of developers. Releasing it at a large open source project site (such as SourceForge, GitHub, or Google Code) will help you get that visibility, and will provide a lot of infrastructure for managing your project. The more you do, the better the chances that others will find it, try it, and use it.
CodeProject is a good suggestion- but it really depends on the platform. Typically users of each major development platform flock to other sites for their Open Source extensions or apps. For example, lots of developers on the Microsoft stack look for things in the Visual Studio Gallery or on CodePlex. SourgeForge obviously has its own religious following as well. I would suggest promoting your new app on a site where you would go to find something like it. The Google page rank of whatever public site you use to host it will also impact how many people find it and ultimately how much criticism (constructive or otherwise) you get on the project. Licensing is always a good plan. It has been my experience that each major open source collaboration site tends to learn towards a specific licensing mechanism, so I would just do what seems to be the most popular if you don't have any specific requirements.

Are licenses relevant for small code snippets? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 7 years ago.
Improve this question
When I'm about to write a short algorithm, I first check in the base class library I'm using whether the algorithm is implemented in it. If not, I often do a quick google search to see if someone has done it before (which is the case, 19 times out of 20).
Most of the time, I find the exact code I need. Sometimes it's clear what license applies to the source code, sometimes not. It may be GPL, LGPL, BSD or whatever. Sometimes people have posted a code snippet on some random forum which solves my problem.
It's clear to me that I can't reuse the code (copy/paste it into my code) without caring about the license if the code is in some way substantial. What is not clear to me is whether I can copy a code snippet containing 5 lines or so without doing a license violation.
Can I copy/paste a 5-line code snippet without caring about the license? What about one-liner? What about 10 lines? Where do I draw the line (no pun intended)?
My second problem is that if I have found a 10-line code snippet which does exactly what I need, but feel that I cannot copy it because it's GPL-licensed and my software isn't, I have already memorized how to implement it so when I go around implementing the same functionality, my code is almost identical to the GPL licensed code I saw a few minutes ago. (In other words, the code was copied to my brain and my brain after that copied it into my source code).
Edit: I'm located in Sweden. It makes me even more confused that this is country-dependent. What if I re-use a piece of code (in a manner which is legal where I live) and I sell this source code to a company in a country where the re-use of code would be illegal.
I am not a lawyer - but i've recently been involved in looking at issues like this. Copying and pasting code from blogs can certainly be considered copyright infringement unless the blog states the license that the code is under and how it can be reused.
I'd recommend using sample code like this only to give you the general process/idea for a solution - then reimplement the idea from your own head and in your own style.
As also suggested, mailing for permission is another alternative. Most people blogging code are open to having it reused.
On the first problem: silly as that law may be, technically copyright applies to any expression, and applies without requiring the author of the expression to assert it explicitly; if there is no license, you might in theory be liable for copyright violation even for small snippets. Possible defenses are based on fair use, but (again, in theory) you might end up in court to defend yourself with that (your fair use claim does not stop the copyright holder from suing -- nothing does, except common sense -- but the judge might decide in your favor if he or she decides the use is indeed fair).
Your second problem hinges on whether your code is a derivative work of the snippet, another thorny concept which mingles with the "fair use" issue. Again, the only definitive answer is the one a judge gives in the specific case ("definitive" unless overruled by an appeals court, actually;-).
Remember, most lawmakers are lawyers by training: sometimes one may wonder if they make the laws subtle and difficult just in order to ensure lawyers will always have plenty of jobs;-).
It largly depends on country. In some countries programs are threated as pices of literature so small amount of 'quote' is allowed as a fair use.
Unfortunatly you have to state which country you live in and check what's the local copyright law. In most cases cheaper solution is mailing author for permission (especially if it is open source project).
Copyright law (as in the Berne Convention) protects even small pieces of writing to some extent, so you'd have to consult somebody knowledgeable in the law where you live. There may be something available locally in a library, or you could consult a lawyer.
As far as what happens when you do something legal in Sweden and send it to me in the US where it might be illegal, I don't really know. I think I'd be the one in legal trouble, although there's the Dmitri Sklyarov case to worry about (he did something legal in Russia, came to the US, and was arrested under legal circumstances I don't really understand). Again, consult a lawyer.

Is contributing internal tools to open source worth the effort? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I know this is a general question, but I'd like to hear other people's opinion about our case:
I work in a small company. Our main development tool is PowerBuilder, which is a very limited IDE with a shrinking community. We've created some tools, which we use internally to solve a certain needs. They have neither been properly designed nor properly tested, and are not in production quality. OTOH, they do save us quite some time, and might help others as well. I'm sure other companies have the same kind of tools, and was wondering how common a practice is it to share them with others. As I see it -
The pros:
Good karma
More attention to our website
Perhaps getting fixes and improvements from others
The cons:
Without investing more development, the tools might make us look bad
Publishing of the code requires some effort
Some of the tools might be too specialized for our needs
The whole effort might go unnoticed given the shrinking community
Have you or your company ever contributed such tools, or used such tools developed by others? Is it worth the effort?
EDIT:
For those how wondered, the tools I had in mind include -
A tool that makes using SourceSafe easier, by listing objects that are checked out to the current user or others, backing up checked-out objects, and reconstructing PBGs.
A tool that recognizes PB controls at runtime, like Spy++ does (requires some infrastructure at the target app).
PBNI wrapper for SQLite (in-process access, no ODBC).
An SQL client, text measurement tool etc.
"Open source" originally meant you published a tool, and you made the source available. Because of some projects that expected, and in some cases through licenses demanded that changes to the source code be resubmitted for sharing, "open source" now quite often adds the concept of collaborative development to the mix. I did (or attempt to do) the latter; allow me to share.
There are magnitudes of difference between the effort associated with source available and collaborative development open source.
Leadership: You need to tell people the who, what, where, when, why and how of changes. And very possibly, you'll need to diplomatically poke and prod your volunteers. You may need to define the vision and prioritize goals of the project, and then enforce them when someone tries to take things another way. And, unless you only want people to come across your tool through serendipity, you'll have to advertise, running that very thin line (even thinner on the Internet) between attention-getting and gaudy. If the project is going to implement the concept of meritocracy, as many open source proponents say should happen, then someone will have to judge people's accomplishments and dole out the rights and responsibilities appropriately.
Work flow: I haven't done an exhaustive search by any stretch of the imagination, but I have yet to see a collaborative development platform that did all the things I needed. Part of the point of open source collaborative development is that the quantity involved in code review will cover any potential issues in quality of code submissions; I haven't seen a free tool integrated into a collaborative development platform that helped manage that cleanly yet (e.g. counting code reviews; auto-promoting after x reviews). We had to handle that, hacking manual methods into the existing tools. Probably at some point you'll have to define a version and create a build. Then there's the grunt tasks like documentation. (Ever try to release a new version of something free without release notes? The furor!! grin)
PB-specific issues: PowerBuilder is a commercial tool, and while there are cheap versions available, there are not free versions. The DRM added to PB11 has probably reduced or eliminated piracy that developers were probably doing to take copies of their office PB home, and while PB11 and later have a dual license policy that would allow developers to take home a copy legally (with permission and cooperation of the original license owners to create a second license), I don't see a lot doing it. (No scientific study, that's just what I see.) That cuts down a lot of potential collaboration, even from enthusiasts. Issues of compatibility of code between versions of PowerBuilder, plus the fact that very few people will own every version, will limit again your list of potential contributors.
Don't get me wrong. I'd love to see more collaborative development open source in the PowerBuilder community. I'd love to know how to work out the issues myself, and I have an effort in the works to see if I can make a new model work. (My first effort to follow the popular model failed miserably, IMHO.)
Is there a reason to feel badly about firing a ZIP file up to the web and forgetting it? I don't know. Is there any more pride or embarrassment in a 4 year old ZIP file as opposed to a SourceForge project whose last contribution 3 1/2 years ago was a post "Where the heck is everyone?" There is a reason why Sybase CodeXchange devolved from a collaborative development platform to a source available platform: next to no one was using the collaborative development features. If you source available open source your code, you'll have plenty of company.
BTW, CodeXchange may be an answer to your concern about visibility to the PowerBuilder community, although you'll lose the web site traffic. The PowerBuilder Web Ring is another, significantly less effective, method to help your visibility that keeps traffic on your web site, but it demands a navigation bar on the target page on your site. CodeXchange may also be a way to get over your concerns about code quality and narrowness of purpose of what you have to share. grin
What should you do? Don't underestimate the effort with a collaborative development sharing, but don't let it stop you from a source available sharing.
Good luck,
Terry.
You can probably discount one of your cons: Anyone interested enough in this kind of tool to be evaluating your offering is unlikely to be writing Company X are teh suxors on your feedback form; rather if they find some deficiency in what you have put out there, you are likely to get helpful bug reports or even patches.
If you can get your company to buy off on contributing to the community then I would go for it. it is always worth the effort to give back a little bit and this would definitely be a good way to get some of your tools out to the public and improved upon by the community.
As far as the cons go, I wouldn't worry too much about the criticism, it can only help you guys improve the next product you deliver and people will respect you from learning from your mistakes, nobody is perfect.
Even if your effort goes unnoticed by your shrinking community, future employees and clients will see that you are contributing outside of the company and may help with your reputation with them.
I think the pros far outweigh the cons on this one.
In short: go for it. I doubt there's little to lose, but much to gain.
The pros:
**Good karma*
never a bad thing to have.
**More attention to our website*
possibly a con if your code is really bad :)
**Perhaps getting fixes and improvements from others*
this is possibly the best thing you get from open-sourcing your code. Its all about sharing and helping each other, you get to use other's code, they get to use yours and everyone's gained from the trade.
The cons:
**Without investing more development, the tools might make us look bad*
I'd search through to remove dodgy/rude/stupid comments, tidy up the formatting etc.
**Publishing of the code requires some effort*
requires barely any effort - set up an account in Sourceforge, create a SVN repo there and import your code. Then create a binary package (a zip file will do) and release it using the website. Might take you an hour, if you stop to read all the documentation.
**Some of the tools might be too specialized for our needs*
You could set the whole lot up as a group - eg PowerBuilder Tools, then people who see the really specialised tools won't have wasted their time getting them, they'll still have the 'more readily useful' tools.
**The whole effort might go unnoticed given the shrinking community*
Possibly, but then there's really no reason not to release the code. If you don't it may get completely lost to everyone when/if you change development tools.
Publishing your source is a great way to get feedback. If you look bad because of it, that's ok. Just be willing to fix the problem. If you want help with your improvements I can't think of a better way than asking for help.
By the way, plenty of open source projects can be credited with the growth of communities that were previously shrinking.
I think you've done a good job of identifying the pros and cons. And it's probably true that the pros will outweigh the cons. If no one likes the utilities and does nothing to or with them, then you've lost nothing really; bad code shouldn't scare experienced developers (most experienced developers, especially PB ones, have seen their share of legacy code). If even one person benefits, then you get the karma, eh?
If you proceed to submit your tools to the open source community, do as you have here, and admit up front that the tools are not polished. This may deter some from even looking at them, however, if they are at least functional and can be easily modified, then they still represent a head-start for any prospective beneficiaries. As a PB user myself, I would be curious to know more about free tools that can give us an edge in productivity.
Have you looked into Sybase CodeExchange? They have some open-source PB things there, including the PowerBuilder Foundation Class framework.
I just saw your response to my question - amazing that you have developed something similiar already. :-)
Regarding your question: the company I work for has a specific section on the web site where tools which we used internally and/or simple solutions (or code snippets) which customers frequently ask for are published. The license of these offerings is very liberal as well, I think it qualifies as open source.
In your particular case, I'm fairly interested in the Spy++-like application you talked about since I was looking for (and/or trying to develop) something like that myself.
I'm aiming for something which doesn't require any infrastructure in the target application, but so far I'd be happy to play with anything which works, even if it requires modifications to the applications. I'm just not familiar enough with the PowerBuilder API yet to make a judgement on whether this is possible without modifiying the target application.
As I mentioned, I already developed similiar Spy-like applications for ordinary Windows applications as well as managed code applications (which require interaction with the VM to query the state of the object tree), so my hope is that I'll be able to find a solution which does not require any target infrastructure.
Do you have the source code up somewhere already? It doesn't need to be compileable, I'd just be happy to look how you did it in principle so that I can (hopefully) derive something from it which solves my particular problem. In case you didn't upload the source code yet, maybe you can provide some email address which I can use to contact you privately? I tried looking for something on your profile, but so far - no luck. :-)

Is it possible for open-source software to have viruses/spyware/malware? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Sorry if this is a stupid question but sometimes I see Easter eggs and stuff in programs like Aptitude. (the package manager for Debian)
Is it possible Is it likely that more sinister features make their way into open-source software?
It's certainly possible, but it's more complicated. I don't know of any actual malware going around, but people have made mistakes with similar effects. (I know of mistakes that have been found; obviously, I don't know how many haven't been.)
If you put malware into closed-source software, the only way to find it is to detect the effects and analyze the binary. There are people who are very good at analyzing the binary.
In open-source software, anybody can look at the source code. Not many will, for most packages, but there's a much higher chance of being found out. Once found out, anybody can patch the software to do the good things without the bad. Moreover, most open source software has publicly available repositories, which means that anybody can track down the history of the code, and (at least to a pseudonym) who did what. There is also a tendency to produce more readable code in open source, so that changes will stand out more.
The caveat, of course, is that most of us really don't know what to look for in software security. If I run a compression program, and it compresses my file to a shorter version that looks like gibberish, and I can get the original back, I know that's working. If it changes it to gibberish that it claims is encrypted, I don't know a priori how to tell if it's well encrypted.
It's possible but sort of harder because the source code is there. The author would be counting on no one bothering to read the source code before running it which is true for a lot of people I suppose. I know I don't bother to read the source code of the open source programs I run. In a larger project it's harder because the code is often reviewed but if there's just one author then it becomes a lot easier.
Any software can contain malicious parts (intentionally or unintentionally). The advantage of open source is that you can check it (if you like and have the time to do so).
Yes it's possible, see the Debian OpenSLL debacle for a nice example:
http://www.metasploit.com/users/hdm/tools/debian-openssl/
Although this is not a virus/spyware/malware, it clearly shows what could go wrong in open source software.
I can't believe nobody's mentioned Ken Thompson's compiler virus yet.
Having access to the source code offers a reasonable level of assurance that the program won't behave maliciously. However, unless you've inspected one of:
The binary output of the compiler
The binary code of the compiler itself
The source code of the compiler (and it's compiler, and it's compiler, etc.) and the binary code of the compiler used to compile it.
you could end up with malicious code in the compiled binary that never appears in the source. Admittedly it's a very unlikely and extremely difficult form of attack, but it's theoretically possible to introduce malicious code into an open source project in a way that cannot be detected in the source code (of either the project or the compiler).
If you're not working for the CIA (or the equivalent agency of another government), compiler security probably isn't something you have to worry about. But it is a very cool concept to think about.
I'd say that it obviously it is possible. All it requires is that code gets accepted without sufficient review. It's not hard to come up with scenarios permitting that, since reviewers are human.
The more interesting question then becomes how likely it is that malware gets accepted into some package where it can do harm. This is far harder to answer, unfortunately. We seem to be doing okay so far (knocks on wood), at least.
I guess Linus's law ("with enough eyes, all bugs are shallow") holds true, but it easy to think that just because something is open source, people will spend a lot of time eyeballing its code. That is not generally true, as far as I know.
EDIT: Changed the wording about Linus's law above, had it wrongly attributed.
Is it possible in principle? Of course. Any software can do things people don't want.
Is it possible in practice? The argument against is of course that the software is available, and that many eyes are looking at it, so it'll be discovered before it can do too much damage.
On the other hand, there is the Underhanded C Code Contest, http://www.underhanded-c.org/, in which you submit programs that intentionally misbehave, but where the cause of the bad behavior is not apparent from an inspection of the source.
Then there's of course the Debian SSL bug, where SSL keys generated with the OpenSSL library on Debian where quite insecure. This, apparently, was just an act of incompetence (Hanlon's Razor, everybody), but it shows how security problems can sneak into open source code. With weak keys and SSH access, you don't need a virus in the code, you just exploit weak code when it's running on production systems.
Take that as a yes/no/maybe :)
Yes it is possible, the same that it's possible for closed-source software to have the same occur (malicious developer on the team, etc)
It's arguably less likly with open-source though, as the moment anything like that is noticed, any other user can pull the problem code and it's no longer a problem.
Similar how closed source software can be viruses/spyware/malware, open source can be as well. As well as how there's tons of horrible open source software.
This far every closed source windows software I've seen has been some sort of malware though, so the bias is closed source apps have higher chances being crap in overall.
Who prevents it? Even if software would be open source, Only poor software allows anyone touch the release repository without authorization. Usually there are maintainer(s) who review all the incoming patches.
99% of all software produced and used is poor quality and bug ridden.
It is, but usually its noticed and removed before it becomes an issue. With any well maintained open source software there are many people who check each revision for any changes that were made.
Yes it is possible, it depends how carefully controlled commit access to the source code is and how carefully monitored those commits are. Some projects have a few lead developers who request patches from the community and commit this to the code base, other projects will grant access to many more developers. Equally some projects have a large number of people reviewing the source code as changes are made.
It is possible, indeed as it is code like any proprietary software. However, the main difference is that you -and the community- has access to the code, and this fact is enough to stop it from happening in almost all cases. Also, the vast amount of versions of libraries and kernel makes malware less likely to succeed.
Do you really know what you use? Do you check? Does typical user check or care?
For example google for the keywords: repository compromised or gpg repository compromised or something along these lines.
It is possible, but not very likely.
There's nothing special about open source code that makes it magically resistant to containing bad things, but open source which is actively developed by a group of people is very unlikely to contain malicious code, because someone would notice and blow the whistle.
In addition, in most open source projects it's possible to trace the history of any particular piece of code, by looking through the project's source repository, which means that the author of a malicious piece of code can be identified.
If in doubt, you can always review the code yourself, or hire someone to review it for you. Code review generally won't catch subtle bugs or errors, of course, but malicious code is likely to be more obvious.
Another example where open-source security is superior to that of closed source is the interbase backdoor.
From The Register:
A back door password has been hidden in Borland/Inprise's popular Interbase database software for at least seven years, potentially exposing tens of thousands of private databases at corporations and government agencies to unauthorized access and manipulation over the Internet, experts say.
The password was discovered when Interbase was made open source.
Which doesn't mean that security of open-source software is perfect, or even good. Who needs to insert malware into original software when there are thousands of remotely exploitable security holes all over the place?
Is it likely that more sinister features make their way into open-source software?
Officially no, it's relatively unlikely that malware features will get in and go unnoticed for long. But:
servers holding distribution sources can be (and have been) compromised so that what you're downloading doesn't correspond to the open-source development work;
in the case of binary distributions (usually for Windows), the installer for the software can be packaged with malware. Again, officially this happens quite rarely; one example is early versions of LimeWire, which installed a ‘shopping helper’ affiliate-fee-stealer BHO to “support the project”, and lost a lot of goodwill doing so;
but, there are also some scam artists who squat search results for well-known open source projects (again, most commonly with file-sharing software) and deliver their own tweaked installers bundled with malware. Always find the project's official site before downloading.
Just a reminder,
SSH Backdoored before
Wordpress backdoored before
both of them happened in a result of an attack, so it wasn't planted by original developers. I think this kind of proves that it can but not that likely and will get picked up quite soon if the application is popular enough.

Which factors determine the success of an open source project? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
We have a series of closed source applications and libraries, for which we think it would make sense opening up the source code.
What has been blocking us, so far, is the effort needed to clean up the code base and documenting the source before opening up.
We want to open up the source only if we have a reasonable chance of the projects being successful -- i.e. having contributors. We are convinced that the code would be interesting to a large base of developers.
Which factors, excluding the "interestingness" and "usefulness" of the project, determine the success of an open source project?
Examples:
Cleanliness of code
Availability of source code comments
Fully or partially documented APIs
Choice of license (GPL vs. LGPL vs. BSD, etc...)
Choice of public repository
Investment in a public website
There are a several things which dominate the successfulness of code. All of these must be achieved for the slightest chance of adoption.
Market - There must be a market for your open source project. If your project is a orange juicer in space, I doubt that you'll be very successful. You must make sure your project gets a large adoption amongst users and developers. It is twice as likely to succeed if you can get other corporations to adopt it as well.
Documentation - As you touched on earlier documentation is key. Amongst this documentation is commented code, architectural decisions, and API notes. Even if your documentation contains bugs, or bugs about your software it is ok. Remember, transparency is key.
Freedom - You must allow your code to be "free" - by this I mean free as in speech, not as in beer. If you have a feeling your market is being a library for other corporations a BSD license is optimal. If your piece of software is going to run on desktops then GPL is your choice.
Transparency - You must write software in a transparent environment. Once you go open source there is no hidden secrets. You must explain everything you do, and what you are doing. This will piss off developers like no other
Developer Community - A strong developer community is required. This must be existing. Only about 5% of users contribute back to the project. If someone notices there haven't been any releases for a year they wont think "Wow, this piece of software is done," they will think "developers must of dumped it." Keep your developers working on it, even if it means they are costing you money.
Communications - You must make sure you community is able to communicate. They must be able to file bugs, discuss workarounds, and publish patches. Without feedback, it is pointless to opensource the project
Availability - Making your code easy to get is necessary, even if it means pissing off lawyers. You have to make sure your project is easy to download, and utilize. You don't want the user to have to jump through 18 nag screens and sign a contract in order to do this. You have to make things simple, and clean
I think that the single most important factor is the number of users that are using your project.
Otherwise its just a really well written, usefull and well documented bunch of stuff that sits on a server not doing very much...
To acquire contributors, you first need users, then you need some incompleteness. You need to trigger the "This is cool, but I really wish it had this or was different in this way." If you are missing an obvious feature, it's extremely likely a user will become a contributor to add it.
The most important thing is that the program be good. If its not good, nobody will use it. You cannot hope that the chicken-and-egg will reverse and that people will take it for granted until it becomes good.
Of course, "good" merely means "better than any other practical option for a significant number of people," it doesn't mean that its strictly the best, only that it has some features that make it, for many people, better than other options. Sometimes the program has no equivalent anywhere else, in which case there's almost no requirement in this regard.
When a program is good, people will use it. Obviously, it has to have a market among users--a good program that does something nobody wants isn't really good no matter how well its designed. One could make a point about marketing, but truly good products, up to a point, have a tendency to market themselves. Its much harder to promote something that isn't good, so clearly one's first priority should be the product itself, not promoting the product.
The real question then is--how do you make it good? And the answer to that is a dedicated, skilled development team. One person can rarely create a good product on their own; even if they're far better than the other developers, multiple perspectives has an incredibly useful effect on the project. This is why having corporate sponsors is so useful--it puts other developers' (from the corporation) minds on the problem to give their own opinion. This is especially useful in the case that developing the program requires significant expertise that isn't commonly available in the community.
Of course, I'm saying this all from experience. I'm one of the main developers on x264 (currently the most active one), one of the most popular video encoders. We have two main developers, various minor developers in the community that contribute patches, and corporate sponsorship from Joost (Gabriel Bouvigne, who maintains ratecontrol algorithms), from Avail Media (who I work for sometimes on contract and who are currently hiring coders on contract to add MBAFF interlacing support), and from a few others that pop up from time to time.
One good developer doesn't make a project--many good developers do. And the end result of this is a program that encodes video faster and at a far better quality than most commercial competitors, hardware or software, even those with utterly enormous development budgets.
In looking at these issues you might be interested in checking out the online version of a course on open source at UC Berkeley, called Open Source Development and Distribution of Digital Information: Technical, Economic, Social, and Legal Perspectives. It’s co-taught by Mitch Kapur (Lotus founder) and Paula Samuelson, a law school professor. I have a long commute and put the audio of the course on my iPod last year – they talk a lot about what works, what doesn’t and why, from a very broad (though obviously academic) perspective.
Books have been written on the subject. In fact, you can find a free book here: producing open source software
Really, I think the answer is 'how you run the project'.
All of your examples matter, yes, but the key things are how the interaction between developers is managed, how patches etc are handled/accepted, who's 'in charge' and how they handle that responsibility, and so on and so forth.
Compare and contrast (the history isn't hard to track down!) the management of the development of Class::DBI and DBIx::Class in Perl.
I was just reading tonight an excellent post on the usability aspect of successful vs unsuccessful open source projects.
Excerpt:
A lot of bandwidth has been wasted arguing over the lack of usability in open-source software/free software (henceforth “OSS”). The debate continues at this moment on blogs, forums, and Slashdot comment threads. Some people say that bad usability is endemic to the entire OSS world, while others say that OSS usability is great but that the real problem is the closed-minded users who expect every program to clone Microsoft. Some people contend that UI problems are temporary growing pains, while others say that the OSS development model systematically produces bad UI. Some people even argue that the GPL indirectly rewards software that’s difficult to use! (For the record, I disagree.)
http://humanized.com/weblog/2007/10/05/make_oss_humane/
Just open-source it. Most probably, nobody will start contributing yet. But at least you can write on the press-releases that your product is GPL or whatever.
The first step is that people start using it...
And maybe then, after users get comfortable, they will start contributing.
Everyone's answers have been good so far, but there's one thing missing and that's good oversight. Nothing kills an open source project faster than not having some sort of project management. Not to tell people what to do so much as to just add some structure and tasking for the developers you are hoping to attract.
Disorganized projects fall apart fast. It's not a bird you just let go and watch it fly away.