How to clean confidential data in mercurial changesets?

How to clean confidential data in mercurial changesets? - mercurial

I want to sell a copy of my system and need to transfer the source code to my customers. I use Mercurial as the VCS. There are some confidential data in my code. For example, Amazon access key/secert key, database passwords and ssl private keys. Those keys are written in the code or configuration files, like this:
# settings of Amazon S3 storage
s3.storages:
access_key: <secret>
secret_key: <secret>
Before I transfer my code to them, I need to clean all those confidential data in the code base. But all of them are in history (changesets). With Mercurial, how can I clean those secret?

If you're giving the customers only a snapshot you can do it after you run hg archive.
If you want to give them access to the repository with full history you need to use hg convert to exclude that file.
In that case you're probably better off just invalidating the AWS key and using a new one in the future -- Amazon makes that very easy.
Going forward you're better off not putting those keys into source control. Instead put in a config.sample file and then add config.actual top your .hgignore.

Related

How do I host and edit a JSON file separate from my GitHub repository?

I'm building a Discord bot and am trying to host it with Heroku and GitHub. I intend to store user data in a JSON file but cannot figure out how to edit the JSON file because I cannot edit it while it is in the repository. I am hoping there is a way to do it through Heroku, without using a separate website.
Note: I know how you would normally edit the JSON file, but because it is in a GitHub Repository it doesn't work the normal way.

Don't use a file as a database. Use a database as a database.
This is generally good advice, but especially important on Heroku where the ephemeral filesystem prevents changes to files from persisting long-term.
Heroku Postgres is a relatively easy way to get started. Its base plan is free.

I believe GitLab allows you to edit files in place, and they have a free tier like Github. As mentioned by Chris, this is generally not recommended, but it may work for your needs.
https://about.gitlab.com/

Git Repository And Database Schemas

My company use Git for “version control”,etc. Currently it is used for C, C# and Python. I have been asked to add the database schemas together with the more “complex” SQL (no idea when it becomes “complex”) to the repository. Currently the database is backed up after changes have been made to the schemas or after data has been added (at the moment it is purely a development environment). Having looked at Git, database schemas and the like do not really seem (to me) to map onto it. Should I be considering another package for “source control” to compliment the existing MySQL backups?
Thank you...

Assuming you are just wanting to store the SQL scripts that can recreate your DB schema without any data in it (CREATE TABLE, VIEW, INDEX, etc.) then Git seems like a perfectly good option. Git is generally good for version control of textual data, such as SQL scripts.

The fingerprint rule is not to store large files which are often modified in git for several reasons. (out of this answer scope - heuristically, snapshots etc) so i would suggest not to add them to git directly and instead store them in a submodule as a standalone repository.
This way you can still use git to track changes but your git repository will not growth to a huge size (pack files) and you can manage it inside your project.
If you only want to store the sql script git is a good choice sine it will handle it as any other file.

Maintaining individual folder structure with Mercurial

We have a main repository of code, separated into subfolders such as SQL, Websites, etc. We are looking to add an additional folder for our SSIS Packages and similar items so that we can keep backups. However, they are stored in folder structures external from our code repository in three separate servers. We need to maintain the original folder structure on these servers containing the SSIS packages.
Normally I would put Mercurial on each of machines hosting files that were to be added to the repository, pull down a copy of our repo, and add the files to the repo. However, in this case I won't be able to move the files into the repo folder after I pull down a copy, because the directory structure needs to be maintained.
The only alternative I can think of is to create a second repository for the SSIS packages, but this still runs into the issue that there are three servers with SSIS files to be added, all with their own unique folder structure. Which means that we would have to either change the directory structure of the three servers to match each other, or create three separate repositories.
Please note that the goal here is not for "installing" these files to the servers - the files are already on the intended servers and are (rarely) modified locally when required - it's purely for backing up the versions and maintaining a minimal number of mercurial repositories.
Is it possible to maintain a unique individual folder structure on a local machine while committing files to Mercurial?
An example folder structure was requested in the comments:
Current Repo structure:
Repo
Documentation
Reports
SQL
Applications
Application A
Application B
The goal is to add a new folder so it will look like this:
Repo
Documentation
SSIS
Server 1
Server 2
Reports
However, each server stores the files in different directory structures (directly dropped in the base C drive, one in a separate sub directory, and the third deep in a legacy Visual Source Safe directory), and is referenced externally so changing them to match the repo is sub optimal.
Hence the question - is there a way to add files to a mercurial repo without dropping them into a pulled copy of the repo?

First of all, what you're trying to do doesn't sound like a great idea to me. I think you can do what you want to do using symlinks in the file system.
Mercurial knows how to manage symbolic links, and you could make a tree of symbolic links for each machine pointing back to the real files in a different
tree.
So, it's possible, but it seems like a very error prone process to me. I think
you're far better off using an install tool and putting the "Makefile" or rake tasks, or ... in your repo.

A practical way to provide code updates via Mercurial without sharing main BitBucket account

I suspect this might be really obvious but I can't find a straightforward solution in the documentation or forums:
I have written some code that is held in a Mercurial repository on BitBucket.
I use this code to build Linux virtual servers. When I build a server, I clone the repo onto the server, run my build script, and then delete the clone. The result is a configured server with several files from my repo located in various folders on the server.
Now, I'm looking for a mechanism where I can roll out bug fixes and improvements to my users' servers after I have handed them over. At that time, I won't have SSH access to the servers and I cannot expect my end users to do anything more complicated than kick off a cron job or launch a script.
To achieve this, I'm thinking of setting up a BitBucket account for my users with read-only access to my repo.
I have no problem writing a script to clone my repo, via this read-only account, and apply the updates, but I don't want to include all my files. In particular, I want to exclude my build script as it is commercially sensitive. I know I could remove it from my repo, but then my build wouldn't work.
Reading around, it seems I may need to create a branch or a fork of my repo (which?). Or maybe a sub-repo? Then, I could remove the sensitive files from that branch/fork/sub-repo and allow my users to clone it via a script.
That's OK, but I need a way to update the branched/forked/sub repo as I make changes to the main one. Can this be automatic? In other words, can it be set up to always reflect the updates made in the main repo? Excluding the sensitive files of course.
I'm not sure I'd want updates to be automatic though, so I'd also like to know how to transfer updates from the main to the branch/fork/sub manually. A merge? If I do a merge, how do I make sure my sensitive files don't get copied across?
To sum up, I have a main repo which contains some sensitive files and I need a way to roll out updates of all but those sensitive files to my read-only users.
Sorry if this is hugely obvious. I'm sure it's a case of not seeing the wood for the trees and being overwhelmed by the possibilities. Thanks.

I don't think that you need to solve this in Mercurial at all.
What you actually need is Continuous Integration / a build server.
The simplest solution goes like this:
Set up a build server with something like TeamCity or Jenkins, that's always online and monitors changes in your Bitbucket repository.
You can set it up so that when there's a change in your repository, the build server runs your build script and copies the output to some FTP server, or download site, or whatever.
Now you have a single location that always contains the most recent code changes, but without the sensitive files like the build script.
Then, you can set up a script or cron job that your end users can run to get the newest version of the code from that central location.

You are ok with two branches, one for the users clone (main) and other for your main development (dev), the tricky part is merging the new changes from dev to main.
You can solve this by excluding files in the merge process. Excluding a file while merging in Mercurial
By setting the [merge-patterns] section in your .hgrc you can sepcify what files are not affected by the merge.
[merge-patterns]
build.sh = internal:local
For more info read hg help merge-tools.
"internal:local"
Uses the local version of files as the merged version.

Entire Mercurial trees always get moved around together, so you can't clone or pull just part of a repository (along the file tree axis). You could keep a branch that has only part of the files, and then keep another branch that has everything, making it easy to merge the the partial (in terms of files) branch into the other branch (but merging the other way wouldn't be particularly easy).
I'm thinking maybe subrepositories work for your particular use case.

How To share configurations between Two or more TRAC environments

hope this is a good spot for my question,
for it i SW Related, but not code related.
We, in our company are using TRAC for Issue tracking and management of the Code links,
I am very satisfied by it, and like how it is working.
i have about several environments (1 per project) and every time we change a setting in the Configurations (e.g. Users & Permissions, Severity, Ticket types, etc...) we need to change all of them.
I Use
[inherit]
file=../../../sharedTrac.ini
and delete the shared parts from the file.
for the preferences, but i didn't find a way to share the Configurations.
this is bad for several reason and the head reason is that is "Bugs me !!!" :p
Can TRAC read its configurations from a central definition, and the data from a local DB?
EDIT:
I noticed all these configurations are in the .db file (sqlite file)...
Is there a Ready made tool to copy the configurations from DB to DB ?
or should i go ahead and analyse what should be copied and how ?

You're almost there. Note though, that local settings will always over-rule inherited ones, so you must delete them in your <env>/conf/trac.ini files to make central configuration effective.
Specifically to the part of configuration inside Trac db: No, there is no sync tool yet. Given that there was one for user accounts that is still a beta after years, there's not much interest. You should use the trac-admin command-line tool (as already advised here) or start to directly sync parts the db by means of own (Python) scripts or custom db syncronisation. For a start have a look at the Trac db schema.

You can try to do this through command line. Just call appropriate "trac-admin" command for each instance. Example one-liner to add user profile:
for D in */; do trac-admin $D session add username "Full Name" user#email.com ; done

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008