When you deploy applications to OpenShift from source code, you will typically provide the source code by specifying a URL to a repository managed by a Git hosting service such as GitHub, GitLab or Bitbucket. When the build process runs to create the image for your application, the first step will be to pull down the source code from that hosted Git repository.
If the hosted Git repository is publicly accessible, there is nothing else to do. If, however, you want to use a private Git repository, you will need to provide to OpenShift access credentials which the build process can use when accessing the Git repository.
The principles of using private Git repositories with OpenShift have been covered in a number of prior posts:
- Deploying From Private Git Repositories
- Using SSH Key for S2I Builds
- OpenShift Online: How to Deploy from Private Git Repositories
These looked at key generation and setting up a build configuration to use it, however, beyond these basics there are additional features in OpenShift which make using private Git repositories easier. There are also various best practices you can adopt to ensure you are using the most secure mechanism possible, without risking your most important access credentials.
In this series of posts I will look again at the topic of accessing private Git repositories, but explore new and more recommended ways of using OpenShift, and hosting services such as GitHub, GitLab, and Bitbucket, to best secure access to your source code.
Methods for Accessing a Repository
To access a hosted Git repository, a number of different protocol types are supported.
- Local Protocol - This is used where the Git repository resides in your local file system and clients access the repository directly. A Git repository on a remote machine can only be accessed using this protocol where the remote file system is shared with the local host where access is required.
- The Git Protocol - This is used where a special Git daemon process is run to mediate access to a repository. It can be accessed remotely if the daemon were exposed, but provides no authentication mechanism for controlling access.
- The HTTP Protocols - This provides access to a hosted Git repository over the HTTP protocol. Access to the repository can be secured, with access credentials supplied using the HTTP Basic Authentication protocol. When passing access credentials, because you are passing your password and the HTTP Basic Authentication doesn't adequately conceal it, a secure connection using HTTPS should be used so that your password cannot be viewed.
- The SSH Protocol - This provides access to a hosted Git repository over an SSH connection. Access to the repository is secured using public-key authentication. You have to generate a key pair, and register the public key with the server hosting the Git repository, the identity of the key pair being used to control access.
For more information about communicating with a Git repository using these protocols see the hosted version of the Pro Git book.
When accessing a repository, Git repository hosting services such as GitHub, GitLab, and Bitbucket support using both the HTTPS and SSH protocols. The terminology for each can vary, but the credential types offered by these hosting services then fall into the following categories:
- Personal password - This is the password for your account and can be used when accessing the service over the HTTPS protocol. This gives access to any repositories in your account, as well as repositories under other accounts which you have been granted access. Depending on the hosting service used, you may have full read/write permissions on any repository you don't own, or you may be restricted to only being able to read a repository.
- Personal SSH keys - This is an SSH key linked to your account with the hosting service and can be used when accessing the service over the SSH protocol. This gives access to any repositories in your account, as well as repositories under other accounts which you have been granted access. Depending on the hosting service used, you may have full read/write permissions on any repository you don't own, or you may be restricted to only being able to read a repository.
- Personal access tokens - This is an alternative to a personal password which can be used to access a repository over the HTTPS protocol. Any number of access tokens can be created. You can create separate tokens for each separate client you may want to use to access your account. By using a separate token per client you can control what level of access each has, including whether they have read-only access, or can update repositories. A token for a specific client can be revoked without affecting other clients. Although what actions a specific client can do can be controlled, it would still be able to see all repositories that you as a user have access to.
- Repository SSH keys - This is an SSH key linked to a specific repository and can only be used with that repository. The same SSH key cannot be used as a personal SSH key, and depending on the hosting service may also not be able to be used with more than one repository. By default, read-only access is granted, but write access can be enabled if desired.
When using a private Git repository with OpenShift, you should always aim to use a unique repository SSH key and ensure it has read-only access to the repository. This is because you will need to upload the private key of the key-pair to OpenShift.
When setting up a repository SSH key you should not use your primary identity SSH key. This is because you should never upload its private key to a system you do not control or which can be accessed by others. Using a primary identity key carries additional risk as it is likely used for other purposes, such as gaining secure shell access over SSH to other hosts. If someone were able to get the private key, they could access those other systems.
One saving grace is that OpenShift will not allow you to use a key-pair where the private key has a passphrase. A best practice, which I am sure you are following, is that your primary identity SSH key should always have a passphrase; this will prevent you from inadvisedly using your primary identity SSH key.
So repository SSH keys and accessing a repository over the SSH protocol is the preferred method of accessing a repository. One situation where this will not work though is where the OpenShift instance you are using sits behind a firewall, and that firewall blocks SSH connections to an external Git hosting service.
In this situation, you would have to fallback to using a personal access token over a HTTPS connection. Although the access rights of personal access tokens can be limited to read-only access, they cannot be linked to just a single repository and could be used with any repository accessible to the user account. If you are deploying a personal project to OpenShift, this may be acceptable. However, if this were a project of your company hosted on the hosting service under an organization, you should not rely on using a personal access token of a specific developer. If that developer were to leave you now have the problem that your builds are linked to that developer's account and they could revoke the access token and break the builds. The developer also puts themselves at risk, as the access token could be used to access other private repositories they have which are not related to their work at the company.
If you cannot use the SSH protocol and must use a personal access token with the HTTPS protocol, for a company it would be better to create a separate machine user account on the Git repository hosting service and use it as the owner of the personal access token.
Setting Up a Repository SSH Key
In this post we have covered the different protocols and credential types you can use to access a hosted Git repository, as well as listed some best practices around the credential type used. In the next post in this series, we will look at setting up a repository SSH key, using the GitHub hosting service as an example.
Access the rest of the series here:
- Private Git Repositories: Part 2A - Repository SSH Keys
- Private Git Repositories: Part 2B - Repository SSH Keys
- Private Git Repositories: Part 3 - Personal Access Tokens
- Private Git Repositories: Part 4 - Hosting Repositories on GitLab
- Private Git Repositories: Part 5 - Hosting Repositories on Bitbucket