Showing posts with label gitlab repository appearing in google. Show all posts
Showing posts with label gitlab repository appearing in google. Show all posts

Thursday, December 12, 2019

Gitlab - Stop Showing your gitlab setup and repositories in google search results


If your gitlab setup is accessible by global url but you do not want to show results in Google, here is solution for you.

1. You can make your gitlab url restricted to limited IPs only. Gitlab works on default port 89. You must have used a web server to serve the gitlab url globally. Your web server may be apache, nginx or some other web server. You can add  attributes in your virtualhost which will stop accessing gitlab from undefined IPs.

2a. Sometimes you need to access gitlab from undefined IPs and it is not feasible to change the Virtualhost setting every time. If you can not restrict your gitlab setup to some IPS but still you do not want to be in Google search results, you can try this solution.
Always make your repository and group private. Do not make any public repository or any public group. Public repositories and public groups are visible in google search results. Gitlab has settings in admin area. After enabling it, no registered user can create a public repository/group. Only admin will have access to create public repository and group.
Here is the settings.

Admin Area > Settings > General > Visibility and access controls > Restricted visibility levels

Check the box Public.
Now no registered user can create a Public repository and Group.

2b. After following solution 2a, you need to implement solution 2b. Like every other web application, gitlab too has robots.txt file.

robots.txt is a direction for search engines and crawlers. They follow it blindly. If you write a rule to not allowing your site in search result, your web application will not be listed.

By default gitlab allows to show login page and explore page to list in search results. Explore page contains list of all public repositories and groups and if your gitlab has some of them, it will be listed in search results. You need to modify your gitlab robots.txt. Here is the path.

/opt/gitlab/embedded/service/gitlab-rails/public/robots.txt

Now comment every single line except these two

 User-Agent: *
 Disallow: /

it will restrict to show your gitlab url in search results. If it is already listed, once you make changes in robots.txt, it will be gone after some days.