This tutorial has as purpose to teach how to install a hadoop cluster with 1 Name Node and 2 Data Nodes. Hadoop was installed into 3 CentOS VMs (KVM) and this tutorial is a merge of 3 others, which you can find on the internet.
First of all, you need to get the token ID to register. It can be found https://gitlab.domain.tld/admin/runners
gitlab-ci-multi-runner register
You should ask something like that:
Running in system-mode. Please enter the gitlab-ci coordinator URL (e.g. https://gitlab.com/): https://gitlab.domain.tld/ Please enter the gitlab-ci token for this runner: _z2PxQuMW7dAeHJPJ4jo Please enter the gitlab-ci description for this runner: [host.domain.tld]: docker-dind Please enter the gitlab-ci tags for this runner (comma separated): docker, dind Whether to run untagged builds [true/false]: [false]: Whether to lock Runner to current project [true/false]: [false]: Registering runner... succeeded runner=_z2PxQuM Please enter the executor: parallels, docker-ssh+machine, kubernetes, docker, docker-ssh, shell, ssh, virtualbox, docker+machine: docker Please enter the default Docker image (e.g. ruby:2.1): docker:latest Runner registered successfully. Feel free to start it, but if it's running already the config should be automatically reloaded!
environment = ["VAR1=value1", "VAR2=value2"]: Use to pass an ENV to the runner
privileged = true: You need to give a privilege to container to use docker’in’docker
volumes = ["/cache", "/gitlab/docker-images-pipeline:/images:rw"]: The last volume (docker-images-pipeline) is used to keep docker images during the pipeline steps. You can use docker save -o /images/NameOfTheImage.img to save and docker load /images/NameOfTheImage.img it again on the next step.
services = ["docker:dind"]: This entry call another container, in this case a dind, to run a service needed by the runner image. Dind service will run a Docker Daemon to provide the docker service to the runner.
An important command which I discovered and use is docker ps --size. Its show what is the size of the container.
[[email protected] ~]$ docker ps --size CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES SIZE dbffabfb478a docker:17.05-dind "dockerd-entrypoin..." 4 weeks ago Up About a minute 2375/tcp docker1705-dind 10.2MB (virtual 110MB)
On this case, you can see the column size which show 10.2MB (virtual 110MB). Compare with the output of docker images:
[[email protected] ~]$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE docker 17.05-dind b547d892dffa 4 weeks ago 99.6MB
We discover the virtual size is the sum of image size + 10.2MB (this space was used creating a file with dd command).
The slice (or slices) that compose the base image is not used for each instance that you run with docker, only the the difference will be stored on the hard disk. In this case represented by 10.2MB.
If you have any PID that you saw using a lot of resource (like CPU or memory), use this command to inspect all running dockers and its respective PID - ID - NAME.