Thiết kế website giá rẻ

Question

Update!

After continuing to play around with this problem I’ve discovered it might not be a dind problem but possibly more of a teamcity hosted on docker problem. It, however, is obviously possible to host teamcity on docker and many people do it, so there must be a solution.

What I know now:

Regardless of if I try to run a docker build on my dind container (discussed below), the agent itself, an external VM running a docker server, or binding the docker.sock of the agent container to it’s host, I encounter the same problem as described below.

The tl;dr is failing when it tries to run:

bin/sh -c . /opt/buildagent/temp/agentTmp/docker-wrapper-5897200216492230985.sh && docker exec -w /opt/buildagent/work/33e31afd4a2c64f0 6ec5e71a142a711bf3bc97b2258dd89e153eaaf9b8a906439b31d728c887588e /bin/sh -c /opt/buildagent/temp/agentTmp/docker-shell-script-10444342258323450815.sh` which results in this error being logged `/bin/sh: /opt/buildagent/temp/agentTmp/docker-shell-script-10444342258323450815.sh: not found

Upon further investigations it becomes apparent that docker-shell-script-{id}.sh is available on the build agent but is unable to find custom_script{id2}' which causes the not found` error.
The error is not caused by dind since the same error happens even when not doing dind.
However, the problem does go away when I try to do a build that does not use docker at all.

######################## Original Post ########################

Goals

Host Teamcity on docker. This includes the server, database, agents, and any other containers needed to support that effort.
Isolate the builds from the docker host to avoid negative impacts from builds on the other applications running on the docker server. (Performance tuning will be done after it is working.)
Use docker compose to deploy the system.
Avoid manual customization after running docker-compose up -d. Everything should be handled in the docker compose file or a bash script that can be run before running docker.

The Setup

Docker Host
- Debian 12
- Docker version 20.10.24

Teamcity on Docker

Summary – This is a standard Teamcity on docker setup except there is one additional container teamcity-dind. This container runs the nestybox/ubuntu-noble-systemd-docker image with the sysbox runtime. This combination gives me a container that is much more like a VM than a normal container as it gives me systemd and dockerd within the container. I expose port 2375 for the agents to connect when they need to run docker workloads. The rest of the configuration is pretty standard. After setting up, everything was imported from our Windows installation of Teamcity that we want to replace with this solution.

The docker-compose.yaml:

services:
  teamcity-db:
    image: postgres:latest
    container_name: ${POSTGRES_HOST}
    restart: unless-stopped
    environment:
      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
      - POSTGRES_USER=${POSTGRES_USER}
      - POSTGRES_DB=${POSTGRES_DB}
      - PG_DATA=/var/lib/postgresql/data
    volumes:
      - ./buildserver_pgdata:/var/lib/postgresql/data
    networks:
      - teamcity_net

  teamcity:
    image: jetbrains/teamcity-server:${TEAMCITY_VERSION}
    container_name: teamcity
    volumes:
      - ./data_dir:/data/teamcity_server/datadir
      - ./teamcity-server-logs:/opt/teamcity/logs
      - ./ssh:/opt/teamcity/.ssh
    labels:
      # HTTPS routing for the dashboard
      - "traefik.enable=true"
      - "traefik.http.routers.teamcity.rule=Host(`teamcity.srom.local`)"
      - "traefik.http.routers.teamcity.entrypoints=web"
      - "traefik.http.routers.teamcity.service=teamcity"
      - "traefik.http.services.teamcity.loadbalancer.server.port=8111"
    depends_on:
      - teamcity-db
    networks:
      - teamcity_net
    restart: unless-stopped

  teamcity-dind:  #This container runs the build containers in an isolated are using sysbox.
    image: nestybox/ubuntu-noble-systemd-docker:latest
    container_name: teamcity-dind
    runtime: sysbox-runc
    privileged: false
    environment:
      - DOCKER_TLS_CERTDIR=""
    volumes:
      - teamcity-dind-data:/data
      - teamcity-dind-cache:/var/lib/docker
      - ./dind-override.conf:/etc/systemd/system/docker.service.d/override.conf:ro
    networks:
      - teamcity_net
    restart: unless-stopped

  teamcity-agent-1:
    build:
      context: .
      dockerfile: Dockerfile.agent
      args:
        - TEAMCITY_VERSION=${TEAMCITY_VERSION}
        - NODE_VERSION=${NODE_VERSION}
    container_name: teamcity-agent-1
    depends_on:
      - teamcity-dind  
    volumes:
      - ./agents/agent-1/conf:/data/teamcity_agent/conf
    environment:
      - SERVER_URL=http://teamcity:8111
      - DOCKER_HOST=tcp://teamcity-dind:2375  # Point to the Docker daemon running in the DinD container
    networks:
      - teamcity_net
    restart: unless-stopped

teamcity-agent-2:
  build:
    context: .
    dockerfile: Dockerfile.agent
    args:
      - TEAMCITY_VERSION=${TEAMCITY_VERSION}
      - NODE_VERSION=${NODE_VERSION}
  container_name: teamcity-agent-2
  depends_on:
    - teamcity-dind 
  volumes:
    - ./agents/agent-2/conf:/data/teamcity_agent/conf
  environment:
    - SERVER_URL=http://teamcity:8111
    - DOCKER_HOST=tcp://teamcity-dind:2375  # Point to the Docker daemon running in the DinD container
  networks:
    - teamcity_net
  restart: unless-stopped

networks:
  teamcity_net:
    external: true

volumes:
  teamcity-dind-data:

Build Configuration

As you can see, this build step is very simple. It just echos ‘hello world’ but it is instructed to run this inside of the alpine:latest container.

Expected Results

The build step to complete and echo ‘hello world’. This works just fine if I execute on the agent instead of passing it off to a container.

Actual Results

The build process fails:

Step 1: test (Command Line)
22:34:05   Running step within container alpine:latest
22:34:05     Starting: . /opt/buildagent/temp/agentTmp/docker-wrapper-1006525611025290252.sh && docker run --rm -w /opt/buildagent/work/33e31afd4a2c64f0 --label jetbrains.teamcity.buildId=2153 -id -v "/opt/buildagent/lib:/opt/buildagent/lib:ro" -v "/opt/buildagent/tools:/opt/buildagent/tools:ro" -v "/opt/buildagent/plugins:/opt/buildagent/plugins:ro" -v "/opt/buildagent/work/33e31afd4a2c64f0:/opt/buildagent/work/33e31afd4a2c64f0" -v "/opt/buildagent/temp/agentTmp:/opt/buildagent/temp/agentTmp" -v "/opt/buildagent/temp/buildTmp:/opt/buildagent/temp/buildTmp" -v "/opt/buildagent/system:/opt/buildagent/system" --env-file /opt/buildagent/temp/agentTmp/docker-wrapper-12799287116685116967.envList --entrypoint /bin/sh "alpine:latest"
22:34:06     Process exited with code 0
22:34:06     Successfully created a reusable container, container id = 704738e10303a5912b0b06d29f9d9b00e141b12a71c83949df69c5bc163483bf
22:34:06   Starting: /bin/sh -c . /opt/buildagent/temp/agentTmp/docker-wrapper-5770134226595135069.sh && docker exec -w /opt/buildagent/work/33e31afd4a2c64f0 704738e10303a5912b0b06d29f9d9b00e141b12a71c83949df69c5bc163483bf /bin/sh -c /opt/buildagent/temp/agentTmp/docker-shell-script-11090623290713555801.sh
22:34:06   in directory: /opt/buildagent/work/33e31afd4a2c64f0
22:34:06   /bin/sh: /opt/buildagent/temp/agentTmp/docker-shell-script-11090623290713555801.sh: not found
22:34:06   Process exited with code 127
22:34:06   Process exited with code 127 (Step: test (Command Line))
22:34:07   Step test (Command Line) failed

As I investigated I discovered that the agent in fact had the file that was ‘not found’, /opt/buildagent/temp/agentTmp/docker-shell-script-11090623290713555801.sh, but, it tries to run another file /opt/buildagent/temp/agentTmp/custom_script17351432578670884465 that is absent from the file system.

buildagent@c6a26dfd854d:/$ ls -la /opt/buildagent/temp/agentTmp/custom_script17351432578670884465
ls: cannot access '/opt/buildagent/temp/agentTmp/custom_script17351432578670884465': No such file or directory

This seems to be the cause of the failure. It seems that this file is never passed to the agent by the server, unless it is somehow deleted before I can see it. When I investigate the dind container, I can see that the agent was communicating with the dind container as it has downloaded the container image.

Things I Plan to Try Next

Setup an agent on a VM instead of a container and see if that works. I’m guessing it will but I don’t want that setup.
- Tried this, the result is unexpectedly the same as when trying to run in the dind container. So the problem is not the dind container itself.
Figure out what this missing custom_script is supposed to be and why it isn’t being passed.
Hope someone here has some insights.

Why even bother?

I think there is a lot of value in being able to run the docker builds inside a more isolated sysbox container instead of having to spin up additional VM’s or give the agents free rein on the host’s docker socket. I like being able to control the resource allocation so a bad build doesn’t bring down the other applications hosted on my docker server. It seems like a much better setup, but it has been resisting my attempts to make it work. This teamcity instance will not see high traffic or I would consider placing my build agents in their own VMs.