Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH v6] gitlab-ci: implement packing into MCS S3
@ 2020-01-27  5:13 Alexander V. Tikhonov
  2020-01-28 13:18 ` [Tarantool-patches] [PATCH v7] " Igor Munkin
  2020-01-30 15:49 ` Alexander Turenko
  0 siblings, 2 replies; 6+ messages in thread
From: Alexander V. Tikhonov @ 2020-01-27  5:13 UTC (permalink / raw)
  To: Igor Munkin, Alexander Turenko; +Cc: tarantool-patches

The changes introduce new Gitlab-CI rules for creating packages on
branches with "-full-ci" suffix and their subsequent deployment to the
'live' repository for master and release branches. Packages for tagged
commits are also delivered to the corresponding 'release' repository.

The PackageCloud storage is replaced with the new self-hosted one
(based on S3 object storage) where all old packages have been synced.
The new builds will be pushed only to S3 based repos. Benefits of the
introduced approach are the following:
* As far as all contents of self-hosted repos are fully controlled
theirs layout is the same as the ones provided by the corresponding
distro
* Repo metadata rebuild is excess considering the known repo layout
* Old packages are not pruned since they do not affect the repo
metadata rebuilding time

For these purposes the standalone script for pushing DEB and RPM
packages to self-hosted repositories is introduced. The script
implements the following flow:
* creates new metafiles for the new packages
* copies new packages to S3 storage
* fetches relevant metafiles from the repo
* merges the new metadata with the fetched one
* pushes the updated metadata to S3 storage

There are distro dependent parts in the script:
* For RPM packages it updates metadata separately per each repo
considering 'createrepo' util behaviour
* For DEB packages it updates metadata simultaniously for all repos
considering 'reprepro' util behaviour

Closes #3380

@TarantoolBot
Title: Update download instructions on the website

Need to update download instructions on the website, due to the new
repository based on MCS S3.
---

Github: https://github.com/tarantool/tarantool/tree/avtikhon/gh-3380-push-packages-s3-full-ci
Issue: https://github.com/tarantool/tarantool/issues/3380

v6: https://lists.tarantool.org/pipermail/tarantool-patches/2020-January/013763.html
v5: https://lists.tarantool.org/pipermail/tarantool-patches/2020-January/013636.html
v4: https://lists.tarantool.org/pipermail/tarantool-patches/2020-January/013568.html
v3: https://lists.tarantool.org/pipermail/tarantool-patches/2019-December/013060.html
v2: https://lists.tarantool.org/pipermail/tarantool-patches/2019-November/012352.html
v1: https://lists.tarantool.org/pipermail/tarantool-patches/2019-October/012021.html

Changes v7:
- removed additional functionality for working with DEB repositories
  using complete pool path w/o specifing packages
- implemented new flag '-f|--force' that helps to overwite the packages
  at MCS S3 if it checksum changed - implemented check on the new
  packages for it
- implemented check with warning on the new RPM packages with the same checksum

Changes v6:
- implemented 2 MCS S3 repositories 'live' and 'release'
- added AWS and GPG keys into Gitlab-CI
- corrected commit message
- corrected return functionality code in script
- moved all changes for sources tarballs at the standalone patch set

Changes v5:
- code style
- commits squashed
- rebased to master

Changes v4:
- minor corrections

Changes v3:
- common code parts merged to standalone routines
- corrected code style, minor updates
- script is ready for release

Changes v2:
- made changes in script from draft to pre-release stages

 .gitlab-ci.yml       | 152 ++++++++++--
 .gitlab.mk           |  30 ++-
 tools/update_repo.sh | 576 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 738 insertions(+), 20 deletions(-)
 create mode 100755 tools/update_repo.sh

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 3af5a3c8a..c68594c1a 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -10,6 +10,10 @@ variables:
   only:
     refs:
       - master
+
+.fullci_only_template: &fullci_only_definition
+  only:
+    refs:
       - /^.*-full-ci$/
 
 .docker_test_template: &docker_test_definition
@@ -24,13 +28,29 @@ variables:
   tags:
     - docker_test
 
+.pack_template: &pack_definition
+  <<: *fullci_only_definition
+  stage: test
+  tags:
+    - deploy
+  script:
+    - ${GITLAB_MAKE} package
+
+.pack_test_template: &pack_test_definition
+  <<: *fullci_only_definition
+  stage: test
+  tags:
+    - deploy_test
+  script:
+    - ${GITLAB_MAKE} package
+
 .deploy_template: &deploy_definition
   <<: *release_only_definition
   stage: test
   tags:
     - deploy
   script:
-    - ${GITLAB_MAKE} package
+    - ${GITLAB_MAKE} deploy
 
 .deploy_test_template: &deploy_test_definition
   <<: *release_only_definition
@@ -38,7 +58,7 @@ variables:
   tags:
     - deploy_test
   script:
-    - ${GITLAB_MAKE} package
+    - ${GITLAB_MAKE} deploy
 
 .vbox_template: &vbox_definition
   stage: test
@@ -141,96 +161,194 @@ freebsd_12_release:
 # Packs
 
 centos_6:
-  <<: *deploy_definition
+  <<: *pack_definition
   variables:
     OS: 'el'
     DIST: '6'
 
 centos_7:
-  <<: *deploy_test_definition
+  <<: *pack_test_definition
   variables:
     OS: 'el'
     DIST: '7'
 
 centos_8:
-  <<: *deploy_test_definition
+  <<: *pack_test_definition
   variables:
     OS: 'el'
     DIST: '8'
 
 fedora_28:
-  <<: *deploy_test_definition
+  <<: *pack_test_definition
   variables:
     OS: 'fedora'
     DIST: '28'
 
 fedora_29:
-  <<: *deploy_test_definition
+  <<: *pack_test_definition
   variables:
     OS: 'fedora'
     DIST: '29'
 
 fedora_30:
-  <<: *deploy_test_definition
+  <<: *pack_test_definition
   variables:
     OS: 'fedora'
     DIST: '30'
 
 fedora_31:
-  <<: *deploy_test_definition
+  <<: *pack_test_definition
   variables:
     OS: 'fedora'
     DIST: '31'
 
 ubuntu_14_04:
-  <<: *deploy_definition
+  <<: *pack_definition
   variables:
     OS: 'ubuntu'
     DIST: 'trusty'
 
 ubuntu_16_04:
-  <<: *deploy_definition
+  <<: *pack_definition
   variables:
     OS: 'ubuntu'
     DIST: 'xenial'
 
 ubuntu_18_04:
-  <<: *deploy_definition
+  <<: *pack_definition
   variables:
     OS: 'ubuntu'
     DIST: 'bionic'
 
 ubuntu_18_10:
-  <<: *deploy_definition
+  <<: *pack_definition
   variables:
     OS: 'ubuntu'
     DIST: 'cosmic'
 
 ubuntu_19_04:
-  <<: *deploy_definition
+  <<: *pack_definition
   variables:
     OS: 'ubuntu'
     DIST: 'disco'
 
 ubuntu_19_10:
-  <<: *deploy_definition
+  <<: *pack_definition
   variables:
     OS: 'ubuntu'
     DIST: 'eoan'
 
 debian_8:
-  <<: *deploy_definition
+  <<: *pack_definition
   variables:
     OS: 'debian'
     DIST: 'jessie'
 
 debian_9:
-  <<: *deploy_definition
+  <<: *pack_definition
   variables:
     OS: 'debian'
     DIST: 'stretch'
 
 debian_10:
+  <<: *pack_definition
+  variables:
+    OS: 'debian'
+    DIST: 'buster'
+
+# Deploy
+
+centos_6_deploy:
+  <<: *deploy_definition
+  variables:
+    OS: 'el'
+    DIST: '6'
+
+centos_7_deploy:
+  <<: *deploy_test_definition
+  variables:
+    OS: 'el'
+    DIST: '7'
+
+centos_8_deploy:
+  <<: *deploy_test_definition
+  variables:
+    OS: 'el'
+    DIST: '8'
+
+fedora_28_deploy:
+  <<: *deploy_test_definition
+  variables:
+    OS: 'fedora'
+    DIST: '28'
+
+fedora_29_deploy:
+  <<: *deploy_test_definition
+  variables:
+    OS: 'fedora'
+    DIST: '29'
+
+fedora_30_deploy:
+  <<: *deploy_test_definition
+  variables:
+    OS: 'fedora'
+    DIST: '30'
+
+fedora_31_deploy:
+  <<: *deploy_test_definition
+  variables:
+    OS: 'fedora'
+    DIST: '31'
+
+ubuntu_14_04_deploy:
+  <<: *deploy_definition
+  variables:
+    OS: 'ubuntu'
+    DIST: 'trusty'
+
+ubuntu_16_04_deploy:
+  <<: *deploy_definition
+  variables:
+    OS: 'ubuntu'
+    DIST: 'xenial'
+
+ubuntu_18_04_deploy:
+  <<: *deploy_definition
+  variables:
+    OS: 'ubuntu'
+    DIST: 'bionic'
+
+ubuntu_18_10_deploy:
+  <<: *deploy_definition
+  variables:
+    OS: 'ubuntu'
+    DIST: 'cosmic'
+
+ubuntu_19_04_deploy:
+  <<: *deploy_definition
+  variables:
+    OS: 'ubuntu'
+    DIST: 'disco'
+
+ubuntu_19_10_deploy:
+  <<: *deploy_definition
+  variables:
+    OS: 'ubuntu'
+    DIST: 'eoan'
+
+debian_8_deploy:
+  <<: *deploy_definition
+  variables:
+    OS: 'debian'
+    DIST: 'jessie'
+
+debian_9_deploy:
+  <<: *deploy_definition
+  variables:
+    OS: 'debian'
+    DIST: 'stretch'
+
+debian_10_deploy:
   <<: *deploy_definition
   variables:
     OS: 'debian'
diff --git a/.gitlab.mk b/.gitlab.mk
index 48a92e518..243f83f2c 100644
--- a/.gitlab.mk
+++ b/.gitlab.mk
@@ -98,14 +98,38 @@ vms_test_%:
 vms_shutdown:
 	VBoxManage controlvm ${VMS_NAME} poweroff
 
-# ########################
-# Build RPM / Deb packages
-# ########################
+# ###########################
+# Sources tarballs & packages
+# ###########################
+
+# Push alpha and beta versions to <major>x bucket (say, 2x),
+# stable to <major>.<minor> bucket (say, 2.2).
+GIT_DESCRIBE=$(shell git describe HEAD)
+MAJOR_VERSION=$(word 1,$(subst ., ,$(GIT_DESCRIBE)))
+MINOR_VERSION=$(word 2,$(subst ., ,$(GIT_DESCRIBE)))
+BUCKET=$(MAJOR_VERSION)_$(MINOR_VERSION)
+ifeq ($(MINOR_VERSION),0)
+BUCKET=$(MAJOR_VERSION)x
+endif
+ifeq ($(MINOR_VERSION),1)
+BUCKET=$(MAJOR_VERSION)x
+endif
 
 package: git_submodule_update
 	git clone https://github.com/packpack/packpack.git packpack
 	PACKPACK_EXTRA_DOCKER_RUN_PARAMS='--network=host' ./packpack/packpack
 
+deploy: package
+	for key in ${GPG_SECRET_KEY} ${GPG_PUBLIC_KEY} ; do \
+		echo $${key} | base64 -d | gpg --batch --import || true ; done
+	./tools/update_repo.sh -o=${OS} -d=${DIST} \
+		-b="s3://tarantool_repo/live/${BUCKET}" build
+	for tag in $$(git tag) ; do \
+			git describe --long $${tag} ; \
+		done | grep "^$$(git describe --long)$$" >/dev/null && \
+		./tools/update_repo.sh -o=${OS} -d=${DIST} \
+			-b="s3://tarantool_repo/release/${BUCKET}" build
+
 # ############
 # Static build
 # ############
diff --git a/tools/update_repo.sh b/tools/update_repo.sh
new file mode 100755
index 000000000..60c66ac4f
--- /dev/null
+++ b/tools/update_repo.sh
@@ -0,0 +1,576 @@
+#!/bin/bash
+set -e
+
+rm_file='rm -f'
+rm_dir='rm -rf'
+mk_dir='mkdir -p'
+ws_prefix=/tmp/tarantool_repo_s3
+
+alloss='ubuntu debian el fedora'
+product=tarantool
+force=
+# the path with binaries either repository
+repo=.
+
+# AWS defines
+aws="aws --endpoint-url ${AWS_S3_ENDPOINT_URL:-https://hb.bizmrg.com} s3"
+aws_cp_public="$aws cp --acl public-read"
+aws_sync_public="$aws sync --acl public-read"
+
+function get_os_dists {
+    os=$1
+    alldists=
+
+    if [ "$os" == "ubuntu" ]; then
+        alldists='trusty xenial bionic cosmic disco eoan'
+    elif [ "$os" == "debian" ]; then
+        alldists='jessie stretch buster'
+    elif [ "$os" == "el" ]; then
+        alldists='6 7 8'
+    elif [ "$os" == "fedora" ]; then
+        alldists='27 28 29 30 31'
+    fi
+
+    echo "$alldists"
+}
+
+function prepare_ws {
+    # temporary lock the publication to the repository
+    ws_suffix=$1
+    ws=${ws_prefix}_${ws_suffix}
+    ws_lockfile=${ws}.lock
+    if [ -f $ws_lockfile ]; then
+        old_proc=$(cat $ws_lockfile)
+    fi
+    lockfile -l 60 $ws_lockfile
+    chmod u+w $ws_lockfile && echo $$ >$ws_lockfile && chmod u-w $ws_lockfile
+    if [ "$old_proc" != ""  -a "$old_proc" != "0" ]; then
+        kill -9 $old_proc >/dev/null || true
+    fi
+
+    # create temporary workspace with repository copy
+    $rm_dir $ws
+    $mk_dir $ws
+}
+
+function usage {
+    cat <<EOF
+Usage for store package binaries from the given path:
+    $0 -o=<OS name> -d=<OS distribuition> -b=<S3 bucket> [-p=<product>] <path to package binaries>
+
+Usage for mirroring Debian|Ubuntu OS repositories:
+    $0 -o=<OS name> -d=<OS distribuition> -b=<S3 bucket> [-p=<product>] <path to packages binaries>
+
+Arguments:
+    <path>
+         Path points to the directory with deb/prm packages to be used.
+
+Options:
+    -b|--bucket
+        MCS S3 bucket already existing which will be used for storing the packages
+    -o|--os
+        OS to be checked, one of the list:
+            $alloss
+    -d|--distribution
+        Distribution appropriate to the given OS:
+EOF
+    for os in $alloss ; do
+        echo "            $os: <"$(get_os_dists $os)">"
+    done
+    cat <<EOF
+    -p|--product
+         Product name to be packed with, default name is 'tarantool'
+    -f|--force
+         Force updating the remote package with the local one despite the checksum difference
+    -h|--help
+         Usage help message
+EOF
+}
+
+for i in "$@"
+do
+case $i in
+    -b=*|--bucket=*)
+    bucket="${i#*=}"
+    shift # past argument=value
+    ;;
+    -o=*|--os=*)
+    os="${i#*=}"
+    if ! echo $alloss | grep -F -q -w $os ; then
+        echo "ERROR: OS '$os' is not supported"
+        usage
+        exit 1
+    fi
+    shift # past argument=value
+    ;;
+    -d=*|--distribution=*)
+    option_dist="${i#*=}"
+    shift # past argument=value
+    ;;
+    -p=*|--product=*)
+    product="${i#*=}"
+    shift # past argument=value
+    ;;
+    -f|--force)
+    force=1
+    ;;
+    -h|--help)
+    usage
+    exit 0
+    ;;
+    *)
+    repo="${i#*=}"
+    pushd $repo >/dev/null ; repo=$PWD ; popd >/dev/null
+    shift # past argument=value
+    ;;
+esac
+done
+
+# check that all needed options were set and correct
+if [ "$bucket" == "" ]; then
+    echo "ERROR: need to set -b|--bucket bucket option, check usage"
+    usage
+    exit 1
+fi
+if ! $aws ls $bucket >/dev/null ; then
+    echo "ERROR: bucket '$bucket' is not found"
+    usage
+    exit 1
+fi
+if [ "$option_dist" == "" ]; then
+    echo "ERROR: need to set -d|--option_dist OS distribuition name option, check usage"
+    usage
+    exit 1
+fi
+if [ "$os" == "" ]; then
+    echo "ERROR: need to set -o|--os OS name option, check usage"
+    usage
+    exit 1
+fi
+alldists=$(get_os_dists $os)
+if [ -n "$option_dist" ] && ! echo $alldists | grep -F -q -w $option_dist ; then
+    echo "ERROR: set distribution at options '$option_dist' not found at supported list '$alldists'"
+    usage
+    exit 1
+fi
+
+# set the subpath with binaries based on literal character of the product name
+proddir=$(echo $product | head -c 1)
+
+# set bucket path of the given OS in options
+bucket_path="$bucket/$os"
+
+function update_deb_packfile {
+    packfile=$1
+    packtype=$2
+    update_dist=$3
+
+    locpackfile=$(echo $packfile | sed "s#^$ws/##g")
+    # register DEB/DSC pack file to Packages/Sources file
+    reprepro -Vb . include$packtype $update_dist $packfile
+    # reprepro copied DEB/DSC file to component which is not needed
+    $rm_dir $debdir/$component
+    # to have all sources avoid reprepro set DEB/DSC file to its own registry
+    $rm_dir db
+}
+
+function update_deb_metadata {
+    packpath=$1
+    packtype=$2
+
+    if [ ! -f $packpath.saved ] ; then
+        # get the latest Sources file from S3 either create empty file
+        $aws ls "$bucket_path/$packpath" >/dev/null 2>&1 && \
+            $aws cp "$bucket_path/$packpath" $packpath.saved || \
+            touch $packpath.saved
+    fi
+
+    if [ "$packtype" == "dsc" ]; then
+        # WORKAROUND: unknown why, but reprepro doesn`t save the Sources
+        # file, lets recreate it manualy from it's zipped version
+        gunzip -c $packpath.gz >$packpath
+        # check if the DSC hash already exists in old Sources file from S3
+        # find the hash from the new Sources file
+        hash=$(grep '^Checksums-Sha256:' -A3 $packpath | \
+            tail -n 1 | awk '{print $1}')
+        # search the new hash in the old Sources file from S3
+        if grep " $hash .* .*$" $packpath.saved ; then
+            echo "WARNING: DSC file already registered in S3!"
+            return
+        fi
+        # check if the DSC file already exists in old Sources file from S3
+        file=$(grep '^Files:' -A3 $packpath | tail -n 1 | awk '{print $3}')
+        if [ "$force" == "" ] && grep " .* .* $file$" $packpath.saved ; then
+            echo "ERROR: the file already exists, but changed, set '-f' to overwrite it: $file"
+            echo "New hash: $hash"
+            # unlock the publishing
+            $rm_file $ws_lockfile
+            exit 1
+        fi
+        updated_dsc=1
+    elif [ "$packtype" == "deb" ]; then
+        # check if the DEB file already exists in old Packages file from S3
+        # find the hash from the new Packages file
+        hash=$(grep '^SHA256: ' $packpath)
+        # search the new hash in the old Packages file from S3
+        if grep "^SHA256: $hash" $packpath.saved ; then
+            echo "WARNING: DEB file already registered in S3!"
+            return
+        fi
+        # check if the DEB file already exists in old Packages file from S3
+        file=$(grep '^Filename:' | awk '{print $2}')
+        if [ "$force" == "" ] && grep "Filename: $file$" $packpath.saved ; then
+            echo "ERROR: the file already exists, but changed, set '-f' to overwrite it: $file"
+            echo "New hash: $hash"
+            # unlock the publishing
+            $rm_file $ws_lockfile
+            exit 1
+        fi
+        updated_deb=1
+    fi
+    # store the new DEB entry
+    cat $packpath >>$packpath.saved
+}
+
+# The 'pack_deb' function especialy created for DEB packages. It works
+# with DEB packing OS like Ubuntu, Debian. It is based on globaly known
+# tool 'reprepro' from:
+#     https://wiki.debian.org/DebianRepository/SetupWithReprepro
+# This tool works with complete number of distributions of the given OS.
+# Result of the routine is the debian package for APT repository with
+# file structure equal to the Debian/Ubuntu:
+#     http://ftp.am.debian.org/debian/pool/main/t/tarantool/
+#     http://ftp.am.debian.org/ubuntu/pool/main/t/
+function pack_deb {
+    # we need to push packages into 'main' repository only
+    component=main
+
+    # debian has special directory 'pool' for packages
+    debdir=pool
+
+    # get packages from pointed location
+    if ! ls $repo/*.deb $repo/*.dsc $repo/*.tar.*z >/dev/null ; then
+        echo "ERROR: files $repo/*.deb $repo/*.dsc $repo/*.tar.*z not found"
+        usage
+        exit 1
+    fi
+
+    # prepare the workspace
+    prepare_ws ${os}
+
+    # copy single distribution with binaries packages
+    repopath=$ws/pool/${option_dist}/$component/$proddir/$product
+    $mk_dir ${repopath}
+    cp $repo/*.deb $repo/*.dsc $repo/*.tar.*z $repopath/.
+    pushd $ws
+
+    # create the configuration file for 'reprepro' tool
+    confpath=$ws/conf
+    $rm_dir $confpath
+    $mk_dir $confpath
+
+    for loop_dist in $alldists ; do
+        cat <<EOF >>$confpath/distributions
+Origin: Tarantool
+Label: tarantool.org
+Suite: stable
+Codename: $loop_dist
+Architectures: amd64 source
+Components: $component
+Description: Tarantool DBMS and Tarantool modules
+SignWith: 91B625E5
+DebIndices: Packages Release . .gz .bz2
+UDebIndices: Packages . .gz .bz2
+DscIndices: Sources Release .gz .bz2
+
+EOF
+    done
+
+    # create standalone repository with separate components
+    for loop_dist in $alldists ; do
+        echo ================ DISTRIBUTION: $loop_dist ====================
+        updated_files=0
+
+        # 1(binaries). use reprepro tool to generate Packages file
+        for deb in $ws/$debdir/$loop_dist/$component/*/*/*.deb ; do
+            [ -f $deb ] || continue
+            updated_deb=0
+            # regenerate DEB pack
+            update_deb_packfile $deb deb $loop_dist
+            echo "Regenerated DEB file: $locpackfile"
+            for packages in dists/$loop_dist/$component/binary-*/Packages ; do
+                # copy Packages file to avoid of removing by the new DEB version
+                # update metadata 'Packages' files
+                update_deb_metadata $packages deb
+                [ "$updated_deb" == "1" ] || continue
+                updated_files=1
+            done
+            # save the registered DEB file to S3
+            if [ "$updated_deb" == 1 ]; then
+                $aws_cp_public $deb $bucket_path/$locpackfile
+            fi
+        done
+
+        # 1(sources). use reprepro tool to generate Sources file
+        for dsc in $ws/$debdir/$loop_dist/$component/*/*/*.dsc ; do
+            [ -f $dsc ] || continue
+            updated_dsc=0
+            # regenerate DSC pack
+            update_deb_packfile $dsc dsc $loop_dist
+            echo "Regenerated DSC file: $locpackfile"
+            # copy Sources file to avoid of removing by the new DSC version
+            # update metadata 'Sources' file
+            update_deb_metadata dists/$loop_dist/$component/source/Sources dsc
+            [ "$updated_dsc" == "1" ] || continue
+            updated_files=1
+            # save the registered DSC file to S3
+            $aws_cp_public $dsc $bucket_path/$locpackfile
+            tarxz=$(echo $locpackfile | sed 's#\.dsc$#.debian.tar.xz#g')
+            $aws_cp_public $ws/$tarxz "$bucket_path/$tarxz"
+            orig=$(echo $locpackfile | sed 's#-1\.dsc$#.orig.tar.xz#g')
+            $aws_cp_public $ws/$orig "$bucket_path/$orig"
+        done
+
+        # check if any DEB/DSC files were newly registered
+        [ "$updated_files" == "0" ] && \
+            continue || echo "Updating dists"
+
+        # finalize the Packages file
+        for packages in dists/$loop_dist/$component/binary-*/Packages ; do
+            mv $packages.saved $packages
+        done
+
+        # finalize the Sources file
+        sources=dists/$loop_dist/$component/source/Sources
+        mv $sources.saved $sources
+
+        # 2(binaries). update Packages file archives
+        for packpath in dists/$loop_dist/$component/binary-* ; do
+            pushd $packpath
+            sed "s#Filename: $debdir/$component/#Filename: $debdir/$loop_dist/$component/#g" -i Packages
+            bzip2 -c Packages >Packages.bz2
+            gzip -c Packages >Packages.gz
+            popd
+        done
+
+        # 2(sources). update Sources file archives
+        pushd dists/$loop_dist/$component/source
+        sed "s#Directory: $debdir/$component/#Directory: $debdir/$loop_dist/$component/#g" -i Sources
+        bzip2 -c Sources >Sources.bz2
+        gzip -c Sources >Sources.gz
+        popd
+
+        # 3. update checksums entries of the Packages* files in *Release files
+        # NOTE: it is stable structure of the *Release files when the checksum
+        #       entries in it in the following way:
+        # MD5Sum:
+        #  <checksum> <size> <file orig>
+        #  <checksum> <size> <file debian>
+        # SHA1:
+        #  <checksum> <size> <file orig>
+        #  <checksum> <size> <file debian>
+        # SHA256:
+        #  <checksum> <size> <file orig>
+        #  <checksum> <size> <file debian>
+        #       The script bellow puts 'md5' value at the 1st found file entry,
+        #       'sha1' - at the 2nd and 'sha256' at the 3rd
+        pushd dists/$loop_dist
+        for file in $(grep " $component/" Release | awk '{print $3}' | sort -u) ; do
+            sz=$(stat -c "%s" $file)
+            md5=$(md5sum $file | awk '{print $1}')
+            sha1=$(sha1sum $file | awk '{print $1}')
+            sha256=$(sha256sum $file | awk '{print $1}')
+            awk 'BEGIN{c = 0} ; {
+                if ($3 == p) {
+                    c = c + 1
+                    if (c == 1) {print " " md  " " s " " p}
+                    if (c == 2) {print " " sh1 " " s " " p}
+                    if (c == 3) {print " " sh2 " " s " " p}
+                } else {print $0}
+            }' p="$file" s="$sz" md="$md5" sh1="$sha1" sh2="$sha256" \
+                    Release >Release.new
+            mv Release.new Release
+        done
+        # resign the selfsigned InRelease file
+        $rm_file InRelease
+        gpg --clearsign -o InRelease Release
+        # resign the Release file
+        $rm_file Release.gpg
+        gpg -abs -o Release.gpg Release
+        popd
+
+        # 4. sync the latest distribution path changes to S3
+        $aws_sync_public dists/$loop_dist "$bucket_path/dists/$loop_dist"
+    done
+
+    # unlock the publishing
+    $rm_file $ws_lockfile
+
+    popd
+}
+
+# The 'pack_rpm' function especialy created for RPM packages. It works
+# with RPM packing OS like Centos, Fedora. It is based on globaly known
+# tool 'createrepo' from:
+#     https://linux.die.net/man/8/createrepo
+# This tool works with single distribution of the given OS.
+# Result of the routine is the rpm package for YUM repository with
+# file structure equal to the Centos/Fedora:
+#     http://mirror.centos.org/centos/7/os/x86_64/Packages/
+#     http://mirrors.kernel.org/fedora/releases/30/Everything/x86_64/os/Packages/t/
+function pack_rpm {
+    if ! ls $repo/*.rpm >/dev/null ; then
+        echo "ERROR: Current '$repo' path doesn't have RPM packages in path"
+        usage
+        exit 1
+    fi
+
+    # prepare the workspace
+    prepare_ws ${os}_${option_dist}
+
+    # copy the needed package binaries to the workspace
+    cp $repo/*.rpm $ws/.
+
+    pushd $ws
+
+    # set the paths
+    if [ "$os" == "el" ]; then
+        repopath=$option_dist/os/x86_64
+        rpmpath=Packages
+    elif [ "$os" == "fedora" ]; then
+        repopath=releases/$option_dist/Everything/x86_64/os
+        rpmpath=Packages/$proddir
+    fi
+    packpath=$repopath/$rpmpath
+
+    # prepare local repository with packages
+    $mk_dir $packpath
+    mv *.rpm $packpath/.
+    cd $repopath
+
+    # copy the current metadata files from S3
+    mkdir repodata.base
+    for file in $($aws ls $bucket_path/$repopath/repodata/ | awk '{print $NF}') ; do
+        $aws ls $bucket_path/$repopath/repodata/$file || continue
+        $aws cp $bucket_path/$repopath/repodata/$file repodata.base/$file
+    done
+
+    # create the new repository metadata files
+    createrepo --no-database --update --workers=2 \
+        --compress-type=gz --simple-md-filenames .
+
+    updated_rpms=0
+    # loop by the new hashes from the new meta file
+    for hash in $(zcat repodata/other.xml.gz | grep "<package pkgid=" | \
+        awk -F'"' '{print $2}') ; do
+        updated_rpm=0
+        name=$(zcat repodata/other.xml.gz | grep "<package pkgid=\"$hash\"" | \
+            awk -F'"' '{print $4}')
+        # search the new hash in the old meta file from S3
+        if zcat repodata.base/filelists.xml.gz | grep "pkgid=\"$hash\"" | \
+            grep "name=\"$name\"" ; then
+            echo "WARNING: $name file already registered in S3!"
+            echo "File hash: $hash"
+            continue
+        fi
+        updated_rpms=1
+        # check if the hashed file already exists in old meta file from S3
+        file=$(zcat repodata/primary.xml.gz | \
+            grep -e "<checksum type=" -e "<location href=" | \
+            grep "$hash" -A1 | grep "<location href=" | \
+            awk -F'"' '{print $2}')
+        # check if the file already exists in S3
+        if [ "$force" == "" ] && zcat repodata.base/primary.xml.gz | \
+                grep "<location href=\"$file\"" ; then
+            echo "ERROR: the file already exists, but changed, set '-f' to overwrite it: $file"
+            echo "New hash: $hash"
+            # unlock the publishing
+            $rm_file $ws_lockfile
+            exit 1
+        fi
+    done
+
+    # check if any RPM files were newly registered
+    [ "$updated_rpms" == "0" ] && \
+        return || echo "Updating dists"
+
+    # move the repodata files to the standalone location
+    mv repodata repodata.adding
+
+    # merge metadata files
+    mkdir repodata
+    head -n 2 repodata.adding/repomd.xml >repodata/repomd.xml
+    for file in filelists.xml other.xml primary.xml ; do
+        # 1. take the 1st line only - to skip the line with
+        #    number of packages which is not needed
+        zcat repodata.adding/$file.gz | head -n 1 >repodata/$file
+        # 2. take 2nd line with metadata tag and update
+        #    the packages number in it
+        packsold=0
+        if [ -f repodata.base/$file.gz ] ; then
+            packsold=$(zcat repodata.base/$file.gz | head -n 2 | \
+                tail -n 1 | sed 's#.*packages="\(.*\)".*#\1#g')
+        fi
+        packsnew=$(zcat repodata.adding/$file.gz | head -n 2 | \
+            tail -n 1 | sed 's#.*packages="\(.*\)".*#\1#g')
+        packs=$(($packsold+$packsnew))
+        zcat repodata.adding/$file.gz | head -n 2 | tail -n 1 | \
+            sed "s#packages=\".*\"#packages=\"$packs\"#g" >>repodata/$file
+        # 3. take only 'package' tags from new file
+        zcat repodata.adding/$file.gz | tail -n +3 | head -n -1 \
+            >>repodata/$file
+        # 4. take only 'package' tags from old file if exists
+        if [ -f repodata.base/$file.gz ] ; then
+            zcat repodata.base/$file.gz | tail -n +3 | head -n -1 \
+                >>repodata/$file
+        fi
+        # 5. take the last closing line with metadata tag
+        zcat repodata.adding/$file.gz | tail -n 1 >>repodata/$file
+
+        # get the new data
+        chsnew=$(sha256sum repodata/$file | awk '{print $1}')
+        sz=$(stat --printf="%s" repodata/$file)
+        gzip repodata/$file
+        chsgznew=$(sha256sum repodata/$file.gz | awk '{print $1}')
+        szgz=$(stat --printf="%s" repodata/$file.gz)
+        timestamp=$(date +%s -r repodata/$file.gz)
+
+        # add info to repomd.xml file
+        name=$(echo $file | sed 's#\.xml$##g')
+        cat <<EOF >>repodata/repomd.xml
+<data type="$name">
+  <checksum type="sha256">$chsgznew</checksum>
+  <open-checksum type="sha256">$chsnew</open-checksum>
+  <location href="repodata/$file.gz"/>
+  <timestamp>$timestamp</timestamp>
+  <size>$szgz</size>
+  <open-size>$sz</open-size>
+</data>"
+EOF
+    done
+    tail -n 1 repodata.adding/repomd.xml >>repodata/repomd.xml
+    gpg --detach-sign --armor repodata/repomd.xml
+
+    # copy the packages to S3
+    for file in $rpmpath/*.rpm ; do
+        $aws_cp_public $file "$bucket_path/$repopath/$file"
+    done
+
+    # update the metadata at the S3
+    $aws_sync_public repodata "$bucket_path/$repopath/repodata"
+
+    # unlock the publishing
+    $rm_file $ws_lockfile
+
+    popd
+}
+
+if [ "$os" == "ubuntu" -o "$os" == "debian" ]; then
+    pack_deb
+elif [ "$os" == "el" -o "$os" == "fedora" ]; then
+    pack_rpm
+else
+    echo "USAGE: given OS '$os' is not supported, use any single from the list: $alloss"
+    usage
+    exit 1
+fi
-- 
2.17.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH v7] gitlab-ci: implement packing into MCS S3
  2020-01-27  5:13 [Tarantool-patches] [PATCH v6] gitlab-ci: implement packing into MCS S3 Alexander V. Tikhonov
@ 2020-01-28 13:18 ` Igor Munkin
  2020-01-30 15:49 ` Alexander Turenko
  1 sibling, 0 replies; 6+ messages in thread
From: Igor Munkin @ 2020-01-28 13:18 UTC (permalink / raw)
  To: Alexander V. Tikhonov; +Cc: tarantool-patches

Sasha,

Thanks, LGTM with a minor comment.

On 27.01.20, Alexander V. Tikhonov wrote:
> The changes introduce new Gitlab-CI rules for creating packages on
> branches with "-full-ci" suffix and their subsequent deployment to the
> 'live' repository for master and release branches. Packages for tagged
> commits are also delivered to the corresponding 'release' repository.
> 
> The PackageCloud storage is replaced with the new self-hosted one
> (based on S3 object storage) where all old packages have been synced.
> The new builds will be pushed only to S3 based repos. Benefits of the
> introduced approach are the following:
> * As far as all contents of self-hosted repos are fully controlled
> theirs layout is the same as the ones provided by the corresponding
> distro
> * Repo metadata rebuild is excess considering the known repo layout
> * Old packages are not pruned since they do not affect the repo
> metadata rebuilding time
> 
> For these purposes the standalone script for pushing DEB and RPM
> packages to self-hosted repositories is introduced. The script
> implements the following flow:
> * creates new metafiles for the new packages
> * copies new packages to S3 storage
> * fetches relevant metafiles from the repo
> * merges the new metadata with the fetched one
> * pushes the updated metadata to S3 storage
> 
> There are distro dependent parts in the script:
> * For RPM packages it updates metadata separately per each repo
> considering 'createrepo' util behaviour
> * For DEB packages it updates metadata simultaniously for all repos
> considering 'reprepro' util behaviour
> 
> Closes #3380
> 
> @TarantoolBot
> Title: Update download instructions on the website
> 
> Need to update download instructions on the website, due to the new
> repository based on MCS S3.
> ---
> 
> Github: https://github.com/tarantool/tarantool/tree/avtikhon/gh-3380-push-packages-s3-full-ci
> Issue: https://github.com/tarantool/tarantool/issues/3380
> 
> v6: https://lists.tarantool.org/pipermail/tarantool-patches/2020-January/013763.html
> v5: https://lists.tarantool.org/pipermail/tarantool-patches/2020-January/013636.html
> v4: https://lists.tarantool.org/pipermail/tarantool-patches/2020-January/013568.html
> v3: https://lists.tarantool.org/pipermail/tarantool-patches/2019-December/013060.html
> v2: https://lists.tarantool.org/pipermail/tarantool-patches/2019-November/012352.html
> v1: https://lists.tarantool.org/pipermail/tarantool-patches/2019-October/012021.html
> 
> Changes v7:
> - removed additional functionality for working with DEB repositories
>   using complete pool path w/o specifing packages
> - implemented new flag '-f|--force' that helps to overwite the packages
>   at MCS S3 if it checksum changed - implemented check on the new
>   packages for it
> - implemented check with warning on the new RPM packages with the same checksum
> 
> Changes v6:
> - implemented 2 MCS S3 repositories 'live' and 'release'
> - added AWS and GPG keys into Gitlab-CI
> - corrected commit message
> - corrected return functionality code in script
> - moved all changes for sources tarballs at the standalone patch set
> 
> Changes v5:
> - code style
> - commits squashed
> - rebased to master
> 
> Changes v4:
> - minor corrections
> 
> Changes v3:
> - common code parts merged to standalone routines
> - corrected code style, minor updates
> - script is ready for release
> 
> Changes v2:
> - made changes in script from draft to pre-release stages
> 
>  .gitlab-ci.yml       | 152 ++++++++++--
>  .gitlab.mk           |  30 ++-
>  tools/update_repo.sh | 576 +++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 738 insertions(+), 20 deletions(-)
>  create mode 100755 tools/update_repo.sh
> 
> diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
> index 3af5a3c8a..c68594c1a 100644
> --- a/.gitlab-ci.yml
> +++ b/.gitlab-ci.yml
> @@ -10,6 +10,10 @@ variables:
>    only:
>      refs:
>        - master
> +
> +.fullci_only_template: &fullci_only_definition
> +  only:
> +    refs:
>        - /^.*-full-ci$/
>  
>  .docker_test_template: &docker_test_definition
> @@ -24,13 +28,29 @@ variables:
>    tags:
>      - docker_test
>  
> +.pack_template: &pack_definition
> +  <<: *fullci_only_definition
> +  stage: test
> +  tags:
> +    - deploy
> +  script:
> +    - ${GITLAB_MAKE} package
> +
> +.pack_test_template: &pack_test_definition
> +  <<: *fullci_only_definition
> +  stage: test
> +  tags:
> +    - deploy_test
> +  script:
> +    - ${GITLAB_MAKE} package
> +
>  .deploy_template: &deploy_definition
>    <<: *release_only_definition
>    stage: test
>    tags:
>      - deploy
>    script:
> -    - ${GITLAB_MAKE} package
> +    - ${GITLAB_MAKE} deploy
>  
>  .deploy_test_template: &deploy_test_definition
>    <<: *release_only_definition
> @@ -38,7 +58,7 @@ variables:
>    tags:
>      - deploy_test
>    script:
> -    - ${GITLAB_MAKE} package
> +    - ${GITLAB_MAKE} deploy
>  
>  .vbox_template: &vbox_definition
>    stage: test
> @@ -141,96 +161,194 @@ freebsd_12_release:
>  # Packs
>  
>  centos_6:
> -  <<: *deploy_definition
> +  <<: *pack_definition
>    variables:
>      OS: 'el'
>      DIST: '6'
>  
>  centos_7:
> -  <<: *deploy_test_definition
> +  <<: *pack_test_definition
>    variables:
>      OS: 'el'
>      DIST: '7'
>  
>  centos_8:
> -  <<: *deploy_test_definition
> +  <<: *pack_test_definition
>    variables:
>      OS: 'el'
>      DIST: '8'
>  
>  fedora_28:
> -  <<: *deploy_test_definition
> +  <<: *pack_test_definition
>    variables:
>      OS: 'fedora'
>      DIST: '28'
>  
>  fedora_29:
> -  <<: *deploy_test_definition
> +  <<: *pack_test_definition
>    variables:
>      OS: 'fedora'
>      DIST: '29'
>  
>  fedora_30:
> -  <<: *deploy_test_definition
> +  <<: *pack_test_definition
>    variables:
>      OS: 'fedora'
>      DIST: '30'
>  
>  fedora_31:
> -  <<: *deploy_test_definition
> +  <<: *pack_test_definition
>    variables:
>      OS: 'fedora'
>      DIST: '31'
>  
>  ubuntu_14_04:
> -  <<: *deploy_definition
> +  <<: *pack_definition
>    variables:
>      OS: 'ubuntu'
>      DIST: 'trusty'
>  
>  ubuntu_16_04:
> -  <<: *deploy_definition
> +  <<: *pack_definition
>    variables:
>      OS: 'ubuntu'
>      DIST: 'xenial'
>  
>  ubuntu_18_04:
> -  <<: *deploy_definition
> +  <<: *pack_definition
>    variables:
>      OS: 'ubuntu'
>      DIST: 'bionic'
>  
>  ubuntu_18_10:
> -  <<: *deploy_definition
> +  <<: *pack_definition
>    variables:
>      OS: 'ubuntu'
>      DIST: 'cosmic'
>  
>  ubuntu_19_04:
> -  <<: *deploy_definition
> +  <<: *pack_definition
>    variables:
>      OS: 'ubuntu'
>      DIST: 'disco'
>  
>  ubuntu_19_10:
> -  <<: *deploy_definition
> +  <<: *pack_definition
>    variables:
>      OS: 'ubuntu'
>      DIST: 'eoan'
>  
>  debian_8:
> -  <<: *deploy_definition
> +  <<: *pack_definition
>    variables:
>      OS: 'debian'
>      DIST: 'jessie'
>  
>  debian_9:
> -  <<: *deploy_definition
> +  <<: *pack_definition
>    variables:
>      OS: 'debian'
>      DIST: 'stretch'
>  
>  debian_10:
> +  <<: *pack_definition
> +  variables:
> +    OS: 'debian'
> +    DIST: 'buster'
> +
> +# Deploy
> +
> +centos_6_deploy:
> +  <<: *deploy_definition
> +  variables:
> +    OS: 'el'
> +    DIST: '6'
> +
> +centos_7_deploy:
> +  <<: *deploy_test_definition
> +  variables:
> +    OS: 'el'
> +    DIST: '7'
> +
> +centos_8_deploy:
> +  <<: *deploy_test_definition
> +  variables:
> +    OS: 'el'
> +    DIST: '8'
> +
> +fedora_28_deploy:
> +  <<: *deploy_test_definition
> +  variables:
> +    OS: 'fedora'
> +    DIST: '28'
> +
> +fedora_29_deploy:
> +  <<: *deploy_test_definition
> +  variables:
> +    OS: 'fedora'
> +    DIST: '29'
> +
> +fedora_30_deploy:
> +  <<: *deploy_test_definition
> +  variables:
> +    OS: 'fedora'
> +    DIST: '30'
> +
> +fedora_31_deploy:
> +  <<: *deploy_test_definition
> +  variables:
> +    OS: 'fedora'
> +    DIST: '31'
> +
> +ubuntu_14_04_deploy:
> +  <<: *deploy_definition
> +  variables:
> +    OS: 'ubuntu'
> +    DIST: 'trusty'
> +
> +ubuntu_16_04_deploy:
> +  <<: *deploy_definition
> +  variables:
> +    OS: 'ubuntu'
> +    DIST: 'xenial'
> +
> +ubuntu_18_04_deploy:
> +  <<: *deploy_definition
> +  variables:
> +    OS: 'ubuntu'
> +    DIST: 'bionic'
> +
> +ubuntu_18_10_deploy:
> +  <<: *deploy_definition
> +  variables:
> +    OS: 'ubuntu'
> +    DIST: 'cosmic'
> +
> +ubuntu_19_04_deploy:
> +  <<: *deploy_definition
> +  variables:
> +    OS: 'ubuntu'
> +    DIST: 'disco'
> +
> +ubuntu_19_10_deploy:
> +  <<: *deploy_definition
> +  variables:
> +    OS: 'ubuntu'
> +    DIST: 'eoan'
> +
> +debian_8_deploy:
> +  <<: *deploy_definition
> +  variables:
> +    OS: 'debian'
> +    DIST: 'jessie'
> +
> +debian_9_deploy:
> +  <<: *deploy_definition
> +  variables:
> +    OS: 'debian'
> +    DIST: 'stretch'
> +
> +debian_10_deploy:
>    <<: *deploy_definition
>    variables:
>      OS: 'debian'
> diff --git a/.gitlab.mk b/.gitlab.mk
> index 48a92e518..243f83f2c 100644
> --- a/.gitlab.mk
> +++ b/.gitlab.mk
> @@ -98,14 +98,38 @@ vms_test_%:
>  vms_shutdown:
>  	VBoxManage controlvm ${VMS_NAME} poweroff
>  
> -# ########################
> -# Build RPM / Deb packages
> -# ########################
> +# ###########################
> +# Sources tarballs & packages
> +# ###########################
> +
> +# Push alpha and beta versions to <major>x bucket (say, 2x),
> +# stable to <major>.<minor> bucket (say, 2.2).
> +GIT_DESCRIBE=$(shell git describe HEAD)
> +MAJOR_VERSION=$(word 1,$(subst ., ,$(GIT_DESCRIBE)))
> +MINOR_VERSION=$(word 2,$(subst ., ,$(GIT_DESCRIBE)))
> +BUCKET=$(MAJOR_VERSION)_$(MINOR_VERSION)
> +ifeq ($(MINOR_VERSION),0)
> +BUCKET=$(MAJOR_VERSION)x
> +endif
> +ifeq ($(MINOR_VERSION),1)
> +BUCKET=$(MAJOR_VERSION)x
> +endif
>  
>  package: git_submodule_update
>  	git clone https://github.com/packpack/packpack.git packpack
>  	PACKPACK_EXTRA_DOCKER_RUN_PARAMS='--network=host' ./packpack/packpack
>  
> +deploy: package
> +	for key in ${GPG_SECRET_KEY} ${GPG_PUBLIC_KEY} ; do \
> +		echo $${key} | base64 -d | gpg --batch --import || true ; done
> +	./tools/update_repo.sh -o=${OS} -d=${DIST} \
> +		-b="s3://tarantool_repo/live/${BUCKET}" build
> +	for tag in $$(git tag) ; do \
> +			git describe --long $${tag} ; \
> +		done | grep "^$$(git describe --long)$$" >/dev/null && \
> +		./tools/update_repo.sh -o=${OS} -d=${DIST} \
> +			-b="s3://tarantool_repo/release/${BUCKET}" build
> +
>  # ############
>  # Static build
>  # ############
> diff --git a/tools/update_repo.sh b/tools/update_repo.sh
> new file mode 100755
> index 000000000..60c66ac4f
> --- /dev/null
> +++ b/tools/update_repo.sh
> @@ -0,0 +1,576 @@
> +#!/bin/bash
> +set -e
> +
> +rm_file='rm -f'
> +rm_dir='rm -rf'
> +mk_dir='mkdir -p'
> +ws_prefix=/tmp/tarantool_repo_s3
> +
> +alloss='ubuntu debian el fedora'
> +product=tarantool
> +force=
> +# the path with binaries either repository
> +repo=.
> +
> +# AWS defines
> +aws="aws --endpoint-url ${AWS_S3_ENDPOINT_URL:-https://hb.bizmrg.com} s3"
> +aws_cp_public="$aws cp --acl public-read"
> +aws_sync_public="$aws sync --acl public-read"
> +
> +function get_os_dists {
> +    os=$1
> +    alldists=
> +
> +    if [ "$os" == "ubuntu" ]; then
> +        alldists='trusty xenial bionic cosmic disco eoan'
> +    elif [ "$os" == "debian" ]; then
> +        alldists='jessie stretch buster'
> +    elif [ "$os" == "el" ]; then
> +        alldists='6 7 8'
> +    elif [ "$os" == "fedora" ]; then
> +        alldists='27 28 29 30 31'
> +    fi
> +
> +    echo "$alldists"
> +}
> +
> +function prepare_ws {
> +    # temporary lock the publication to the repository
> +    ws_suffix=$1
> +    ws=${ws_prefix}_${ws_suffix}
> +    ws_lockfile=${ws}.lock
> +    if [ -f $ws_lockfile ]; then
> +        old_proc=$(cat $ws_lockfile)
> +    fi
> +    lockfile -l 60 $ws_lockfile
> +    chmod u+w $ws_lockfile && echo $$ >$ws_lockfile && chmod u-w $ws_lockfile
> +    if [ "$old_proc" != ""  -a "$old_proc" != "0" ]; then
> +        kill -9 $old_proc >/dev/null || true
> +    fi
> +
> +    # create temporary workspace with repository copy
> +    $rm_dir $ws
> +    $mk_dir $ws
> +}
> +
> +function usage {
> +    cat <<EOF
> +Usage for store package binaries from the given path:
> +    $0 -o=<OS name> -d=<OS distribuition> -b=<S3 bucket> [-p=<product>] <path to package binaries>
> +
> +Usage for mirroring Debian|Ubuntu OS repositories:
> +    $0 -o=<OS name> -d=<OS distribuition> -b=<S3 bucket> [-p=<product>] <path to packages binaries>
> +
> +Arguments:
> +    <path>
> +         Path points to the directory with deb/prm packages to be used.
> +
> +Options:
> +    -b|--bucket
> +        MCS S3 bucket already existing which will be used for storing the packages
> +    -o|--os
> +        OS to be checked, one of the list:
> +            $alloss
> +    -d|--distribution
> +        Distribution appropriate to the given OS:
> +EOF
> +    for os in $alloss ; do
> +        echo "            $os: <"$(get_os_dists $os)">"
> +    done
> +    cat <<EOF
> +    -p|--product
> +         Product name to be packed with, default name is 'tarantool'
> +    -f|--force
> +         Force updating the remote package with the local one despite the checksum difference
> +    -h|--help
> +         Usage help message
> +EOF
> +}
> +
> +for i in "$@"
> +do
> +case $i in
> +    -b=*|--bucket=*)
> +    bucket="${i#*=}"
> +    shift # past argument=value
> +    ;;
> +    -o=*|--os=*)
> +    os="${i#*=}"
> +    if ! echo $alloss | grep -F -q -w $os ; then
> +        echo "ERROR: OS '$os' is not supported"
> +        usage
> +        exit 1
> +    fi
> +    shift # past argument=value
> +    ;;
> +    -d=*|--distribution=*)
> +    option_dist="${i#*=}"
> +    shift # past argument=value
> +    ;;
> +    -p=*|--product=*)
> +    product="${i#*=}"
> +    shift # past argument=value
> +    ;;
> +    -f|--force)
> +    force=1
> +    ;;
> +    -h|--help)
> +    usage
> +    exit 0
> +    ;;
> +    *)
> +    repo="${i#*=}"
> +    pushd $repo >/dev/null ; repo=$PWD ; popd >/dev/null
> +    shift # past argument=value
> +    ;;
> +esac
> +done
> +
> +# check that all needed options were set and correct
> +if [ "$bucket" == "" ]; then
> +    echo "ERROR: need to set -b|--bucket bucket option, check usage"
> +    usage
> +    exit 1
> +fi
> +if ! $aws ls $bucket >/dev/null ; then
> +    echo "ERROR: bucket '$bucket' is not found"
> +    usage
> +    exit 1
> +fi
> +if [ "$option_dist" == "" ]; then
> +    echo "ERROR: need to set -d|--option_dist OS distribuition name option, check usage"
> +    usage
> +    exit 1
> +fi
> +if [ "$os" == "" ]; then
> +    echo "ERROR: need to set -o|--os OS name option, check usage"
> +    usage
> +    exit 1
> +fi
> +alldists=$(get_os_dists $os)
> +if [ -n "$option_dist" ] && ! echo $alldists | grep -F -q -w $option_dist ; then
> +    echo "ERROR: set distribution at options '$option_dist' not found at supported list '$alldists'"
> +    usage
> +    exit 1
> +fi
> +
> +# set the subpath with binaries based on literal character of the product name
> +proddir=$(echo $product | head -c 1)
> +
> +# set bucket path of the given OS in options
> +bucket_path="$bucket/$os"
> +
> +function update_deb_packfile {
> +    packfile=$1
> +    packtype=$2
> +    update_dist=$3
> +
> +    locpackfile=$(echo $packfile | sed "s#^$ws/##g")
> +    # register DEB/DSC pack file to Packages/Sources file
> +    reprepro -Vb . include$packtype $update_dist $packfile
> +    # reprepro copied DEB/DSC file to component which is not needed
> +    $rm_dir $debdir/$component
> +    # to have all sources avoid reprepro set DEB/DSC file to its own registry
> +    $rm_dir db
> +}
> +
> +function update_deb_metadata {
> +    packpath=$1
> +    packtype=$2
> +
> +    if [ ! -f $packpath.saved ] ; then
> +        # get the latest Sources file from S3 either create empty file
> +        $aws ls "$bucket_path/$packpath" >/dev/null 2>&1 && \
> +            $aws cp "$bucket_path/$packpath" $packpath.saved || \
> +            touch $packpath.saved
> +    fi
> +
> +    if [ "$packtype" == "dsc" ]; then
> +        # WORKAROUND: unknown why, but reprepro doesn`t save the Sources
> +        # file, lets recreate it manualy from it's zipped version
> +        gunzip -c $packpath.gz >$packpath
> +        # check if the DSC hash already exists in old Sources file from S3
> +        # find the hash from the new Sources file
> +        hash=$(grep '^Checksums-Sha256:' -A3 $packpath | \
> +            tail -n 1 | awk '{print $1}')
> +        # search the new hash in the old Sources file from S3
> +        if grep " $hash .* .*$" $packpath.saved ; then
> +            echo "WARNING: DSC file already registered in S3!"
> +            return
> +        fi
> +        # check if the DSC file already exists in old Sources file from S3
> +        file=$(grep '^Files:' -A3 $packpath | tail -n 1 | awk '{print $3}')
> +        if [ "$force" == "" ] && grep " .* .* $file$" $packpath.saved ; then
> +            echo "ERROR: the file already exists, but changed, set '-f' to overwrite it: $file"
> +            echo "New hash: $hash"
> +            # unlock the publishing
> +            $rm_file $ws_lockfile
> +            exit 1
> +        fi
> +        updated_dsc=1
> +    elif [ "$packtype" == "deb" ]; then
> +        # check if the DEB file already exists in old Packages file from S3
> +        # find the hash from the new Packages file
> +        hash=$(grep '^SHA256: ' $packpath)
> +        # search the new hash in the old Packages file from S3
> +        if grep "^SHA256: $hash" $packpath.saved ; then
> +            echo "WARNING: DEB file already registered in S3!"
> +            return
> +        fi
> +        # check if the DEB file already exists in old Packages file from S3
> +        file=$(grep '^Filename:' | awk '{print $2}')
> +        if [ "$force" == "" ] && grep "Filename: $file$" $packpath.saved ; then
> +            echo "ERROR: the file already exists, but changed, set '-f' to overwrite it: $file"
> +            echo "New hash: $hash"
> +            # unlock the publishing
> +            $rm_file $ws_lockfile
> +            exit 1
> +        fi
> +        updated_deb=1
> +    fi
> +    # store the new DEB entry
> +    cat $packpath >>$packpath.saved
> +}
> +
> +# The 'pack_deb' function especialy created for DEB packages. It works
> +# with DEB packing OS like Ubuntu, Debian. It is based on globaly known
> +# tool 'reprepro' from:
> +#     https://wiki.debian.org/DebianRepository/SetupWithReprepro
> +# This tool works with complete number of distributions of the given OS.
> +# Result of the routine is the debian package for APT repository with
> +# file structure equal to the Debian/Ubuntu:
> +#     http://ftp.am.debian.org/debian/pool/main/t/tarantool/
> +#     http://ftp.am.debian.org/ubuntu/pool/main/t/
> +function pack_deb {
> +    # we need to push packages into 'main' repository only
> +    component=main
> +
> +    # debian has special directory 'pool' for packages
> +    debdir=pool
> +
> +    # get packages from pointed location
> +    if ! ls $repo/*.deb $repo/*.dsc $repo/*.tar.*z >/dev/null ; then
> +        echo "ERROR: files $repo/*.deb $repo/*.dsc $repo/*.tar.*z not found"
> +        usage
> +        exit 1
> +    fi
> +
> +    # prepare the workspace
> +    prepare_ws ${os}
> +
> +    # copy single distribution with binaries packages
> +    repopath=$ws/pool/${option_dist}/$component/$proddir/$product
> +    $mk_dir ${repopath}
> +    cp $repo/*.deb $repo/*.dsc $repo/*.tar.*z $repopath/.
> +    pushd $ws
> +
> +    # create the configuration file for 'reprepro' tool
> +    confpath=$ws/conf
> +    $rm_dir $confpath
> +    $mk_dir $confpath
> +
> +    for loop_dist in $alldists ; do
> +        cat <<EOF >>$confpath/distributions
> +Origin: Tarantool
> +Label: tarantool.org
> +Suite: stable
> +Codename: $loop_dist
> +Architectures: amd64 source
> +Components: $component
> +Description: Tarantool DBMS and Tarantool modules
> +SignWith: 91B625E5
> +DebIndices: Packages Release . .gz .bz2
> +UDebIndices: Packages . .gz .bz2
> +DscIndices: Sources Release .gz .bz2
> +
> +EOF
> +    done
> +
> +    # create standalone repository with separate components
> +    for loop_dist in $alldists ; do
> +        echo ================ DISTRIBUTION: $loop_dist ====================
> +        updated_files=0
> +
> +        # 1(binaries). use reprepro tool to generate Packages file
> +        for deb in $ws/$debdir/$loop_dist/$component/*/*/*.deb ; do
> +            [ -f $deb ] || continue
> +            updated_deb=0
> +            # regenerate DEB pack
> +            update_deb_packfile $deb deb $loop_dist
> +            echo "Regenerated DEB file: $locpackfile"
> +            for packages in dists/$loop_dist/$component/binary-*/Packages ; do
> +                # copy Packages file to avoid of removing by the new DEB version
> +                # update metadata 'Packages' files
> +                update_deb_metadata $packages deb
> +                [ "$updated_deb" == "1" ] || continue
> +                updated_files=1
> +            done
> +            # save the registered DEB file to S3
> +            if [ "$updated_deb" == 1 ]; then
> +                $aws_cp_public $deb $bucket_path/$locpackfile
> +            fi
> +        done
> +
> +        # 1(sources). use reprepro tool to generate Sources file
> +        for dsc in $ws/$debdir/$loop_dist/$component/*/*/*.dsc ; do
> +            [ -f $dsc ] || continue
> +            updated_dsc=0
> +            # regenerate DSC pack
> +            update_deb_packfile $dsc dsc $loop_dist
> +            echo "Regenerated DSC file: $locpackfile"
> +            # copy Sources file to avoid of removing by the new DSC version
> +            # update metadata 'Sources' file
> +            update_deb_metadata dists/$loop_dist/$component/source/Sources dsc
> +            [ "$updated_dsc" == "1" ] || continue
> +            updated_files=1
> +            # save the registered DSC file to S3
> +            $aws_cp_public $dsc $bucket_path/$locpackfile
> +            tarxz=$(echo $locpackfile | sed 's#\.dsc$#.debian.tar.xz#g')
> +            $aws_cp_public $ws/$tarxz "$bucket_path/$tarxz"
> +            orig=$(echo $locpackfile | sed 's#-1\.dsc$#.orig.tar.xz#g')
> +            $aws_cp_public $ws/$orig "$bucket_path/$orig"
> +        done
> +
> +        # check if any DEB/DSC files were newly registered
> +        [ "$updated_files" == "0" ] && \
> +            continue || echo "Updating dists"
> +
> +        # finalize the Packages file
> +        for packages in dists/$loop_dist/$component/binary-*/Packages ; do
> +            mv $packages.saved $packages
> +        done
> +
> +        # finalize the Sources file
> +        sources=dists/$loop_dist/$component/source/Sources
> +        mv $sources.saved $sources
> +
> +        # 2(binaries). update Packages file archives
> +        for packpath in dists/$loop_dist/$component/binary-* ; do
> +            pushd $packpath
> +            sed "s#Filename: $debdir/$component/#Filename: $debdir/$loop_dist/$component/#g" -i Packages
> +            bzip2 -c Packages >Packages.bz2
> +            gzip -c Packages >Packages.gz
> +            popd
> +        done
> +
> +        # 2(sources). update Sources file archives
> +        pushd dists/$loop_dist/$component/source
> +        sed "s#Directory: $debdir/$component/#Directory: $debdir/$loop_dist/$component/#g" -i Sources
> +        bzip2 -c Sources >Sources.bz2
> +        gzip -c Sources >Sources.gz
> +        popd
> +
> +        # 3. update checksums entries of the Packages* files in *Release files
> +        # NOTE: it is stable structure of the *Release files when the checksum
> +        #       entries in it in the following way:
> +        # MD5Sum:
> +        #  <checksum> <size> <file orig>
> +        #  <checksum> <size> <file debian>
> +        # SHA1:
> +        #  <checksum> <size> <file orig>
> +        #  <checksum> <size> <file debian>
> +        # SHA256:
> +        #  <checksum> <size> <file orig>
> +        #  <checksum> <size> <file debian>
> +        #       The script bellow puts 'md5' value at the 1st found file entry,
> +        #       'sha1' - at the 2nd and 'sha256' at the 3rd
> +        pushd dists/$loop_dist
> +        for file in $(grep " $component/" Release | awk '{print $3}' | sort -u) ; do
> +            sz=$(stat -c "%s" $file)
> +            md5=$(md5sum $file | awk '{print $1}')
> +            sha1=$(sha1sum $file | awk '{print $1}')
> +            sha256=$(sha256sum $file | awk '{print $1}')
> +            awk 'BEGIN{c = 0} ; {
> +                if ($3 == p) {
> +                    c = c + 1
> +                    if (c == 1) {print " " md  " " s " " p}
> +                    if (c == 2) {print " " sh1 " " s " " p}
> +                    if (c == 3) {print " " sh2 " " s " " p}
> +                } else {print $0}
> +            }' p="$file" s="$sz" md="$md5" sh1="$sha1" sh2="$sha256" \
> +                    Release >Release.new
> +            mv Release.new Release
> +        done
> +        # resign the selfsigned InRelease file
> +        $rm_file InRelease
> +        gpg --clearsign -o InRelease Release
> +        # resign the Release file
> +        $rm_file Release.gpg
> +        gpg -abs -o Release.gpg Release
> +        popd
> +
> +        # 4. sync the latest distribution path changes to S3
> +        $aws_sync_public dists/$loop_dist "$bucket_path/dists/$loop_dist"
> +    done
> +
> +    # unlock the publishing
> +    $rm_file $ws_lockfile
> +
> +    popd
> +}
> +
> +# The 'pack_rpm' function especialy created for RPM packages. It works
> +# with RPM packing OS like Centos, Fedora. It is based on globaly known
> +# tool 'createrepo' from:
> +#     https://linux.die.net/man/8/createrepo
> +# This tool works with single distribution of the given OS.
> +# Result of the routine is the rpm package for YUM repository with
> +# file structure equal to the Centos/Fedora:
> +#     http://mirror.centos.org/centos/7/os/x86_64/Packages/
> +#     http://mirrors.kernel.org/fedora/releases/30/Everything/x86_64/os/Packages/t/
> +function pack_rpm {
> +    if ! ls $repo/*.rpm >/dev/null ; then
> +        echo "ERROR: Current '$repo' path doesn't have RPM packages in path"
> +        usage
> +        exit 1
> +    fi
> +
> +    # prepare the workspace
> +    prepare_ws ${os}_${option_dist}
> +
> +    # copy the needed package binaries to the workspace
> +    cp $repo/*.rpm $ws/.
> +
> +    pushd $ws
> +
> +    # set the paths
> +    if [ "$os" == "el" ]; then
> +        repopath=$option_dist/os/x86_64
> +        rpmpath=Packages
> +    elif [ "$os" == "fedora" ]; then
> +        repopath=releases/$option_dist/Everything/x86_64/os
> +        rpmpath=Packages/$proddir
> +    fi
> +    packpath=$repopath/$rpmpath
> +
> +    # prepare local repository with packages
> +    $mk_dir $packpath
> +    mv *.rpm $packpath/.
> +    cd $repopath
> +
> +    # copy the current metadata files from S3
> +    mkdir repodata.base
> +    for file in $($aws ls $bucket_path/$repopath/repodata/ | awk '{print $NF}') ; do
> +        $aws ls $bucket_path/$repopath/repodata/$file || continue
> +        $aws cp $bucket_path/$repopath/repodata/$file repodata.base/$file
> +    done
> +
> +    # create the new repository metadata files
> +    createrepo --no-database --update --workers=2 \
> +        --compress-type=gz --simple-md-filenames .
> +
> +    updated_rpms=0
> +    # loop by the new hashes from the new meta file
> +    for hash in $(zcat repodata/other.xml.gz | grep "<package pkgid=" | \
> +        awk -F'"' '{print $2}') ; do
> +        updated_rpm=0

Minor: unused variable.

> +        name=$(zcat repodata/other.xml.gz | grep "<package pkgid=\"$hash\"" | \
> +            awk -F'"' '{print $4}')
> +        # search the new hash in the old meta file from S3
> +        if zcat repodata.base/filelists.xml.gz | grep "pkgid=\"$hash\"" | \
> +            grep "name=\"$name\"" ; then
> +            echo "WARNING: $name file already registered in S3!"
> +            echo "File hash: $hash"
> +            continue
> +        fi
> +        updated_rpms=1
> +        # check if the hashed file already exists in old meta file from S3
> +        file=$(zcat repodata/primary.xml.gz | \
> +            grep -e "<checksum type=" -e "<location href=" | \
> +            grep "$hash" -A1 | grep "<location href=" | \
> +            awk -F'"' '{print $2}')
> +        # check if the file already exists in S3
> +        if [ "$force" == "" ] && zcat repodata.base/primary.xml.gz | \
> +                grep "<location href=\"$file\"" ; then
> +            echo "ERROR: the file already exists, but changed, set '-f' to overwrite it: $file"
> +            echo "New hash: $hash"
> +            # unlock the publishing
> +            $rm_file $ws_lockfile
> +            exit 1
> +        fi
> +    done
> +
> +    # check if any RPM files were newly registered
> +    [ "$updated_rpms" == "0" ] && \
> +        return || echo "Updating dists"
> +
> +    # move the repodata files to the standalone location
> +    mv repodata repodata.adding
> +
> +    # merge metadata files
> +    mkdir repodata
> +    head -n 2 repodata.adding/repomd.xml >repodata/repomd.xml
> +    for file in filelists.xml other.xml primary.xml ; do
> +        # 1. take the 1st line only - to skip the line with
> +        #    number of packages which is not needed
> +        zcat repodata.adding/$file.gz | head -n 1 >repodata/$file
> +        # 2. take 2nd line with metadata tag and update
> +        #    the packages number in it
> +        packsold=0
> +        if [ -f repodata.base/$file.gz ] ; then
> +            packsold=$(zcat repodata.base/$file.gz | head -n 2 | \
> +                tail -n 1 | sed 's#.*packages="\(.*\)".*#\1#g')
> +        fi
> +        packsnew=$(zcat repodata.adding/$file.gz | head -n 2 | \
> +            tail -n 1 | sed 's#.*packages="\(.*\)".*#\1#g')
> +        packs=$(($packsold+$packsnew))
> +        zcat repodata.adding/$file.gz | head -n 2 | tail -n 1 | \
> +            sed "s#packages=\".*\"#packages=\"$packs\"#g" >>repodata/$file
> +        # 3. take only 'package' tags from new file
> +        zcat repodata.adding/$file.gz | tail -n +3 | head -n -1 \
> +            >>repodata/$file
> +        # 4. take only 'package' tags from old file if exists
> +        if [ -f repodata.base/$file.gz ] ; then
> +            zcat repodata.base/$file.gz | tail -n +3 | head -n -1 \
> +                >>repodata/$file
> +        fi
> +        # 5. take the last closing line with metadata tag
> +        zcat repodata.adding/$file.gz | tail -n 1 >>repodata/$file
> +
> +        # get the new data
> +        chsnew=$(sha256sum repodata/$file | awk '{print $1}')
> +        sz=$(stat --printf="%s" repodata/$file)
> +        gzip repodata/$file
> +        chsgznew=$(sha256sum repodata/$file.gz | awk '{print $1}')
> +        szgz=$(stat --printf="%s" repodata/$file.gz)
> +        timestamp=$(date +%s -r repodata/$file.gz)
> +
> +        # add info to repomd.xml file
> +        name=$(echo $file | sed 's#\.xml$##g')
> +        cat <<EOF >>repodata/repomd.xml
> +<data type="$name">
> +  <checksum type="sha256">$chsgznew</checksum>
> +  <open-checksum type="sha256">$chsnew</open-checksum>
> +  <location href="repodata/$file.gz"/>
> +  <timestamp>$timestamp</timestamp>
> +  <size>$szgz</size>
> +  <open-size>$sz</open-size>
> +</data>"
> +EOF
> +    done
> +    tail -n 1 repodata.adding/repomd.xml >>repodata/repomd.xml
> +    gpg --detach-sign --armor repodata/repomd.xml
> +
> +    # copy the packages to S3
> +    for file in $rpmpath/*.rpm ; do
> +        $aws_cp_public $file "$bucket_path/$repopath/$file"
> +    done
> +
> +    # update the metadata at the S3
> +    $aws_sync_public repodata "$bucket_path/$repopath/repodata"
> +
> +    # unlock the publishing
> +    $rm_file $ws_lockfile
> +
> +    popd
> +}
> +
> +if [ "$os" == "ubuntu" -o "$os" == "debian" ]; then
> +    pack_deb
> +elif [ "$os" == "el" -o "$os" == "fedora" ]; then
> +    pack_rpm
> +else
> +    echo "USAGE: given OS '$os' is not supported, use any single from the list: $alloss"
> +    usage
> +    exit 1
> +fi
> -- 
> 2.17.1
> 

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH v7] gitlab-ci: implement packing into MCS S3
  2020-01-27  5:13 [Tarantool-patches] [PATCH v6] gitlab-ci: implement packing into MCS S3 Alexander V. Tikhonov
  2020-01-28 13:18 ` [Tarantool-patches] [PATCH v7] " Igor Munkin
@ 2020-01-30 15:49 ` Alexander Turenko
  2020-01-31  4:59   ` Alexander Tikhonov
  1 sibling, 1 reply; 6+ messages in thread
From: Alexander Turenko @ 2020-01-30 15:49 UTC (permalink / raw)
  To: Alexander V. Tikhonov; +Cc: Oleg Piskunov, tarantool-patches

CCed Oleg.

Oleg, please look at the commit message: whether it is understandable
w/o much context? If not, please, work together with Alexander on it.

I almost didn't look into the script (because I guess I'll no give much
value here and because Igor already did).

I have several small comments, but I don't against the patch at whole.

WBR, Alexander Turenko.

> The changes introduce new Gitlab-CI rules for creating packages on
> branches with "-full-ci" suffix and their subsequent deployment to the
> 'live' repository for master and release branches. Packages for tagged
> commits are also delivered to the corresponding 'release' repository.
> 
> The PackageCloud storage is replaced with the new self-hosted one

Packagecloud rules are not removed (and this is what we planned to do at
the moment), so the word 'replaced' looks confusing here.

> (based on S3 object storage) where all old packages have been synced.
> The new builds will be pushed only to S3 based repos. Benefits of the

This is not so, we don't plan to disable packagecloud repositories right
now.

> introduced approach are the following:
> * As far as all contents of self-hosted repos are fully controlled
> theirs layout is the same as the ones provided by the corresponding
> distro
> * Repo metadata rebuild is excess considering the known repo layout
> * Old packages are not pruned since they do not affect the repo
> metadata rebuilding time
> 
> For these purposes the standalone script for pushing DEB and RPM
> packages to self-hosted repositories is introduced. The script
> implements the following flow:
> * creates new metafiles for the new packages
> * copies new packages to S3 storage
> * fetches relevant metafiles from the repo
> * merges the new metadata with the fetched one
> * pushes the updated metadata to S3 storage

Is some blocking mechanism implemented? What will going on if two CI
jobs will fetch metadata, update them locally and them push back? Is it
depends on whether those CI jobs are run on the same machine or
different ones?

> 
> There are distro dependent parts in the script:
> * For RPM packages it updates metadata separately per each repo
> considering 'createrepo' util behaviour
> * For DEB packages it updates metadata simultaniously for all repos
> considering 'reprepro' util behaviour

What 'repo' means here? We have 1.10 repo. But it contains centos/7,
debian/jessie and other repos. I don't understand this paragraph, to be
honest.

Aside of that, we always update only one repository (say, 1.10 x
centos/7) at the moment. 'Simultaneous update' looks even more
confusing.

> 
> Closes #3380
> 
> @TarantoolBot
> Title: Update download instructions on the website
> 
> Need to update download instructions on the website, due to the new
> repository based on MCS S3.

First, it is stub, not a request to the documentation team. What actions
you are requested here?

Second, since we decided to keep current repositories structure
(separate 1.10/2.1/... repos and no version in a package name) no
actions are required actually.

> ---
> 
> Github: https://github.com/tarantool/tarantool/tree/avtikhon/gh-3380-push-packages-s3-full-ci
> Issue: https://github.com/tarantool/tarantool/issues/3380
> 
> v6: https://lists.tarantool.org/pipermail/tarantool-patches/2020-January/013763.html
> v5: https://lists.tarantool.org/pipermail/tarantool-patches/2020-January/013636.html
> v4: https://lists.tarantool.org/pipermail/tarantool-patches/2020-January/013568.html
> v3: https://lists.tarantool.org/pipermail/tarantool-patches/2019-December/013060.html
> v2: https://lists.tarantool.org/pipermail/tarantool-patches/2019-November/012352.html
> v1: https://lists.tarantool.org/pipermail/tarantool-patches/2019-October/012021.html

> diff --git a/.gitlab.mk b/.gitlab.mk
> index 48a92e518..243f83f2c 100644
> --- a/.gitlab.mk
> +++ b/.gitlab.mk
> @@ -98,14 +98,38 @@ vms_test_%:
>  vms_shutdown:
>  	VBoxManage controlvm ${VMS_NAME} poweroff
>  
> -# ########################
> -# Build RPM / Deb packages
> -# ########################
> +# ###########################
> +# Sources tarballs & packages
> +# ###########################

But it is about packages, not tarballs. It seems you copied the comment
from .travis.yml, but it is not appropriate here.

> +
> +# Push alpha and beta versions to <major>x bucket (say, 2x),
> +# stable to <major>.<minor> bucket (say, 2.2).
> +GIT_DESCRIBE=$(shell git describe HEAD)
> +MAJOR_VERSION=$(word 1,$(subst ., ,$(GIT_DESCRIBE)))
> +MINOR_VERSION=$(word 2,$(subst ., ,$(GIT_DESCRIBE)))
> +BUCKET=$(MAJOR_VERSION)_$(MINOR_VERSION)
> +ifeq ($(MINOR_VERSION),0)
> +BUCKET=$(MAJOR_VERSION)x
> +endif
> +ifeq ($(MINOR_VERSION),1)
> +BUCKET=$(MAJOR_VERSION)x
> +endif

Let's push to 2.1, we'll add 2x for compatibility on using redirects on
download.tarantool.org.

>  
>  package: git_submodule_update
>  	git clone https://github.com/packpack/packpack.git packpack
>  	PACKPACK_EXTRA_DOCKER_RUN_PARAMS='--network=host' ./packpack/packpack
>  
> +deploy: package
> +	for key in ${GPG_SECRET_KEY} ${GPG_PUBLIC_KEY} ; do \
> +		echo $${key} | base64 -d | gpg --batch --import || true ; done

I guess we need just secret key to signing.

> +	./tools/update_repo.sh -o=${OS} -d=${DIST} \
> +		-b="s3://tarantool_repo/live/${BUCKET}" build

I would hide name of an S3 bucket under a CI variable. It is more about
our infrastructure and it would be good to show less in the repo.

> +	for tag in $$(git tag) ; do \
> +			git describe --long $${tag} ; \
> +		done | grep "^$$(git describe --long)$$" >/dev/null && \

Simpler way:

 | git name-rev --name-only --tags --no-undefined HEAD 2>/dev/null

See https://stackoverflow.com/a/11489642/1598057

> +		./tools/update_repo.sh -o=${OS} -d=${DIST} \
> +			-b="s3://tarantool_repo/release/${BUCKET}" build

> +function get_os_dists {
> +    os=$1
> +    alldists=
> +
> +    if [ "$os" == "ubuntu" ]; then
> +        alldists='trusty xenial bionic cosmic disco eoan'
> +    elif [ "$os" == "debian" ]; then
> +        alldists='jessie stretch buster'
> +    elif [ "$os" == "el" ]; then
> +        alldists='6 7 8'
> +    elif [ "$os" == "fedora" ]; then
> +        alldists='27 28 29 30 31'
> +    fi

We have no Fedora 27 in CI. Should it be here? I don't mind, just found
the tiny inconsistency.

> +
> +    echo "$alldists"
> +}
> +
> +function prepare_ws {
> +    # temporary lock the publication to the repository
> +    ws_suffix=$1
> +    ws=${ws_prefix}_${ws_suffix}
> +    ws_lockfile=${ws}.lock
> +    if [ -f $ws_lockfile ]; then
> +        old_proc=$(cat $ws_lockfile)
> +    fi
> +    lockfile -l 60 $ws_lockfile
> +    chmod u+w $ws_lockfile && echo $$ >$ws_lockfile && chmod u-w $ws_lockfile
> +    if [ "$old_proc" != ""  -a "$old_proc" != "0" ]; then
> +        kill -9 $old_proc >/dev/null || true
> +    fi

So the current script can be killed by another instance of the script?
This means that the lock does not work.

> +
> +    # create temporary workspace with repository copy
> +    $rm_dir $ws
> +    $mk_dir $ws

Am I understand right: we don't create a copy of remote repository?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH v7] gitlab-ci: implement packing into MCS S3
  2020-01-30 15:49 ` Alexander Turenko
@ 2020-01-31  4:59   ` Alexander Tikhonov
  2020-01-31 22:53     ` Alexander Turenko
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Tikhonov @ 2020-01-31  4:59 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: Oleg Piskunov, tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 8941 bytes --]




>Четверг, 30 января 2020, 18:49 +03:00 от Alexander Turenko <alexander.turenko@tarantool.org>:
>
>CCed Oleg.
>
>Oleg, please look at the commit message: whether it is understandable
>w/o much context? If not, please, work together with Alexander on it.
>
>I almost didn't look into the script (because I guess I'll no give much
>value here and because Igor already did).
>
>I have several small comments, but I don't against the patch at whole.
>
>WBR, Alexander Turenko.
Oleg, I've added the answers, please proceed with check.
>
>> The changes introduce new Gitlab-CI rules for creating packages on
>> branches with "-full-ci" suffix and their subsequent deployment to the
>> 'live' repository for master and release branches. Packages for tagged
>> commits are also delivered to the corresponding 'release' repository.
>> 
>> The PackageCloud storage is replaced with the new self-hosted one
>
>Packagecloud rules are not removed (and this is what we planned to do at
>the moment), so the word 'replaced' looks confusing here.
Corrected, as:
    Packages creation activities relocating from the PackageCloud storage
    to the new self-hosted MCS S3 where all old packages have been synced.
>
>> (based on S3 object storage) where all old packages have been synced.
>> The new builds will be pushed only to S3 based repos. Benefits of the
>
>This is not so, we don't plan to disable packagecloud repositories right
>now. 
Removed the comment.
>
>
>> introduced approach are the following:
>> * As far as all contents of self-hosted repos are fully controlled
>> theirs layout is the same as the ones provided by the corresponding
>> distro
>> * Repo metadata rebuild is excess considering the known repo layout
>> * Old packages are not pruned since they do not affect the repo
>> metadata rebuilding time
>> 
>> For these purposes the standalone script for pushing DEB and RPM
>> packages to self-hosted repositories is introduced. The script
>> implements the following flow:
>> * creates new metafiles for the new packages
>> * copies new packages to S3 storage
>> * fetches relevant metafiles from the repo
>> * merges the new metadata with the fetched one
>> * pushes the updated metadata to S3 storage
>
>Is some blocking mechanism implemented? What will going on if two CI
>jobs will fetch metadata, update them locally and them push back? Is it
>depends on whether those CI jobs are run on the same machine or
>different ones?
Using lock files with the owner pid in it, the any next process checks the lock
file existence and if so gives it 60 seconds timeout to finish it's job after which
it forces to kill it to avoid of hanged processes.
>
>> 
>> There are distro dependent parts in the script:
>> * For RPM packages it updates metadata separately per each repo
>> considering 'createrepo' util behaviour
>> * For DEB packages it updates metadata simultaniously for all repos
>> considering 'reprepro' util behaviour
>
>What 'repo' means here? We have 1.10 repo. But it contains centos/7,
>debian/jessie and other repos. I don't understand this paragraph, to be
>honest.
>
>Aside of that, we always update only one repository (say, 1.10 x
>centos/7) at the moment. 'Simultaneous update' looks even more
>confusing.
This part just says that RPM packages updates for each distribution separately,
because meta files exist for each distribution, while DEB packages have only
single metafile for the all distributions within single OS.
>
>> 
>> Closes #3380
>> 
>> @TarantoolBot
>> Title: Update download instructions on the website
>> 
>> Need to update download instructions on the website, due to the new
>> repository based on MCS S3.
>
>First, it is stub, not a request to the documentation team. What actions
>you are requested here?
>
>Second, since we decided to keep current repositories structure
>(separate 1.10/2.1/... repos and no version in a package name) no
>actions are required actually.
Removed.
>
>> ---
>> 
>> Github:  https://github.com/tarantool/tarantool/tree/avtikhon/gh-3380-push-packages-s3-full-ci
>> Issue:  https://github.com/tarantool/tarantool/issues/3380
>> 
>> v6:  https://lists.tarantool.org/pipermail/tarantool-patches/2020-January/013763.html
>> v5:  https://lists.tarantool.org/pipermail/tarantool-patches/2020-January/013636.html
>> v4:  https://lists.tarantool.org/pipermail/tarantool-patches/2020-January/013568.html
>> v3:  https://lists.tarantool.org/pipermail/tarantool-patches/2019-December/013060.html
>> v2:  https://lists.tarantool.org/pipermail/tarantool-patches/2019-November/012352.html
>> v1:  https://lists.tarantool.org/pipermail/tarantool-patches/2019-October/012021.html
>
>> diff --git a/.gitlab.mk b/.gitlab.mk
>> index 48a92e518..243f83f2c 100644
>> --- a/.gitlab.mk
>> +++ b/.gitlab.mk
>> @@ -98,14 +98,38 @@ vms_test_%:
>>  vms_shutdown:
>>  	VBoxManage controlvm ${VMS_NAME} poweroff
>> 
>> -# ########################
>> -# Build RPM / Deb packages
>> -# ########################
>> +# ###########################
>> +# Sources tarballs & packages
>> +# ###########################
>
>But it is about packages, not tarballs. It seems you copied the comment
>from .travis.yml, but it is not appropriate here.
I've meant the sources in real, but if it is not appropriate I'm removing it.
>
>> +
>> +# Push alpha and beta versions to <major>x bucket (say, 2x),
>> +# stable to <major>.<minor> bucket (say, 2.2).
>> +GIT_DESCRIBE=$(shell git describe HEAD)
>> +MAJOR_VERSION=$(word 1,$(subst ., ,$(GIT_DESCRIBE)))
>> +MINOR_VERSION=$(word 2,$(subst ., ,$(GIT_DESCRIBE)))
>> +BUCKET=$(MAJOR_VERSION)_$(MINOR_VERSION)
>> +ifeq ($(MINOR_VERSION),0)
>> +BUCKET=$(MAJOR_VERSION)x
>> +endif
>> +ifeq ($(MINOR_VERSION),1)
>> +BUCKET=$(MAJOR_VERSION)x
>> +endif
>
>Let's push to 2.1, we'll add 2x for compatibility on using redirects on
>download.tarantool.org.
Ok, removed redirections to 2x bucket.
>
>> 
>>  package: git_submodule_update
>>  	git clone  https://github.com/packpack/packpack.git packpack
>>  	PACKPACK_EXTRA_DOCKER_RUN_PARAMS='--network=host' ./packpack/packpack
>> 
>> +deploy: package
>> +	for key in ${GPG_SECRET_KEY} ${GPG_PUBLIC_KEY} ; do \
>> +		echo $${key} | base64 -d | gpg --batch --import || true ; done
>
>I guess we need just secret key to signing.
To use the secret key for signing it has to be imported into the user's environment.
>
>> +	./tools/update_repo.sh -o=${OS} -d=${DIST} \
>> +		-b="s3://tarantool_repo/live/${BUCKET}" build
>
>I would hide name of an S3 bucket under a CI variable. It is more about
>our infrastructure and it would be good to show less in the repo.
S3 bucket name is opened and visible in the logs, there is no need to hide it and
to pay the attention for it, due to name is not secret at all, while the additional
secret values setup will be the dead code or additional manual steps.
>
>> +	for tag in $$(git tag) ; do \
>> +			git describe --long $${tag} ; \
>> +		done | grep "^$$(git describe --long)$$" >/dev/null && \
>
>Simpler way:
>
> | git name-rev --name-only --tags --no-undefined HEAD 2>/dev/null 
Ok, changed.
>
>
>See  https://stackoverflow.com/a/11489642/1598057
>
>> +		./tools/update_repo.sh -o=${OS} -d=${DIST} \
>> +			-b="s3://tarantool_repo/release/${BUCKET}" build
>
>> +function get_os_dists {
>> +    os=$1
>> +    alldists=
>> +
>> +    if [ "$os" == "ubuntu" ]; then
>> +        alldists='trusty xenial bionic cosmic disco eoan'
>> +    elif [ "$os" == "debian" ]; then
>> +        alldists='jessie stretch buster'
>> +    elif [ "$os" == "el" ]; then
>> +        alldists='6 7 8'
>> +    elif [ "$os" == "fedora" ]; then
>> +        alldists='27 28 29 30 31'
>> +    fi
>
>We have no Fedora 27 in CI. Should it be here? I don't mind, just found
>the tiny inconsistency.
Just wanted to have the same availability matrix of the OS/DISTS that
packagecloud repository has.
>
>> +
>> +    echo "$alldists"
>> +}
>> +
>> +function prepare_ws {
>> +    # temporary lock the publication to the repository
>> +    ws_suffix=$1
>> +    ws=${ws_prefix}_${ws_suffix}
>> +    ws_lockfile=${ws}.lock
>> +    if [ -f $ws_lockfile ]; then
>> +        old_proc=$(cat $ws_lockfile)
>> +    fi
>> +    lockfile -l 60 $ws_lockfile
>> +    chmod u+w $ws_lockfile && echo $$ >$ws_lockfile && chmod u-w $ws_lockfile
>> +    if [ "$old_proc" != ""  -a "$old_proc" != "0" ]; then
>> +        kill -9 $old_proc >/dev/null || true
>> +    fi
>
>So the current script can be killed by another instance of the script?
>This means that the lock does not work. 
No, it means that the common path used by the previous process should be free from it
in 60 seconds timeout. Also discussed in QA chat and proved that it works.
>
>
>> +
>> +    # create temporary workspace with repository copy
>> +    $rm_dir $ws
>> +    $mk_dir $ws
>
>Am I understand right: we don't create a copy of remote repository? 
Right, it is the copy of the newly created repository meta files.
>


-- 
Alexander Tikhonov

[-- Attachment #2: Type: text/html, Size: 14393 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH v7] gitlab-ci: implement packing into MCS S3
  2020-01-31  4:59   ` Alexander Tikhonov
@ 2020-01-31 22:53     ` Alexander Turenko
  2020-02-01 18:55       ` Alexander Tikhonov
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Turenko @ 2020-01-31 22:53 UTC (permalink / raw)
  To: Alexander Tikhonov; +Cc: Oleg Piskunov, tarantool-patches

I looked again over the GitLab CI rules and I have a couple of
questions:

- What will going on if one will fork the repository and push to fork's
  master (or release branch) something?
- What will going on if one will open a PR from a forked repository to
  master / release branch of our repository?
- What will going on if one will open a PR from our repository branch to
  master / release branch of our repository?
- Are there something like Travis-CI's 'cron jobs' in GitLab CI? What
  will going on if we'll enable it someday?

I asked this, because I more or less know how Travis-CI works and I
guess GitLab CI is quite similar in those points. In brief:

- Travis-CI cherry-pick's commits from a source branch to a target ones
  for a PR (this is the difference between 'branch jobs' and 'PR jobs').
  So 'PR jobs' are run like they are on a target branch.

- Travis-CI does not set variables from 'Settings' (at least secret
  ones) for jobs in other repositories (because of security reasons). So
  This is applicable also for PRs to our repository from a fork. Don't
  sure about a PR from a branch in our repository.

All those cases can be differentiated using environment variable that
Travis-CI sets or by conditions in .travis.yml that can be set for a
stage. So it is natural to split packaging and deployment and set
specific conditions for the deployment stage.

I guess we extract a deployment to a stage in GitLab CI like in
Travis-CI and also set necessary conditions when it should be run. Aside
of resolving forks / PRs / cron jobs problems it would also allow to
don't duplicate per-distro jobs in the config, I guess.

See also two comments below.

WBR, Alexander Turenko.

----

> >>  package: git_submodule_update
> >>  	git clone  https://github.com/packpack/packpack.git packpack
> >>  	PACKPACK_EXTRA_DOCKER_RUN_PARAMS='--network=host' ./packpack/packpack
> >> 
> >> +deploy: package
> >> +	for key in ${GPG_SECRET_KEY} ${GPG_PUBLIC_KEY} ; do \
> >> +		echo $${key} | base64 -d | gpg --batch --import || true ; done
> >
> > I guess we need just secret key to signing.
>
> To use the secret key for signing it has to be imported into the user's environment.

Again, why do you need a public key? Let's try to add just a secret key.
If it works w/o the public one (I guess so), then remove the loop and
just add the secret key.

> >
> >> +	./tools/update_repo.sh -o=${OS} -d=${DIST} \
> >> +		-b="s3://tarantool_repo/live/${BUCKET}" build
> >
> >I would hide name of an S3 bucket under a CI variable. It is more about
> >our infrastructure and it would be good to show less in the repo.
>
> S3 bucket name is opened and visible in the logs, there is no need to hide it and
> to pay the attention for it, due to name is not secret at all, while the additional
> secret values setup will be the dead code or additional manual steps.

An S3 bucket name is part of underlying infrastructure, just like S3
compliant service we use. We should be able to replace it w/o extra
commits to the repository. Let's add a variable.

Hiding is not critical and is not my main point. I would however hide
the variable, but up to you.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH v7] gitlab-ci: implement packing into MCS S3
  2020-01-31 22:53     ` Alexander Turenko
@ 2020-02-01 18:55       ` Alexander Tikhonov
  0 siblings, 0 replies; 6+ messages in thread
From: Alexander Tikhonov @ 2020-02-01 18:55 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: Oleg Piskunov, tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 4213 bytes --]


Alexander, thanks for your review, I've made the changes that you suggested.

>Суббота,  1 января 2020, 1:53 +03:00 от Alexander Turenko <alexander.turenko@tarantool.org>:
>
>I looked again over the GitLab CI rules and I have a couple of
>questions:
>
>- What will going on if one will fork the repository and push to fork's
>  master (or release branch) something?
>- What will going on if one will open a PR from a forked repository to
>  master / release branch of our repository?
>- What will going on if one will open a PR from our repository branch to
>  master / release branch of our repository?
>- Are there something like Travis-CI's 'cron jobs' in GitLab CI? What
>  will going on if we'll enable it someday?
>
>I asked this, because I more or less know how Travis-CI works and I
>guess GitLab CI is quite similar in those points. In brief:
>
>- Travis-CI cherry-pick's commits from a source branch to a target ones
>  for a PR (this is the difference between 'branch jobs' and 'PR jobs').
>  So 'PR jobs' are run like they are on a target branch.
>
>- Travis-CI does not set variables from 'Settings' (at least secret
>  ones) for jobs in other repositories (because of security reasons). So
>  This is applicable also for PRs to our repository from a fork. Don't
>  sure about a PR from a branch in our repository.
>
>All those cases can be differentiated using environment variable that
>Travis-CI sets or by conditions in .travis.yml that can be set for a
>stage. So it is natural to split packaging and deployment and set
>specific conditions for the deployment stage.
>
>I guess we extract a deployment to a stage in GitLab CI like in
>Travis-CI and also set necessary conditions when it should be run. Aside
>of resolving forks / PRs / cron jobs problems it would also allow to
>don't duplicate per-distro jobs in the config, I guess.
Right, to make the jobs working correct for it I've made the changes based on the links
you suggested:
https://docs.gitlab.com/ee/ci/merge_request_pipelines/#configuring-pipelines-for-merge-requests  
https://docs.gitlab.com/ee/ci/ci_cd_for_external_repos/#pipelines-for-external-pull-requests  

.deploy_only_template: &deploy_only_definition
 only:
 - master
 except:
 - external_pull_requests
 - merge_requests

.pack_only_template: &pack_only_definition
 only:
 - external_pull_requests
 - merge_requests
 - /^.*-full-ci$/

So the external_pull_requests and merge_requests will be blocked for deploying
and must be run for packing jobs.
>
>See also two comments below.
>
>WBR, Alexander Turenko.
>
>----
>
>> >>  package: git_submodule_update
>> >>  	git clone  https://github.com/packpack/packpack.git packpack
>> >>  	PACKPACK_EXTRA_DOCKER_RUN_PARAMS='--network=host' ./packpack/packpack
>> >> 
>> >> +deploy: package
>> >> +	for key in ${GPG_SECRET_KEY} ${GPG_PUBLIC_KEY} ; do \
>> >> +		echo $${key} | base64 -d | gpg --batch --import || true ; done
>> >
>> > I guess we need just secret key to signing.
>>
>> To use the secret key for signing it has to be imported into the user's environment.
>
>Again, why do you need a public key? Let's try to add just a secret key.
>If it works w/o the public one (I guess so), then remove the loop and
>just add the secret key. 
Right, public key was not really needed - removed.
>
>
>> >
>> >> +	./tools/update_repo.sh -o=${OS} -d=${DIST} \
>> >> +		-b="s3://tarantool_repo/live/${BUCKET}" build
>> >
>> >I would hide name of an S3 bucket under a CI variable. It is more about
>> >our infrastructure and it would be good to show less in the repo.
>>
>> S3 bucket name is opened and visible in the logs, there is no need to hide it and
>> to pay the attention for it, due to name is not secret at all, while the additional
>> secret values setup will be the dead code or additional manual steps.
>
>An S3 bucket name is part of underlying infrastructure, just like S3
>compliant service we use. We should be able to replace it w/o extra
>commits to the repository. Let's add a variable.
>
>Hiding is not critical and is not my main point. I would however hide
>the variable, but up to you.
Moved both bucket/dir path to Gitlab-CI variables.

-- 
Alexander Tikhonov

[-- Attachment #2: Type: text/html, Size: 5816 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-02-01 18:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-27  5:13 [Tarantool-patches] [PATCH v6] gitlab-ci: implement packing into MCS S3 Alexander V. Tikhonov
2020-01-28 13:18 ` [Tarantool-patches] [PATCH v7] " Igor Munkin
2020-01-30 15:49 ` Alexander Turenko
2020-01-31  4:59   ` Alexander Tikhonov
2020-01-31 22:53     ` Alexander Turenko
2020-02-01 18:55       ` Alexander Tikhonov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox