Giter Club home page Giter Club logo

sonic-buildimage's Introduction

master builds:

Broadcom Centec Centec(arm64) Innovium Mellanox Marvell(armhf) Marvell(arm64) Nephos Nvidia-Bluefield Pensando

VS

202311 builds:

Broadcom Centec Centec(arm64) Innovium Mellanox Marvell(armhf) Marvell(arm64) VS

202305 builds:

Barefoot Broadcom Centec Centec(arm64) Innovium Mellanox Marvell(armhf) Nephos VS

202211 builds:

Barefoot Broadcom Centec Centec(arm64) Innovium Mellanox Marvell(armhf) Nephos VS

202205 builds:

Barefoot Broadcom Centec Centec(arm64) Innovium Mellanox Marvell(armhf) Nephos VS

202111 builds:

Barefoot Broadcom Centec Centec(arm64) Innovium Mellanox Marvell(armhf) Nephos VS

202012 builds:

Barefoot Broadcom Centec Centec(arm64) Innovium Marvell(armhf) Mellanox Nephos VS

201911 builds:

Barefoot Broadcom Innovium Mellanox Nephos VS

201811 builds:

Broadcom Mellanox Innovium Nephos VS

SONiC Image Azure Pipelines

All SONiC project build pipeline could be found at Download Portal for SONiC Images

sonic-buildimage

Build SONiC Switch Images

Description

Following are the instructions on how to build an (ONIE) compatible network operating system (NOS) installer image for network switches, and also how to build docker images running inside the NOS. Note that SONiC images are build per ASIC platform. Switches using the same ASIC platform share a common image. For a list of supported switches and ASIC, please refer to this list

Hardware

Any server can be a build image server as long as it has:

  • Multiple cores to increase build speed
  • Plenty of RAM (less than 8 GiB is likely to cause issues)
  • 300G of free disk space
  • KVM Virtualization Support.

Note: If you are in a VM, make sure you have support for nested virtualization. Some cases (e.g. building OVS image) also requires extra configuration options to expose the full KVM interface to the VM (e.g. the KVM paravirtualization support on VirtualBox).

A good choice of OS for building SONiC is currently Ubuntu 20.04.

Prerequisites

  • Install pip and jinja in host build machine, execute below commands if j2/j2cli is not available:
sudo apt install -y python3-pip
pip3 install --user j2cli
  • Install Docker and configure your system to allow running the 'docker' command without 'sudo':
    • Add current user to the docker group: sudo gpasswd -a ${USER} docker
    • Log out and log back in so that your group membership is re-evaluated
    • If you are using Linux kernel 5.3 or newer, then you must use Docker 20.10.10 or newer. This is because older Docker versions did not allow the clone3 syscall, which is now used in Bookworm.

Note: If a previous installation of Docker using snap was present on the system, remove it and also remove docker from snap before reinstallating docker. This will avoid known bugs that falsely report read-only filesystems issues during the build process.

Clone the repository with all the git submodules

To clone the code repository recursively:

git clone --recurse-submodules https://github.com/sonic-net/sonic-buildimage.git

Usage

To build SONiC installer image and docker images, run the following commands:

# Ensure the 'overlay' module is loaded on your development system
sudo modprobe overlay

# Enter the source directory
cd sonic-buildimage

# (Optional) Checkout a specific branch. By default, it uses master branch.
# For example, to checkout the branch 201911, use "git checkout 201911"
git checkout [branch_name]

# Execute make init once after cloning the repo,
# or after fetching remote repo with submodule updates
make init

# Execute make configure once to configure ASIC
make configure PLATFORM=[ASIC_VENDOR]

# Build SONiC image with 4 jobs in parallel.
# Note: You can set this higher, but 4 is a good number for most cases
#       and is well-tested.
make SONIC_BUILD_JOBS=4 all

The supported ASIC vendors are:

  • PLATFORM=barefoot
  • PLATFORM=broadcom
  • PLATFORM=marvell
  • PLATFORM=mellanox
  • PLATFORM=cavium
  • PLATFORM=centec
  • PLATFORM=nephos
  • PLATFORM=nvidia-bluefield
  • PLATFORM=innovium
  • PLATFORM=vs

Usage for ARM Architecture

sudo apt-get install --allow-downgrades -y docker-ce=5:18.09.0~3-0~ubuntu-xenial
sudo apt-get install --allow-downgrades -y docker-ce-cli=5:18.09.0~3-0~ubuntu-xenial

To build Arm32 bit for (ARMHF) platform

# Execute make configure once to configure ASIC and ARCH
make configure PLATFORM=[ASIC_VENDOR] PLATFORM_ARCH=armhf
make target/sonic-[ASIC_VENDER]-armhf.bin

example:

make configure PLATFORM=marvell PLATFORM_ARCH=armhf
make target/sonic-marvell-armhf.bin

To build Arm32 bit for (ARMHF) Marvell platform on amd64 host for debian buster using cross-compilation, run the following commands:

# Execute make configure once to configure ASIC and ARCH for cross-compilation build

NOJESSIE=1 NOSTRETCH=1 BLDENV=buster CROSS_BLDENV=1 \
make configure PLATFORM=marvell PLATFORM_ARCH=armhf

# Execute Arm32 build using cross-compilation environment

NOJESSIE=1 NOSTRETCH=1 BLDENV=buster CROSS_BLDENV=1 make target/sonic-marvell-armhf.bin

Running the above Arm32 build using cross-compilation instead of qemu emulator drastically reduces the build time.

To build Arm64 bit for platform

# Execute make configure once to configure ASIC and ARCH

make configure PLATFORM=[ASIC_VENDOR] PLATFORM_ARCH=arm64

# example:

make configure PLATFORM=marvell PLATFORM_ARCH=arm64

NOTE:

  • Recommend reserving at least 100G free space to build one platform with a single job. The build process will use more disk if you are setting SONIC_BUILD_JOBS to more than 1.

  • If Docker's workspace folder, /var/lib/docker, resides on a partition without sufficient free space, you may encounter an error like the following during a Docker container build job:

    /usr/bin/tar: /path/to/sonic-buildimage/<some_file>: Cannot write: No space left on device

    The solution is to move the directory to a partition with more free space.

  • Use http_proxy=[your_proxy] https_proxy=[your_proxy] no_proxy=[your_no_proxy] make to enable http(s) proxy in the build process.

  • Add your user account to docker group and use your user account to make. root or sudo are not supported.

The SONiC installer contains all docker images needed. SONiC uses one image for all devices of a same ASIC vendor.

For Broadcom ASIC, we build ONIE and EOS image. EOS image is used for Arista devices, ONIE image is used for all other Broadcom ASIC based devices.

make configure PLATFORM=broadcom
# build debian stretch required targets
BLDENV=stretch make stretch
# build ONIE image
make target/sonic-broadcom.bin
# build EOS image
make target/sonic-aboot-broadcom.swi

You may find the rules/config file useful. It contains configuration options for the build process, like adding more verbosity or showing dependencies, username and password for base image etc.

Every docker image is built and saved to target/ directory. So, for instance, to build only docker-database, execute:

make target/docker-database.gz

Same goes for debian packages, which are under target/debs/:

make target/debs/swss_1.0.0_amd64.deb

Every target has a clean target, so in order to clean swss, execute:

make target/debs/swss_1.0.0_amd64.deb-clean

It is recommended to use clean targets to clean all packages that are built together, like dev packages for instance. In order to be more familiar with build process and make some changes to it, it is recommended to read this short Documentation.

Build debug dockers and debug SONiC installer image

SONiC build system supports building dockers and ONIE-image with debug tools and debug symbols, to help with live & core debugging. For details refer to SONiC Buildimage Guide.

SAI Version

Please refer to SONiC roadmap on the SAI version for each SONiC release.

Notes

  • If you are running make for the first time, a sonic-slave-${USER} docker image will be built automatically. This may take a while, but it is a one-time action, so please be patient.
  • The root user account is disabled. However, the created user can sudo.
  • The target directory is ./target, containing the NOS installer image and docker images.
    • sonic-generic.bin: SONiC switch installer image (ONIE compatible)
    • sonic-aboot.bin: SONiC switch installer image (Aboot compatible)
    • docker-base.gz: base docker image where other docker images are built from, only used in build process (gzip tar archive)
    • docker-database.gz: docker image for in-memory key-value store, used as inter-process communication (gzip tar archive)
    • docker-fpm.gz: docker image for quagga with fpm module enabled (gzip tar archive)
    • docker-orchagent.gz: docker image for SWitch State Service (SWSS) (gzip tar archive)
    • docker-syncd-brcm.gz: docker image for the daemon to sync database and Broadcom switch ASIC (gzip tar archive)
    • docker-syncd-cavm.gz: docker image for the daemon to sync database and Cavium switch ASIC (gzip tar archive)
    • docker-syncd-mlnx.gz: docker image for the daemon to sync database and Mellanox switch ASIC (gzip tar archive)
    • docker-syncd-nephos.gz: docker image for the daemon to sync database and Nephos switch ASIC (gzip tar archive)
    • docker-syncd-invm.gz: docker image for the daemon to sync database and Innovium switch ASIC (gzip tar archive)
    • docker-sonic-p4.gz: docker image for all-in-one for p4 software switch (gzip tar archive)
    • docker-sonic-vs.gz: docker image for all-in-one for software virtual switch (gzip tar archive)
    • docker-sonic-mgmt.gz: docker image for managing, configuring and monitoring SONiC (gzip tar archive)

Contribution Guide

All contributors must sign a contribution license agreement before contributions can be accepted. Visit EasyCLA - Linux Foundation.

GitHub Workflow

We're following basic GitHub Flow. If you have no idea what we're talking about, check out GitHub's official guide. Note that merge is only performed by the repository maintainer.

Guide for performing commits:

  • Isolate each commit to one component/bugfix/issue/feature
  • Use a standard commit message format:

[component/folder touched]: Description intent of your changes

[List of changes]

Signed-off-by: Your Name [email protected]

For example:

swss-common: Stabilize the ConsumerTable

  • Fixing autoreconf
  • Fixing unit-tests by adding checkers and initialize the DB before start
  • Adding the ability to select from multiple channels
  • Health-Monitor - The idea of the patch is that if something went wrong with the notification channel, we will have the option to know about it (Query the LLEN table length).

Signed-off-by: [email protected]

  • Each developer should fork this repository and add the team as a Contributor
  • Push your changes to your private fork and do "pull-request" to this repository
  • Use a pull request to do code review
  • Use issues to keep track of what is going on

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

sonic-buildimage's People

Contributors

abdosi avatar aravindmani-1 avatar arunsaravananbalachandran avatar dgsudharsan avatar dprital avatar jleveque avatar junchao-mellanox avatar keboliu avatar lguohan avatar liuh-80 avatar liushilongbuaa avatar marian-pritsak avatar mssonicbld avatar nazariig avatar oleksandrivantsiv avatar pavel-shirshov avatar prsunny avatar pterosaur avatar qiluo-msft avatar renukamanavalan avatar saiarcot895 avatar shlomibitton avatar staphylo avatar stepanblyschak avatar stephenxs avatar taoyl-ms avatar theasianpianist avatar vivekrnv avatar xumia avatar yxieca avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sonic-buildimage's Issues

Build will fail when working on a branch name with "/"

git_revision will contain "/", and later sed will fail.

Example log:

Build ONIE installer
+ arch=x86_64
+ machine=broadcom
+ platform=x86_64-broadcom-r0
+ installer_dir=installer
+ platform_conf=platform/broadcom/platform.conf
+ output_file=target/sonic-broadcom.bin
+ demo_type=OS
+ git_revision=taoyl/test.0-c8a315e
+ onie_image_part_size=32768
+ shift 9
+ [ ! -d installer ]
+ [ ! -r installer/sharch_body.sh ]
+ [ ! -d installer/x86_64 ]
+ [ ! -r installer/x86_64/install.sh ]
+ [ -n taoyl/test.0-c8a315e ]
+ [ -n 32768 ]
+ [ -r platform/broadcom/platform.conf ]
+ [ 1 -gt 0 ]
+ tmp_dir=
+ echo -n Building self-extracting install image .
Building self-extracting install image .+ mktemp --directory
+ tmp_dir=/tmp/tmp.85lbt3lbEE
+ tmp_installdir=/tmp/tmp.85lbt3lbEE/installer
+ mkdir /tmp/tmp.85lbt3lbEE/installer
+ cp -r installer/x86_64/install.sh installer/x86_64/macset.sh installer/x86_64/platforms /tmp/tmp.
85lbt3lbEE/installer
+ cp onie-image.conf /tmp/tmp.85lbt3lbEE/installer
+ echo
+ sed -e s/[\/&]/\\&/g
+ EXTRA_CMDLINE_LINUX=
+ sed -i -e s/%%DEMO_TYPE%%/OS/g -e s/%%GIT_REVISION%%/taoyl/test.0-c8a315e/g -e s/%%ONIE_IMAGE_PART
_SIZE%%/32768/ -e s/%%EXTRA_CMDLINE_LINUX%%// /tmp/tmp.85lbt3lbEE/installer/install.sh
sed: -e expression #2, char 26: unknown option to `s'
+ clean_up 1
+ rm -rf /tmp/tmp.85lbt3lbEE
+ exit 1
slave.mk:314: recipe for target 'target/sonic-broadcom.bin' failed
make: Leaving directory '/sonic'
make: *** [target/sonic-broadcom.bin] Error 1
Makefile:33: recipe for target 'target/sonic-broadcom.bin' failed
make: *** [target/sonic-broadcom.bin] Error 2

Failure trying to run: chroot /sonic-buildimage/fsroot mount -t proc proc /proc

Hello,
I'm trying to build an image for Broadcom.

Step 1-3 is fine, the problem appears when I'm using make command in usage section.
Docker version 1.12.1, build 6f9534c

Error output:

I: Extracting util-linux...
I: Extracting liblzma5...
I: Extracting zlib1g...
W: Failure trying to run: chroot /sonic-buildimage/fsroot mount -t proc proc /proc
W: See /sonic-buildimage/fsroot/debootstrap/debootstrap.log for details
Makefile:141: recipe for target 'target/sonic-generic.bin' failed
make: *** [target/sonic-generic.bin] Error 1
rm src/initramfs-tools/initramfs-tools_0.120_all.deb src/sonic-linux-kernel/linux-image-3.16.0-4->amd64_3.16.7-ckt11-2+acs8u2_amd64.deb

Thanks.

LAGs are not updated correctly after teamd restart

Reproduced on:

build_version: HEAD.211-5585221
debian_version: 8.7
kernel_version: 3.16.0-4-amd64
asic_type: mellanox
build_date: Wed Apr 12 05:07:40 UTC 2017
built_by: johnar@jenkins-worker-1

Steps to reproduce:

$ systemctl restart teamd

After I restart teamd, some LAGs are not working anymore. From SDK dump I can see that part of them have no members on hardware, some have only one, and some have two.

P4 bulilds failure undeterministically

The test part fails randomly.

test_suite
  ThriftTestCase
    testByte ...                                                           [OK]
    testDouble ...                                                         [OK]
    testException ...                                                      [OK]
    testI32 ...                                                            [OK]
    testI64 ...                                                            [OK]
    testOneway ...                                                       [FAIL]
    testString ...                                                         [OK]
    testStruct ...                                                         [OK]
    testVoid ...                                                           [OK]

omem=1430033368

admin@str-s6000-acs-7:~$ redis-cli client list | grep -v omem=0 
id=96 addr=/var/run/redis/redis.sock:0 fd=40 name= age=162949 idle=162940 flags=U db=0 sub=0 psub=1 multi=-1 qbuf=0 qbuf-free=0 obl=16351 oll=83587 omem=1430033368 events=rw cmd=psubscribe
admin@str-s6000-acs-7:~$ show version

SONiC Software Version: SONiC.master.0-c7d540c
Distribution: Debian 8.7
Kernel: 3.16.0-4-amd64
Build commit: c7d540c
Build date: Sat Apr 29 09:56:25 UTC 2017
Built by: lgh@lgh-issaquah

Docker images:
REPOSITORY                TAG                 IMAGE ID            SIZE
docker-syncd-brcm         latest              e91056044328        282.9 MB
docker-orchagent-brcm     latest              5d6e3f1eeec2        252.8 MB
docker-lldp-sv2           latest              6362b759dd9c        254.9 MB
docker-dhcp-relay         latest              662c2975b024        248.5 MB
docker-database           latest              544a69626fd3        216.4 MB
docker-snmp-sv2           latest              301f15e9b299        290 MB
docker-teamd              latest              8dbd472b5815        250.3 MB
docker-platform-monitor   latest              29412eed58bf        273.5 MB
docker-fpm-quagga         latest              63517bd3a437        256.9 MB
admin@str-s6000-acs-7:~$ docker stop lldp
lldp

admin@str-s6000-acs-7:~$ redis-cli client list | grep -v omem=0

configurations are re-generated across reboots

@taoyl-ms Sometimes people would like to update some configurations on the device by themselves. but rebooting will trigger the files inside the dockers to be re-generated again via the minigraph.xml. Is it by design or shall we able to leave the configurations there before manually regenerating this configurations?

disklabel can only be 16 char in ext4, now it is too long

Creating new SONiC-OS partition /dev/sda3 ...
Could not create partition 3 from 788480 to 67897343
Unable to set partition 3's name to 'SONiC-OS-HEAD.751-701c1eb'!
Error encountered; not saving changes.
Warning: The first trial of creating partition failed, trying the largest aligned available block of sectors on the disk
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
mke2fs 1.42.8 (20-Jun-2013)
Discarding device blocks: done                            
Filesystem label=SONiC-OS-HEAD.75
OS type: Linux

Filesystem label=SONiC-OS-HEAD.75

This causes issue in booting.

Booting `SONiC-OS-HEAD.751-701c1eb'

error: no such device: SONiC-OS-HEAD.751-701c1eb.
Loading SONiC-OS-HEAD.751-701c1eb OS kernel ...
Loading SONiC-OS-HEAD.751-701c1eb OS initial ramdisk ...

rc.local fails on boot

● rc-local.service - /etc/rc.local Compatibility
Loaded: loaded (/lib/systemd/system/rc-local.service; static)
Active: failed (Result: exit-code) since Thu 2017-03-02 12:01:05 UTC; 4min 35s ago
Process: 439 ExecStart=/etc/rc.local start (code=exited, status=1/FAILURE)

Mar 02 12:01:05 arc-switch1025 rc.local[439]: install platform dependent packages at the first boot time
Mar 02 12:01:05 arc-switch1025 rc.local[439]: cp: cannot stat '/usr/share/sonic/device//minigraph.xml': No such file or directory
Mar 02 12:01:05 arc-switch1025 systemd[1]: rc-local.service: control process exited, code=exited status=1
Mar 02 12:01:05 arc-switch1025 systemd[1]: Failed to start /etc/rc.local Compatibility.
Mar 02 12:01:05 arc-switch1025 systemd[1]: Unit rc-local.service entered failed state.

portstat fails

# portstat 
Traceback (most recent call last):
  File "/usr/bin/portstat", line 320, in <module>
    main()
  File "/usr/bin/portstat", line 273, in main
    cnstat_dict = portstat.get_cnstat()
  File "/usr/bin/portstat", line 77, in get_cnstat
    cnstat_dict[port] = get_counters(counter_port_name_map[port])
  File "/usr/bin/portstat", line 67, in get_counters
    fields[pos] += int(self.db.get(self.db.COUNTERS_DB, full_table_id, counter_name))
TypeError: int() argument must be a string or a number, not 'NoneType'

Reproduced on Mellanox platform.
Can anyone try it on another to help locate the problem?

README.md leaves out docker-database

I caught a question on "what is docker-daabase, is this different from docker-base?"

docker-database is not in the list of images in the readme.

Remove platform/ subdirectory from sonic-config-engine

1.) Confirm all components are now referencing device-specific data in the device/ directory, and that the platform/ subdirectory in sonic-config-engine is truly deprecated.

2.) Remove platform/ subdirectory from sonic-config-engine.

umount: /proc: target is busy

https://sonic-jenkins.westus.cloudapp.azure.com/job/common/job/buildimage-baseimage-pr/385/console

[INFO] Umount all
+ sudo LANG=C chroot ./fsroot fuser -km /proc
/proc:                6095
+ sudo LANG=C chroot ./fsroot umount /proc
umount: /proc: target is busy
        (In some cases useful info about processes that
         use the device is found by lsof(8) or fuser(1).)
+ rm -f /tmp/tmp.AjVwDYDeYQ
+ sudo umount ./fsroot/proc
umount: /sonic/fsroot/proc: target is busy
        (In some cases useful info about processes that
         use the device is found by lsof(8) or fuser(1).)
+ true
+ true
slave.mk:303: recipe for target 'target/sonic-generic.bin' failed
make: Leaving directory '/sonic'
make: *** [target/sonic-generic.bin] Error 1
Makefile:32: recipe for target 'target/sonic-generic.bin' failed
make: *** [target/sonic-generic.bin] Error 2
Build step 'Execute shell' marked build as failure
Adding one-line test results to commit status...
Setting status of ba493b72e4a0fb0170fc0f4c51488641e8da6658 to FAILURE with url https://sonic-jenkins.westus.cloudapp.azure.com/job/common/job/buildimage-baseimage-pr/385/ and message: ' No test results found.'
Using context: baseimage
Finished: FAILURE

The order to apply qos.json and buffers.json cannot be reversed

Currently there are internal dependencies in between qos.json and buffers.json files that if the order of applying the configurations cannot be guaranteed, some configurations cannot be applied correctly.

After brief debugging on this issue, it is possible that there might be some internal dependencies in bcmsai package.

dpkg-query: error: failed to open package info file `/var/lib/dpkg/updates/0001' for reading: No such file or directory

dh_perl -plibnl-3-200-udeb
dh_shlibdeps -plibnl-3-200-udeb
dpkg-query: error: failed to open package info file `/var/lib/dpkg/updates/0001' for reading: No such file or directory
dpkg-shlibdeps: error: dpkg-query --control-path libc6:amd64 shlibs gave error exit status 2
dh_shlibdeps: dpkg-shlibdeps -Tdebian/libnl-3-200-udeb.substvars -tudeb debian/libnl-3-200-udeb/lib/libnl-3.so.200.22.0 returned exit code 2
/usr/share/cdbs/1/rules/debhelper.mk:277: recipe for target 'binary-predeb-IMPL/libnl-3-200-udeb' failed
make[2]: *** [binary-predeb-IMPL/libnl-3-200-udeb] Error 2
make[2]: Leaving directory '/var/build/workspace/mellanox/buildimage-mlnx-all-pr/src/libnl3/libnl3'

minigraph.py crashed when no png is in the minigraph

When there's no png info, minigraph.py will crash, because console_dev and mgmt_dev variables are missing. Better to have some basic checks here.

Besides, if the role tag is missing, the bgpd.conf.j2 will fail to be deployed.

docker-base not squashed

missing squash

## Flatten the image by importing an exported container on this image
## Note: it will squash the image with only one layer and lost all metadata such as ENTRYPOINT,
##       so apply only to the base image
## TODO: wait docker-squash supporting Docker 1.10+
## ref: https://github.com/jwilder/docker-squash/issues/45
if [ "$docker_image_name" = "docker-base" ]; then
    ## Run old image in a container
    tmp_container=$(docker run -d ${docker_image_name} /bin/bash)
    ## Export the container's filesystem, then import as a new image
    docker export $tmp_container | docker import - ${docker_image_name}
    ## Remove the container
    docker rm -f $tmp_container || true
    ## Remove the old image
    docker rmi -f $image_id || true
fi

docker-fpm and docker-team depend on libsai

docker-fpm needs fpmsyncd. docker-team needs teamsyncd.
these two binaries only depends on libswsscommon but they are in swss repository.
swss repository depends on libsairedis which is in sairedis repo.
sairedis repo depends on libsai because syncd is in sairedis repo.

thus, docker-fpm and docker-team depend on libsai.

the ideal scenario is that only fpmsyncd and teamsyncd binaries are inside the docker. no libsairedis/libsai/swss are installed.

Crash on changing of IP address on a network interface with an active neighbor

Topology
TG-DUT-DUT-TG

Steps

  1. Boot two switches with SONIC (OCP branch) and minigraph
  2. sudo ifconfig Ethernet0 2.2.2.1 netmask 255.255.255.0 on any switch

Observed results
Feb 10 14:23:23 OCPSCH0104001MS NOTICE orchagent: :- removeSubnetRoute: Remove subnet route to 10.10.1.2/30 from Ethernet0
Feb 10 14:23:23 OCPSCH0104001MS NOTICE orchagent: :- removeIp2MeRoute: Remove packet action trap route ip:10.10.1.2
Feb 10 14:23:23 OCPSCH0104001MS NOTICE orchagent: :- removeRouterIntfs: Remove router interface for port Ethernet0
Feb 10 14:23:23 OCPSCH0104001MS ERR orchagent: :- meta_sai_validate_oid: oid is set to null object id
Feb 10 14:23:23 OCPSCH0104001MS ERR orchagent: :- removeRouterIntfs: Failed to remove router interface for port Ethernet0, rv:-5
Feb 10 14:23:23 OCPSCH0104001MS ERR orchagent: :- main: Failed due to exception: Failed to remove router interface.
Feb 10 14:23:23 OCPSCH0104001MS INFO docker[1068]: terminate called without an active exception
Feb 10 14:23:23 OCPSCH0104001MS INFO kernel: [ 1264.781525] Core dump to |/usr/bin/coredump-compress orchagent 91 pipe failed

teamsyncd fails on start

● teamd.service - TEAMD container
Loaded: loaded (/etc/systemd/system/teamd.service; enabled)
Active: active (running) since Fri 2017-03-03 15:05:23 UTC; 4min 51s ago
Process: 14282 ExecStop=/usr/bin/teamd.sh stop (code=exited, status=0/SUCCESS)
Main PID: 14315 (teamd.sh)
CGroup: /system.slice/teamd.service
├─14315 /bin/bash /usr/bin/teamd.sh start
└─14324 docker start -a teamd

Mar 03 15:05:26 arc-switch1026 teamd.sh[14315]: This program is not intended to be run as root.
Mar 03 15:05:26 arc-switch1026 teamd.sh[14315]: This program is not intended to be run as root.
Mar 03 15:05:26 arc-switch1026 teamd.sh[14315]: This program is not intended to be run as root.
Mar 03 15:05:26 arc-switch1026 teamd.sh[14315]: This program is not intended to be run as root.
Mar 03 15:05:26 arc-switch1026 teamd.sh[14315]: This program is not intended to be run as root.
Mar 03 15:05:27 arc-switch1026 teamd.sh[14315]: This program is not intended to be run as root.
Mar 03 15:05:27 arc-switch1026 teamd.sh[14315]: This program is not intended to be run as root.
Mar 03 15:05:27 arc-switch1026 teamd.sh[14315]: This program is not intended to be run as root.
Mar 03 15:05:27 arc-switch1026 teamd.sh[14315]: terminate called after throwing an instance of 'std::system_error'
Mar 03 15:05:27 arc-switch1026 teamd.sh[14315]: what(): Unable to connect to redis (unixs-socket): Cannot assign requested address

docker connect failure in the build

https://sonic-jenkins.westus.cloudapp.azure.com/job/mellanox/job/buildimage-mlnx-all-pr/277/console

  • BUILD_TEMPLATES=files/build_templates
  • IMAGE_CONFIGS=files/image_config
  • trap_push clean_sys
  • local oldcmd=true
  • local 'newcmd=clean_sys; true'
  • trap -- 'clean_sys; true' EXIT INT TERM HUP
  • _trap_push 'clean_sys; true'
  • local 'next=clean_sys; true'
    ++ sed -e 's/'''/'''\''''''/g'
    ++ echo 'clean_sys; true'
  • eval 'trap_push() {
    local oldcmd='''clean_sys; true'''
    local newcmd="$1; $oldcmd"
    trap -- "$newcmd" EXIT INT TERM HUP
    _trap_push "$newcmd"
    }'
  • sudo LANG=C chroot ./fsroot mount sysfs /sys -t sysfs
  • sudo chroot ./fsroot service docker start
    mount: cgroup is already mounted or /sys/fs/cgroup/cpu busy
    cgroup is already mounted on /sys/fs/cgroup
    mount: cgroup is already mounted or /sys/fs/cgroup/cpuacct busy
    cgroup is already mounted on /sys/fs/cgroup
    mount: cgroup is already mounted or /sys/fs/cgroup/net_cls busy
    cgroup is already mounted on /sys/fs/cgroup
    mount: cgroup is already mounted or /sys/fs/cgroup/net_prio busy
    cgroup is already mounted on /sys/fs/cgroup
    Starting Docker: docker.
  • sudo chroot ./fsroot docker version
    Client:
    Version: 1.11.1
    API version: 1.23
    Go version: go1.5.4
    Git commit: 5604cbe
    Built: Tue Apr 26 23:11:07 2016
    OS/Arch: linux/amd64
    Cannot connect to the Docker daemon. Is the docker daemon running on this host?

The switch will be crashed when inputting "reboot" command under host

Hi all,

I found that there is a trace back issue in master (commitment 1491bf9). When user inputting "reboot" command under host, the switch will be crashed. After investigating, it maybe relates to commitment a877603 (Merge swss and syncd into single service)

The following is the error message.

[  207.772520] BUG: unable to handle kernel paging request at ffffffffa03ff0f0
[  207.780338] IP: [<ffffffff811a746f>] filp_close+0x1f/0x70
[  207.786396] PGD 1816067 PUD 1817063 PMD 27346a067 PTE 0
[  207.792260] Oops: 0000 [#1] SMP 
[  207.795872] Modules linked in: eeprom_mb(O) eeprom w83795 jc42 coretemp bridge stp llc i2c_mux_pca954x i2c_mux i2c_dev i2c_ismt i2c_i801 kvm_intel kvm crc32_pclmul xt_conntrack iTCO_wdt iTCO_vendor_support iptable_filter ipt_MASQUERADE aesni_intel aes_x86_64 xt_addrtype lrw gf128mul glue_helper ablk_helper lpc_ich mfd_core evdev cryptd serio_raw pcspkr iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 ipmi_msghandler tpm_tis tpm button nf_nat_ipv4 shpchp i2c_core nf_nat acpi_cpufreq nf_conntrack processor thermal_sys ip_tables x_tables autofs4 loop ext4 crc16 mbcache jbd2 nls_utf8 nls_cp437 vfat fat aufs(C) squashfs sg sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel ahci libahci libata ehci_pci ehci_hcd scsi_mod igb(O) usbcore usb_common dca ptp pps_core [last unloaded: linux_kernel_bde]
[  207.877421] CPU: 1 PID: 1555 Comm: syncd Tainted: G         C O  3.16.0-4-amd64 #1 Debian 3.16.36-1+deb8u2
[  207.896132] task: ffff880036dbe190 ti: ffff8802731bc000 task.ti: ffff8802731bc000
[  207.904518] RIP: 0010:[<ffffffff811a746f>]  [<ffffffff811a746f>] filp_close+0x1f/0x70
[  207.913283] RSP: 0018:ffff8802731bfce0  EFLAGS: 00010246
[  207.919232] RAX: ffffffffa03ff080 RBX: ffff88027349ff00 RCX: 0000000000000027
[  207.927227] RDX: ffff880272466858 RSI: ffff880272466800 RDI: ffff88027349ff00
[  207.935215] RBP: ffff880272466800 R08: ffff8802731bc000 R09: 000000000000b8fe
[  207.943211] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[  207.951207] R13: ffff880272466800 R14: 0000000000000001 R15: ffff880272466810
[  207.959205] FS:  00007f4388398740(0000) GS:ffff88027fc80000(0000) knlGS:0000000000000000
[  207.968274] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  207.974711] CR2: ffffffffa03ff0f0 CR3: 000000027317b000 CR4: 00000000001007e0
[  207.982708] Stack:
[  207.984956]  00000000000002ff 0000000000000027 0000000000000000 ffffffff811c5898
[  207.993262]  ffff880036dbe810 ffff880272a67400 ffff880272a67460 0000000000000005
[  208.001566]  ffff880036fe8760 ffff880036dbe190 ffffffff81069c0e ffff8802731bff58
[  208.009872] Call Trace:
[  208.012602]  [<ffffffff811c5898>] ? put_files_struct+0x78/0xc0
[  208.019142]  [<ffffffff81069c0e>] ? do_exit+0x28e/0xa70
[  208.024997]  [<ffffffff8106a469>] ? do_group_exit+0x39/0xa0
[  208.031241]  [<ffffffff81078928>] ? get_signal_to_deliver+0x1c8/0x5d0
[  208.038461]  [<ffffffff81013492>] ? do_signal+0x42/0xa10
[  208.044412]  [<ffffffff810779c4>] ? do_send_sig_info+0x54/0x70
[  208.050949]  [<ffffffff81013ed8>] ? do_notify_resume+0x78/0xa0
[  208.057488]  [<ffffffff81518a8a>] ? int_signal+0x12/0x17
[  208.063437] Code: 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 55 53 48 8b 47 38 48 89 fb 48 85 c0 74 44 48 8b 47 28 45 31 e4 48 89 f5 <48> 8b 40 70 48 85 c0 74 05 ff d0 41 89 c4 f6 43 45 40 75 16 48 
[  208.084896] RIP  [<ffffffff811a746f>] filp_close+0x1f/0x70
[  208.091039]  RSP <ffff8802731bfce0>
[  208.094942] CR2: ffffffffa03ff0f0
[  208.098652] ---[ end trace 7dcc5bf3c94d2437 ]---

Does anyone encounter this issue?

[swss] 'oper_status' attribute can have a value of '2'?

The 'oper_status' attribute of a port in our DB has three potential values: 'up', 'down' and '2'.

Can reproduce on a T0 setup. Simply 'ifdown' a port that's part of the VLAN. Reading the database rapidly, you will see the 'oper_status' change from 'up' to 'down' and then quickly to '2', where it will stay.

Likewise, bringing the port back up using 'ifup', you will see the 'oper_status' change from '2' to 'down' and then to 'up'.

Jenkins builds failing due to lack of free space

https://sonic-jenkins.westus.cloudapp.azure.com/job/common/job/buildimage-baseimage-pr/413/:

make[4]: Entering directory '/sonic/src/sonic-linux-kernel/linux-3.16.36'
dh_installdocs 
dh_installchangelogs
dh_strip
dh_compress
dh_fixperms
dh_installdeb
dh_gencontrol -- -Vkernel:Recommends=irqbalance,
dh_md5sums
dh_builddeb -- -Zxz 
dpkg-deb: building package `linux-image-3.16.0-4-amd64' in `../linux-image-3.16.0-4-amd64_3.16.36-1+deb8u2_amd64.deb'.
make[4]: Leaving directory '/sonic/src/sonic-linux-kernel/linux-3.16.36'
dpkg-deb (subprocess): compressing data member: lzma write error: No space left on device
dpkg-deb: error: subprocess <compress> from tar -cf returned error exit status 2
dh_builddeb: dpkg-deb -Zxz --build debian/linux-image-3.16.0-4-amd64-dbg .. returned exit code 2
debian/rules.real:173: recipe for target 'install-base' failed

https://sonic-jenkins.westus.cloudapp.azure.com/job/broadcom/job/buildimage-brcm-all-pr/83/:

running build
running build_ext
error: [Errno 28] No space left on device: 'build/temp.linux-x86_64-3.6-pydebug'
Makefile:627: recipe for target 'sharedmods' failed
make[3]: *** [sharedmods] Error 1

https://sonic-jenkins.westus.cloudapp.azure.com/job/cavium/job/buildimage-cavm-all-pr/410/:

/tmp/ccJe73Wu.s: Assembler messages:
/tmp/ccJe73Wu.s: Fatal error: can't write tests-converter_ut.o: No space left on device

bgp container not started due to database

admin@str-s6000-acs-7:~$ sudo systemctl status bgp
â—� bgp.service - BGP container
   Loaded: loaded (/etc/systemd/system/bgp.service; enabled)
   Active: failed (Result: exit-code) since Wed 2017-03-08 10:55:45 UTC; 21h ago
 Main PID: 802 (code=exited, status=125)

Mar 08 10:55:39 str-s6000-acs-7 systemd[1]: Started BGP container.
Mar 08 10:55:42 str-s6000-acs-7 bgp.sh[802]: docker: Error response from daemon: No such container: database.
Mar 08 10:55:42 str-s6000-acs-7 bgp.sh[802]: See 'docker run --help'.
Mar 08 10:55:42 str-s6000-acs-7 systemd[1]: bgp.service: main process exited, code=exited, status=125/n/a
Mar 08 10:55:45 str-s6000-acs-7 bgp.sh[903]: Error response from daemon: No such container: bgp
Mar 08 10:55:45 str-s6000-acs-7 systemd[1]: bgp.service: control process exited, code=exited status=1
Mar 08 10:55:45 str-s6000-acs-7 systemd[1]: Unit bgp.service entered failed state.

[dockers]: "docker stop" command fails on dockers with shell entrypoints

docker stop sends a signal to the process with PID of 1 inside the docker. However, if the docker has an entrypoint of /bin/bash running one or more commands, /bin/bash becomes PID 1 and the signal will be received by /bin/bash but it will not be forwarded along to processes running in the shell. This causes docker stop to timeout instead of shutting the docker down properly.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.