


    Scaling Red Hat OpenStack Platform to more than 500 Overcloud Nodes

    At Red Hat, performance and scale are treated as first class citizens and a lot of time and effort are put into making sure our products scale. We have a dedicated team of performance and scale engineers that work closely with product management, developer


     디렉터에서 배포한 compute노드가 150대 이상 될때 쯤 신규 노드 배포시 여러가지 이슈들이 발생했다.
    상기 원문의 내용 중 필요할만한 내용만 추려본다.



    • /etc/keystone/keystone.conf
      •  We raised the number of Keystone admin workers to 32 and main workers to 24
      • 기본값은 Director 노드에 할당된 CPU의 절반 수가 설정됨
    [root@rhosp-director ~]# vi /etc/keystone/keystone.conf


    • /etc/httpd/conf.d/10-keystone_wsgi_admin.conf
      •  process수를 32로 변경
    • /etc/httpd/conf.d/10-keystone_wsgi_main.conf
      • process수를 24로 변경
    [root@rhosp-director ~]# vi /etc/httpd/conf.d/10-keystone_wsgi_admin.conf
      WSGIDaemonProcess keystone_admin display-name=keystone-admin group=keystone processes=32 threads=1 user=keystone
    [root@rhosp-director ~]# vi /etc/httpd/conf.d/10-keystone_wsgi_main.conf
      WSGIDaemonProcess keystone_main display-name=keystone-main group=keystone processes=24 threads=1 user=keystone
    "Keystone processes do not take a substantial amount of memory,  so it is safe to increase the process count. Even with 32 processes of admin workers, keystone admin takes around 3-4 GB of memory and with 24 processes, Keystone main takes around 2-3 GB of RSS memory."


    •   We also had to enable caching with memcached to improving Keystone performance.(memcached를 cache로 사용하도록 설정)
    [root@rhosp-director ~]# vi /etc/keystone/keystone.conf
    enabled = true
    backend = dogpile.cache.memcached
    • notification driver를 noop으로 설정
    [root@rhosp-director ~]# vi /etc/keystone/keystone.conf


    • /etc/heat/heat.conf
      • num_engine_workers=48
      • executor_thread_pool_size = 48
      • rpc_response_timeout=1200
    [root@rhosp-director ~]# vi /etc/heat/heat.conf
    executor_thread_pool_size = 48
    • enable caching in /etc/heat/heat.conf
    [root@rhosp-director ~]# vi /etc/heat/heat.conf
    backend = dogpile.cache.memcached
    enabled = true
    memcache_servers =


    • /etc/my.cnf.d/galera.cnf
    [root@rhosp-director ~]# vi /etc/my.cnf.d/galera.cnf
    innodb_buffer_pool_size = 5G


    • /etc/neutron/neutron.conf
    [root@rhosp-director ~]# vi /etc/neutron/neutron.conf


    • /etc/ironic/ironic.conf
      • ironic-conductor의 CPU 사용량을 감소시켜주는 효과
    [root@rhosp-director ~]# vi /etc/ironic/ironic.conf
    sync_power_state_interval = 180


    • /etc/mistral/mistral.conf
      • execution_field_size_limit_kb 값을 증가시켜야 함. (정해진 값이 있는것은 아닌것 같고 환경에 따라 증가시켜야 할듯)
    [root@rhosp-director ~]# vi /etc/mistral/mistral.conf


    • /etc/nova/nova.conf
    [root@rhosp-director ~]# vi /etc/nova/nova.conf




    > In OpenStack Queens, director/TripleO defaults to use an agent running on each overcloud node called os-collect-config. This agent periodically polls the undercloud Heat API for software configuration changes that need to be applied to the node. The os-collect-config agent runs os-refresh-config and os-apply-config as needed whenever new software configuration changes are detected. 

    compute 노드는 os-collect-config 서비스가 기동되며, 주기적으로 디렉터의 Heat API 요청을 polling 함
    [root@rhosp-comp-1 ~]# systemctl status os-*
    ● os-collect-config.service - Collect metadata and run hook commands.
       Loaded: loaded (/usr/lib/systemd/system/os-collect-config.service; enabled; vendor preset: disabled)
       Active: active (running) since Tue 2021-05-18 21:48:15 KST; 1 weeks 0 days ago
     Main PID: 2292 (os-collect-conf)
        Tasks: 1
       Memory: 161.5M
       CGroup: /system.slice/os-collect-config.service
               └─2292 /usr/bin/python /usr/bin/os-collect-config


    > To add 1 compute node to a 500+ node overcloud using the default Heat/os-collect-config method, the stack update took approximately 68 minutes. The time taken for stack update as well as the amount of CPU resources consumed by the heat-engine can be significantly cut down by passing the --skip-deploy-identifier flag to the overcloud deploy which prevents puppet from running on previously deployed nodes where no changes are required. In this example, the time taken to add 1 compute node was reduced from 68 to 61 minutes along with reduced CPU usage by heat-engine.

    'Openstack' 카테고리의 다른 글

    KVM vs QEMU  (0) 2021.06.05
    instance의 interface, mac 정보 조회  (0) 2021.05.28
    horizon - multiple domain enable  (0) 2021.05.26
    glance image customize  (0) 2021.05.26
    redis password 확인  (0) 2021.05.26
    • 네이버 블러그 공유하기
    • 네이버 밴드에 공유하기
    • 페이스북 공유하기
    • 카카오스토리 공유하기