Public:Honza801/debian cluster
testovani
probiha na jessie.
peacemaker + corosync
- musi se nabrat ze sid/stretch.
- nefunguje clvm (mozna chyba nastaveni???) - zkusit znova
- problem s clvm locking_type=3 a lvm2-lvmetad (lvmetad)
- nejde ani po stahnutim zdrojaku (apt-get source), prekompilovani a opravy Makefile pro lvm2 (chybela nejaka knihovna)
cman
- funguje z jessie
- baliky
xen* 4.4.1-9+deb8u1
- baliky
- funguje i clvm (bez snapshotu)
- problem pri migraci virtualu
- timeoutovani virsh obejdes:
virsh -k10 -K10 migrate --live docker xen+ssh://nesoi1/
- pak stejne error
- timeoutovani virsh obejdes:
libxl: error: libxl.c:855:libxl_domain_unpause: unpausing domain 5: Invalid argument
cman + kvm
instalaci lze provest vymenovanim baliku cman, fence-agents ... nebo (asi lepsi reseni) instalaci baliku redhat-cluster-suite
. vezme sebou sice vice zavislosti, ale stejne jsou v konecnym vysledku potreba.
apt-get install -y redhat-cluster-suite
Pro bootovani z partisny je treba vytvorit grub a dat jako kernel parametr k virtualu
grub-mkimage -O i386-pc -o grub.img --prefix="(hd0)/boot/grub" part_msdos ext2 xfs biosdisk
je potreba prehodit xvda za sda (pokud migruju virtual z xenu)
- jessie libvirt ma problem s systemd.
# virsh start docker error: Failed to start domain docker error: error from service: CreateMachine: Activation of org.freedesktop.machine1 timed out
- sysvinit funguje vcetne
- lvm, clvm
- kvm, live migrace
- gfs2 hraje dobre
Konfigurace v clusteru v souboru /etc/cluster/cluster.conf
<?xml version="1.0"?> <cluster name="nesoi" config_version="3"> <clusternodes> <clusternode name="nesoi1" nodeid="1"> </clusternode> <clusternode name="nesoi2" nodeid="2"> </clusternode> <clusternode name="nesoi3" nodeid="3"> </clusternode> </clusternodes> <logging debug="on"/> <dlm protocol="tcp" timewarn="500"> </dlm> <fencedevices> </fencedevices> </cluster>
libvirt
ovladani
https://libvirt.org/virshcmdref.html
virsh start virsh stop virsh list virsh define virsh undefine
web manager
- mist.io - reseni pro velke cloudy, potreba registrovat ucet na mist.io
- ovirt - na miru rhel, potrebuje ovirt agenta (neni pro debian), velke a slozite
- ganeti
- archipel - komunikuje pres xmpp, nevhodne
webvirtcloud
mala django aplikace pouzivajici libvirt/ssh. fork projektu s nasimi upravami jsou na https://github.com/honza801/webvirtcloud.
original pouziva nginx, takze musime vymyslet konfiguraci apache.
<Directory /srv/webvirtcloud/static> Require all granted </Directory> Alias /static /srv/webvirtcloud/static SSLProxyEngine On ProxyRequests Off ProxyPreserveHost On RequestHeader set X-Forwarded-Proto "https" ProxyPass /static ! ProxyPass / http://127.0.0.1:8000/ ProxyPassReverse / http://127.0.0.1:8000/
linux
vyroba noveho template
spustit rescue s prazdnym diskem
qemu-img create -f qcow2 debian9-root.qcow2 10G virt-rescue -a debian9-root.qcow2 --network -m 1024
uvnitr pustit sit a zalozit fs
dhclient eth0 mkfs.xfs /dev/sda mount /dev/sda /sysroot
pripravit base system
debootstrap stretch /sysroot ftp://ftp.zcu.cz/pub/linux/debian mount -t proc none /sysroot/proc mount -t sysfs none /sysroot/sys mount /dev /sysroot/dev --bind chroot /sysroot bash
pridame rozumny repo
cat > /etc/apt/sources.list <<EOF deb http://ftp.zcu.cz/pub/linux/debian stretch main non-free contrib deb http://ftp.zcu.cz/pub/linux/debian stretch-updates main contrib non-free deb http://download.zcu.cz/public/software/linux/debian stable main EOF apt-get update apt-get install apt-transport-https apt-get update
v imagi by mohly byt baliky (nakonfigurovane)
apt-get -o Dpkg::Options::="--force-confdef" -y install acpid bash-completion cfengine-community curl grub2 krb5-user less ntp netfilter-persistent openssh-server openafs-client python perl linux-image-amd64 vim xfsprogs dpkg-reconfigure tzdata update-grub passwd -d root
/etc/ntp.conf /etc/krb5.conf /etc/openafs/*
cat > /etc/network/interfaces.d/ens3 <EOF auto ens3 iface ens3 inet dhcp EOF
nakopirovat guest-init (roztahne disk, nastavi hostname, static ip) a pridat do rc.local
widle
funguji standardni typy devicu, ale pro virtio musime pridat drivery do image
- vyrobit virtual z image na afs
- udelat image
- udelat partisny
- nalejt image na partisnu, resize
- nalejt boot na loop
- vyrobit virtual s ide imagem a jeste dalsim diskem (napartisnovanym, formatovanym)
- pridat jeste virtio disk
- pripojit cd s virtio driverama
- pustit virtual
- nainstalovat virtio drivery (viostor, vioser, NetKVM, baloon)
- pustit sysprep, vypnout
- vyrobit virtio-image
- zkopirovat copy-sysprep-config
- zkopirovat skripty pro roztahnuti filesystemu extend_disk.txt setupcomplete.cmd do /Windows/Setup/scripts
- resize
- dd
- vyrobit virtual z virtio-image
- https://fedoraproject.org/wiki/Windows_Virtio_Drivers
- https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Host_Configuration_and_Guest_Installation_Guide/form-Virtualization_Host_Configuration_and_Guest_Installation_Guide-Para_virtualized_drivers-Mounting_the_image_with_virt_manager.html
namety
- chceme fencing?
- jak vyrabet stroje - rt/formular/pustit vsechny na cloud
- jak davat vedet (a komu) ze byl vyroben nejaky stroj - viz problem zalohovani
sprava
par dobrych rad :)
cluster
- konfigurace v
/etc/cluster/cluster.conf
- service cman start
- clustat
clvm
- /etc/lvm/lvm.conf: locking_type = 3
- service clvm start
- vgs/lvs
gfs
- service gfs2-cluster start
- mkfs.gfs2 -t nesoi:gfs2-single -p lock_dlm -j 3 /dev/vg-single/gfs2
- mount /dev/vg-single/gfs2 /mnt
libvirt
- konfigurace v
/gfs/libvirt/etc/nesoi?/
- virtualy
/gfs/libvirt/etc/nesoi?/qemu/
- image
/gfs/libvirt/storage/
- virtualy
- service libvirtd start
- virsh help
- start/stop
- destroy
- undefine
- virsh list [--all]
web
https://cloud.civ.zcu.cz je to virtual na nesoi clusteru.
- virsh start cloud
uprava template
Templaty mame v qcow2, takze nejdou rovnou primountovat. Musime pouzit qemu-nbd
.
qemu-nbd -c /dev/nbd0 template0.qcow mount /dev/nbd0 /mnt/disk # chroot/edit umount /dev/nbd0 qemu-nbd -d /dev/nbd0
viz. mount_qcow.sh
pridani disku
virt-manager nedela qcow, i kdyz ho zvolim v menu, takze ruco
virsh vol-create-as gfs-single jmeno_virtualu-mnt.qcow2 100G --format qcow2 virsh attach-disk jmeno_virtualu /gfs-single/libvirt/storage/jmeno_virtualu-mnt.qcow2 vdb virtio qemu qcow2
problemy
nesestaveni cluster
muze se stat, ze pri spadnuti jednoho nodu se spravne nesestavi cluste. jak se to pozna? neni primountovane /gfs. clustat, virsh list --all nic neukazuje. cloud.civ neukazuje zadne virtualy na restartlem nodu. nutne rucne nahodit sluzby ve spravnem poradi. a restartnout libvirt, protoze konfigurace je ulozena na gfs.
- service cman start
- service clvm start
- service gfs2-cluster start
- service gfs2-utils start
- service libvirtd restart
kontrola
- virsh list --all
spadne web
jak to poznam? nekde se pripojit na web. reseni:
- projedu vsechny nody a kouknu, jestli nahodou nebezi virtual cloud
- kdyz jsem si jistej, ze nebezi, proste ho pustim na tom nodu, kde je definovan (nesoi3)
- ssh nesoi3 virsh start cloud
kontrola:
virtual se zablokuje
- nezavisle na typu storage (lvm,gfs,local)
- nezavisle na driveru (ide,sata,virtio)
- projevuje se pri sync
- nejspis problemy s cache
- mount sync na hostu nepomohlo, pomale pri lokalnich operacich
- mount sync na guestu nepomohlo
- qemu iocache directsync se zda funguje dobre
viz.
- https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Tuning_and_Optimization_Guide/sect-Virtualization_Tuning_Optimization_Guide-BlockIO-Caching.html
- https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Tuning_and_Optimization_Guide/sect-Virtualization_Tuning_Optimization_Guide-BlockIO-IO_mode.html
- http://www.ibm.com/support/knowledgecenter/linuxonibm/liaat/liaatbpkvmguestcache.htm
web console hlasi error 1006
nejspis je spatne certifikat v /srv/webvirtcloud/console/cert.pem
nahozeni inquorate cluster
Pokud mi spadne vetsina nodu a jsem fakt tak zoufalej, ze potrebuju pustit veci na inquorate cluster.
Rekneme, ze mam 3 node cluster a pustit muzu jen jeden node.
service cman start cman_tool votes -v 2 # pocet hlasu musi odpovidat poctu Quorum v cman_tool status service cman start # opravdu je to tu podruhy fence_ack_manual node02 fence_ack_manual node03 service clvm start service gfs2-cluster start service gfs2-utils start
Kdyz uz muzu pustit dalsi nody, tak postaci nahodit servicy.
Zkontrolovat cman_tool status
na pocty hlasu a opravit.
odkazy
- https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/High_Availability_Add-On_Reference/ch-clustresources-HAAR.html
- https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html-single/Virtualization_Tuning_and_Optimization_Guide/index.html
- http://clusterlabs.org/quickstart-ubuntu.html
- https://wiki.debian.org/Debian-HA/ClustersFromScratch
- http://ppa.mmogp.com/apt/debian/
- https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Logical_Volume_Manager_Administration/LVM_administration.html#cluster_setup
- CIV:Granty/Optimalizace_správy_a_zabezpečení_virtuálních_strojů
- http://fondrozvoje.cesnet.cz/projekt.aspx?ID=571