<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Cerberus on Krkn</title><link>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/</link><description>Recent content in Cerberus on Krkn</description><generator>Hugo</generator><language>en</language><atom:link href="https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/index.xml" rel="self" type="application/rss+xml"/><item><title>Installation</title><link>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/installation/</link><pubDate>Thu, 05 Jan 2017 00:00:00 +0000</pubDate><guid>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/installation/</guid><description>&lt;p>Following ways are supported to run Cerberus:&lt;/p>
&lt;ul>
&lt;li>Standalone python program through Git or python package&lt;/li>
&lt;li>Containerized version using either Podman or Docker as the runtime&lt;/li>
&lt;li>Kubernetes or OpenShift deployment&lt;/li>
&lt;/ul>


&lt;div class="alert alert-primary" role="alert">
&lt;h4 class="alert-heading">Note&lt;/h4>

 Only OpenShift 4.x versions are tested.

&lt;/div>

&lt;h2 id="git">
 Git
 &lt;a class="td-heading-self-link" href="#git" aria-label="Heading self-link">&lt;/a>
&lt;/h2>
&lt;p>Pick the latest stable release to install &lt;a href="https://github.com/redhat-chaos/cerberus/releases/">here&lt;/a>.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">$ git clone https://github.com/redhat-chaos/cerberus.git --branch &amp;lt;release&amp;gt;
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="install-the-dependencies">
 Install the dependencies
 &lt;a class="td-heading-self-link" href="#install-the-dependencies" aria-label="Heading self-link">&lt;/a>
&lt;/h3>
&lt;p>&lt;strong>NOTE&lt;/strong>: Recommended to use a virtual environment(pyenv,venv) so as to prevent conflicts with already installed packages.&lt;/p></description></item><item><title>Config</title><link>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/config/</link><pubDate>Thu, 05 Jan 2017 00:00:00 +0000</pubDate><guid>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/config/</guid><description>&lt;p>Cerberus Config Components Explained&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/config/#config">Sample Config&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/config/#watch-nodes">Watch Nodes&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/config/#watch-cluster-operators">Watch Operators&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/config/#watch-routes">Watch Routes&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/config/#watch-master-schedulable-status">Watch Master Schedulable Status&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/config/#watch-namespaces">Watch Namespaces&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/config/#watch-terminating-namespaces">Watch Terminating Namespaces&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/config/#publish-status">Publish Status&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/config/#inspect-components">Inspect Components&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/config/#custom-checks">Custom Checks&lt;/a>&lt;/li>
&lt;/ul>
&lt;h3 id="config">
 Config
 &lt;a class="td-heading-self-link" href="#config" aria-label="Heading self-link">&lt;/a>
&lt;/h3>
&lt;p>Set the components to monitor and the tunings like duration to wait between each check in the config file located at config/config.yaml. A sample config looks like:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-yaml" data-lang="yaml">&lt;span class="line">&lt;span class="cl">&lt;span class="nt">cerberus&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">distribution&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">openshift &lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Distribution can be kubernetes or openshift&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">kubeconfig_path&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">/root/.kube/config &lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Path to kubeconfig&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">port&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="m">8081&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># http server port where cerberus status is published&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">watch_nodes&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">True&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Set to True for the cerberus to monitor the cluster nodes&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">watch_cluster_operators&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">True&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Set to True for cerberus to monitor cluster operators&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">watch_terminating_namespaces&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">True&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Set to True to monitor if any namespaces (set below under &amp;#39;watch_namespaces&amp;#39; start terminating&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">watch_url_routes&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c"># Route url&amp;#39;s you want to monitor, this is a double array with the url and optional authorization parameter&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">watch_master_schedulable&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># When enabled checks for the schedulable master nodes with given label.&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">enabled&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">True&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">label&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">node-role.kubernetes.io/master&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">watch_namespaces&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># List of namespaces to be monitored&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-etcd&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-apiserver&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-kube-apiserver&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-monitoring&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-kube-controller-manager&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-machine-api&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-kube-scheduler&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-ingress&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-sdn &lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># When enabled, it will check for the cluster sdn and monitor that namespace&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">watch_namespaces_ignore_pattern&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Ignores pods matching the regex pattern in the namespaces specified under watch_namespaces&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">cerberus_publish_status&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">True&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># When enabled, cerberus starts a light weight http server and publishes the status&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">inspect_components&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">False&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Enable it only when OpenShift client is supported to run&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c"># When enabled, cerberus collects logs, events and metrics of failed components&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">prometheus_url&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># The prometheus url/route is automatically obtained in case of OpenShift, please set it when the distribution is Kubernetes.&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">prometheus_bearer_token&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># The bearer token is automatically obtained in case of OpenShift, please set it when the distribution is Kubernetes. This is needed to authenticate with prometheus.&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c"># This enables Cerberus to query prometheus and alert on observing high Kube API Server latencies.&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">slack_integration&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">False&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># When enabled, cerberus reports the failed iterations in the slack channel&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c"># The following env vars needs to be set: SLACK_API_TOKEN ( Bot User OAuth Access Token ) and SLACK_CHANNEL ( channel to send notifications in case of failures )&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c"># When slack_integration is enabled, a watcher can be assigned for each day. The watcher of the day is tagged while reporting failures in the slack channel. Values are slack member ID&amp;#39;s.&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">watcher_slack_ID: # (NOTE&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">Defining the watcher id&amp;#39;s is optional and when the watcher slack id&amp;#39;s are not defined, the slack_team_alias tag is used if it is set else no tag is used while reporting failures in the slack channel.)&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">Monday&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">Tuesday&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">Wednesday&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">Thursday&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">Friday&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">Saturday&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">Sunday&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">slack_team_alias&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># The slack team alias to be tagged while reporting failures in the slack channel when no watcher is assigned&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">custom_checks&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">custom_checks/custom_check_sample.py &lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Relative paths of files containing additional user defined checks&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="nt">tunings&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">timeout&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="m">20&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Number of seconds before requests fail&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">iterations&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="m">1&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Iterations to loop before stopping the watch, it will be replaced with infinity when the daemon mode is enabled&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">sleep_time&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="m">3&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Sleep duration between each iteration&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">kube_api_request_chunk_size&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="m">250&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Large requests will be broken into the specified chunk size to reduce the load on API server and improve responsiveness.&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">daemon_mode&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">True&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Iterations are set to infinity which means that the cerberus will monitor the resources forever&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">cores_usage_percentage&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="m">0.5&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Set the fraction of cores to be used for multiprocessing&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="nt">database&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">database_path&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">/tmp/cerberus.db &lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Path where cerberus database needs to be stored&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">reuse_database&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">False&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># When enabled, the database is reused to store the failures&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h4 id="watch-nodes">
 Watch Nodes
 &lt;a class="td-heading-self-link" href="#watch-nodes" aria-label="Heading self-link">&lt;/a>
&lt;/h4>
&lt;p>This flag returns any nodes where the KernelDeadlock is not set to False and does not have a &lt;code>Ready&lt;/code> status&lt;/p></description></item><item><title>Example Report</title><link>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/example_report/</link><pubDate>Thu, 05 Jan 2017 00:00:00 +0000</pubDate><guid>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/example_report/</guid><description>&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:06,393 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Starting cerberus
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:06,401 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Initializing client to talk to the Kubernetes cluster
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:06,434 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Fetching cluster info
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:06,739 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Publishing cerberus status at http://0.0.0.0:8080
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:06,753 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Starting http server at http://0.0.0.0:8080
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:06,753 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Daemon mode enabled, cerberus will monitor forever
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:06,753 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Ignoring the iterations &lt;span class="nb">set&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:25,104 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 4: Node status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:25,133 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 4: Etcd member pods status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:25,161 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 4: OpenShift apiserver status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:25,546 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 4: Kube ApiServer status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:25,717 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 4: Monitoring stack status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:25,720 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 4: Kube controller status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:25,746 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 4: Machine API components status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:25,945 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 4: Kube scheduler status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:25,963 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 4: OpenShift ingress status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:26,077 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 4: OpenShift SDN status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:26,077 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> HTTP requests served: &lt;span class="m">0&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:26,077 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Sleeping &lt;span class="k">for&lt;/span> the specified duration: &lt;span class="m">5&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:31,134 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 5: Node status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:31,162 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 5: Etcd member pods status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:31,190 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 5: OpenShift apiserver status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">127.0.0.1 - - &lt;span class="o">[&lt;/span>26/Mar/2020 22:05:31&lt;span class="o">]&lt;/span> &lt;span class="s2">&amp;#34;GET / HTTP/1.1&amp;#34;&lt;/span> &lt;span class="m">200&lt;/span> -
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:31,588 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 5: Kube ApiServer status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:31,759 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 5: Monitoring stack status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:31,763 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 5: Kube controller status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:31,788 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 5: Machine API components status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:31,989 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 5: Kube scheduler status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:32,007 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 5: OpenShift ingress status: True
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:32,118 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Iteration 5: OpenShift SDN status: False
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:32,118 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> HTTP requests served: &lt;span class="m">1&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:32,118 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Sleeping &lt;span class="k">for&lt;/span> the specified duration: &lt;span class="m">5&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">+--------------------------------------------------Failed Components--------------------------------------------------+
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-03-26 22:05:37,123 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Failed openshift sdn components: &lt;span class="o">[&lt;/span>&lt;span class="s1">&amp;#39;sdn-xmqhd&amp;#39;&lt;/span>&lt;span class="o">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-05-23 23:26:43,041 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> ------------------------- Iteration Stats ---------------------------------------------
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-05-23 23:26:43,041 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Time taken to run watch_nodes in iteration 1: 0.0996248722076416 seconds
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-05-23 23:26:43,041 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Time taken to run watch_cluster_operators in iteration 1: 0.3672499656677246 seconds
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-05-23 23:26:43,041 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Time taken to run watch_namespaces in iteration 1: 1.085144281387329 seconds
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-05-23 23:26:43,041 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Time taken to run entire_iteration in iteration 1: 4.107403039932251 seconds
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2020-05-23 23:26:43,041 &lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> ---------------------------------------------------------------------------------------
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div></description></item><item><title>Usage</title><link>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/usage/</link><pubDate>Thu, 05 Jan 2017 00:00:00 +0000</pubDate><guid>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/usage/</guid><description>&lt;h3 id="config">
 Config
 &lt;a class="td-heading-self-link" href="#config" aria-label="Heading self-link">&lt;/a>
&lt;/h3>
&lt;p>Set the supported components to monitor and the tunings like number of iterations to monitor and duration to wait between each check in the config file located at config/config.yaml. A sample config looks like:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-yaml" data-lang="yaml">&lt;span class="line">&lt;span class="cl">&lt;span class="nt">cerberus&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">distribution&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">openshift &lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Distribution can be kubernetes or openshift&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">kubeconfig_path&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">~/.kube/config &lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Path to kubeconfig&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">port&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="m">8080&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># http server port where cerberus status is published&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">watch_nodes&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">True&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Set to True for the cerberus to monitor the cluster nodes&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">watch_cluster_operators&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">True&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Set to True for cerberus to monitor cluster operators. Parameter is optional, will set to True if not specified&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">watch_url_routes&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Route url&amp;#39;s you want to monitor&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- - &lt;span class="l">https://...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="l">Bearer **** &lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># This parameter is optional, specify authorization need for get call to route&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- - &lt;span class="l">http://...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">watch_master_schedulable&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># When enabled checks for the schedulable&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">enabled&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">True&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">master nodes with given label.&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">label&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">node-role.kubernetes.io/master&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">watch_namespaces&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># List of namespaces to be monitored&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-etcd&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-apiserver&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-kube-apiserver&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-monitoring&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-kube-controller-manager&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-machine-api&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-kube-scheduler&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-ingress&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">openshift-sdn&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">cerberus_publish_status&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">True&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># When enabled, cerberus starts a light weight http server and publishes the status&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">inspect_components&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">False&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Enable it only when OpenShift client is supported to run.&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c"># When enabled, cerberus collects logs, events and metrics of failed components&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">prometheus_url&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># The prometheus url/route is automatically obtained in case of OpenShift, please set it when the distribution is Kubernetes.&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">prometheus_bearer_token&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># The bearer token is automatically obtained in case of OpenShift, please set it when the distribution is Kubernetes. This is needed to authenticate with prometheus.&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c"># This enables Cerberus to query prometheus and alert on observing high Kube API Server latencies.&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">slack_integration&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">False&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># When enabled, cerberus reports status of failed iterations in the slack channel&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c"># The following env vars need to be set: SLACK_API_TOKEN ( Bot User OAuth Access Token ) and SLACK_CHANNEL ( channel to send notifications in case of failures )&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c"># When slack_integration is enabled, a watcher can be assigned for each day. The watcher of the day is tagged while reporting failures in the slack channel. Values are slack member ID&amp;#39;s.&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">watcher_slack_ID: # (NOTE&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">Defining the watcher id&amp;#39;s is optional and when the watcher slack id&amp;#39;s are not defined, the slack_team_alias tag is used if it is set else no tag is used while reporting failures in the slack channel.)&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">Monday&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">Tuesday&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">Wednesday&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">Thursday&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">Friday&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">Saturday&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">Sunday&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">slack_team_alias&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># The slack team alias to be tagged while reporting failures in the slack channel when no watcher is assigned&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">custom_checks&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Relative paths of files containing additional user defined checks&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">custom_checks/custom_check_sample.py&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="w"> &lt;/span>&lt;span class="l">custom_check.py&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="nt">tunings&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">iterations&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="m">5&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Iterations to loop before stopping the watch, it will be replaced with infinity when the daemon mode is enabled&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">sleep_time&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="m">60&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Sleep duration between each iteration&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">kube_api_request_chunk_size&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="m">250&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Large requests will be broken into the specified chunk size to reduce the load on API server and improve responsiveness.&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">daemon_mode&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">True&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Iterations are set to infinity which means that the cerberus will monitor the resources forever&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">cores_usage_percentage&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="m">0.5&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Set the fraction of cores to be used for multiprocessing&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="nt">database&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">database_path&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">/tmp/cerberus.db &lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Path where cerberus database needs to be stored&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">reuse_database&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">False&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># When enabled, the database is reused to store the failures&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>

&lt;div class="alert alert-primary" role="alert">
&lt;h4 class="alert-heading">Note&lt;/h4>

 watch_namespaces support regex patterns. Any valid regex pattern can be used to watch all the namespaces matching the regex pattern. For example, &lt;code>^openshift-.*$&lt;/code> can be used to watch all namespaces that start with &lt;code>openshift-&lt;/code> or &lt;code>openshift&lt;/code> can be used to watch all namespaces that have &lt;code>openshift&lt;/code> in it.

&lt;/div>



&lt;div class="alert alert-primary" role="alert">
&lt;h4 class="alert-heading">Note&lt;/h4>

 The current implementation can monitor only one cluster from one host. It can be used to monitor multiple clusters provided multiple instances of Cerberus are launched on different hosts.

&lt;/div>



&lt;div class="alert alert-primary" role="alert">
&lt;h4 class="alert-heading">Note&lt;/h4>

 The components especially the namespaces needs to be changed depending on the distribution i.e Kubernetes or OpenShift. The default specified in the config assumes that the distribution is OpenShift. A config file for Kubernetes is located at config/kubernetes_config.yaml

&lt;/div></description></item><item><title>Alerts</title><link>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/alerts/</link><pubDate>Thu, 05 Jan 2017 00:00:00 +0000</pubDate><guid>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/alerts/</guid><description>&lt;p>Cerberus consumes the metrics from Prometheus deployed on the cluster to report the alerts.&lt;/p>
&lt;p>When provided the prometheus url and bearer token in the config, Cerberus reports the following alerts:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>KubeAPILatencyHigh: alerts at the end of each iteration and warns if 99th percentile latency for given requests to the kube-apiserver is above 1 second. It is the official SLI/SLO defined for Kubernetes.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>High number of etcd leader changes: alerts the user when an increase in etcd leader changes are observed on the cluster. Frequent elections may be a sign of insufficient resources, high network latency, or disruptions by other components and should be investigated.&lt;/p></description></item><item><title>Node Problem Detector</title><link>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/node-problem-detector/</link><pubDate>Thu, 05 Jan 2017 00:00:00 +0000</pubDate><guid>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/node-problem-detector/</guid><description>&lt;p>&lt;a href="https://github.com/kubernetes/node-problem-detector">node-problem-detector&lt;/a> aims to make various node problems visible to the upstream layers in cluster management stack.&lt;/p>
&lt;h3 id="installation">
 Installation
 &lt;a class="td-heading-self-link" href="#installation" aria-label="Heading self-link">&lt;/a>
&lt;/h3>
&lt;p>Please follow the instructions in the &lt;a href="https://github.com/kubernetes/node-problem-detector#installation">installation&lt;/a> section to setup Node Problem Detector on Kubernetes. The following instructions are setting it up on OpenShift:&lt;/p>
&lt;ol>
&lt;li>Create &lt;code>openshift-node-problem-detector&lt;/code> namespace &lt;a href="https://github.com/openshift/node-problem-detector-operator/blob/master/deploy/ns.yaml">ns.yaml&lt;/a> with &lt;code>oc create -f ns.yaml&lt;/code>&lt;/li>
&lt;li>Add cluster role with &lt;code>oc adm policy add-cluster-role-to-user system:node-problem-detector -z default -n openshift-node-problem-detector&lt;/code>&lt;/li>
&lt;li>Add security context constraints with &lt;code>oc adm policy add-scc-to-user privileged system:serviceaccount:openshift-node-problem-detector:default &lt;/code>&lt;/li>
&lt;li>Edit &lt;a href="https://github.com/kubernetes/node-problem-detector/blob/master/deployment/node-problem-detector.yaml">node-problem-detector.yaml&lt;/a> to fit your environment.&lt;/li>
&lt;li>Edit &lt;a href="https://github.com/kubernetes/node-problem-detector/blob/master/deployment/node-problem-detector-config.yaml">node-problem-detector-config.yaml&lt;/a> to configure node-problem-detector.&lt;/li>
&lt;li>Create the ConfigMap with &lt;code>oc create -f node-problem-detector-config.yaml&lt;/code>&lt;/li>
&lt;li>Create the DaemonSet with &lt;code>oc create -f node-problem-detector.yaml&lt;/code>&lt;/li>
&lt;/ol>
&lt;p>Once installed you will see node-problem-detector pods in openshift-node-problem-detector namespace.
Now enable openshift-node-problem-detector in the &lt;a href="https://github.com/openshift-scale/cerberus/blob/master/config/config.yaml">config.yaml&lt;/a>.
Cerberus just monitors &lt;code>KernelDeadlock&lt;/code> condition provided by the node problem detector as it is system critical and can hinder node performance.&lt;/p></description></item><item><title>Slack Integration</title><link>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/slack/</link><pubDate>Thu, 05 Jan 2017 00:00:00 +0000</pubDate><guid>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/slack/</guid><description>&lt;p>The user has the option to enable/disable the slack integration ( disabled by default ). To use the slack integration, the user has to first create an &lt;a href="https://api.slack.com/apps?new_granular_bot_app=1">app&lt;/a> and add a bot to it on slack. SLACK_API_TOKEN and SLACK_CHANNEL environment variables have to be set. SLACK_API_TOKEN refers to Bot User OAuth Access Token and SLACK_CHANNEL refers to the slack channel ID the user wishes to receive the notifications. Make sure the Slack Bot Token Scopes contains this permission [calls:read] [channels:read] [chat:write] [groups:read] [im:read] [mpim:read]&lt;/p></description></item><item><title>Contribute</title><link>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/contribute/</link><pubDate>Thu, 05 Jan 2017 00:00:00 +0000</pubDate><guid>https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/contribute/</guid><description>&lt;h3 id="how-to-contribute">
 How to contribute
 &lt;a class="td-heading-self-link" href="#how-to-contribute" aria-label="Heading self-link">&lt;/a>
&lt;/h3>
&lt;p>Contributions are always appreciated.&lt;/p>
&lt;p>How to:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/contribute/#pull-request">Submit Pull Request&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/contribute/#fix-formatting">Fix Formatting&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://deploy-preview-247--krkn-chaos.netlify.app/docs/cerberus/contribute/#squash-commits">Squash Commits&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="pull-request">
 Pull request
 &lt;a class="td-heading-self-link" href="#pull-request" aria-label="Heading self-link">&lt;/a>
&lt;/h2>
&lt;p>In order to submit a change or a PR, please fork the project and follow instructions:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">$ git clone http://github.com/&amp;lt;me&amp;gt;/cerberus
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">$ &lt;span class="nb">cd&lt;/span> cerberus
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">$ git checkout -b &amp;lt;branch_name&amp;gt;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">$ &amp;lt;make change&amp;gt;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">$ git add &amp;lt;changes&amp;gt;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">$ git commit -a
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">$ &amp;lt;insert good message&amp;gt;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">$ git push
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="fix-formatting">
 Fix Formatting
 &lt;a class="td-heading-self-link" href="#fix-formatting" aria-label="Heading self-link">&lt;/a>
&lt;/h2>
&lt;p>Cerberus uses &lt;a href="https://pre-commit.com">pre-commit&lt;/a> framework to maintain the code linting and python code styling.
The CI would run the pre-commit check on each pull request.
We encourage our contributors to follow the same pattern, while contributing to the code.&lt;/p></description></item></channel></rss>