Google News
logo
Kubernetes Interview Questions
Kubernetes is a container management system developed in the Google platform. Kubernetes (sometimes shortened to K8s with the 8 standing for the number of letters between the "K" and the "s") is an open source system to deploy, scale, and manage containerized applications anywhere.
 
Kubernetes automates operational tasks of container management and includes built-in commands for deploying applications, rolling out changes to your applications, scaling your applications up and down to fit changing needs, monitoring your applications, and more—making it easier to manage applications.
Automated operations : Kubernetes has built-in commands to handle a lot of the heavy lifting that goes into application management, allowing you to automate day-to-day operations. You can make sure applications are always running the way you intended them to run.
 
Infrastructure abstraction : When you install Kubernetes, it handles the compute, networking, and storage on behalf of your workloads. This allows developers to focus on applications and not worry about the underlying environment.
 
Service health monitoring : Kubernetes continuously runs health checks against your services, restarting containers that fail, or have stalled, and only making available services to users when it has confirmed they are running.
Kubernetes is used to create applications that are easy to manage and deploy anywhere. When available as a managed service, Kubernetes offers you a range of solutions to meet your needs. Here are some common use cases.
 
Increasing development velocity : Kubernetes helps you to build cloud-native microservices-based apps. It also supports containerization of existing apps, thereby becoming the foundation of application modernization and letting you develop apps faster.
 
Deploying applications anywhere : Kubernetes is built to be used anywhere, allowing you to run your applications across on-site deployments and public clouds; as well as hybrid deployments in between. So you can run your applications where you need them.
 
Running efficient services : Kubernetes can automatically  adjust the size of a cluster required to run a service. This enables you to automatically scale your applications, up and down, based on the demand and run them efficiently.
You can cluster together groups of hosts running Linux® containers, and Kubernetes helps you easily and efficiently manage those clusters.
 
Kubernetes clusters can span hosts across on-premise, public, private, or hybrid clouds. For this reason, Kubernetes is an ideal platform for hosting cloud-native applications that require rapid scaling, like real-time data streaming through Apache Kafka.
The primary advantage of using Kubernetes in your environment, especially if you are optimizing app dev for the cloud, is that it gives you the platform to schedule and run containers on clusters of physical or virtual machines (VMs).
 
More broadly, it helps you fully implement and rely on a container-based infrastructure in production environments. And because Kubernetes is all about automation of operational tasks, you can do many of the same things other application platforms or management systems let you do—but for your containers.
 
Developers can also create cloud-native apps with Kubernetes as a runtime platform by using Kubernetes patterns. Patterns are the tools a Kubernetes developer needs to build container-based applications and services.
 
With Kubernetes you can :
 
* Orchestrate containers across multiple hosts.
* Make better use of hardware to maximize resources needed to run your enterprise apps.
* Control and automate application deployments and updates.
* Mount and add storage to run stateful apps.
* Scale containerized applications and their resources on the fly.
* Declaratively manage services, which guarantees the deployed applications are always running the way you intended them to run.
* Health-check and self-heal your apps with autoplacement, autorestart, autoreplication, and autoscaling.

However, Kubernetes relies on other projects to fully provide these orchestrated services. With the addition of other open source projects, you can fully realize the power of Kubernetes. These necessary pieces include (among others) :
 
* Registry, through projects like Docker Registry.
* Networking, through projects like OpenvSwitch and intelligent edge routing.
* Telemetry, through projects such as Kibana, Hawkular, and Elastic.
* Security, through projects like LDAP, SELinux, RBAC, and OAUTH with multitenancy layers.
* Automation, with the addition of Ansible playbooks for installation and cluster life cycle management.
* Services, through a rich catalog of popular app patterns.
How does Kubernetes work

A working Kubernetes deployment is called a cluster. You can visualize a Kubernetes cluster as two parts: the control plane and the compute machines, or nodes.
 
Each node is its own Linux® environment, and could be either a physical or virtual machine. Each node runs pods, which are made up of containers.
 
The control plane is responsible for maintaining the desired state of the cluster, such as which applications are running and which container images they use. Compute machines actually run the applications and workloads.
 
Kubernetes runs on top of an operating system (Red Hat® Enterprise Linux®, for example) and interacts with pods of containers running on the nodes.
 
The Kubernetes control plane takes the commands from an administrator (or DevOps team) and relays those instructions to the compute machines.
 
This handoff works with a multitude of services to automatically decide which node is best suited for the task. It then allocates resources and assigns the pods in that node to fulfill the requested work.
 
The desired state of a Kubernetes cluster defines which applications or other workloads should be running, along with which images they use, which resources should be made available to them, and other such configuration details.
 
From an infrastructure point of view, there is little change to how you manage containers. Your control over containers just happens at a higher level, giving you better control without the need to micromanage each separate container or node.
 
Your work involves configuring Kubernetes and defining nodes, pods, and the containers within them. Kubernetes handles orchestrating the containers.
 
Where you run Kubernetes is up to you. This can be on bare metal servers, virtual machines, public cloud providers, private clouds, and hybrid cloud environments. One of Kubernetes’ key advantages is it works on many different kinds of infrastructure.
The term orchestration specifies integrating multiple services that allow them to automate processes or synchronize information in a specific time sequence.
 
For example, suppose we have six or seven microservices for an application to run, then if you place them in separate containers, this would inevitably create obstacles for communication. Using orchestration, we can do it quickly to enable all services in individual containers to work seamlessly to accomplish a single goal.
Often misunderstood as a choice between one or the other, Kubernetes and Docker are different yet complementary technologies for running containerized applications. 
 
Docker lets you put everything you need to run your application into a box that can be stored and opened when and where it is required. Once you start boxing up your applications, you need a way to manage them; and that's what Kubernetes does.
 
Kubernetes is a Greek word meaning  ‘captain’ in English. Like the captain is responsible for the safe journey of the ship in the seas, Kubernetes is responsible for carrying and delivering those boxes safely to locations where they can be used.
 
* Kubernetes can be used with or without Docker

* Docker is not an alternative to Kubernetes, so it’s less of a “Kubernetes vs. Docker” question. It’s about using Kubernetes with Docker to containerize your applications and run them at scale

* The difference between Docker and Kubernetes relates to the role each play in containerizing and running your applications

* Docker is an open industry standard for packaging and distributing applications in containers

* Kubernetes uses Docker to deploy, manage, and scale containerized applications
Deploying Applications consist of an architecture that has an operating system. The operating system will have a kernel that holds various libraries installed on the operating system needed for an application.
 
Whereas container host refers to the system that runs the containerized processes. This kind is isolated from the other applications; therefore, the applications must have the necessary libraries. The binaries are separated from the rest of the system and cannot infringe any other application.
10 .
What is the work of a kube-scheduler?
Kube-scheduler is the default scheduler for Kubernetes. It assigns nodes to newly created pods.
11 .
Define daemon sets
Daemon sets are a set of pods that runs on a host. They are used for host layers attributes like monitoring network or simple network.
The main components of the master node that help to manage worker nodes are as follows :
 
Kube-server : It acts as a frontend of the cluster and communicates with the cluster through the API server.

Kube controller : It implements governance across the cluster and runs the set of controllers for the running cluster.

Kube scheduler : It schedules the activities of the nodes and holds the node resource to determine proper action for triggering events.
Today, the majority of on-premises Kubernetes deployments run on top of existing virtual infrastructure, with a growing number of deployments on bare metal servers. This is a natural evolution in data centers. Kubernetes serves as the deployment and lifecycle management tool for containerized applications, and separate tools are used to manage infrastructure resources. 
You would start directly with bare metal servers and software-defined storage, deployed and managed by Kubernetes to give the infrastructure the same self-installing, self-scaling, and self-healing benefits as containers enjoy. This is the vision of Kubernetes-native infrastructure.
* Public cloud agility and simplicity on-premises to reduce friction between developers and IT operations
 
* Cost efficiency by eliminating the need for a separate hypervisor layer to run VMs
 
* Developer flexibility to deploy containers, serverless applications, and VMs from Kubernetes, scaling both applications and infrastructure
 
* Hybrid cloud extensibility with Kubernetes as the common layer across on-premises and public clouds
With the use of limit and request resource usage of a POD can be controlled.
 
Request : The number of resources being requested for a container. If a container exceeds its request for resources, it can be throttled back down to its request.
 
Limit : An upper cap on the resources a single container can use. If it tries to exceed this predefined limit it can be terminated if K8's decides that another container needs these resources. If you are sensitive towards pod restarts, it makes sense to have the sum of all container resource limits equal to or less than the total resource capacity for your cluster.
 
Example :
apiVersion: v1
kind: Pod
metadata:
 name: demo
spec:
 containers:
 - name: example1
 image:example/example1
 resources:
   requests:
     memory: "_Mi"
     cpu: "_m"
   limits:
     memory: "_Mi"
     cpu: "_m"
Mainly K8 cluster consists of two types of nodes, executor and master.
 
Executor node : (This runs on master node)
 
* Kube-proxy : This service is responsible for the communication of pods within the cluster and to the outside network, which runs on every node. This service is responsible to maintain network protocols when your pod establishes a network communication.

* kubelet : Each node has a running kubelet service that updates the running node accordingly with the configuration(YAML or JSON) file. NOTE: kubelet service is only for containers created by Kubernetes.

Master services :
 
* Kube-apiserver : Master API service which acts as an entry point to K8 cluster.

* Kube-scheduler : Schedule PODs according to available resources on executor nodes.

* Kube-controller-manager :  is a control loop that watches the shared state of the cluster through the apiserver and makes changes attempting to move the current state towards the desired stable state
18 .
What is Minikube?
With the help of Minikube, users can Kubernetes locally. This process lets the user run a single-node Kubernetes cluster on your personal computer, including Windows, macOS, and Linus PCs. With this, users can try out Kubernetes also for daily development work.
A Kubernetes administrator can create a deployment of a kind : PodDisruptionBudget for high availability of the application, it makes sure that the minimum number is running pods are respected as mentioned by the attribute minAvailable spec file. This is useful while performing a drain where the drain will halt until the PDB is respected to ensure the High Availability(HA) of the application. The following spec file also shows minAvailable as 2 which implies the minimum number of an available pod (even after the election).
 
Example : YAML Config using minAvailable => 
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
 name: zk-pdb
spec:
 minAvailable: 2
 selector:
   matchLabels:
     app: zookeeper
20 .
Why use namespace in Kubernetes?
Namespaces in Kubernetes are used for dividing cluster resources between users. It helps the environment where more than one user spread projects or teams and provides a scope of resources.
Architecture of Kubernetes

Explanation of the critical components used in the diagram :
 
Master Node : The master node is the first and most crucial component of the Kubernetes' architecture. It is used to manage the Kubernetes cluster. It is the entry point for all kinds of administrative tasks.
 
There may be more than one master node in the cluster to check for fault tolerance.
 
API Server : The API server is an entry point for all the REST commands. It is used to control the cluster.
 
Scheduler : The scheduler is used to schedule the tasks to the slave node and distribute the workload. It also stores the resource usage information for every slave node.
 
Etcd : The etcd component is used to store configuration detail and good values. It communicates with the most component and receives commands and work. It is also responsible for managing network rules and port forwarding activity.
 
Worker/Slave nodes : Worker nodes or Slave nodes are another essential components that contain all the required services to manage the networking between the containers, communicate with the master node, which allows you to assign resources to the scheduled boxes.
 
Kubelet : It gets the Pod's configuration from the API server and ensures that the described containers are up and running.
 
Docker Container : The Docker container runs on each of the worker nodes, which runs the configured pods.
 
Pods : A pod is specified as a combination of single or multiple containers that logically run together on nodes.
Following is the list of various services available in Kubernetes :
 
* Cluster IP service : It exposes the services on cluster internal IP and makes the services reachable within the cluster only.

* Load Balancer service : It provides services externally using a cloud provider’s load balancer. It creates the service to route the external load balancer automatically.

*
Node Port service : It exposes the services on each node’s IP at the static port.

* External Name Creation service : It navigates the service to the contents of the external field by returning the CNAME record by its value.
Docker Swarm is an open-source container orchestration platform used to cluster and schedule Docker containers. It is a native of Docker. Following is the list of key differences between the Docker Swarm and Kubernetes :

Docker Swarm Kubernetes
Docker Swarm is easy and convenient to set up, but it doesn't have a robust cluster. Kubernetes is more complicated than Docker Swarm to set up, but it assures a robust cluster.
Docker Swarm can't do auto-scaling as the Kubernetes can do, but Docker's scaling is five times faster than Kubernetes. Kubernetes can do auto-scaling, but scaling is slower than Docker Swarm.
Docker Swarm doesn't provide a GUI. Kubernetes provides a GUI in the form of a dashboard.
Docker Swarm provides an automatic load balancing feature of traffic between containers in a cluster. It requires manual intervention in Kubernetes for load balancing such traffic.
Docker requires third-party tools such as the ELK stack for logging and monitoring. Kubernetes provides such integrated tools for logging and monitoring purposes.
In Docker Swarm, we can easily share storage volumes with any container. In Kubernetes, we can only share storage volumes with containers in the same pod.
We can deploy rolling updates in Docker Swarm but can't deploy automatic rollbacks. In Kubernetes, we can deploy rolling updates as well as automatic rollbacks.
24 .
What is Kube-proxy?
Kube-proxy is an implementation of both a network proxy and a load balancer. It is used to support service abstraction used with other networking operations. It is responsible for directing traffic to the container depend on IP and the port number.
25 .
What is the role of Kube-apiserver?
This kind validates and provides configuration data for the API objects. It includes pods, services, replication controllers. Also, it provides REST operations and also the frontend of the cluster. This frontend cluster state is shared through which all other component interacts.
Load balancing is a way to distribute the incoming traffic into multiple backend servers, which is useful to ensure the application available to the users.
 
All the incoming traffic lands to a single IP address on the load balancer which is a way to expose your service to outside the internet which routes the incoming traffic to a particular pod (via service) using an algorithm known as round-robin. Even if any pod goes down load balances are notified so that the traffic is not routed to that particular unavailable node. Thus load balancers in Kubernetes are responsible for distributing a set of tasks (incoming traffic) to the pods
By default, POD can communicate with any other POD, we can set up network policies to limit this communication between the PODs.
 
* RBAC (Role-based access control) to narrow down the permissions.

* Use namespaces to establish security boundaries.

* Set the admission control policies to avoid running the privileged containers.

* Turn on audit logging.
This architecture depends upon the application and many other factors. Following are the common logging patterns
 
* Node level logging agent.
* Streaming sidecar container.
* Sidecar container with the logging agent.
* Export logs directly from the application.

In the setup, journalbeat and filebeat are running as daemonset. Logs collected by these are dumped to the kafka topic which is eventually dumped to the ELK stack.
 
The same can be achieved using EFK stack and fluentd-bit.
In this Kubernetes interview question, the interviewer would expect a thorough explanation. You can explain what it is and also it has been useful to you (if you have used it in your work so far!). A Heapster is a performance monitoring and metrics collection system for data collected by the Kublet. This aggregator is natively supported and runs like any other pod within a Kubernetes cluster, which allows it to discover and query usage data from all nodes within the cluster.
Kubernetes uses etcd as a distributed key-value store for all of its data, including metadata and configuration data, and allows nodes in Kubernetes clusters to read and write data. Although etcd was purposely built for CoreOS, it also works on a variety of operating systems (e.g., Linux, BSB, and OS X) because it is open-source. Etcd represents the state of a cluster at a specific moment in time and is a canonical hub for state management and cluster coordination of a Kubernetes cluster.
31 .
What is Kubectl?
Kubectl is the platform using which you can pass commands to the cluster. So, it basically provides the CLI to run commands against the Kubernetes cluster with various ways to create and manage the Kubernetes component.
32 .
What is Kubelet?
This is an agent service which runs on each node and enables the slave to communicate with the master. So, Kubelet works on the description of containers provided to it in the PodSpec and makes sure that the containers described in the PodSpec are healthy and running.
The kube-scheduler is the default scheduler for Kubernetes. It is designed such that if you prefer, you can write your own one and use that instead.
 
Following is the syntax :
kube-scheduler [flags]
The scheduling life cycle :
 
* A pod is created and the preferred state is mentioned, and without filling the node name, it is saved to etcd
* The scheduler notices the new pod with no node bound
* It finds a suitable node for that pod
* It then informs the API server to bind the pod to the node, and next, the new desired state is saved to etcd
* Kubelets watch the pods that are bound and start the containers on the particular node
34 .
Explain Prometheus in Kubernetes
Prometheus is an application that is used for monitoring and alerting. It can be called out to your systems, grab real-time metrics, compress it, and stores properly in a database.
Ingress network is a collection of rules that acts as an entry point to the Kubernetes cluster. This allows inbound connections, which can be configured to give services externally through reachable URLs, load balance traffic, or by offering name-based virtual hosting. So, Ingress is an API object that manages external access to the services in a cluster, usually by HTTP and is the most powerful way of exposing service.
 
Now, let me explain to you the working of Ingress network with an example.
 
There are 2 nodes having the pod and root network namespaces with a Linux bridge. In addition to this, there is also a new virtual ethernet device called flannel0(network plugin) added to the root network.
 
Now, suppose we want the packet to flow from pod1 to pod 4. Refer to the below diagram.
Ingress Network
* So, the packet leaves pod1’s network at eth0 and enters the root network at veth0.
* Then it is passed on to cbr0, which makes the ARP request to find the destination and it is found out that nobody on this node has the destination IP address.
* So, the bridge sends the packet to flannel0 as the node’s route table is configured with flannel0.
* Now, the flannel daemon talks to the API server of Kubernetes to know all the pod IPs and their respective nodes to create mappings for pods IPs to node IPs.
* The network plugin wraps this packet in a UDP packet with extra headers changing the source and destination IP’s to their respective nodes and sends this packet out via eth0.
* Now, since the route table already knows how to route traffic between nodes, it sends the packet to the destination node2.
* The packet arrives at eth0 of node2 and goes back to flannel0 to de-capsulate and emits it back in the root network namespace.
* Again, the packet is forwarded to the Linux bridge to make an ARP request to find out the IP that belongs to veth1.
* The packet finally crosses the root network and reaches the destination Pod4.
Replica Set and Replication Controller do almost the same thing. Both ensure that a specified number of pod replicas are running at any given time. The difference comes with the usage of selectors to replicate pods. Replica Set uses Set-Based selectors while replication controllers use Equity-Based selectors.
 
Equity-Based Selectors : This type of selector allows filtering by label key and values. So, in layman’s terms, the equity-based selector will only look for the pods with the exact same phrase as the label.
Example : Suppose your label key says app=nginx; then, with this selector, you can only look for those pods with label app equal to nginx.

Selector-Based Selectors : This type of selector allows filtering keys according to a set of values. So, in other words, the selector-based selector will look for pods whose label has been mentioned in the set.
Example : Say your label key says app in (Nginx, NPS, Apache). Then, with this selector, if your app is equal to any of Nginx, NPS, or Apache, the selector will take it as a true result.
37 .
What is the Google Container Engine? What is its use?
The Google Container Engine is an open-source management platform tailor-made for Docker containers and clusters. It is used to provide support for the clusters that run in Google public cloud services.
The biggest disadvantages of Kubernetes are listed below:
 
* Security is the biggest issue with Kubernetes. It is not very secure.

* Kubernetes dashboard is not very user-friendly.

* Kubernetes is very complex to use. That's why it can reduce productivity.

* Kubernetes is costlier than its alternatives.
Following is the difference between a daemon set, a deployment, and a replication controller :
 
Daemon set : A daemon set is used to ensure that all nodes you have selected are running precisely one copy of a pod.
Daemon set
Deployment : A deployment is a resource object in Kubernetes. It is used to provide declarative updates to applications. It also manages the scheduling and lifecycle of pods. It offers many key features for managing pods, such as pod health checks, rolling updates of pods, the ability to roll back and quickly scale pods horizontally.
 
Replication Controller : The replication controller is used to specify how many exact copies of a pod should be running in a cluster. It differs from a deployment in that it does not offer pod health checks, and the rolling update process is not as robust.
40 .
What is a sidecar container, and what is its use?
A sidecar container is a utility container used to support the main container in a Pod. We can pair a sidecar container with one or more main containers and enhance those primary containers' functionality. It is mainly used for monitoring or when the system logs.
There are mainly two types of pods in Kubernetes :
 
Single Container Pods : We can create Single Container Pods by using the run command in Kubernetes.

multi-container pods : We can create multi-container pods by using the "create" command in Kubernetes.
Following are some examples of recommended security measures for Kubernetes :
 
* Defining resource quota
* Support for auditing
* Restriction of etcd access
* Regular security updates to the environment
* Network segmentation
* Definition of strict resource policies
* Continuous scanning for security vulnerabilities
* Using images from authorized repositories etc.
Following is the list of some important Kubectl commands :
 
* kubectl annotate
* kubectl cluster-info
* kubectl attach
* kubectl apply
* kubectl config
* kubectl autoscale
* kubectl config current-context
* kubectl config set etc.
In K8’s scheduler is responsible to spawn pods into nodes. There are many factors that can lead to unstartable POD. The most common one is running out of resources, use the commands like kubectl describe <POD> -n <Namespace> to see the reason why POD is not started. Also, keep an eye on kubectl to get events to see all events coming from the cluster.
Various methods are available to achieve it.
 
nodeName : specify the name of a node in POD spec configuration, it will try to run the POD on a specific node.

nodeSelector : Assign a specific label to the node which has special resources and use the same label in POD spec so that POD will run only on that node.

nodeaffinities : required DuringSchedulingIgnoredDuringExecution, preferredDuringSchedulingIgnoredDuringExecution are hard and soft requirements for running the POD on specific nodes. This will be replacing nodeSelector in the future. It depends on the node labels.
You should be creating serviceAccount. A service account creates a token and tokens are stored inside a secret object. By default Kubernetes automatically mounts the default service account. However, we can disable this property by setting automountServiceAccountToken: false in our spec. Also, note each namespace will have a service account
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-sa
automountServiceAccountToken: false
We can introduce probes. A liveness probe with a Pod is ideal in this scenario.
 
A liveness probe always checks if an application in a pod is running,  if this check fails the container gets restarted. This is ideal in many scenarios where the container is running but somehow the application inside a container crashes.
spec:
containers:
- name: liveness
image: k8s.gcr.io/liveness
args:
- /server
livenessProbe:
      httpGet:
        path: /healthz
48 .
What do you understand by Sematext Docker Agent?
The Sematext Docker agent is a log collection agent with events and metrics, which runs as a small container in each Docker host. These agents are responsible for gathering metrics, events, and logs for all cluster nodes and containers.
No, all the nodes don't occupy the same size in the cluster. The Kubernetes components, such as kubelet, take up resources on your nodes, and you still need more capacity for the node to do any task. In the larger cluster, it is always preferred to create a mix of different instance sizes. So, the Kubernetes can easily schedule the pods that require a lot of memory with intensive compute workloads on large nodes, and smaller nodes can handle smaller pods.
Kubernetes Volumes Docker Volumes
The Kubernetes volumes are not limited to any container. The Docker volumes are limited to a pod in the container.
Kubernetes volumes support all containers deployed in a pod of Kubernetes. Docker volumes do not support all containers deployed in Docker.
The uses of the GKE (Google Kubernetes Engine) are :
 
* It can be used to create docker container clusters
* Resize application controllers
* Update and then upgrade the clusters of container
* Debug cluster of the container.
* GKE can be used to creates a replication controller, jobs, services, container pods, or load balancer.
By default, POD should be able to reach the external network but vice-versa we need to make some changes. Following options are available to connect with POD from the outer world.
 
* Nodeport (it will expose one port on each node to communicate with it)
* Load balancers (L4 layer of TCP/IP protocol)
* Ingress (L7 layer of TCP/IP Protocol)

Another method is to use Kube-proxy which can expose a service with only cluster IP on the local system port.
 
$ kubectl proxy --port=8080 $ http://localhost:8080/api/v1/proxy/namespaces//services/:/
53 .
Define OpenShift
OpenShift is a public cloud application development and hosting platform developed by Red Hat. It offers automation for management so that developers can focus on writing the code.
54 .
What do you understand by PVC? What is its full form?
The full form of PVC is Persistent Volume Claim. It is storage requested by Kubernetes for pods. In PVC, it doesn't require knowing the underlying provisioning. You can create the claim in the same namespace where the pod is created.
The users need to understand the application's performance and resource utilization at all the different abstraction layers. Kubernetes create abstraction at different levels like containers, pods, services, and whole cluster to monitor them well. This process is called container resource monitoring.
 
Following is a list of some container resource monitoring tools :
 
Heapster : It is used to gather data and events from the containers and pods within the cluster.

InfluxDB : It is used along with heapster for visualizing data within the Kubernetes environment.

Grafana : It is a time-series database used to store the data captured by all heapster pods.

CAdvisor : It is a built-in tool in a kubelet that automatically discovers all the active containers and monitors them.

Prometheus : It specifies a project of CNCF, which provides powerful querying, alerting, and visualization capabilities.
Container Runtime 
 
* Kubelet
* kube-proxy

Kubernetes Worker node is a machine where workloads get deployed. The workloads are in the form of containerised applications and because of that, every node in the cluster must run the container run time such as docker in order to run those workloads. You can have multiple masters mapped to multiple worker nodes or a single master having a single worker node. Also, the worker nodes are not gossiping or doing leader election or anything that would lead to odd-quantities. The role of the container run time is to start and managed containers. The kubelet is responsible for running the state of each node and it receives commands and works to do from the master. It also does the health check of the nodes and make sure they are healthy. Kubelet is also responsible for metric collectins of pods as well. The kube-proxy is a component that manages host subnetting and makes services available to other components.
57 .
How to forward port '8080 (container) -> 8080 (service) -> 8080 (ingress) -> 80 (browser)' how is it done?
The ingress is exposing port 80 externally for the browser to access, and connecting to a service that listens on 8080. The ingress will listen on port 80 by default. An "ingress controller" is a pod that receives external traffic and handles the ingress  and is configured by an ingress resource For this you need to configure ingress selector and if no 'ingress controller selector' is specified then no ingress controller will control the ingress.
 
simple ingress Config will look like :
 
host: abc.org
http:
paths:
backend:
serviceName: abc-service
servicePort: 8080
Then the service will look like
kind: Service
apiVersion: v1
metadata:
name: abc-service
spec:
ports:
protocol: TCP
port: 8080 # this is the port the service listens on
targetPort: 8080
* Node Affinity ensures that pods are hosted on particular nodes.

Pod Affinity ensures two pods to be co-located in a single node.
 
Node Affinity :
apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
affinity:
nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: Kubernetes.io/e2e-az-name
            operator: In
            values:
            - e2e-az1
Pod Affinity :
apiVersion: v1
kind: Pod
metadata:
name: with-pod-affinity
spec:
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: security
            operator: In
            values:
            - S1
The pod affinity rule says that the pod can be scheduled to a node only if that node is in the same zone as at least one already-running pod that has a label with key “security” and value “S1”
You can attach an image pull secret to a service account. Any pod using that service account (including default) can take advantage of the secret.you can bind the pullSecret to your pod, but you’re still left with having to create the secret every time you make a namespace.
 
* imagePullSecrets:
* name: test

Also, you can  Create the rc/deployment manually and either specify the imagepullsecret or a service account that has the secret or add the imagepullsecret to the default service account, in which case you'd be able to use `kubectl run` and not have to make any manual changes to the manifest. Depending on your environment and how secret this imagepullsecret is, will change how you approach it.
The service selects apps based on labels, so if no pods have appropriate labels, the service has nothing to route and labels can be anything you like. Since all pod names should be unique, you can just set the labels as the pod name. Since statesets create the same pods multiple times, they won't be configured with distinct labels you could use to point disparate services to the correct pod. If you gave the pods their own labels manually it will work. Also, service selects pods based on selector as well their location label as well Below .yaml file of Grafana dashboard service shows the same
 
* apiVersion: v1
* kind: Service

metadata :
* name: grafanaportforward
* namespace: kubeflow

labels :
* run: grafana-test

spec :
 
ports :
* - port: 3000
* protocol: TCP
* name: grafana
* externalIPs:- x.y.x.q
* selector:app: grafana-test