Google News
logo
PostgreSQL Interview Questions
PostgreSQL is an open-source object-relational database management system known as Object-Relational Database Management System(ORDBMS). It is known as Postgres or Postgresql. 
 
In the SQL world, it is one of the most widely and popularly used for Object-Relational Database Management System mainly used in large web applications. It is a powerful database management system that provides additional and substantial power by incorporating four basic concepts so that the users can extend the system without any problem.
 
It uses the SQL language and extends its features to store the data securely. It supports the best practices and allows users to retrieve the data when the request is processed.
PostgreSQL supports the common SQL data types; it also supports some unexpected ones, such as JSON. Check some of the most notable data types below:
 
Numeric Types : “Numeric types consist of two, four, and eight-byte integers; four and eight-byte floating-point numbers; and selectable-precision decimals.” 

Character Types : SQL defines two primary character types: “character varying(n) and character(n), where n is a positive integer. Both of these types can store strings up to n characters in length. The notations VARCHAR(n) and CHAR(n) are aliases for character varying(n) and character(n).” 

Binary Data Types : A binary string is a sequence of bytes.

Date/Time Types : These store dates, times, or dates and times (timestamps).

Boolean : Boolean values store only TRUE, FALSE, and NULL values.

Enumerated Types : “Enumerated (ENUM) types comprise a static, ordered set of values. They are equivalent to the ENUM types supported in a number of programming languages. An example of an ENUM type might be the days of the week, or a set of status values for a piece of data”. 

XML : The XML data type stores XML data.

JSON : The JSON data type stores JSON (JavaScript Object Notation) data.
The most important features of PostgreSQL are as follows :
 
Download : PostgreSQL is open-source software and free to download. We can easily download it from the official website of PostgreSQL.
 
Area of Compatibility
 
* PostgreSQL is compatible with several operating systems such as Microsoft Windows, Linux, MacOS X, UNIX (AIX, BSD, HP-UX, SGI IRIX, Solaris, and Tru64), etc.

* It is compatible with various programming languages such as C/C++, Java, Python, Perl, Ruby, Tcl, and ODBC (Open Database Connectivity).

PostgreSQL is compatible with multiple data types such as
 
* Structured data types, i.e., Array, Date and Time, UUID (Universally Unique Identifier), Array, Range, etc.
* Primitive data types, i.e., String, Integer, Boolean, Numeric, etc.
* Customizations data types, i.e., Custom Types, Composite, etc.
* Geometry data types, i.e., Polygon, Circle, Line, Point, etc.
* Document data types i.e. XML, JSON/JSONB, Key-value, etc.

Extensibility
 
* PostgreSQL is highly extensible as it supports procedural languages such as Perl, PL/PGSQL, Python, etc.
* It is compatible with foreign data wrappers, which connect to further databases with a standard SQL interface.
* It supports JSON/SQL path expressions, stored procedures, and functions.
* It supports a customizable storage interface for a table.

Secure and Highly Reliable
 
PostgreSQL is safe and secure because of the following security aspects :
 
* PostgreSQL provides a robust access control system and several authentications such as Lightweight Directory Access Protocol (LDAP), Generic Security Service Application Program Interface (GSSAPI), SCRAM-SHA-256, Security Support Provider Interface (SSPI), Certificate, etc.
* It also supports column and row-level security.
* It is highly reliable as it provides disaster recovery such as active standbys and PITR (Point in time recovery).
* It supports WAL (Write-ahead Logging)
* It supports different types of Replication like Synchronous, Asynchronous, and Logical.
* It supports Internationalization, which includes ICU collations, accent-insensitive and case-sensitive collations, and full-text searches.
* It is compatible with ANSI-SQL2008.
* It can improve the functionality of Server-Side programming.
The major functionalities/ features of PostgreSQL are
 
1) Object relational database

2) Extensibility and support for SQL

3) DB validation and flexible Application Program Interface.

4) Procedural languages and Multi Version Concurrency Control.

5) Client server and WAL.
* PostgreSQL does not support many open-source applications compared to the MySQL database.

* In PostgreSQL, creating replication is a bit complex which reduces its popularity.

* The speed and performance of PostgreSQL are not as good as compared to some other databases and tools.

* PostgreSQL is not maintained by one company. This may be one of its drawbacks.

* It is slower as compared to the MySQL database.

* It is not popular as MySQL, so the installation process is sometimes not easy for beginners.
Multi-Version Concurrency Control (MVCC) is an advanced method used in PostgreSQL for improving the performance of a database in a multi-user environment. Unlike lock models in other databases, PostgreSQL uses a multi-version environment in which locks that are acquired for reading data don’t conflict with locks acquired for writing the data. Hence, making the process more compartmentalized and a lot faster.
The ctid field exists in every PostgreSQL table. It is unique for every record in a table and denotes the turple location. It can be used to delete records. The thing to remember, we should only use ctid if we have absolutely no other unique identifier to use.
Difference between clustered index and non clustered index
 
Cluster index is an index type that is used to sort table data rows on the basis of their key values. In RDBMS primary key allows us to create a clustered index based on that specific column.

A non-clustered index (or regular b-tree index) is an index where the order of the rows does not match the physical order of the actual data. It is instead ordered by the columns that make up the index.
In PostgreSQL, the index is a common way for database performance enhancement. It allows the database server to find the retrieve specific rows faster compared to without index. It also adds overhead to the database system as a whole, so users have to implement them sensibly.
 
Indexes are special lookup tables that are used by the database search engine to speed up data retrieval. Simply defining, an index is a pointer to a specific data in a table.
Knowing the GROUP BY clause can demonstrate a slightly more advanced knowledge of SQL. Thus, you may be asked how the GROUP BY clause works in PostgreSQL.
 
The GROUP BY clause allows you to group the data that results from your queries. Organizing data into groups is useful in making sense of data and in the use of aggregate functions.
 
Imagine you have an eCommerce website that sells several product types. In your database, you have a table that stores information about the inventory you have in stock. If you want to find the count of each product type, you can use GROUP BY with the COUNT() aggregate function. (Note: Aggregate functions are covered in more detail in the next section.) Here’s what the query would look like:
SELECT product_type, COUNT(product_id)
FROM stock
GROUP BY product_type
In PostgreSQL, aggregate functions perform a calculation over multiple rows and return one value. As mentioned previously, aggregate functions are often used alongside the GROUP BY clause, but there are many possible uses for these functions.
 
There are five aggregate functions in SQL :
 
COUNT() : Returns the number of rows that fit the criteria stated in the WHERE clause.

SUM() : Calculates the total of all values in a column or an expression.

AVG() : Calculates the average column value.

MIN() : Returns the smallest value from a set.

MAX() : Returns the largest value from a set.
Another question that interviewers may use to gauge your knowledge of PostgreSQL is to ask you the difference between the WHERE and HAVING clauses.
 
The WHERE and HAVING clauses filter data and restrict unwanted data from appearing in your result set. The main difference between these filters is :
 
WHERE is applied at the record level.

HAVING is applied to sets of records.
13 .
What do you understand by a base directory in PostgreSQL?
The base directory in PostgreSQL is data_dir/base. It is a folder in PostgreSQL which contains all the sub-directories used by a database in clusters and stores all the data you have inserted in your databases.
14 .
What do you understand by a partitioned table in PostgreSQL?
In PostgreSQL, a partitioned table is a logical structure used to split a large table into smaller pieces. These small pieces of the tables are called partitions.
15 .
What is the maximum size for a table in PostgreSQL?
PostgreSQL provides unlimited user database size, but it doesn't provide an unlimited size for tables. In PostgreSQL, the maximum size for a table is set to 32 TB.
16 .
What is Multi-version control?
Multi-version concurrency control or MVCC is used to avoid unnecessary locking of the database. This removes the time lag for the user to log into his database. This feature or time lag occurs when someone else is accessing the content. All transactions are kept as a record.
17 .
Can you explain Pgadmin?
Pgadmin is a feature known to form a graphical front-end administration tool. This feature is available under free software released under an Artistic License. Pgadmin iii is the new database administration tool released under artistic license.
During updating the project, one can never be sure which features will go in and which ones won’t make the cut. The project has precise and stringent standards for quality, and some patches may or may not match them before the set deadline.

Currently, the 9.1 version is working on some essential features, including JSON support, synchronous replication, nearest-neighbor geographic searches, collation at the column level, SQL/MED external data connections, security labels, and index-only access. However, this list is highly likely to change completely by the time Postgre 9.1 is released.
The command enable-debug is used to enable the compilation of all the applications and libraries. The execution of this procedure usually impedes the system, but it also amplifies the binary file size. Debugging symbols that are present generally assist the developers in spotting the bugs and other problems which may arise associated with their script.
Postgres’ write ahead logging increases the reliability (in a way resiliency) of the db by logging changes before any changes or updates to the database. This provides log of database in case of a db crash and this also helps to start the work from the point it was discontinued. This it is useful for crash recovery.
Multi-Version Concurrency Control or MVCC is an advanced technique in PostgreSQL that improves database performance in a multi-user environment. It is mainly used to avoid unnecessary locking of the database by removing the time lag for the user to log into his database.

This time lag occurs when someone else is accessing the content. In Multi-Version Concurrency Control or MVCC, all the transactions are kept as records. That's why PostgreSQL maintains data consistency, unlike most other database systems which use locks for concurrency control.
22 .
What do you understand by string constants in PostgreSQL?
In the PostgreSQL database, a string constant is a sequence of some character bounded by single quotes ('). For example: 'This is an example of string Constant'.
To start Postgres Database server, we need to move to the location where pgsql script (ideally it’s a shell script) and the location of postgresql and start the services respectively as below.
 
We can add them to the auto startup scripts using chkconfig such that whenever server gets restarted postgres will be started.
 
* /usr/local/etc/rc.d/010.pgsql.sh start
* /usr/local/etc/rc.d/postgresql start
To stop the Postgres Database server, we need to move to the location where pgsql script (ideally it’s a shell script) and the location of postgresql and stop the services respectively as below.
 
Alternatively, we can add them to the auto startup scripts using chkconfig such that whenever server gets restarted postgres server will be restarted and thus we can avoid manual intervention.
 
* /usr/local/etc/rc.d/010.pgsql.sh stop
* /usr/local/etc/rc.d/postgresql stop
To enable debug in Postgres, we’ll have to enable the below shared library as shown below.
 
# Edit the entry to add the library ‘plugin_debugger’ :
shared_preload_libraries = ‘plugin_debugger’

Restart PostgreSQL
sudo service postgresql stop
sudo service postgresql start
To enable this extension in PostgreSQL
 
The following is done at each DB level, not instance level. So log into the DB and run following SQL :
CREATE EXTENSION pldbgapi;​

 

The above mentioned command is used for enabling the compilation of all libraries and applications.
 
This process generally slows down the system and it also increases the binary file size. Debugging symbols are present which can assist developers in noticing bugs and problems associated with their script.
A SQL subquery is a query placed inside of another query. Sometimes subqueries are also referred to as nested queries. Below is an example where the subquery comes after the WHERE clause; however, it is possible to have subqueries in the SELECT or FROM part of your queries.
 
SELECT name
FROM employees
WHERE id IN
  (SELECT id FROM sales_team)
Thankfully, PostgreSQL provides us with many methods of modifying the data that is in our database. You have probably heard of these commands before. They are the INSERT, UPDATE, and DELETE commands. Each of these operations has a different impact on the data inside your tables. Let’s take a look at the syntax of each of these statements.
 
SQL UPDATE Syntax

The syntax of the UPDATE statement is:
UPDATE table_name
SET column1 = value1,
        column2 = value2,
        …
[WHERE conditions]

 

SQL INSERT Syntax

You can write an INSERT statement with or without explicitly declaring the column names. The syntax for using INSERT without column names is:

INSERT INTO table_name

VALUES (value1, value2, value3, ...);

Note that you’ll need to provide a value for every column in the table if you omit the column names.

If you want to add data to some or all the columns in a table, use INSERT with column names:

INSERT INTO table_name (column1, column2, column3, ...)

VALUES (value1, value2, value3, ...);


SQL DELETE Syntax

The syntax for using DELETE is : 

DELETE FROM table_name

[WHERE condition]
A view is a database object that acts as a temporary or virtual table. You can query it like a regular table. It can be used in the FROM clause of a SELECT, and you can reference view columns in SELECT, WHERE, GROUP BY, and other clauses.
 
Views and tables differ in some important ways:
 
* Views do not store any records anywhere.
* Tables store records physically, on a disk.
* Views use existing records; they can also calculate new records as needed.

Views are useful for summarizing the data from single or multiple tables. They also provide an additional layer of data security. For example, you can create a view and only give the relevant users access to that view.
 
A simple example of a view might look like this :
 
CREATE VIEW european_customers AS
SELECT customer_id, customer_name
FROM customers
WHERE region = 'Europe';
 
To use this view, you’d write:
SELECT *
FROM european_customers;
29 .
What are the indices of PostgreSQL?
These are inbuilt functions or methods like GIST Indices, hash table, and B-tree, which the user can use to scan the index backwardly. Users can also define their indices of PostgreSQL.
The SQL standard is defined by four transaction isolation levels regarding three phenomena. The three phenomena must be prevented between concurrent transactions. The unwanted phenomenon is
 
Phantom read : a transaction that re-executes a query, returning a set of rows that satisfy a search condition and then finds that the location of rows that have been helping the situation has changed due to another recently committed transaction.

Non-repeatable read : a transaction that re-reads the data that it has previously read and then finds that another transaction has already modified data.

Dirty reads : a transaction that reads data written by a concurrent uncommitted transaction is the dirty read.
CTIDs are a field in every PostgreSQL table and are known to identify specific physical rows according to their block and offset positions within a particular table. They are used by index entries to point to physical rows. It is unique for each record in the table and quickly denotes the location of a tuple.

A logical row’s CTID changes when updated, so the CTID cannot be used as a long-term row identifier. However, it is sometimes helpful to identify a row within a transaction when no competing update is expected.
32 .
How to list the no of Databases ?
To list out all the databases in Postgres, below are the commands we should execute.
su -l pgsql
psql -l
SQL is essential to succeed in any data-focused job, especially data engineering. The more you know about SQL, the easier it’ll be for you to manipulate and query data.
 
Parallel Queries : PostgreSQL makes it possible to run parallel queries. This is when CPU power is leveraged to allow the running of multiple queries at once. This is especially important in data science, where often there is a general query

Full SQL syntax support : PostgreSQL supports a lot of SQL syntax and places an emphasis on SQL standard compliance. As a result, it supports window functions, table inheritance, and common table expressions.

Extended Data Support : PostgreSQL supports NoSQL data structures like JSON and XML.

Declarative Partitioning : This is when tables are split up into different segments called partitions. For example, you can create a different partition for each area code for large, geographically-distributed datasets.
Since no one type of clustering satisfies all needs, they have several different clustering tools which take various clustering approaches. The open source projects PostgresXC and Postgres-XL are available, as well as open source forks and proprietary tools such as Greenplum Database, Citus Data and several others.
 
Also, PostgreSQL is supported by filesystem-based clustering systems for failover, including ones from Red Hat, Microsoft, Veritas and Oracle.
PostgreSQL doesn't have Stored Procedures directly.
 
However, by working with Object Relational Database Management Systems i.e., PostgreSQL have not necessarily the function Stored Procedure (SP) defined, as we can find in most of other types of DBMS.
 
They are in fact, small sets of code stored at server side of a database.
 
On the contrary, to what happens in other databases, the Stored Procedures in PostgreSQL are defined as FUNCTIONS, as well as triggers, making this feature a bit more complicated, depending on its return type. These functions are important and variant in features, but created equal. Working with the creation of these small snippets of code is a good practice because it can leave quite complicated code acting on the server-side that can be used by multiple apps, thus avoiding the need to replicate them in each of these apps.
String constant contains a sequence of characters bound by single quotes. This feature is used during insertion of a character or passing character to database objects. PostgreSQL allows the usage of single quotes but embedded by a C style backslash. This feature is important in parsing data.
The key difference between Multi-Version Concurrency Control and lock models is that in MVCC, the locks acquired for querying or reading the data doesn't conflict with locks acquired for writing data. In this case, reading never blocks writing, and writing never blocks reading. So, Multi-Version Concurrency Control has the upper hand compared to other lock models.
To set up pgAdmin in PostgreSQL, we should follow the steps given below :
 
* First, start and launch pgAdmin 4.

* Then, go to the "Dashboard" tab, click on the "Quick Link" section and then click on "Add new Server."

* After clicking on the "Add new Server", you have to select the "Connection" tab in the "Create-Server" window.

* Now, configure the connection by entering your server's IP address in the "Hostname/Address" field.

* At last, you have to specify the "Port" as "5432," which is the by default port for the PostgreSQL server.
39 .
What do you understand by PL/Python?
PostgreSQL provides support to a procedural language known as PL/Python.
In the PostgreSQL server, PgAdmin is a free, open-source PostgreSQL database administration GUI or tool used in Microsoft Windows, Mac OS X, and Linux systems. PgAdmin is used to retrieve, develop, conduct quality testing, and maintain databases or other ongoing maintenances.
41 .
What is the purpose of Array_To_String in PostgreSQL?
The Array_To_String function concatenates array elements using a provided delimiter.
NoSQL” Databases (Not Only SQL) which are varied range of non-relational databases from small embedded databases to massive clustered (HA/Sarding) data processing platforms like Hadood &  MongoDB and thus it’s absurd to compare and remark on them as a general class RDBMS Databases.
 
Non relational databases preceded RDBMS (relational databases_ and have existed alongside them for 40 years, so choosing between relational and non relational databases is nothing strange.
 
Users should choose the database based on the following aspects :
 
* Features,
* Functionalities,
* Scalability
* Reliability,
* HA capabilities implementation (Replication) 
* Community support their current application needs.
 
Further, using multiple different databases for huge projects is becoming the standard, and PostgreSQL has no exception as well.
Use CURRENT_TIMESTAMP:

CREATE TABLE tbl_name (a int, modtime TIMESTAMP DEFAULT

CURRENT_TIMESTAMP );
44 .
What is TRUNCATE used for?
TRUNCATE TABLE statement is used to remove all data quickly and efficiently from the table.
45 .
Why do I get the error -error: memory exhausted in allocsetalloc ()?
You probably have run out of virtual memory on your system, or your kernel has a low limit for certain resources. Try this before starting the server:
 
Ulimit -D 262144 Limit Datasize 256m
Speed is imperative : At the expense of speed, PostgreSQL was designed with extensibility and compatibility in mind. If your project requires the fastest read operations possible, PostgreSQL may not be the best choice of DBMS.

Simple setups : Because of its large feature set and strong adherence to standard SQL, Postgres can be overkill for simple database setups. For read-heavy operations where speed is required, MySQL is typically a more practical choice.

Complex replication : Although PostgreSQL does provide strong support for replication, it’s still a relatively new feature and some configurations — like a primary-primary architecture — are only possible with extensions. Replication is a more mature feature on MySQL and many users see MySQL’s replication to be easier to implement, particularly for those who lack the requisite database and system administration experience.
It is the method or technique for the search of a single or collection of computer-stored documents in a full-text database. It can be easily differentiated from searches based on the metadata or on parts of the original texts represented in databases.
 
Although PostgreSQL is not as advanced as compared to the Elasticsearch and SOLR, both of these are specific for full-text search tools. Whereas in PostgreSQL, full-text search is only a feature and it is a pretty good one.
In PostgreSQL, tokens are the building blocks of any source code. Tokens contain several types of special character symbols like constants, quoted identifiers, other identifiers, and keywords. The keywords tokens contain pre-defined SQL commands and meanings. On the other hand, identifiers specify variable names like columns, tables, etc.
There are mainly four types of operators used in PostgreSQL :
 
* Arithmetic operators
* Comparison operators
* Logical operators
* Bitwise operators
Indices of PostgreSQL are inbuilt functions or methods such as GIST Indices, hash table, and B-tree (Binary tree). The user uses these to scan the index in a backward manner. PostgreSQL also facilitates their users to define their indices of PostgreSQL.
51 .
What is the way to avoid unnecessary locking of a database?
The best way to avoid unnecessary database locking is to use MVCC (Multi-version concurrency control). It is an advanced technique used in PostgreSQL for improving database performances.
52 .
What do we call database callback functions? What is its purpose?
The database callback functions are known as PostgreSQL Triggers. The PostgreSQL Triggers are performed or invoked automatically whenever a specified database event occurs.
To start, stop, and restart the PostgreSQL server on Windows, first, we have to find the PostgreSQL database directory. It would look something like this : C:\Program Files\PostgreSQL\10.4\data. Now, open the Command Prompt and execute the following commands :
 
To start the PostgreSQL server of Windows :
pg_ctl -D "C:\Program Files\PostgreSQL\9.6\data" start  

To stop the PostgreSQL server of Windows :
pg_ctl -D "C:\Program Files\PostgreSQL\9.6\data" stop​
  
To restart the PostgreSQL server of Windows :
pg_ctl -D "C:\Program Files\PostgreSQL\9.6\data" restart  

Another way to start, stop and restart the PostgreSQL server on Windows.
 
There is also another way to start, stop, and restart the PostgreSQL server on Windows. Follow the steps given below:
 
* First, open the Run Window by pressing the Windows key + R simultaneously.
* Then, type services.msc to find out the PostgreSQL services.
* Search Postgres service based on the version installed.
* Click the stop, start or restart option to do the same.
In PostgreSQL, the DROP TABLE command can be used to delete complete data from an existing table, but the biggest disadvantage of using the DROP TABLE command is that it removes complete table structure from the database. If you use the DROP TABLE command to delete the table, you must re-create a table to store data.
PostgreSQL MongoDB
PostgreSQL is a classical, relational database server that supports most SQL standards. MongoDB is a NoSQL database.
PostgreSQL is a traditional relational database management system (RDBMS) or SQL-based databases like Oracle and MySQL. It is open-source and free to use. MongoDB is a no-schema, NoSQL, JSON format database. It also provides a free version, but its enterprise-paid versions are more popular.
PostgreSQL database is written in C language. MongoDB is written in C++.
PostgreSQL is a Relational Database Management System. MongoDB is a Non-Relational Database Management System.
PostgreSQL is an Object-Oriented Database. MongoDB is Document Oriented Database.
PostgreSQL is available in multiple languages. MongoDB is only available in the English language.
PostgreSQL is 4 to 10 times faster than MongoDB on some parameters. MongoDB is slower than PostgreSQL. It is best suited for big data.
After you install the BRIN index, PostgreSQL is able to select the maximum and minimum value by reading your selected column for each 8k page of stored data. After that, PostgreSQL stores the page number, the minimum value, and the maximum value for your chosen column in BRIN Indexes.
The GiST is an extendable data structure that helps the users to develop indices over any kind of data. It also supports any lookup over that data. The GiST is able to do this by inserting an API to Postgres's index system.
In PostgreSQL, an inverted file is an index data structure used to map content to its location to a database file, within a document, or in sets of documents. It generally includes the distinct words found in a text and a list containing the occurrences of a word in the text. It is used in a data structure for document retrieval systems to provide a full-text search.
In PostgreSQL, a sequence is a special form of data created to generate multiple numeric identifiers in the database. It creates unique identifiers between multiple rows inside a table. It is mainly used to create sequences and artificial primary keys similar to Auto_Increment in MySQL.
Oracle PostgreSQL
Oracle is mostly an aid object-relational database management system. It is the first database management system designed for grid computing. PostgreSQL is a free, open-source object-relational database management system that follows SQL standards and extensibility.
Oracle is written and implemented in C, C++, and assembly language. PostgreSQL is written and implemented in C language.
Oracle is a comparatively old database. It was developed by Larry Ellison and Bob on 16 June 1977. PostgreSQL is comparatively a new database. It was developed by the PostgreSQL Global Development group on 8 July 1996.
One must require a license to use Oracle. PostgreSQL is open-source and free to use.
In Oracle, server operating systems are OS X, Linux, Windows, z/OS, AIX, HP-UX, etc. In PostgreSQL, server operating systems are HP-UX, NetBSD, Solaris, Windows, Unix, Linux, FreeBSD, etc.
Oracle provides advanced security options. PostgreSQL also provides good security support but less compared to Oracle.
Oracle provides support for the programming languages such as C, C++, JAVA, PERL, .NET, JavaScript, PHP, etc. PostgreSQL provides support for the programming languages such as C, C++, JAVA, PERL, SCALA, PHP, C#, COBOL, JavaScript, etc.