What is YAML?
The full form of YAML is 'YAML ain't markup language'. It is not a programming language. It is a data format that is used to exchange data. This language is similar to JSON and XML. However, it is popular because it is easily readable and works quite well along with other programming languages.
They have the extensions .yaml or .yml
It is a data serialization language, that converts the data object which is a combination of code and data into a series of bytes that saves the state of this object in a form that is easily transmitted. In YAML we can store documents as well as object data.
Benefits of YAML
It is simple and easy to read.
Has a strict syntax i.e. it is case-sensitive and indentation plays an important role in writing this language.
It can be easily converted to JSON and XML.
This language is more powerful when representing complex data.
Data Parsing is easy.
What does a YAML file look like?
This is just an example. We will be studying each component in detail.
Consider the following example:
For this, we will now be checking the YAML validator
Thus it is a valid syntax and works perfectly fine!
Indentation
A YAML file relies on whitespace and indentation to indicate nesting. The number of spaces used for indentation doesn’t matter as long as they are consistent.
YAML generally has 3 data types:
Scalar
List
Dictionary
1. Scalar
Scalar is a simple data type. The value of the scalar can be an integer, float, Boolean or string. They are further classified into numeric data type and string data type.
i) Numeric data type
Take a look at the following Example 1:
In this example, 'number' holds an integer value, 'marks' holds a floating point value, and 'bool1' & 'bool2' holds a boolean expression. Along with 'Yes' & 'No', we can also declare a boolean expression by writing 'True' & 'False' or 'On' & 'Off'.
We can also specify the data type of a variable by using double exclamation marks followed by the data type (ex: !!int)
Example 2:
Take a close look at the above program and note the following points:
Positive, negative and zero values can be declared easily as shown above.
Binary Numbers should have the prefix '0b' followed by the desired number.
Octal Numbers should have the prefix '0' followed by the desired number.
Hexadecimal Numbers should have the prefix '0x' followed by the desired number.
We can also write an exponential number in a similar way as shown above.
ii) String data type
Look at the following example
There can be many ways to define and write a string.
We can use the key-value pair, in which a key is followed by a colon: and then we write the desired string in double quotes " ".
We can also omit the double quotes and write the desired string directly after the key and colon :
We can also define the string by using the syntax: !!str followed by the desired writings.
Writing Multiple lines
In most of the programming languages, when we want to shift to a new line we use '\n'. Here we have something interesting.
Here we can just use | and then from the next line, we can start writing whatever we want in any number of lines. We can also omit the key 'bio' and simply use |
NOTE: Take a look at the indentation from line 2. As said earlier, whitespaces play a crucial role in writing YAML files. Every space and every indentation matters!
Writing multiple lines in a single line
There is also a way in which we can write multiple lines in one single line.
We can write all of this in one single line. We need to use the symbol '>' followed by the multiple lines of words and sentences. Everything will be displayed on one single line on the YAML validator. Here we can also omit the key 'message' and simply use >
NULL value
During the YAML file, we can set the value of a data variable to be null. Later, we can write a program to change the value of null to any other value. There are many ways to do this.
We can use the keyword 'NULL' or use the symbol '~'
The output of the above code on the YAML validator will be
Comments
In computer programming, a comment is a programmer-readable explanation or annotation in the source code of a computer program. They are added to make the source code easier for humans to understand and are generally ignored by compilers and interpreters.
In YAML comments are followed by the symbol #. In YAML we cannot write multi-line comments.
Advance Data Types
Mapping/Dictionaries
Mappings in YAML represent the unordered collection of key-value pairs. Maps can be nested by increasing the indentation, or new maps can be created at the same level by resolving the previous one.
There are 2 ways of representing the dictionaries which are shown above. There can be many types of mapping like simple, nested, and mixed.
Sequence/List
In YAML we can also represent data as a collection of sequences with proper sequence styles. They are represented using a hyphen (-) along with appropriate indentation.
This is the syntax for writing a sequence. Again there are 2 ways of representing them. We can also nest the sequence and can also mix it with dictionaries. Below is an example.
There is also a special type of sequence called a sparse sequence. Here some values will be empty or NULL. Even though the key will be empty, it will be treated as a NULL value and there won't be any error.
Pairs
Once we declare a key-value pair, we cannot use that same key again and assign a different value to it. To overcome this there is a special method, using pairs. Here we can assign as many values as we want to any particular key. Refer to the below example for the syntax.
Anchors and Alias
Sometimes while writing large programs there might be some particular code that can be repeated. To avoid this duplication we use Anchors. Anchors and aliases allow us to store and reuse data within our YAML file.
Anchors(&) refer to that part of the snippet which is to be repeated. To use this chunk of code again we use this syntax (<<*). Refer to the below example for a clear understanding.
We can also override and change a few values if we wish. Here we have changed Rahul's dislike to Pineapple and Rohit's fav fruit to Orange. The output of it will be something like this:
Documents
One block of code/program is called a document or directive. YAML files can contain multiple documents. To separate these multiple documents we use 3 hyphens (---). And if we wish to stop the program then we use 3 dots(...).
In the above example, the documents are separated by --- and then the program is ended by ...
YAML vs JSON
JSON: It’s a lightweight data-interchange format for data exchange between different systems. Because it’s a text-based format that uses JavaScript syntax, JSON is fairly easy for humans to read and write, but it’s also easy for machines to parse and generate. It is easy to learn and adapt, fast and easy to learn and validate, compact, secure and can connect with other systems.
YAML: YAML is the most human-readable data serialization format. Many developers consider it easier to learn than JSON because it’s written using natural language. YAML can handle a variety of tasks, works with multiple data types, allows the use of comments and uses natural language.
Which to choose?
It is quite difficult to decide whether JSON is better or YAML. JSON is comparatively faster than YAML. However, if data configurations are small then YAML is better since its interface is much more friendly.
Uses of YAML
YAML is used in Ansible and for Kubernetes resources and deployments. A benefit of using YAML is that YAML files can be added to source control, such as Github so that changes can be tracked and audited. They are also used while working with other DevOps tools like Jenkins, Terraform, Docker Compose etc.
Conclusion
So this was a beginner's guide to YAML. This will give you a basic and brief overview of the YAML language.
Credits
A huge thanks to Kunal Kushwaha for continuously motivating me and constantly providing his valuable knowledge.
Link to YAML Validator: https://www.yamllint.com/