YAML
YAML is a specialized language for storing structured information with a simple syntax. This tool allows you to store complex data in a compact and easy-to-read format (.yml file). This feature can be extremely useful in the context of DevOps and virtualization.
What is YAML?
YAML is a language designed to store information in a human-readable format. Its name stands for “Yet Another Markup Language.” However, this definition was later changed to “YAML is not a markup language” to clearly distinguish it from other languages in that category.
This language is similar to XML and JSON, but uses a more sophisticated syntax while maintaining similar capabilities. YAML is typically used to create configuration files within the Infrastructure as Code (IaC) approach, as well as for container management in the context of DevOps.
YAML is most often used to create automation protocols that can execute sequences of commands specified in a .yml file. This allows your system to function more quickly and independently, without additional developer involvement.
A growing number of companies are actively implementing DevOps practices and virtualization in their operations. For this reason, YAML proficiency is becoming an essential requirement for modern developers. Particularly valuable is the language’s compatibility with Python (including the PyYAML library), as well as popular technologies like Docker and Ansible, which significantly simplify integration.
Comparison of YAML, JSON, and XML
YAML (.yml)
Peculiarities:
- has a human-readable code;
- has a minimalistic syntax;
- data-oriented;
- includes a structure resembling JSON (YAML is an extended version of JSON);
- allows you to add comments;
- supports the use of unquoted strings;
- considered cleaner than JSON;
- provides additional features such as extensible data types, relative anchors, and preservation of key order.
Usage: YAML is ideal for data-intensive applications that rely on DevOps processes or use virtual machines. Increased data readability is especially useful in teams where developers regularly interact with this data.
JSON
Peculiarities:
- requires more effort to read.
- The syntax has strict and clear requirements.
- similar to the inline YAML style (some YAML parsers can interpret JSON files);
- There is no option to add comments.
- Strings must be enclosed in double quotes.
Usage: JSON is used in web development, representing the best format for serializing and transmitting data over an HTTP connection.
XML
Peculiarities :
- Requires more effort to read.
- Has a more complex structure.
- Serves as a markup language, while YAML is used to format data.
- Provides a wider range of possibilities, such as the use of tag attributes.
- Has a more rigid document structure.
Usage: XML is ideal for complex projects requiring careful control over validation, schema, and namespaces. XML is difficult to read, consumes more bandwidth, and requires more storage space, but it provides an unrivaled level of control.
What makes YAML unique?
Multiple document support
You have the ability to group multiple YAML documents into a single file, making file management and information processing much easier.
Documents are separated by a triple hyphen (-):
Ability to add comments
YAML allows comments to be inserted after the # symbol, similar to how Python does it:
Clear syntax
YAML file syntax uses an indentation system similar to that of the Python programming language. It’s important to use spaces instead of tabs to avoid confusion.
This approach eliminates the redundant use of characters common to JSON and XML (such as quotation marks, parentheses, and curly braces). As a result, the file becomes much more readable.
YAML:
JSON:
Explicit and implicit typing
YAML implements both explicit and implicit data typing. It provides the ability to automatically detect types and specify them explicitly. To use a specific data type, simply add
Examples of implicit typing:
No embedded executable files
The language does not contain embedded executables. This ensures the secure exchange of YAML files with other parties.
To work with executable files, you will need to integrate YAML with other languages such as Perl or Java.
YAML syntax
The YAML language has several fundamental concepts that enable the processing of a wide variety of data.
Key-value pairs
The bulk of the data in the yml file is in the form of key-value pairs, where the key represents a name and the value represents the corresponding data.
Scalars and mapping
A scalar denotes a single value associated with a particular name.
The YAML language supports standard types: int and float, boolean, string, and null.
These types can be represented in a variety of formats, including hexadecimal, octal, and scientific notation. Additionally, there are special types for mathematical concepts such as infinity, negative infinity, and NaN.
Lines
A line is a sequence of characters that can include words or entire sentences. Lines are denoted by the symbols | for individual lines and > for paragraphs.
It is important to note that YAML does not require quotation marks.
Sequences
Sequences are data structures similar to lists or arrays that store multiple values under a single key. They are defined using indentation or [].
Single-line sequences look more compact, but their readability suffers:
Dictionaries
Dictionaries are collections of key-value pairs grouped under a single key. They allow you to structure data into logical categories.
Anchors
Anchors are a unique feature of the YAML language, allowing you to create links to specific data elements within the document structure. This is especially useful when the same data is repeated in different places in the document, as anchors help avoid duplication of information.
Anchors work quite simply: you define a data element using an anchor, and then you can reference that anchor in other parts of the document. This allows you to store data in one place and reference it as needed.
Example of using anchors:
In this example, the &details anchor creates an anchor containing general data for a person. The data defined in the anchor is then linked <<: *detailsin sections using a link. This avoids duplication and simplifies data updates, as changes in one location are automatically reflected elsewhere.employee1employee2
Integration with Docker, Ansible, and other tools
YAML is actively used for integration with various automation, deployment, and management tools, such as Docker, Ansible, and many others. This integration enables more efficient configuration management, deployment, and task automation.
Docker Integration:
Docker uses YAML files to define container configurations. The docker-compose.yml file is an example of how to use the language to define multi-container applications, including containers, networks, and volumes.
Example of a service definition in Docker Compose:
Ansible integration:
Ansible uses YAML files to define infrastructure configurations and automate tasks. YAML files in Ansible contain “playbooks”—sets of tasks that describe what should be done on target systems and how.
Example Ansible playbook:
Extended forms of sequences and mapping
YAML has extended forms for describing sequences (lists) and mappings (dictionaries) that add additional capabilities and flexibility to structuring data.
Sequence forms:
- Multi-line strings within a sequence:
In this example, the second element “banana” is represented by a multi-line string, allowing for longer, more descriptive data to be included.
- Nested sequences:
Mapping forms:
- Block mapping style:
- Nested mappings:
Extended data types (timestamp, null, and others)
YAML supports extended data types in addition to the standard types (string, number, Boolean, etc.). These extended data types allow for more precise and concise descriptions of various entities.
Examples of extended data types in YAML:
- Timestamp:
Here timestamp_exampleis a timestamp in ISO 8601 format.
- Null (Empty value):
This example demonstrates the use of a value nullthat can denote the absence of a value.
- NaN (Not a Number):
The value .nanrepresents “not a number”, which can be used to denote undefined numeric values.
- Infinity:
The value .infrepresents positive infinity.
- Negative infinity:
The value -.infrepresents negative infinity.