The API provides a suite of specialized operations designed to optimize specific tasks.
The currently available operations include:
The analyze-image operation enables you to extract valuable information from an image, utilizing both traditional and artificial intelligence (AI) techniques.
You can customize the extraction process for specific data by enabling the corresponding option.
Option Name | Type | Possible Values | Description |
---|---|---|---|
nsfw | boolean | true, false | This option activates the detection of Not Safe For Work (NSFW) content in images. It provides an indication of whether explicit or inappropriate material is present. |
age | boolean | true, false | Enabling this option triggers the identification of the age of individuals depicted in the images. It provides insights into the approximate age range of people captured in the visual content. |
max_outputs_per_input | integer | 1-2048 | Specifies the maximum number of outputs to be generated per input file during the conversion process. If the conversion yields more than one output per input, only the specified number of outputs will be returned. |
allow_multiple_outputs | boolean | true, false | If the comparison produces more than one output file, by default all of them are compressed in just one file. Set this option to true if you want a download link for each file. |
{
"conversion": [{
"category": "operation",
"target": "analyze-image",
"options": {
"nsfw": true,
"age": true
}
}]
}
The compare-image operation allows for the comparison of differences between two provided images.
Option Name | Type | Possible Values | Description |
---|---|---|---|
method | string | AE, MAE, NCC, PSNR, RMSE | The algorithm to use for image comparison. |
diff_color | string | red, green, blue, yellow, orange, violet, black, white | The color to use to show the pixels that are different in the diff image. |
threshold | integer | 0-100 | Sets the threshold in percentage above which color is to be considered different between the two images. A zero means that the pixels must be identical to be considered equal. |
allow_multiple_outputs | boolean | true, false | If the comparison produces more than one output file, by default all of them are compressed in just one file. Set this option to true if you want a download link for each file. |
Value | Description |
---|---|
AE | The Absolute Error (AE) algorithm measures the absolute differences between the pixel values of the two images being compared. |
MAE | The Mean Absolute Error (MAE) algorithm calculates the average absolute differences between the pixel values of the two images being compared. |
NCC | The Normalized Cross-Correlation (NCC) algorithm measures the similarity between the two images by comparing the correlation of the pixel values. |
PSNR | The Peak Signal-to-Noise Ratio (PSNR) algorithm calculates the ratio between the maximum possible pixel value of the image and the difference between the two images being compared. |
RMSE | The Root Mean Squared Error (RMSE) algorithm calculates the square root of the average of the squared differences between the pixel values of the two images being compared. |
{
"conversion": [{
"category": "operation",
"target": "compare-image",
"options": {
"method": "mae",
"diff_color": "orange",
"threshold": "25",
"allow_multiple_outputs": true
}
}]
}
The output of this image comparison operation may include two files:
The JSON output provided in this example is a representation of the results of an image comparison operation and can be used to determine the degree of similarity or difference between the two input images.
It allows to have a per channel comparison and have a relative idea of the difference between the images.
It includes information about the comparison method used and the resulting color values and, since the method used in the call was mae, their Mean Absolute Error values.
The top-level key "method" is a string representing the method used for the comparison. In this example, it is set to "Mean Absolute Error".
The next key is an identifier, "fe194466-3df5-4baf-8138-511d874e583e", that it is associated with the comparison result containing the color channels: red, green, blue, and all.
Each color channel is associated with a string that contains two parts: the first is the Mean Absolute Error value, and the second (the value in parenthesis) represents the relative error as a ratio (range between 0 and 1) and this can be useful to understand how much different the two images are.
For example, the value "1911.56 (0.0291685)" for the "red" color channel means that the Mean Absolute Error for the red channel is 1911.56 and the relative error is 0.0291685.
{
"method": "Mean Absolute Error",
"fe194466-3df5-4baf-8138-511d874e583e": {
"red": "1911.56 (0.0291685)",
"green": "1813.52 (0.0276726)",
"blue": "2007.71 (0.0306357)",
"all": "1910.93 (0.029159)"
}
}
The compare-video operation allows for the comparison of differences between two provided videos.
It utilizes the Video Multimethod Assessment Fusion (VMAF) video quality metric to evaluate the similarity in quality between the two videos.
VMAF is a machine learning-based video quality assessment algorithm that combines the strengths of several existing video quality metrics into a single, unified score.
It was designed to closely predict the Mean Opinion Score (MOS) of human viewers and has been shown to be highly correlated with subjective quality assessments.
By calculating the VMAF score for two videos, you can get an idea of how similar in quality the videos are to each other.
{
"conversion": [{
"category": "operation",
"target": "compare-video"
}],
}
Field Name | Description |
---|---|
integer_adm2 | The Average Difference Modulus (ADM) metric, which measures the average difference between the two videos in terms of pixel intensity. |
integer_adm_scale0 integer_adm_scale1 integer_adm_scale2 integer_adm_scale3 |
The ADM metric calculated at a lower resolution. |
integer_motion2 | The Motion2 metric, which measures the degree of motion in the video. |
integer_motion | The Motion metric, which measures the degree of motion in the video. |
integer_vif_scale0 integer_vif_scale1 integer_vif_scale2 integer_vif_scale3 |
The Visual Information Fidelity (VIF) metric calculated at a lower resolution. |
vmaf | The VMAF score is calculated on a per-frame basis and the vmaf field in the metrics field contains the VMAF score for each individual frame. |
pooled_metrics |
Contains summary statistics for each metric over the entire video. The statistics include the minimum, maximum, mean and harmonic mean values for each metric. The vmaf field in the pooled_metrics field contains summary statistics for the VMAF scores over the entire video. |
The output of this operation includes information about the quality of each frame in the video, as well as summary statistics for the entire video.
The metrics field contains a set of quality metrics for each frame, as in the code example.
{
"version": "2.3.1",
"fps": 24.03,
"frames": [
{
"frameNum": 0,
"metrics": {
"integer_adm2": 0.842100,
"integer_adm_scale0": 0.779905,
"integer_adm_scale1": 0.749054,
"integer_adm_scale2": 0.846122,
"integer_adm_scale3": 0.901889,
"integer_motion2": 0.000000,
"integer_motion": 0.000000,
"integer_vif_scale0": 0.198811,
"integer_vif_scale1": 0.522634,
"integer_vif_scale2": 0.660773,
"integer_vif_scale3": 0.768340,
"vmaf": 42.687407
}
},
<...more frames...>
],
"pooled_metrics": {
"integer_adm2": {
"min": 0.794626,
"max": 0.863825,
"mean": 0.830444,
"harmonic_mean": 0.830332
},
"integer_adm_scale0": {
"min": 0.698013,
"max": 0.818805,
"mean": 0.761654,
"harmonic_mean": 0.761299
},
<...more metrics...>
"vmaf": {
"min": 40.664485,
"max": 61.936634,
"mean": 51.017497,
"harmonic_mean": 50.703806
}
}
The compress operation allows for the reduction of an image file's size, potentially with no noticeable loss in quality.
This is achieved through the use of various image compression algorithms, which are designed to remove redundant or unnecessary information from the image file while preserving the visual integrity of the image.
There are several types of image compression algorithms that can be used, each with their own trade-offs in terms of file size reduction and image quality.
The specific compression algorithm used is automatically selected by setting the quality option between the values: low, medium, high, best.
Option Name | Type | Possible Values | Description |
---|---|---|---|
compression_level | string | low, medium, high, best | Tries to compress your file at different levels with no perceivable loss of quality |
compression_target | string | gif, jpg, png | Sets the target format if it's different from the input's file format |
allow_multiple_outputs | boolean | true, false | If the comparison produces more than one output file, by default all of them are compressed in just one file. Set this option to true if you want a download link for each file. |
{
"conversion": [{
"category": "operation",
"target": "compress",
"options": {
"compression_level": "high",
"compression_target": "jpg"
}
}],
}
The convert-pdfa operation allows for the conversion of a PDF file to a PDF/A archive, which is a specialized format designed for long-term preservation of electronic documents.
PDF/A ensures that documents will be rendered correctly and consistently over time, even if the software and hardware used to view the documents change.
Option Name | Type | Possible Values | Description |
---|---|---|---|
validation_profile | string | pdfa1a, pdfa1b, pdfa2a, pdfa2b, pdfa2u, pdfa3a, pdfa3b, pdfa3u | The converted file is PDF/A compliant to the specified profile |
strict_mode | boolean | true, false | If true, it will not modify the appearance of the input file in order to make it PDF/A compliant.
This means that any elements of the file that do not meet the requirements of the selected PDF/A profile will not be altered,
potentially resulting in a non-compliant PDF/A file and a failing conversion. If false, the convert-pdfa operation will make any necessary modifications to the input file in order to make it PDF/A compliant. |
allow_multiple_outputs | boolean | true, false | If the comparison produces more than one output file, by default all of them are compressed in just one file. Set this option to true if you want a download link for each file. |
Profile | Description |
---|---|
pdfa1a | This profile requires that all fonts be embedded and that the document's visual appearance be fully self-contained |
pdfa1b | This profile is similar to pdfa1a, but allows for certain color spaces to be not embedded |
pdfa2a | This profile is similar to pdfa1a, but allows for the use of transparent objects and requires that the document be tagged with logical structure information |
pdfa2b | This profile is similar to pdfa2a, but allows for certain color spaces to be not embedded |
pdfa2u | This profile is similar to pdfa2a, but allows for Unicode encoding of text strings |
pdfa3a | This profile is similar to pdfa2a, but allows for the use of JPEG 2000 image compression |
pdfa3b | This profile is similar to pdfa3a, but allows for certain color spaces to be not embedded |
pdfa3u | This profile is similar to pdfa3a, but allows for Unicode encoding of text strings |
{
"conversion": [{
"category": "operation",
"target": "convert-pdfa",
"options": {
"validation_profile": "pdfa1b",
"strict_mode": true
}
}]
}
The extract-archive operation allows for accessing the contents of an archived file and extracting them for further processing or use.
It can handle a range of different archive formats, such as a ZIP, 7ZIP, TAR.GZ, TAR.BZ2, or RAR file, making it a versatile tool for working with archived content.
Option Name | Type | Possible Values | Description |
---|---|---|---|
summary | boolean | true, false | includes information about the file paths and sizes of the extracted files, and is structured in a way that makes it easy to parse and create a tree-like structure for display, such as on a website. |
allow_multiple_outputs | boolean | true, false | If the comparison produces more than one output file, by default all of them are compressed in just one file. Set this option to true if you want a download link for each file. |
{
"conversion": [{
"category": "operation",
"target": "extract-archive",
"options": {
"summary": true
}
}]
}
The extract-stream operation allows for the extraction of individual streams from a multimedia file, such as an MP4 file, for further processing or use.
It can handle a range of different file formats, making it a versatile tool for working with multimedia content.
By default, this operation extracts all streams contained within the input file, including audio, video, and subtitle streams.
Option Name | Type | Possible Values | Description |
---|---|---|---|
stream | string | all, audio, video, subtitle | This can be useful for isolating and processing individual streams within a multimedia file. |
allow_multiple_outputs | boolean | true, false | If the comparison produces more than one output file, by default all of them are compressed in just one file. Set this option to true if you want a download link for each file. |
The merge-stream operation allows for the merging of multiple multimedia streams, such as audio, video, and subtitle files, into a single multimedia file.
It offers a range of options for customizing the resulting merged file, making it a versatile tool for working with multimedia content.
Option Name | Type | Possible Values | Description |
---|---|---|---|
video_format | string | avi, flv, mkv, mov, mp4, mpg, ogv, webm, wmv | Allows the specification of the format of the merged file. |
keep_original_audio | boolean | true, false | If set to true, this option will keep the audio streams from the original video. |
input_filters | array of objects | see details later in this page | This option allows for the application of filters to the merged result, such as modifying the volume of individual streams. |
metadata | array of objects | see details later in this page | This option allows the specification of metadata for each stream, such as the language of an audio stream or the subtitle language of a subtitle stream. |
audio_channels | string | layout, back_center, back_left, back_right, front_center, front_left, front_left_of_center, front_right, front_right_of_center, low_frequency, side_left, side_right, top_center, top_front_center, top_front_left, top_front_right, top_back_center, top_back_left, top_back_right, wide_left, wide_right | This option allows the specification of the audio layout of the final merged file. See details later in this page. |
allow_multiple_outputs | boolean | true, false | If the comparison produces more than one output file, by default all of them are compressed in just one file. Set this option to true if you want a download link for each file. |
input_filters: it requires an array of objects where each object contains the following fields:
Property Name | Type | Possible Values | Description |
---|---|---|---|
input_id | string | pattern: ^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$ | Specifies the input_id of the file for which the desired filter will be set. |
filter_name | string | volume | The filter to apply the selected filter to the input_id |
filter_value | string | The value(s) to use with the selected filter as a comma separated list | E.g. for the volume filter allows you to control the loudness of the stream. A value of 0 mutes the stream, 0.50 reduces it to half, 1 leaves it as is in the original file, and 1.50 increases it by 50%. |
metadata: it requires an array of objects where each object contains the following fields:
Property Name | Type | Possible Values | Description |
---|---|---|---|
input_id | string | pattern: ^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$ | Specifies the input_id of the file for which the desired properties will be set. |
language | string | pattern: ^[a-z]{3}$ | The language (using ISO 639-2/T standard) of the stream contained within the input file, as specified by the input_id field |
audio_channels: This option allows the specification of the audio layout of the final merged file.
The option layout allows you to predefine the layout of sound sources. Possible values are described in the next table.
Option Value | Description |
---|---|
mono | A single channel, with all audio being played through a single speaker or set of speakers. |
stereo | 2 channels, typically played through a pair of left and right speakers. |
2.1 | 2 channels for left and right audio, plus a separate channel for low-frequency effects (LFE) played through a subwoofer. |
3.0 | 3 channels, with left, right, and center audio played through separate speakers. |
3.0-back | This audio layout is similar to 3.0, but includes an additional back channel for rear audio. |
3.1 | 3 channels for left, right, and center audio, plus a separate channel for LFE played through a subwoofer. |
4.0 | 4 channels, with left, right, left-surround, and right-surround audio played through separate speakers. |
4.1 | 4 channels for left, right, left-surround, and right-surround audio, plus a separate channel for LFE played through a subwoofer. |
quad | This audio layout is similar to 4.0, but with the addition of a separate channel for rear audio. |
quad-side | This audio layout is similar to quad, but with the addition of side channels for left-side and right-side audio. |
5.0 | 5 channels, with left, right, center, left-surround, and right-surround audio played through separate speakers. |
5.0-side | This audio layout is similar to 5.0, but with the addition of side channels for left-side and right-side audio. |
5.1 | 5 channels for left, right, center, left-surround, and right-surround audio, plus a separate channel for LFE played through a subwoofer. |
5.1-side | This audio layout is similar to 5.1, but with the addition of side channels for left-side and right-side audio. |
6.0 | 6 channels, with left, right, center, left-surround, right-surround, and back audio played through separate speakers. |
6.0-front | This audio layout is similar to 6.0, but with the addition of front channels for left-front and right-front audio. |
hexagonal | 6 channels, with left, right, left-surround, right-surround, left-back, and right-back audio played through separate speakers. |
6.1 | 6 channels for left, right, center, left-surround, right-surround, and back audio, plus a separate channel for LFE played through a subwoofer. |
6.1-back | This audio layout is similar to 6.1, but with the back channel replaced by left-back and right-back channels. |
6.1-front | This audio layout is similar to 6.1, but with the back channel replaced by left-front and right-front channels. |
7.0 | 7 channels, with left, right, center, left-surround, right-surround, left-back, and right-back audio played through separate speakers. |
7.0-front | This audio layout is similar to 7.0, but with the addition of front channels for left-front and right-front audio. |
7.1 | 7 channels for left, right, center, left-surround, right-surround, left-back, and right-back audio, plus a separate channel for LFE played through a subwoofer. |
7.1-wide | This audio layout is similar to 7.1, but with the addition of wide channels for left-wide and right-wide audio. |
7.1-wide-side | This audio layout is similar to 7.1-wide, but with the addition of side channels for left-side and right-side audio. |
octagonal | 8 channels, with left, right, left-surround, right-surround, left-back, right-back, left-side, and right-side audio played through separate speakers |
hexadecagonal | 16 channels, with left, right, center, left-surround, right-surround, left-back, right-back, left-side, right-side, left-front, right-front, left-wide, right-wide, top-front-center, top-back-center, and bottom-center audio played through separate speakers. |
Furthermore, the option audio_channels allows to specify the single channels of the above layouts.
The value is the input_id of the file to use as a source for the specified channel.
The available values of the audio_channels option are: back_center, back_left, back_right, front_center, front_left, front_left_of_center, front_right, front_right_of_center, low_frequency, side_left, side_right, top_center, top_front_center, top_front_left, top_front_right, top_back_center, top_back_left, top_back_right, wide_left, wide_right
{
"conversion": [{
"category": "operation",
"target": "merge-streams",
"options": {
"audio_channels": {
"layout": "3.0",
"front_left": "{input_file_id_1}",
"front_right": "{input_file_id_2}",
"front_center": "{input_file_id_3}"
}
}
}],
}
The thumbnail operation allows you to create small versions of images or document pages, called thumbnails.
This can be useful if you want to display a large number of images or document pages in a small space.
Option Name | Type | Possible Values | Description |
---|---|---|---|
thumbnail_target | string | png, jpg | creates a thumbnail in the Portable Network Graphics (PNG) format or in the Joint Photographic Experts Group (JPEG) format. |
width | integer | Range: 10-500 | Specifies the width of the thumbnail in pixels. You can use either option alone or together with height to specify the thumbnail's aspect ratio. |
height | integer | Range: 10-500 | Specifies the height of the thumbnail in pixels. You can use either option alone or together with width to specify the thumbnail's aspect ratio. |
dpi | integer | Range: 10-300 | Specifies the horizontal and vertical density of the thumbnail image. This is important because it determines the size of the image when it is printed or displayed on a high-resolution screen. |
pages | string | Pattern: ^(((\d+|first)-?(\d*|last))|(first)|(last))(, ?(((\d+|first)-?(\d*|last))|(first)|(last)))*$ | Specifies the page numbers of a document that you want to extract and convert to image thumbnails. This is useful if you only want to create thumbnails of certain pages of a document. You can specify a single page number or a range of pages separated by a hyphen. For example, 1 or 1-3 would create thumbnails of the first and second pages of a document. Aliases like "first" and "last" are also accepted. Note that this option is only applicable to document inputs. |
allow_multiple_outputs | boolean | true, false | If the comparison produces more than one output file, by default all of them are compressed in just one file. Set this option to true if you want a download link for each file. |
The validate-pdfa operation allows for the validation of a PDF file's compliance with the PDF/A standard.
This operation checks the input file against the specified PDF/A profile for ensuring that a PDF file meets the requirements for long-term preservation and will be consistently rendered over time, even if the software and hardware used to view the file change.
Option Name | Type | Possible Values | Description |
---|---|---|---|
validation_profile | string | pdfa1a, pdfa1b, pdfa2a, pdfa2b, pdfa2u, pdfa3a, pdfa3b, pdfa3u | The input file is checked for PDF/A compliance to this profile |
allow_multiple_outputs | boolean | true, false | If the comparison produces more than one output file, by default all of them are compressed in just one file. Set this option to true if you want a download link for each file. |
Profile | Description |
---|---|
pdfa1a | This profile requires that all fonts be embedded and that the document's visual appearance be fully self-contained |
pdfa1b | This profile is similar to pdfa1a, but allows for certain color spaces to be not embedded |
pdfa2a | This profile is similar to pdfa1a, but allows for the use of transparent objects and requires that the document be tagged with logical structure information |
pdfa2b | This profile is similar to pdfa2a, but allows for certain color spaces to be not embedded |
pdfa2u | This profile is similar to pdfa2a, but allows for Unicode encoding of text strings |
pdfa3a | This profile is similar to pdfa2a, but allows for the use of JPEG 2000 image compression |
pdfa3b | This profile is similar to pdfa3a, but allows for certain color spaces to be not embedded |
pdfa3u | This profile is similar to pdfa3a, but allows for Unicode encoding of text strings |
{
"conversion": [{
"category": "operation",
"target": "validate-pdfa",
"options": {
"validation_profile": "pdfa1b"
}
}]
}
The output of this operation includes details on the sections of the PDF that do not conform to the specified profile as in the example code.
{
"profile": {
"name": "PDF/A-2B validation profile",
"description": "Validation rules against ISO 19005-2:2011, Level B"
},
"errors": [
{
"error": null,
"rule": "The Catalog dictionary of a conforming file shall contain the Metadata key whose value is a metadata stream as defined in ISO 32000-1:2008, 14.3.2.",
"location": "root/document[0]"
},
{
"error": null,
"rule": "DeviceRGB shall only be used if a device independent DefaultRGB colour space has been set when the DeviceRGB colour space is used.",
"location": "root/document[0]/pages[0](1 0 obj PDPage)/contentStream[0](2 0 obj PDContentStream)/operators[6]/colorSpace[0]"
}
]
}