Connector Metadata.yaml File
The metadata.yaml
file contains crucial information about the connector, including its type, definition ID, Docker image tag, Docker repository, and much more. It plays a key role in the way Airbyte handles connector data and improves the overall organization and accessibility of this data.
N.B. the metadata.yaml
file replaces the previous source_definitions.yaml
and destinations_definitions.yaml
files.
Structure
Below is an example of a metadata.yaml
file for the Postgres source:
data:
allowedHosts:
hosts:
- ${host}
- ${tunnel_method.tunnel_host}
connectorSubtype: database
connectorType: source
definitionId: decd338e-5647-4c0b-adf4-da0e75f5a750
dockerImageTag: 2.0.28
maxSecondsBetweenMessages: 7200
dockerRepository: airbyte/source-postgres
githubIssueLabel: source-postgres
icon: postgresql.svg
license: MIT
name: Postgres
tags:
- language:java
- language:python
registries:
cloud:
dockerRepository: airbyte/source-postgres-strict-encrypt
enabled: true
oss:
enabled: true
supportLevel: certified
documentationUrl: https://docs.airbyte.com/integrations/sources/postgres
metadataSpecVersion: "1.0"
The registries
Section
The registries
section within the metadata.yaml
file plays a vital role in determining the contents of the oss_registry.json
and cloud_registry.json
files.
This section contains two subsections: cloud
and oss
(Open Source Software). Each subsection contains details about the specific registry, such as the Docker repository associated with it and whether it's enabled or not.
Structure
Here's how the registries
section is structured in our previous metadata.yaml
example:
registries:
cloud:
dockerRepository: airbyte/source-postgres-strict-encrypt
enabled: true
oss:
enabled: true
In this example, both cloud
and oss
registries are enabled, and the Docker repository for the cloud
registry is overrode to airbyte/source-postgres-strict-encrypt
.
Updating Registries
When the metadata.yaml
file is updated, this data is automatically uploaded to Airbyte's metadata service. This service then generates the publicly available oss_registry.json
and cloud_registry.json
registries based on the information provided in the registries
section.
For instance, if a connector is listed as enabled: true
under the oss
section, it will be included in the oss_registry.json
file. Similarly, if it's listed as enabled: true
under the cloud
section, it will be included in the cloud_registry.json
file.
Thus, the registries
section in the metadata.yaml
file provides a flexible and organized way to manage which connectors are included in each registry.
The tags
Section
The tags
field is an optional part of the metadata.yaml
file. It is designed to provide additional context about a connector and improve the connector's discoverability. This field can contain any string, making it a flexible tool for adding additional details about a connector.
In the metadata.yaml
file, tags
is a list that may contain any number of string elements. Each element in the list is a separate tag. For instance:
tags:
- "language:java"
- "keyword:database"
- "keyword:SQL"
In the example above, the connector has three tags. Tags are used for two primary purposes in Airbyte:
-
Denoting the Programming Language(s): Tags that begin with language: are used to specify the programming languages that are utilized by the connector. This information is auto-generated by a script that scans the connector's files for recognized programming languages. In the example above, language:java means that the connector uses Java.
-
Keywords for Searching: Tags that begin with keyword: are used to make the connector more discoverable by adding searchable terms related to it. In the example above, the tags keyword:database and keyword:SQL can be used to find this connector when searching for
database
orSQL
.
These are just examples of how tags can be used. As a free-form field, the tags list can be customized as required for each connector. This flexibility allows tags to be a powerful tool for managing and discovering connectors.
The icon
Field
⚠️ This property is in the process of being refactored to be a file in the connector folder
You may notice a icon.svg
file in the connectors folder.
This is because we are transitioning away from icons being stored in the airbyte-platform
repository. Instead, we will be storing them in the connector folder itself. This will allow us to have a single source of truth for all connector-related information.
This transition is currently in progress. Once it is complete, the icon
field in the metadata.yaml
file will be removed, and the icon.svg
file will be used instead.
The releases
Section
The releases
section contains extra information about certain types of releases. The current types of releases are:
breakingChanges
breakingChanges
The breakingChanges
section of releases
contains a dictionary of version numbers (usually major versions, i.e. 1.0.0
) and information about
their associated breaking changes. Each entry must contain the following parameters:
message
: A description of the breaking change, written in a user-friendly format. This message should briefly describe- What the breaking change is, and which users it effects (e.g. all users of the source, or only those using a certain stream)
- Why the change is better for the user (fixed a bug, something got faster, etc)
- What the user should do to fix the issue (e.g. a full reset, run a SQL query in the destinaton, etc)
upgradeDeadline
: (YYYY-MM-DD
) The date by which the user should upgrade to the new version.
When considering what the upgradeDeadline
should be, target the amount of time which would be reasonable for the user to make the required changes described in the message
and upgrade giude. If the required changes are simple (e.g. "do a full reset"), 2 weeks is recommended. Note that you do not want to link the duration of upgradeDeadline
to an upstream API's deprecation date. While it is true that the older version of a connector will continue to work for that period of time, it means that users who are pinned to the older version of the connector will not benefit from future updates and fixes.
Without all 3 of these points, the breaking change message is not helpful to users.
Here is an example:
releases:
breakingChanges:
1.0.0:
message: "This version changes the connector’s authentication by removing ApiKey authentication, which is now deprecated by the [upstream source](upsteam-docs-url.com). Users currently using ApiKey auth will need to reauthenticate with OAuth after upgrading to continue syncing."
upgradeDeadline: "2023-12-31"
scopedImpact
The optional scopedImpact
property allows you to provide a list of scopes for which the change is breaking.
This allows you to reduce the scope of the change; it's assumed that any scopes not listed are unaffected by the breaking change.
For example, consider the following scopedImpact
definition:
releases:
breakingChanges:
1.0.0:
message: "This version changes the cursor for the `Users` stream. After upgrading, please reset the stream."
upgradeDeadline: "2023-12-31"
scopedImpact:
- scopeType: stream
impactedScopes: ["users"]
This change only breaks the users
stream - all other streams are unaffected. A user can safely ignore the breaking change
if they are not syncing the users
stream.
The supported scope types are listed below.
Scope Type | Value Type | Value Description |
---|---|---|
stream | list[str] | List of stream names |
remoteRegistries
The optional remoteRegistries
property allows you to configure how a connector should be published to registries like Pypi.
Important note: Currently no automated publishing will occur.
remoteRegistries:
pypi:
enabled: true
packageName: airbyte-source-connector-name
The packageName
property of the pypi
section is the name of the installable package in the PyPi registry.
If not specified, all remote registry configurations are disabled by default.
The connectorTestSuitesOptions
section
The connectorTestSuitesOptions
contains a list of test suite options for a connector.
The list of declared test suites affects which suite will run in CI.
We currently accept three values for the suite
field:
unitTests
integrationTests
acceptanceTests
Each list entry can also declare a testSecrets
object which will enable our CI to fetch connector specific secret credentials which are required to run the suite
.
The testSecrets
object
The testSecrets
object has three properties:
name
(requiredstring
): The name of the secret in the secret store.secretStore
(requiredsecretStore
object): Where the secret is stored (more details on the object structure below).fileName
(optionalstring
): The name of the file in which our CI will persist the secret (inside the connector'ssecrets
directory).
If you are a community contributor please note that addition of a new secret to our secret store requires manual intervention from an Airbyter. Please reach out to your PR reviewers if you want to add a test secret to our CI.
The secretStore
object
This object has two properties:
type
: Defines the secret store type, onlyGSM
(Google Secret Manager) is currently supportedalias
: The alias of this secret store in our system, which is resolved into an actual secret store address by our CI. We currently have a single alias to store our connector test secrets:airbyte-connector-testing-secret-store
.
How to enable a test suite
We currently support three test suite types:
unitTests
,integrationTests
acceptanceTests
To enable a test suite, add the suite name to the connectorTestSuitesOptions
list:
connectorTestSuitesOptions:
- suite: unitTests
# This will enable acceptanceTests for this connector
# It declares that this test suite requires a secret named SECRET_SOURCE-FAKER_CREDS
# In our secret store, and that the secret should be stored in the connector secret folder in a file named config.json
- suite: acceptanceTests
testSecrets:
- name: SECRET_SOURCE-FAKER_CREDS
fileName: config.json
secretStore:
type: GSM
alias: airbyte-connector-testing-secret-store
Default paths and conventions
The airbyte-ci
tool will automatically locate specific test types based on established conventions and will automatically store secret files (when needed) in the established secrets directory - which should be already excluded from accidental git commits.
Python connectors Tests are discovered by Pytest and are expected to be located in:
unit_tests
directory for theunitTests
suiteintegration_tests
directory for theintegrationTests
suite
Java connectors
No specific directory is determined. Which test will run is determined by the Gradle configuration of the connector.
airbyt-ci
runs the test
Gradle task for the unitTests
suite and the integrationTest
Gradle task for the integrationTests
suite.
Acceptance tests
They are language agnostic and are configured via the acceptance-test-config.yml
file in the connector's root directory. More on that here.
Default secret paths
The listed secrets in testSecrets
with a file name will be mounted to the connector's secrets
directory. The fileName
should be relative to this directory.
E.G.: fileName: config.json
will be mounted to <connector-directory>/secrets/config.json