FLAIR-GG Network and Virtual Platform

Frequently Asked Questions

Do I need to change my database or any part of my existing infrastructure?

NO! FLAIR-GG requires, at a minimim, a document containing a description of your germplasm bank in DCAT format. This can be put on your existing Website and you can request FLAIR-GG to visit it. It will be indexed and you will become part of the Network.
If you wish to have more power, you can install a FAIR Data Point of your own (follow the onboarding instructions on the main page)
If you wish to share your germplasm data, you need only export CSV from your existing database. This does not require any changes to your database or infrastructure.

In the area of germplasm conservation and use, which external suppliers of information could we connect under the FAIR principles?

We can include any resource that is compliant with the DCAT standard. This includes a wide range of government-sourced datasets, including ecological, environmental, and geographical.

What's a shared data model?

FAIR is partially about metadata and partially about data. In FLAIR-GG we require an agreement on the metadata model (i.e., DCAT) but we do not dictate a data model. Nevertheless, to maximize interoperability between germplasm resources, we have created several core data models that can be reused by Network members. We also provide a transformation tool that will take CSV exported from your existing Germplasm database, and will fill these shared data models, thus requiring very little expertise to fully participate in this FAIR network. Sharing a data model among all members allows us to send identical queries to all sites, maximizing the federation of the network.

Could you give me an example to see the differences between the terms "reusability" and "interoperatibility"?

FAIR is a set of data publishing principles that provide behaviours that will maximize the ability of both humans and machines to reuse data. Reusability and Interoperability are tightly aligned concepts. Interoperability is mostly associated with the ability to merge data from multiple sites without manual manipulation. Reusability is primarily around a single site, where the data can be reused by a machine without human intervention. Interoperability is mostly about technology, while Reusability is primarily around adequate annotation of the data, and the availability of a license and citation information.

Increased findability and accessibility are not also increasing findability and accessibility by hackers?

Not in any meaningful way. For example, one would not accuse Google of aiding hackers, even though it allows your website to be discovered by a search.

How is it possible the interoperability of seed banks databases under different taxonomies?

FLAIR-GG does not mandate the use of any given taxonomy. We have selected World Flora Online for a variety of reasons. It is possible to participate in the network without agreeing to use WFO, however your interoperability will be limited because the network tools do not automatically do taxonomy mapping.

Does a germplasm bank need a database expert to update information on its metadata and data?

For most levels of FLAIR-GG network participation, no special training is required (beyond what is normally available within the Germplasm bank team). For more advanced participation, such as deploying analytical interfaces over your database, it would be useful to have someone with slightly more experience. We can provide advice and guidance, based on your ambitions.

Is the FLAIR-GG platform open to in-situ conservation information?

Yes. Any data source can participate simply by creating their DCAT-compliant record. We currently do not have shared data models for in-situ germplasm data, but we would be pleased to work with you to help us build them!

Is there a limitation of the number of germplasm banks that could be integrated in the FLAIR-GG platform?

There is no limit (beyond the capacity of our Index server, which is very high!). The FLAIR-GG network is mostly distributed, so the data remains at-source. As a result we will likely never reach the limits of our capacity. The FLAIR-GG Virtual Platform has been tested up to 3700 datasets (more than the number of germplasm banks on earth) and had acceptable performance. If performance ever became an issue, there are ways we could better optoimize the code.

Who is responsible for updating the metadata and data? How can I update the metadata and data?

The "generic" response to this is that it is the responsibility of the group that is hosting the metadata and/or data. Updating the metadata requires you to either modify a text file (if you are using a DCAT-File-based FAIR Data Point) or to go to your FAIR Data Point website, login,and edit the metadata through the web page. Updating data requires you to do a new CSV export of the data you want to update, and then trigger the transformation pipeline. This can be done whenever you wish.

If my bank is not following the FAIR principles, could it still benefit from the FAIR accesibility and reusability of data from others (FAIR germplasm banks, FAIR suppliers of georreferenced information, FAIR public administrative institutions, etc)?

FAIR is entirely about making data reusable for both humans and machines. For this reason, any data that is FAIR can more easily be accessed and reused by anyone, regardless if their own data is FAIR or not! However, trhe full benefit of FAIR data is only achieved when it is easy to integrate your own data with data from others. So while you will find it easier to retrieve this third party FAIR data (e.g. via the FLAIR-GG Virtual Platform) you will still struggle to integrate it with your own data to generate new knowledge.