Release notes

Table of Contents

v5.3.7 Release notes
v5.3.6 Release notes
v5.2.20 Release notes
v5.2.19 Release notes
v5.2.18 Release notes
What’s Changed
v5.2.16 Release notes
v5.2.15 Release notes
v5.2.14 Release notes
- Improvements
- Note
v5.2.13 Release notes
v5.2.12 Release notes
v5.2.11 Release notes
v5.2.10 Release notes
v5.2.9 Release notes
v5.2.8 Release notes
v5.2.7 Release notes
v5.2.6 Release notes
v5.2.5 Release notes
- Bug fixes
v5.2.4 Release notes
- New Features
v5.2.3 Release notes
- Improvements
- New Features
- Bug fixes
- New endpoints
v5.2.2 Release notes
- Internal improvements
v5.2.1 Release notes (v5.1.15 and v5.2.0 included)
- Internal improvements
v5.1.14 Release notes
- Bug Fixes
v5.1.13 Release notes
- Internal improvements
- Note
v5.1.12 Release notes
- New Features
- Bug Fixes
- New V2 endpoints
v5.1.11 Release notes
- New Features
- Internal improvements
- Bug Fixes
v5.1.10 Release notes
- New Features
- Internal improvements
- Bug Fixes
v5.1.6 Release notes (v5.1.5 included)
- New Features
- Internal improvements
v5.1.4 Release notes
- Bug Fixes
v5.1.3 Release notes
- Bug Fixes
v5.1.2 Release notes
- Internal improvements
v5.1.1 Release notes (v5.1.0 included)
- New Features
- Bug Fixes
v5.0.7 Release notes
- Bug Fixes
v5.0.6 Release notes
- New features
- Bug Fixes
v5.0.5 Release notes
- New features
- Bug Fixes
v5.0.4 Release notes
- New Features
v5.0.3 Release notes
- New Features
- Notifications
v5.0.2 Release notes
- New Features
v5.0.1 Release notes
- New Features
- Bug Fixes
v5.0.0 Release notes
- New features
v4.2.7 Release notes
- New features
- Bug fixes:
- Notifications
v4.2.6 Release notes
- New features
- Bug fixes:
- Notifications
v4.2.5 Release notes
- New features
- Notifications
- Bug Fixes
v4.2.3 Release notes
- New features
- Bug Fixes
v4.2.2 Release notes
- New features
- Bug Fixes
- Platform upgrades
v4.2.1 Release Notes
- New features:
- Bug fixes:
4.2.0
4.1.15
4.1.14
4.1.13
4.1.12
4.1.11
4.1.10
4.1.9
4.1.8
4.1.7
4.1.6
4.1.5
4.1.4
4.1.3
4.1.2
4.1.1
4.1.0
- New features
- Bug fixes
4.0.7
4.0.6
4.0.5
4.0.4
4.0.3
4.0.2
4.0.1
4.0.0
- Data content
- JSON /biosamples
- XML /biosamples/xml
- JSON /biosamples/api
- Acknowledgements

This pages contains links to release notes for BioSamples for version 4.0.0 and higher. This release represents a comprehensive overhaul and therefore previous release notes are no longer applicable.

v5.3.7 Release notes

MICROBE logo in front page
NCBI and ENA sample mirroring fixes
Option added in BioSamples to perform JSON schema validation of all WEBIN samples

v5.3.6 Release notes

Fixes in documentation template
Fixes in handling NCBI sample mirroring in BioSamples

v5.2.20 Release notes

ERS accessioning in BioSamples
Java 17 and Spring Boot 2.5 upgrade

v5.2.19 Release notes

BioSamples client multi-threading fix

v5.2.18 Release notes

What’s Changed

add public filter for INSDC status != suppressed by @theisuru in #630
Bsd 2292 taxon importer codon by @theisuru in #629
fix release date/ sample status bug in accessioning of V1 and V2, sam… by @dipayan1985 in #631
Adding sample post release action to CI/CD by @dipayan1985 in #632
Fix Gitlab CI file for post release action pipeline by @dipayan1985 in #633
Fix CI/CD for sample post release action by @dipayan1985 in #634
Too large artifact error in CICD by @dipayan1985 in #635
Configure micrometer stackdriver by @dipayan1985 in #620
Stackdriver monitoring by @dipayan1985 in #636
BSD release 5.2.17 by @dipayan1985 in #637

Full Changelog: https://github.com/EBIBioSamples/biosamples-v4/compare/v5.2.16…v5.2.17

v5.2.16 Release notes

v5.2.15 Release notes

v5.2.14 Release notes

Improvements

Elixir biovalidator upgrade

Elixir biovalidator upgraded to the latest version. This version includes improved performance and error handling.
EVA logo in external links

Now EVA links will be identified and shown in the external links section with the logo.

=== Bug fixes
At sample submission time if AAP domain is not provided, the first AAP domain of the user is used by default.

Note

Update holiday message - Please note that the BioSamples team will be out of the office from December 19th 2022 to January 2nd 2022. Replies to Helpdesk requests will be delayed during this period.

v5.2.13 Release notes

Internal improvements and critical bug fixes

v5.2.12 Release notes

Internal improvements only

v5.2.11 Release notes

Internal improvements only

v5.2.10 Release notes

Internal improvements only

v5.2.9 Release notes

Internal improvements only

v5.2.8 Release notes

Internal improvements only

v5.2.7 Release notes

v5.2.6 Release notes

Internal improvements only

v5.2.5 Release notes

Bug fixes

Accession duplication fix

v5.2.4 Release notes

New Features

Bulk fetch by accessions added

v5.2.3 Release notes

Improvements

File Upload Submissions

Case insensitive column names are now accepted for file uploader submissions

Improvement in error reporting for failed submissions. Errors related to authentication, access and file format problems are clearly reported back to the submitter
Performance improvement in accessioning There were several request timeouts observed over the past few months when ENA attempted to create around ~1000 sample accessions from BioSamples in a single API call. There were several bottlenecks identified in the BioSamples accession generation process and they are eliminated now and replaced with a much simpler process resulting in improved accessioning and submission performance. An example from May 4, 2022 taken from ENA logs shows BioSamples is now able to generate ~10000 accessions in one single API call:

“Requested 9985 accessions from BioSamples and registering 9985 new BioSample accessions took 81229 milliseconds”

New Features

Reference to private BioSamples while doing an ENA (WEBIN) submission

It is now possible to create private samples in BioSamples and refer to the BioSamples accessions while doing an ENA submission. It has also been ensured that such private samples in BioSamples will be made automatically public when runs/analyses that refers to these samples are made public in ENA
Generic structured data model

BioSamples structured data submission was restricted to only a few structured data types, like AMR, histological data, etc. Now with this release it is possible to submit any type of structured data to BioSamples.
- structured data are specific additional information to a sample for example, antibiogram data which is an overall profile of antimicrobial susceptibility testing results
Improved NCBI and ENA sample imports

BioSamples import pipelines import newly created or updated samples from ENA and NCBI daily to remain consistent with other INSDC databases. Publication information of samples were not imported by BioSamples until now and we are starting to do that from this release. ENA Browser is in their final phase of testing before they start indexing samples from BioSamples and this feature has been requested by them as they would rightly like to query a single database to get all related information for a sample
Submissions with WEBIN authentication

Submitters are no longer required to pass the parameter authProvider=WEBIN for submissions done with WEBIN authentication.

Bug fixes

Filtered search for both samples and accessions containing a mix private and public samples were returning inconsistent results; this has now been fixed
Solr out of memory issues has been resolved which ensures consistency in searching and filtering

New endpoints

Structured data PUT and GET endpoints

1.1 PUT to add structured data to already submitted sample: PUT structureddata/<accession>

1.2 GET to fetch structured data of a sample GET structureddata/<accession>

Documentation and example available here

v5.2.2 Release notes

Internal improvements

File upload submissions: We had a restriction of unique sample names per submitter for file uploader submissions. This has now been removed as we have received requests that for re-sequencing project multiple samples having the same sample name can be submitted by same submitter or community. The sample metadata should be different.

v5.2.1 Release notes (v5.1.15 and v5.2.0 included)

Internal improvements

Improved error handling for the file uploader submissions with more user friendly error messages
Allowing case insensitive column names for the file uploader submission files
Improvement of handling structured data in BioSamples
Performance improvements in accessioning
Improvements in the ENA import pipeline
New pipeline added to handle sample release in BioSamples when ENA data (runs/analyses) referring such samples are released

v5.1.14 Release notes

Bug Fixes

Fix bug with search indexing

v5.1.13 Release notes

Internal improvements

Update the release process and move away from SPOT infrastructure

Note

Update holiday message - Please note that the BioSamples team will be out of the office from December 20th 2021 to January 3rd 2022. Replies to Helpdesk requests will be delayed during this period.

v5.1.12 Release notes

New Features

Private sample search using Webin Authentication

Private sample submitted using Webin Authentication can now be searched using the GET API, including:

1.1. Single private sample search using accession

1.2. Filtered search result containing only private samples

1.3. Filtered search results containing a mix of private and public samples

Example API calls,

For 1.1 - curl 'https://www.ebi.ac.uk/biosamples/samples/<accession>?authProvider=WEBIN' -i -X GET -H "Content-Type: application/json;charset=UTF-8" -H "Accept: application/hal+json" -H "Authorization: Bearer $TOKEN"

For 1.2 and 1.3 - curl 'http:// www.ebi.ac.uk/biosamples/samples?filter=attr:<attribute_name>:<attribute_value>&authProvider=WEBIN' -i -X GET -H "Content-Type: application/json;charset=UTF-8" -H "Accept: application/hal+json" -H "Authorization: Bearer $TOKEN"

To get the Webin Authentication token,

TOKEN=$(curl --location --request POST 'https://www.ebi.ac.uk/ena/submit/webin/auth/token' --header 'Content-Type: application/json' --data-raw '{ "authRealms": [ "ENA" ], "password": "<password_here>", "username": "<username_here>" }')
Additional field support for the drag’n’drop uploader

Publications, contacts and organizations can now be added to sample metadata for submission using the drag’n’drop uploader. For more details please refer to https://www.ebi.ac.uk/biosamples/docs/cookbook/upload_files.html
Generic structured data submission model

We have refactored the structured data API to accept generic data structures. This alleviates the need to update the code as new datatypes were requested. Additionally, the structured data section of the sample now has its own owner, allowing it to fully support cases where structured data is added separately to the original samples metadata.

More details about the API can be found in our documentation here https://www.ebi.ac.uk/biosamples/docs/references/api/submit#_submit_structured_data_to_sample.

Bug Fixes

BioSamples API documentation has been fixed to include the http requests and http response snippets

New V2 endpoints

New submission and accession endpoints will be deployed with this release to increase availability, in particular for bulk accessioning. Those will be first integrated with ENA for accessioning and monitored for performance. Metrics will be documented and made available for users. Expected date of general availability is December 10, 2021, if performance results are as expected. Target availability is 99.5%.

v5.1.11 Release notes

New Features

Internal improvements

Bug Fixes

Fix of private sample GET using WEBIN authentication. It was not possible to GET private samples using WEBIN authentication. This release has the fix for the issue.

v5.1.10 Release notes

New Features

Internal improvements

Bug Fixes

Fixing of missing curationdomain query parameter in HAL section of the sample page API. This has lead to returning samples with curations in biosamples-client when using sample page service with no-curations flag and consequently some samples might have overwritten with curations applied.

v5.1.6 Release notes (v5.1.5 included)

New Features

File uploader improvements

BioSamples file uploader has gone through some changes for dealing with larger file uploads. Any file upload with over 200 samples are queued and submitters are provided with a submission ID. Submitters can use the submission ID in the View submissions tab and check status of their uploads. Once a submission is searched in the View Submissions tab and if the submission ID is valid then the submitter will get a result json file with the submission status and the sample accessions mapped against sample names.

Submissions can have either of the 3 status, ACTIVE, COMPLETED or FAILED.

ACTIVE status: Submission is waiting to be processed or is being processed

COMPLETED status: Submission has completed, if a submission is in COMPLETED status, it is expected that the samples have been created and accessions generated OR samples have failed validation against minimal validation rules of BioSamples database or samples have failed validation against checklist specified by the submitter while doing the file upload

FAILED status: Submission has failed, the submission might have a failed status of the file uploaded was invalid and BioSamples were not able to parse the file or any technical issue in BioSamples database which has prevented the submission from getting processed

JSON schema-store integration and BioSamples checklist (BSDC) ID space

BioSamples now has a checklist ID space starting from BSDC00001. This is to clearly distinguish between ENA checklists and BioSamples checklists. We have also imported ENA checklists into BioSamples schema-store preserving ENA checklist IDs. ‘checklist’ attribute in the sample will trigger a validation in sample submission time, where the checklist will be retrieved from the schema-store and validated using the Elixir biovalidator.

Internal improvements

Submission API’s have gone through some performance improvements for faster responses and we hope it will result in better submission experience
Internal pipelines have gone through some fault resilience tests and have been improved accordingly

v5.1.4 Release notes

Bug Fixes

ENA import pipeline fix for BioSamples authority samples

This bugfix release is to ensure that BioSamples authority samples i.e. samples submitted to BioSamples and referred in an ENA submission is not re-updated with Webin submission account Id while attaching SRA accession to the sample. Updating the sample with SRA accession is a requirement of the ENA browser.

v5.1.3 Release notes

Bug Fixes

Elixir biovalidate response format mismatch

Due to existence of different versions of Elixir validator, there were some output format errors. Now on BioSamples will only use Elixir biovalidator.

v5.1.2 Release notes

Internal improvements

Several performance related internal improvements

v5.1.1 Release notes (v5.1.0 included)

New Features

JSON Schema store integration with BioSamples

We have integrated the JSON Schema store with BioSamples. JSON Schema store is an application for storing and managing JSON Schemas. All BioSamples’ checklists will be stored and managed in the JSON Schema store. In the future we plan to expose the API with authentication.
BioSamples File uploader

We have introduced a new drag and drop style file uploader for bulk uploading of samples. This is mostly intended for our non-programmatic submitters who want to fill in their samples metadata in a file for uploading and persisting samples in BioSamples. The drag and drop uploader in BioSamples supports both Webin and AAP authentication. More details on the uploader can be found in a newly added uploader guide. The guide has the required details about the file format, mandatory fields and other pre-conditions. [add link]
ENA taxonomy service integration with BioSamples

Samples submitted to BioSamples using ENA Webin authentication are put through additional checks to be compliant with ENA. All ENA samples must have taxonomy information and the taxonomy must be valid against the ENA taxonomy service. In BioSamples we have added a submission time validation of the mandatory organism attribute against the ENA taxonomy service.
BioSamples client changes

BioSamples client version 5.1.0 has undergone technical changes to support Webin authentication. The latest version of the client can be used to submit samples, curate samples or certify samples in BioSamples using Webin authentication.
Improved DUO code rendering

Improve DUO codes in Samples page. When the mouse pointer is moved on top of a DUO code, its description will be displayed as a tooltip.

Bug Fixes

Fix Phenopacket export errors when exporting samples with disease related attributes

v5.0.7 Release notes

Bug Fixes

Re-introduce missing samples/validate endpoint

In last release we have removed samples/validate endpoint in favour of validate endpoint. But since most users are using samples/validate we will keep this and deprecate in a future release.
Support both json and hal+json for accept header

Validate endpoint did not support hal+json accept header in last release. We will include support for this.
Enable ENA to pre-accession samples using WEBIN authentication instead of AAP

ENA will pre-accession samples using a WEBIN super user (prefixed SU-) and the metadata submission will be done by a non super user. During metadata submission we will check if the sample has been accessioned by the ENA registered super user and if yes then we will allow submission by any general webin user who wants to submit metadata against the accession.

v5.0.6 Release notes

New features

Authentication

We have added additional authentication support in BioSamples. With this release BioSamples users can authenticate using EBML-EBI’s European Nucleotide Archive (ENA) WEBIN authentication service. This is especially useful for users who intend to submit their sample metadata to BioSamples and sequencing data to ENA as shared, identical WEBIN credentials can be used to submit to both BioSamples and ENA. BioSamples continues to support the existing AAP authentication mechanism. AAP authentication is the default mode and current users using AAP authentication to submit sample metadata to BioSamples are not required to do any changes to their submission routines. More information related to authentication could be found here.
Sample search results bulk download

A new API enables downloading searching and bulk downloading results up to a maximum of 100,000 samples. The API supports text search and samples filtering. When search results exceed the maximum allowable download size, only the first 100,000 samples will be downloaded. Download buttons were also added to the search user interface. Currently this supports downloading samples as JSON, XML or accession list only.
Validation checklist in samples body (similar to existing ENA checklists)

Samples are validated at submission time. They are by default validated against the biosamples-minimal (ERC100001) checklist. Users can additionally provide the name of a known checklist in the sample body; when provided, this is also used for validation. If validation fails, the submission will be rejected. This enables users to define their preferred validation checklist at submission time. Please refer to the validation guide to see available checklists. The validation API is also available independently of submission and can be used to validate samples without submitting. We have updated our documentation to reflect these changes in certification and validation.

Bug Fixes

Link to new ENA browser - Samples having external reference to ENA were using the old ENA browser links. This has now been updated to link to the new ENA browser.

Example:

Old link - https://www.ebi.ac.uk/ena/data/view/SAMEA5776016
New link - https://www.ebi.ac.uk/ena/browser/view/SAMEA5776016

v5.0.5 Release notes

New features

Private samples are searchable by authenticated users

Previously, private samples were only available for direct retrieval after logging in. This release enables searching of private samples through the API by their owner. The sample search endpoint requires a JWT and returns the private samples the user is authorised for.

Bug Fixes

Documentation updates

BioSamples documentation has been updated to remove links to deprecated AAP services. Furthermore, the documentation has been improved to distinguish between the dev and production authentication services.

v5.0.4 Release notes

New Features

Add Plant-MIAPPE checklist to BioSamples' schemas

We have added Plant-MIAPPE checklist into BioSamples' schemas. At the sample submission time, certification service will verify if the given sample is in compliance with this checklist. If compliant, Plant-MIAPPE compliant certificate will be attached to the sample. Please find more about certification and validation in our documentation here.
Remove holiday notification banner from the website

v5.0.3 Release notes

New Features

Further changes in representation of BioSamples dates

1.1 In response to additional user feedback, a few changes in how we present dates in the BioSamples user interface have been implemented. The “ID created date” was removed from the user interface. This internal bookkeeping date was generating confusion with the sample submission date. More information is available at https://wwwdev.ebi.ac.uk/biosamples/docs/faq#_why_was_the_code_id_created_on_code_field_removed

1.2 A collapsible section “BioSamples record history” has been added and contains the following dates: Submitted on: The earliest date at which valid metadata has been provided by the submitter. This attribute is generated by BioSamples and other INSDC partners.

Released on: The user-supplied date at which the sample metadata is made available publicly for the first time.

Last reviewed: The date at which a new curation object has been created or the automatic curation pipelines have been run on a sample metadata. This field is only present if at least one curation object has been added by the curation pipelines. The last reviewed date is updated when the curation objects are reviewed, even if they are found still valid and are not modified and indicates that the sample is compliant with the latest BioSamples curation rules [https://www.ebi.ac.uk/biosamples/docs/guides/curation]. This attribute is generated by BioSamples.

Please refer to our documentation and FAQ section for further details, at https://www.ebi.ac.uk/biosamples/docs/guides/dates and https://wwwdev.ebi.ac.uk/biosamples/docs/faq

Modification to EBI search engine export pipeline

The “host” attribute is now represented as “host scientific name” in the daily sample export. This change has been done to accommodate a request from the EBI Search team around a new facet in EBI search.

Notifications

Please note that the BioSamples team will be out of the office from December 21st 2020 to January 3rd 2021. Replies to Helpdesk requests will be delayed during this period. This notification was added to the service home page.

v5.0.2 Release notes

New Features

Change in representation of BioSamples dates In response to user feedback, and to alleviate possible confusion between samples ID creation and submission dates, we have updated the label of ‘created on’ to ‘ID created on’, and added the ‘Submitted on’ date for newly added samples. We also added documentation for all the following dates which will be displayed in the UI going forward:
- ID created on: The date at which the sample accession is created. This attribute is generated by BioSamples. IDs can be created in advance of collection or submission; BioSamples allows the pre-registration of sample accession to support cross-archive data exchange and data provenance management.
- Submitted on: The earliest date at which valid metadata has been provided by the submitter. This attribute is generated by BioSamples and other INSDC partners.
- Released on: The user-supplied date at which the sample metadata is made available publicly for the first time.
- Updated on: The last date at which the sample was updated. Samples can be updated for curation needs and other technical purposes. More information about curation is available in the documentation [https://www.ebi.ac.uk/biosamples/docs/guides/curation. ] This attribute is generated by BioSamples.

v5.0.1 Release notes

New Features

Organism has been made a mandatory attribute for samples Samples submitted to BioSamples must have either an organism attribute or a species attribute. Samples without an organism and species will not be persisted and the request of submission will be rejected with HTTP status code 400 (Bad request)
Certification Service A new service has been added to BioSamples for sample validation using JSON schema checklists. Samples validated against checklists are deemed certified by the checklist and certificates are added to the sample. Please see BioSamples user guide and API guide on the certification service for more details: User guide - http://www.ebi.ac.uk/biosamples/docs/guides/certification API reference - http://www.ebi.ac.uk/biosamples/docs/references/api/certify First use case - Certification service has been used to validate the existence of organism or species in sample metadata submitted to BioSamples. Schema reference - https://github.com/EBIBioSamples/biosamples-v4/blob/dev/webapps/core/src/main/resources/schemas/certification/biosamples-minimal.json
Structured data support for new types Structured data support was extended to include new data formats. New data formats include CHICKEN_DATA, HISTOLOGY_MARKERS, MOLECULAR_MARKERS and FATTY_ACIDS. This has been done for the structured data support of the ‘HoloFood’ project involving the microbiome of agricultural animals (salmon and chicken). As part of this project, various submitters are going to generate the data and some of which is suitable to go into ENA. Some of the data in structured data form falls outside ENA’s remit (eg, histological summaries for the samples, etc) and BioSamples will provide support to store such structured data.
Sample recommendations endpoint New endpoint introduced to use along with validation endpoint. Before submitting a sample, the submitter can check if the sample conforms to the BioSamples recommended format and get suggestions for changes. Submitting a sample in recommended format will increase FAIRness of data. Please refer to the API guide for more details - http://www.ebi.ac.uk/biosamples/docs/references/api/validate
Relationship curations Previously, curations can only be applied for attributes and external references. Now curations can also be applied to relationships. This enables third parties to apply relationships to samples.
Retrospective KILLED samples handler added to the ENA pipeline The ENA import pipeline that imports samples from ENA to BioSamples has been modified to retrospectively check if samples have been KILLED in ENA. Status update is made accordingly in BioSamples so that sample metadata is consistent with ENA.
Cross-origin resource sharing (CORS) has been enabled for BioSamples API’s for all origins and all methods
BioSamples sample XML view has been modified to include AMR Antibiogram model as well. Please download the XML from the example sample - https://wwwdev.ebi.ac.uk/biosamples/samples/SAMN09711403 to see the XML modelling of AMR data

Bug Fixes

Bug fix in EBI search pipeline to not include killed and suppressed samples in the exported data
Bug fix in NCBI samples to avoid 400 bad requests while processing samples that don’t have an organism. Certification service rejects samples without an organism
Bug fix in pipelines to deal with HTTP 404 errors while trying to fetch samples with blank curation domain. Pipeline failure avoided in such cases and error logging is improved
The EBI search data export pipeline has been modified so that the data export dump includes the top 100 most present attributes in all samples in the BioSamples database. Other attributes have been ignored in sample metadata sent to the EBI search engine. This has been done because the EBI search engine can permit upto 100 query params and not more

v5.0.0 Release notes

New features

Retiring SampleTab API
The SampleTab, legacy-json and legacy-xml APIs have been retired in this release. Please contact us at biosamples@ebi.ac.uk if you have any questions/concerns. The following endpoints are no longer supported:

v4.2.7 Release notes

New features

Sample groups API:
Sample group API, which was present in SampleTab is now present in JSON API. But we are in discussion whether there is a real user requirement for this. We will be really happy to hear from users, if they have any use case in mind for sample groups.
Sample graph search API, interface and new neo4j dependency:
Sample graph search is an experimental feature, which enables to explore sample to sample and sample to external resource relationships. This is backed by neo4j graph database and therefore now neo4j is introduced as a new dependency. Experimental interface (which will change in future) enables simple relationship queries and lists down the results.
Domain transfer from old SampleTab domain to new AAP domain:
Now we have started moving old SampleTab domains to new DSP subs domains. This is done only on user request. Let us know if you need to move your samples from old domains t new AAP domain.
Sample relationship source validation and relationship documentation:
In a sample relationship, sample source should equal to the containing sample accession. This is validated at sample submission time. New section is added to the user guide to explain sample relationships.
Clearinghouse import:
Now we have all the scripts in place for importing curations from clearinghouse. As a result we have also changed how we curate "not collected" and "not provided" values. This is described in documentation.
Improvements to EBI Search Engine data dump pipeline
BioSamples support to ENA presentation: External reference to ENA is added to samples submitted through BioSamples, i.e. BioSamples authority samples
Improve BioSamples documentation

Bug fixes:

Remove alt text from h1 tag in UI. Alt text in h1 tag has caused google to wrongly index biosamples in search results.
Include missing domain validation when updating samples:
Domain validation in sample update service was missing in the previous version. This has been added in the new version. Now if a user has access to an existing sample, he can update the sample using any domain he has access to.
Fix the curation pipeline to retain meaningful attributes having values like “not provided”, “not collected”
NCBI Exchange - There are cases of missing SRA accessions in NCBI samples imported to EBI BioSamples. In such cases NCBI samples are cross checked with ENA Oracle database and if SRA accession is found in ENA Oracle database, the NCBI samples are updated with the same
There were often failures in updating already private samples in NCBI to private in EBI BioSamples, this has been fixed in this release

Notifications

Please note that we will be removing SampleTab format submission support on 1st of July. Please let us know if you have any concerns regarding this.

v4.2.6 Release notes

New features

Changes to BioSamples indexing: Solr CDCR process is quite slow when we re-index BioSamples at the weekend. Therefore at the weekend, instead of using CDCR for datacenter replication, we will copy Solr index to the second datacenter and keep CDCR process down while copying.
Pipeline statistics: We will store pipeline related statistics in a new collection in MongoDB. This will enable us to have insight into BioSamples sample distribution and later enable visualization of BioSamples usage.
AMR Structured data support: AMR Structured data submission support has been added to BioSamples. You can further read the documentation to know how to submit AMR structured data in BioSamples. Structured data submission has retention of access rights. If the sample submitter and the structured data submitter are different, then the sample submitter can only update the sample metadata and structured data submitter can only update the structured data
Livelist pipeline has been improved to generate live samples list, suppressed samples list and killed samples list
New pipeline added to provide dump of biosamples to the EBI search engine with the scope of further improvements based on review of data dump
BioSamples support to ENA presentation: Feature has been added to ENA Pipeline to update SRA accession in samples submitted through BioSamples, i.e. BioSamples authority samples
Include COVID-19 query in BioSamples home page: BioSamples contains samples related to COVID-19 disease. COVID-19 related samples can be easily accessible by following the link on the home page.

Bug fixes:

Curation pipelines have been fixed to accept samples having blank attribute values
Bug fix in handling attribute name and measurement in ENA AMR import pipeline

Notifications

Data center migration and related maintenance tasks were completed as expected. BioSamples operates on full capacity as usual.
Please note that we will be removing SampleTab format submission support on 1st of May. Please let us know if you have any concerns regarding this.

v4.2.5 Release notes

New features

Removed duplicate BioSamples accessions New pipeline developed for dealing with duplicate ERS identifiers in BioSamples. This pipeline will be initially used to remove duplicate BioSamples accessions generated by import from ENA and ArrayExpress. The duplication had happened before because BioSamples import data from both ENA and ArrayExpress, where each creates their BioSamples IDs. ArrayExpress also includes a reference to ENA, which creates the duplicate towards the ENA accessions. The pipeline is generic and can be configured to remove similar duplicates in future.
Improvements to the /accessions endpoint to add pagination and wildcard search The accessions endpoint now has the same capabilities as the /samples endpoint with the only difference that it brings back just the accession numbers and not the full sample content. This has been requested by the NCBI. This includes text search, applying filters and paging. Instead of a list of accession, it now returns a page with paging information.
- https://www.ebi.ac.uk/biosamples/accessions?text=human
- https://www.ebi.ac.uk/biosamples/accessions?filter=attr:organism:homo%20sapiens -https://www.ebi.ac.uk/biosamples/accessions?filter=attr:organism:homo%20sapiens&page=1&size=100
Ontology annotations to AMR structured data added through Zooma. AMR structured data support in BioSamples was added in our last release, https://www.ebi.ac.uk/biosamples/samples/SAMEA3993565
Improvements in BioSamples Web UI 4.1 Broken hyperlinks have been removed through our curation pipelines. 4.2 Original ontology hyperlinks of attributes are maintained where links couldn’t be resolved by OLS. 4.3 Timestamps of samples have been moved to the bottom of the sample display webpage. 4.4 BioSamples sample search page could be slow to load due to long facet generation time. We now return samples immediately, while facets are being loaded. Planned maintenance message has been added
BioSamples support for ENA Presentation – BioSamples will use NCBI sample attribute name and not attribute display names to form BioSample sample attribute names.

Notifications

Some of our services are currently undergoing planned maintenance which is due to complete on 4th April 2020. There should be no impact on our users. If you experience any issues, please contact our helpdesk (biosamples@ebi.ac.uk) directly for support.
The planned maintenance will affect the Data Submission Portal (DSP), Consequently, and to provide ample time for our users to test and migrate to DSP, theI BioSamples Sample tab APIs will be deprecated on May 1, 2020 (instead of April 1, 2020)

Bug Fixes

Fixing the BioSamples pipelines namely curation and zooma to retain the tag field in attributes
Fixing of pipeline failure notification system to send out emails if pipeline fails because of a network issue.

v4.2.3 Release notes

New features

1.Incorporation of AMR structured data support in BioSamples and addition of the new ENA-AMR import pipeline. The ENA-AMR import pipeline queries the ENA API for AMR data of samples. It received back the samples having AMR information and the FTP links to the AMR information. It then attempts to get the AMR data from the FTP links and adds it to the sample and updates the sample in BioSamples. In case of NCBI AMR data, it comes as a part of the NCBI Sample XML and BioSample imports it while the NCBI pipeline executes. 2. Below recommendations from ENA presentation has been implemented in order to achieve the BioSamples support for ENA Presentation use case,

BioSamples JSON will have core attributes like description, title and organism in lower case

If a user provided attribute of the same name exists and are in upper case, then they will be treated as separate attributes in the BioSamples JSON

"Description" : [ {      "text" : "user provided description in ENA sample”,
 "tag" : "attribute"
	} ]
"description" : [ {
	  "text" : "core description in ENA sample"                                         -
	} ]

If a user-attributes of the same exists and is also in lower case, then it will be an array of elements within an attribute in the BioSamples JSON "description" : [ { "text" : "core description in ENA sample" }, { "text" : "user provided description in ENA sample", "tag" : "attribute" } ]

Bug Fixes

Fixing the curami pipeline to deal with attributes having blank values

Fixing the curami pipeline to deal with attributes having tag. Curami pipeline was removing the tags while creating curation objects.

Please note:  “tag” is used to specify any additional information about the attribute, like for example a namespace of an external id or a submitter id or to represent if an attribute has been provided specifically by the user. Couple of examples below:
		"Submitter Id" : [ {
			  "text" : "E-MTAB-565:FOXK2_Dox_treated",
			  "tag" : "Namespace:UNIVERSITY OF MANCHESTER"
			} ],

"DiseaseState" : [ {
		  "text" : "Osteosarcoma",
		  "tag" : "attribute" ------------- indicates an user provided attribute
	} ]

v4.2.2 Release notes

New features

Modification of /accessions POST endpoint to improve the pre-accessioning performance. Pre-accession of samples is used by ENA and ENA was using our Sample Tab API’s in the past. Sample tab is going to get deprecated from April 01, 2020 and the new improved /accessions POST endpoint can been used for pre-accessioning.
Improvements in the /accessions GET endpoint, added search filters, pagination and sizing to this endpoint to comply with such requests from NCBI. In this case NCBI was using BioSamples legacy-xml endpoints and before the legacy-xml endpoints gets deprecated the alternate accessions REST endpoint required these improvements so that similar functionality can be provided to NCBI.
RDF release pipeline has been added to BioSamples for continuous RDF release. The frequency of the release can be configured.
Improvement of BioSamples pipeline to report back error statuses and log correct error messages and failure cases.
Below recommendations from ENA presentation to easily identify top level attributes and user provided attributes and to leave out any attribute that doesn’t make sense to them. This comes in effect for all ENA and NCBI samples imported to BioSamples and is related to the topic of ENA Presentation querying BioSamples API’s for samples metadata: 5.1. to have the tag “attribute” for all user provided attributes . 5.2. to remove the tag “core” from specific top-level attributes (description as an example).
BioSamples will retain create date of NCBI samples that are being imported. Currently it overrides the create date and replaces it with the date and time when the sample is saved in BioSamples.

Bug Fixes

Bug fix to handle null dates in NCBI samples while being imported to BioSamples.

Platform upgrades

BioSamples now runs on Java 11 (Open JDK 11).

v4.2.1 Release Notes

New features:

Handler added to check and update sample status in BioSamples for SUPPRESSED samples in ENA/NCBI. SUPPRESSED samples that exist in ENA and not in BioSamples are created in BioSamples. This helps to have a consistent view of the samples in ENA and BioSamples.
Contact full details will be saved and displayed by default, which includes name, role, email, affiliation etc. Request param -setfulldetails if set false and passed in the request URI, full details of contact won’t be saved.
ENA BioSamples integration changes has been done in this release. This will enable ENA presentation to query BioSamples API for the samples metadata. Short description of the changes done are given below:
1. Retaining of ArrayExpress elements in ENA imported samples
2. Mapping of alias in ENA sample XML to name (top-attribute) in BioSamples JSON
3. Mapping of SAMPLE_ATTRIBUTE/alias in ENA sample XML to characteristics/alias in BioSamples JSON
4. Removing tagging of core attributes from Synonyms for ENA/NCBI/DDBJ samples. SUBMITTER_ID, EXTERNAL_ID, UUID, ANONYMIZED_NAME, INDIVIDUAL_NAME attributes were earlier mapped to synonyms. With this release they are mapped to individual attributes under characteristics in BioSamples JSON, like characteristics/External Id, characteristics/Submitter Id and so on
5. Introduction of tag in BioSamples JSON for mapping namespace values in ENA/NCBI/DDBJ samples. An example below: External_id" : [{ "text" : "GM18582", “tag” : “Namespace: Coriell” } ] "Submitter Id" : [ { "text" : "ZF_CR_MPX22_279-sc-2227782", "tag" : "Namespace:SC" } ]
6. Handling for multiple descriptions (core description and SAMPLE_ATTRIBUTE description) for ENA/NCBI/DDBJ samples. An example below. Reusing of tag to show if the description is of core or sample attributes "Description" : [ { "text" : "Protocols: U2OS cells …..)", "tag" : "core" }, { "text" : "This sample has been re-named", "tag" : "attribute" } ]
7. Removing characteristics/synonym from BioSamples JSON for ENA/NCBI/DDBJ samples. All attributes that were tagged under synonyms now has individual attributes under characteristics and hence synonym is not required. Alias is now mapped to name too and hence it makes synonym redundant
8. PRIMARY_ID of NCBI/DDBJ samples mapped to characteristics/SRA accession in BioSamples JSON. This will bring samples metadata in BioSamples in sync for ENA/NCBI/DDBJ samples.
9. Title was mapped to characteristics/Title (for ENA samples) and characteristics/description title (for NCBI/DDBJ samples). Title is now mapped to characteristics/Title for all ENA/NCBI/DDBJ samples
10. GenBank common name handled in characteristics/Common Name for NCBI/DDBJ samples. Provision is kept for ENA samples too if such an attribute exists.
11. Performance improvements of ENA pipeline
12. Create date added for ENA/NCBI/DDBJ samples
13. Retaining of ENA prefixed attributes in BioSamples JSON

Bug fixes:

UI bugfix to display contact role. Earlier it used to show name instead of role.
Change curation-view pipeline to read samples from MongDB. To crawl all the samples available in BIoSamples, we can’t use biosamples-client get all samples method as it will not return non-indexed samples (eg. suppressed samples)

4.2.0

Deprecation of SampleTab submission format.
Adding static collection for samples+curations.
Modify applying order for the curation objects.
Add link to sample accession.

4.1.15

Update phenopacket version
Add curami pipeline to curate biosamples attributes

4.1.14

Add DUO attribute to external reference class
Add script to import EGA data
Add presto connector as a BioSamples client module

4.1.13

Added API in biosamples-client to utilize JWT tokens
Resolved issue where ENA pipeline failed if FIRST_PUBLIC date is not available

4.1.12

Replicate required ENA XML Dump functionality in the ENA pipeline
Added an annotation 'submitted via USI' to USI samples
Added support for suppressed samples imported theough ENA pipeline
Added user documentation of JSON schema
Added logging and retry logic for reindexing pipeline
Refined ncbi pipeline to check suppressed samples are in solr index before removing

4.1.11

Added support for suppressed samples to enable dbGap data loading
Fix confusion between supressed and private samples in dbGap data
Livelist file: adding flush to make sure file is written
Add validation and accessioning service
Fix SampleTab template download link

4.1.10

Remove the holiday message
Fix submit tab link in error pages

4.1.9

Added a Curation Undo Pipeline to allow for removal of erroneous curations.
Fix an issue where long attributes break the sample box UI.

4.1.8

Corrected error in curation pipeline which caused sample characteristics to be removed erroneously
Added holiday message

4.1.7

Added libraries to enable applications to use Graylog to allow configuration of aggregated logging
Switched to the AAP explore environment at https://explore.api.aai.ebi.ac.uk
Updated the default AAP URL used by the BioSamples client
Included sampletab template file in the sampletab documentation
Included ETAG and Curation Object recipes to the BioSamples cookbook
Removed name and API key lookup functionality from SampleTab process

4.1.6

Addition of AMR structured data into BioSamples
Submission of samples with a relationship not targeting a valid accession now return an error
Fixed bug with Phenopacket export not able to extract medatada for Orphanet terms
Updated user interface to use the newer version of the EBI visual framework
Improved documentation navigation experience adopting a new menu style

4.1.5

Fixed bug that search failed when using a colon with a non-indexed field. e.g. taxon:9696
Added the BioSamples cookbook
Fixed issue where there are duplicate organism attributes with different cases in a sample
Updated the error message in the SampleTab UI to take into account large submissions timeout

4.1.4

As part of curation pipeline attributes with the value "not_applicable" are removed
Date titles on the sample page are now "Releases on" and "Updated on" rather than "Release" and "Update"
An initial accession endpoint has been added to the REST API to enable ENA to get a list of accessions for a project
A multi-step Docker build has been added to allow Docker images to be distributed on quay.io
A fix has been made for an issue that caused the Zooma Pipeline to fail on wwwdev

4.1.3

Additional sample attributes required by ENA are now available including a single, top-level taxId field
The export box for a sample is now renamed download and contains a list of serialisations that always download as a file fixing a blocked popups issue in Safari
The search results now have an updated look and feel based on feedback from ENA

4.1.2

Sample JSON now contains a numeric taxId field at the top level
IRI of ontology terms now resolve to the defining ontology when they are available in multiple ontologies
Requests for a sample now contain a computed ETag header to identify changes
When requesting a private sample an explanation message is now provided in addition to the 403 error code
The search UI now contains a clear filters button

4.1.1

Expose the BioSchemas markup with enhanced context and Sample ontology code
SampleTab submission pipeline has been rewritten for better robustness
In the samples results page, the sample name and the sample accession are now linking to the single sample page
Fixed various broken hyperlinks on the home page and in documentation

4.1.0

New features

GDPR:
- SampleTab submissions enforce explicit acceptance of the terms of service and the privacy information
- GDPR notices added throughout
SampleTab where targets of relationships are neither sample name nor sample accession are now rejected, providing user additional information on the problematic data
Bioschema.org entities are exported in BioSamples and available both in the UI - embedded in a script tag - and through the API

Bug fixes

Solved issues with wrong header’s hyperlinks
Solved issue with resolving relationship by name in SampleTab submissions
Solved issue with converting DatabaseURI to external references in SampleTab submissions
Improved special characters handling in SampleTab submissions

4.0.7

This is a bugfix release that addresses the following issues: * GDPR notices * Update format of the Sitemap file