Import of Specimen or Occurrence Records Into Taxonomic Manuscripts

Repositories and data indexing platforms, such as GBIFBOLD systems, or iDigBio hold documented specimen or occurrence records along with their record ID’s. In order to streamline the authoring process, save taxonomists’ time, and provide a workflow for peer-review and quality checks of raw occurrence data, the ARPHA team has introduced an innovative feature that makes it possible to easily import specimen occurrence records into a taxonomic manuscript (see Fig. 1).

For the remainder of this post we will refer to specimen data as occurrence records, since an occurrence can be both an observation in the wild, or a museum specimen.

Figure1

Fig. 1: Workflow for directly importing occurrence records into a taxonomic manuscript.

Until now, when users of the ARPHA writing tool wanted to include occurrence records as materials in a manuscript, they would have had to format the occurrences as an Excel sheet that is uploaded to the Biodiversity Data Journal, or enter the data manually. While the “upload from Excel” approach significantly simplifies the process of importing materials, it still requires a transposition step – the data which is stored in a database needs to be reformatted to the specific Excel format. With the introduction of the new import feature, occurrence data that is stored at GBIFBOLD systems, or iDigBio, can be directly inserted into the manuscript by simply entering a relevant record identifier.

The functionality shows up when one creates a new “Taxon treatment” in a taxonomic manuscript prepared in the ARPHA Writing Tool. The import functions as follows:

  1. the author locates an occurrence record or records in one of the supported data portals;
  2. the author notes the ID(s) of the records that ought to be imported into the manuscript (see Fig. 2, 3, and 4 for examples);
  3. the author enters the ID(s) of the occurrence records in a form that is to be seen in the materials section of the species treatment, selects a particular database from a list, and then simply clicks ‘Add’ to import the occurrence directly into the manuscript.

In the case of BOLD Systems, the author may also select a given Barcode Identification Number (BIN; for a treatment of BIN’s read below), which then pulls all occurrences in the corresponding BIN (see Fig. 5).

Figure 2       Figure 3

Fig. 2: (Left) An occurrence record in iDigBio. The UUID is highlighted; Fig. 3: (Right) An occurrence record in GBIF. The GBIF ID and the Occurrence ID is highlighted. (Click on images to enlarge)

Figure 4       Figure 5

Fig. 4: (Left) An occurrence record in BOLD Systems. The record ID is highlighted.; Fig. 5:  (Right) All occurrence records corresponding to a OTU. The BIN is highlighted. (Click on images to enlarge)

We will illustrate this workflow by creating a fictitious treatment of the red moss, Sphagnum capillifolium, in a test manuscript. Let’s assume we have started a taxonomic manuscript in ARPHA and know that the occurrence records belonging to S. capillifolium can be found in iDigBio. What we need to do is to locate the ID of the occurrence record in the iDigBio webpage. In the case of iDigBio, the ARPHA system supports import via a Universally Unique Identifier (UUID). We have already created a treatment for S. capillifolium and clicked on the pencil to edit materials (Fig. 6). When we scroll all the way down in the pop-up window, we see the form which is displayed in the middle of Fig. 1.

Figure 6

Fig. 6: Edit materials.

From here, the following actions are possible:

  • insert (an) occurrence record(s) from iDigBio by specifying their UUID’s (universally unique identifier) (Fig.2);
  • insert (an) occurrence record(s) from GBIF by entering their GBIF ID’s (Fig.3);
  • insert (an) occurrence record(s) from GBIF by entering their occurrence ID’s (note that unfortunately not all GBIF records have an occurrence ID, which is to be understood as some sort of universal identifier) (Fig. 3);
  • insert (an) occurrence record(s) from BOLD by entering their record ID’s (Fig. 4);
  • insert a set of occurrence records from BOLD belonging to a BIN (barcode index number) (Fig. 5).

In this example, select the fifth option (iDigBio) and type or paste the UUID b9ff7774-4a5d-47af-a2ea-bdf3ecc78885 and click Add. This will pull the occurrence record for S. capillifolium from iDigBio and insert it as a material in the current paper (Fig. 6). The same workflow applies also to the aforementioned GBIF and BOLD portals.

Figure 7

Fig. 7: Materials after they have been imported.

This workflow can be used for a number of purposes but one of its most exciting future applications is the rapid re-description of Linnaean species, or new morphological descriptions of species together with DNA barcode sequences (a barcode is a taxon-specific highly conserved gene that provides enough inter-species variation for statistical classification to take place) using the  Barcode Identification Numbers (BIN’s) underlying an Operational Taxonomic Units (OTU). If a taxonomist is convinced that a species hypothesis corresponding to OTU defined algorithmically at  BOLD systems clearly presents a new species, then he/she can import all specimen records associated with that OTU via inserting that OTU’s BIN ID in the respective fields.

Having imported the specimen occurrence records, the author needs to define one specimen as holotype of the news species, other as paratypes, and so on. The author can also edit the records in the ARPHA tool, delete some, or add new ones, etc.

Not having to retype or copy/paste species occurrence records, the authors save a lot of efforts. Moreover, they automatically import them in a structured Darwin Core format, which can easily be downloaded from the article text into structured data by anyone who needs the data for reuse.

Another important aspect of the workflow is that it will serve as a platform for peer-review, publication and curation of raw data, that is of unpublished individual data records coming from collections or observations stored at GBIF, BOLD and iDigBio. Taxonomists are used to publish only records of specimens they or their co-authors have personally studied. In a sense, the workflow will serve as a “cleaning filter” for portions of data that are passed through the publishing process. Thereafter, the published records can be used to curate raw data at collections, e.g. put correct identifications, assign newly described species names to specimens belonging to the respective BIN and so on.

Additional Information:

The work has been partially supported by the EC-FP7 EU BON project (ENV 308454, Building the European Biodiversity Observation Network) and the ITN Horizon 2020 project BIG4(Biosystematics, informatics and genomics of the big 4 insect groups: training tomorrow’s researchers and entrepreneurs), under Marie Sklodovska-Curie grant agreement No. 542241.

Research Ideas & Outcomes: New open-access journal to publish entire research cycles

Research Ideas & Outcomes (RIO), a new open access journal, is formally announced. The new journal represents a paradigm shift in academic publishing: for the first time, RIO will publish research from all stages of the research cycle, across a broad suite of disciplines, from humanities to science.

Traditional journals accept only articles produced at the end of the research continuum, long after the core work has been completed. RIO will publish ideas and outputs from all stages of the research cycle: proposals, experimental designs, data, software, research articles, project reports, policy briefs, project management plans and more.

The journal takes another step ahead with a collaborative platform that allows all ideas and outputs to be labelled with Impact Categories based upon UN Millennium Development Goals(MDGs) and EU Societal Challenges. These categories provide social impact-based labelling to help funders, journalists and the wider public discover and finance relevant research as well as to foster interdisciplinary collaboration around societal challenges.

These game-changing ideas come packed with technical innovation and unique features. The journal is published through ARPHA, the first publishing platform ever to support the full life cycle of a manuscript: from authoring to submission, public peer review, publication and dissemination, within a single, fully-integrated online collaborative environment. The new platform will also allow for RIO to offer one of the most transparent, open and public peer review processes, thus building trust in the reviewed outcomes.

These features come à la carte: RIO will offer flexible pricing where authors can choose exactly which publishing services fit their needs and budget. All its contents – including reviews and comments, data and code – will receive a persistent unique identifier, will be permanently archived and made available under open licenses without any access embargo.

“RIO is not just about different kinds of submissions, though that is a crucial feature and certainly unique for publishing ongoing or even proposed research: it is also about linking those submissions together across the research cycle, about reducing the time from submission to publication, about collaborative authoring and reviewing, about mapping to societal challenges, about technical innovation, about enabling reuse and about giving authors more choice in what features they actually want from the journal.” said Dr. Daniel Mietchen, a founding editor of RIO.

I’m proud to pioneer the first journal which can publish research from all stages of the research process,” said Prof. Lyubomir Penev, Co-Founder of RIO and Pensoft. “For the first time, researchers can get formal publication credit for previously ‘hidden’ parts of their work like written research proposals. We can publish all outputs in one journal; the same journal – RIO.

RIO is scheduled to start accepting manuscripts in November 2015.

Introducing the ARPHA Writing Tool

The former Pensoft Writing Tool (PWT) appears under a new name with exciting functionalities customised to your needs

It’s been almost two full years since we first launched the Pensoft Writing Tool (PWT) as the first ever workflow that supports the full life cycle of a manuscript, from authoring, to peer-review, publishing and dissemination. Now it is time to move a step forward with an updated tool that incorporates all our accumulated experience and your invaluable feedback. PWT is now transforming into ARPHA Writing Tool (AWT) – a rebrand that means much more than a change of name and design.

So, what is so cool about the new ARPHA Writing tool? Here it is:

  • New modern outlook and user-friendly design
  • All editing happens in the manuscript preview mode
  • Plug-in for mathematical formulas
  • Pre-submission technical validation, by automated tool and humans
  • Pre-submission external peer-review
  • Importing manuscripts through Application Programming Interface (API)

Those of you who have been using the PWT remember the two writing modes – Preview and Editing. Over the past two years, we’ve learned that this might sometimes be tricky. With the AWT, there will be no more flipping between modes. The tool now contains only one editing mode – this means rich editing functions and direct visualisation of your changes and comments straight into the the article preview.

Besides, the AWT will take a step beyond biodiversity data publishing towards providing a large set of predefined, yet flexible article templates to allow the publication of most types of research outcomes. As the scope is broadening, we also strive to simplify and improve the user experience.

The AWT is all about user-friendliness. With the new intuitive design and more comprehensible functions, the system is fast to navigate and get used to. While making every effort to improve user experience, we made sure functions are straightforward and easy to discover.

awt-screen-shot (2)The AWT makes collaborative work on a manuscript with co-authors or peers easier than ever. Mentors, pre-submission reviewers, linguistic or copy editors can now contribute to the manuscript side by side. The collaborative peer-review process provides easy communication thanks to a track-change function, comments and replies, as well as automated, but customisable email and social network notification tools.

The tool also provides authors with a two-step technical validation – the manuscript is examined for consistency automatically by the system, followed by a second check from our staff ahead of publication. After an article is published, the AWT also offers easy republication of updated article versions via the authoring tool.

Perhaps the most innovative feature of AWT, however, is the new functionality to invite reviewers still during the authoring process. This function is still globally unique as it allows the authors to discuss manuscripts with their peers before submission, and consequently to submit the reviews together with the manuscript. In case the editor approves the manuscript for publication based on the pre-submission review(s), the manuscript can be published just a few days after submission.

Go to the AWT now and test it yourself: http://arpha.pensoft.net/