This specification describes SciFed, a standard for federation of scientific activities and content using ActivityPub. It is intended to be used in the context of the [[[ActivityStreams-Vocabulary]]] and provides a vocabulary for activity types needed in the context of scientific exchanges. The vocabulary is meant to be used in the context of [[ActivityPub]] exchanges, of which we make dialectal recommendations that merge [[[ActivityStreams-Vocabulary]]] with `` vocabulary.
### Author's Note This draft is heavily influenced by the [[ActivityStreams-Vocabulary]], and uses part of `` as its object types. The author is very thankful for their significant contributions and gladly stands on their shoulders.
The provided namespace is not deployed (see Namespaces).
# Introduction The SciFed Vocabulary relies on [[[ActivityStreams-Vocabulary]]] and extends it with a set of abstract properties that describe past, present and future Activities. The standard is meant as a modern replacement for [[OAI-PMH]], and provides functional parity with it. All SciFed implementations are expected to implement support for the Core Types. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [[RFC2119]]. This specification uses the common term URI to mean both IRI [[RFC3987]] and URI [[RFC3986]]. JSON-LD natively supports IRIs without any special measures.
## Namespaces SciFed makes use of part of the following namespaces:
(default namespace used as base URI)
## Conventions The examples included in this document use the normative JSON serialization defined by the ActivityStreams Core specification.
# Core Types The Core Types provide the minimum requirement to replicate the basic functionalities of a data repository, relying solely on ActivityPub. They are meant as replacement for OAI-PMH's Item and Record notions. Base URI: ``. The SciFed Core Types include Object subtypes that are mapped to common scientific applications: Implementations are expected to use properties that make sense within the specific context and requirements of their applications. They MUST however avoid using extension types or properties that unduly overlap with or duplicate the vocabulary defined here or in the [[[ActivityStreams-Vocabulary]]] that this document already extends.
## Activity Types All Activity Types inherit the properties of the base Activity type. Some specific Activity Types are subtypes or specializations of more generalized Activity Types The Activity Types include:
Class Description Properties
Read URI:

Modification of the `Read` type to add review capabilities of part or all of an Object. Typically a Corpus object.

Extends: Read


## Object and Link types All Object Types inherit the properties of the base Object and schema:CreativeWork types. Non-dataset documents should extend the `schema:CreativeWork` of which `schema:Dataset` is a subclass. The Object Types include: The Link Types include:
Class Description Properties
Corpus URI:

An ensemble of files or links, usually described with additional metadata.

Extends: Object | schema:Dataset.

inLanguage | aboutLanguage | audiences

Inherits all properties from Object and schema:Dataset.

Extends what attachment can express. See attachment.

Metadata URI:

A specialized Link that represents an metadata for a document, usually used in a Corpus object's attachment or url field.

Extends: Link
Properties: Inherits all properties from Link, with the addition of `prov:alternateOf` | `oai:metadataPrefix` | `oai:metadataNamespace` | `oai:schema` | `http:RequestHeader`
WikiData URI:

An alternative to Hashtag that represents a WikiData entry.

Extends: Link
Properties: Inherits all properties from Link.
# Properties Base URI: ``. The "Domain" indicates the type of Object the property term applies to. The "Range" indicates the type of value the property term can have. Certain properties are marked as a "Subproperty Of" another term, meaning that the term is a specialization of the referenced term. Properties marked as being "Functional" can have only one value. Items not marked as "Functional" can have multiple values.
Term Description Example
attachment URI: `@attachment`
Notes: A modified `attachment`, with potential `prov` and `pav` properties. Supports Collections and other Corpus as an attachment.
Domain: Object
Range: Object
inLanguage URI: `@inLanguage`
Notes: Language in which the described resource is, represented by a [[RFC5646]]-controlled vocabulary.
Domain: Object
Range: `xsd:string` | `rdf:langString`
aboutLanguage URI: `@aboutLanguage`
Notes: List of languages about which the described resource is, if different from `@inLanguage`, represented by a [[RFC5646]]-controlled vocabulary.
Domain: Object
Range: @set of `xsd:string` | `rdf:langString`
audiences URI: `@audiences`
Notes: List of domains/audiences the resource is targetted at, if any. Some platforms will want to see only scientific datasets for instance, and it would be superfluous to use the 'tag' property to describe the Dataset as 'scientific'.



@set of schema:Audience.

Intended `auidenceType` values are "scientific", "artistic", "politic", with their respective WikiData identifiers.

# Implementation Notes
## Object purpose rather than object origin You can specify the general nature of the `schema:CreativeWork` via the audiences attribute, and are encouraged to do so, especially if the content is scientific. This allows to clearly state (a) domain(s) of interest, and those only interested by one (i.e.: institutional repositories and aggregators) to filter out the others without relying only on the origin of an object and thus favor *a-posteriori* moderation.
## Object Type Motivating Use Cases Corpora typically hold a lot of information about themselves and their content individual files. They are versatile containers for all type of data - keeping in mind they were designed with scientists' needs as requirements - and can be used for non-scientific activities that require this versatility. Among the tackled problems, are those of existing metadata description held in OAI-PMH repositories. It is often vastly more descriptive than the base ActivityStreams Object that has not as many descriptive attributes as, say, Dublin Core or CMDI. It has however, a `url` attribute holding information about the Object itself, whereas `attachment` holds files and potentially metadata about some of these files. This allows us to select `url` as the place where matadata currently shared by OAI-PMH implementations can be held. In that sense, no information is lost while transitioning to/complementing with an ActivitiyPub implementation. Hashtags and generally tags used in most institutional repositories suffer from different writings and the lack of internationalization. Using alternative tags relying on WikiData ids allows multiple representations depending on the selected language while keeping the semantic meaning and unicity of filtering.