DocFX Document Schema v1.0 Specification
1. Introduction
DocFX supports different document processors to handle different kinds of input. For now, if the data model changes a bit, a new document processor is needed, even most of the work in processors are the same.
DocFX Document Schema (abbreviated to THIS schema below) is introduced to address this problem. This schema is a JSON media type for defining the structure of a DocFX document. This schema is intended to annotate, validate and interpret the document data.
2. Conventions and Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
3. Overview
DocFX Document Schema is in JSON format. It borrows most syntax from JSON Schema, while it also introduces some other syntax to manipulate the data.
3.1 Annotation
THIS schema is a JSON based format for the structure of a DocFX document.
3.2 Validation
JSON schema validation already defines many keywords. This schema starts from supporting limited keyword like type
, properties
.
3.3 Interpretation
Besides annotate and validate the input document model, THIS schema also defines multiple interpretations for each property of the document model.
For example, a property named summary
contains value in Markdown format, THIS schema can define a markup
interpretation for the summary
property, so that the property can be marked using DFM syntax.
4. General Considerations
- THIS schema leverages JSON schema definition, that is to say, keywords defined in JSON schema keeps its meaning in THIS schema when it is supported by THIS schema.
5. Detailed Specification
Format
The files describing DocFX document model in accordance with the DocFX document schema specification are represented as JSON objects and conform to the JSON standards. YAML, being a superset of JSON, can be used as well to represent a DocFX document schema specification file.
All field names in the specification are case sensitive.
This schema exposes two types of fields. Fixed fields, which have a declared name, and Patterned fields, which declare a regex pattern for the field name. Patterned fields can have multiple occurrences as long as each has a unique name.
By convention, the schema file is suffixed with .schema.json
.
Data Types
Primitive data types in THIS schema are based on JSON schema
Schema
For a given field, *
as the starting character in Description cell stands for required.
Schema Object
This is the root document object for THIS schema.
Fixed Field
Field Name | Type | Description |
---|---|---|
$schema | string | * The version of the schema specification, for example, https://dotnet.github.io/docfx/schemas/v1.0/schema.json# . |
version | string | * The version of current schema object. |
id | string | It is best practice to include an id property as an unique identifier for each schema. |
title | string | The title of current schema, LandingPage , for example. In DocFX, this value can be used to determine what kind of documents apply to this schema, If not specified, file name before schema.json of this schema is used. Note that . is not allowed. |
description | string | A short description of current schema. |
type | string | * The type of the root document model MUST be object . |
properties | Property Definitions Object | An object to hold the schema of all the properties. |
metadata | string | In json-pointer format as defined in http://json-schema.org/latest/json-schema-validation.html#rfc.section.8.3.9. The format for JSON pointer is defined by https://tools.ietf.org/html/rfc6901, referencing to the metadata object. Metadata object is the object to define the metadata for current document, and can be also set through globalMetadata or fileMetadata in DocFX. The default value for metadata is empty which stands for the root object. |
Patterned Field
Field Name | Type | Description |
---|---|---|
^x- | Any | Allows extensions to THIS schema. The field name MUST begin with x-, for example, x-internal-id. The value can be null, a primitive, an array or an object. |
Property Definitions Object
It is an object where each key is the name of a property and each value is a schema to describe that property.
Patterned Field
Field Name | Type | Description |
---|---|---|
{name} | Property Object | The schema object for the {name} property |
Property Object
An object to describe the schema of the value of the property.
Fixed Field
Field Name | Type | Description |
---|---|---|
title | string | The title of the property. |
description | string | A lengthy explanation about the purpose of the data described by the schema. |
default | what type defined |
The default value for current field. |
type | string | The type of the root document model. Refer to type keyword for detailed description. |
properties | Property Definitions Object | An object to hold the schema of all the properties if type for the model is object . Omitting this keyword has the same behavior as an empty object. |
items | Property Object | An object to hold the schema of the items if type for the model is array . Omitting this keyword has the same behavior as an empty schema. |
reference | string | Defines whether current property is a reference to the actual value of the property. Refer to reference for detailed explanation. |
contentType | string | Defines the content type of the property. Refer to contentType for detailed explanation. |
tags | array | Defines the tags of the property. Refer to tags for detailed explanation. |
mergeType | string | Defines how to merge the property. Omitting this keyword has the same behavior as merge . Refer to mergeType for detailed explanation. |
xrefProperties | array | Defines the properties of current object when it is cross referenced by others. Each item is the name of the property in the instance. Refer to xrefProperties for detailed description of how to leverage this property. |
Patterned Field
Field Name | Type | Description |
---|---|---|
^x- | Any | Allows extensions to THIS schema. The field name MUST begin with x-, for example, x-internal-id. The value can be null, a primitive, an array or an object. |
6. Keywords in detail
6.1 type
Same as in JSON schema: http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.25
The value of this keyword MUST be either a string or an array. If it is an array, elements of the array MUST be strings and MUST be unique.
String values MUST be one of the six primitive types ("null", "boolean", "object", "array", "number", or "string"), or "integer" which matches any number with a zero fractional part.
An instance validates if and only if the instance is in any of the sets listed for this keyword.
6.2 reference
It defines whether current property is a reference to the actual value of the property. The values MUST be one of the following:
Value | Description |
---|---|
none |
It means the property is not a reference. |
file |
It means current property stands for a file path that contains content to be included. |
6.3 contentType
It defines how applications interpret the property. If not defined, the behavior is similar to default
value. The values MUST be one of the following:
Value | Description |
---|---|
default |
It means that no interpretion will be done to the property. |
uid |
type MUST be string . With this value, the property name MUST be uid . It means the property defines a unique identifier inside current document model. |
href |
type MUST be string . It means the property defines a file link inside current document model. Application CAN help to validate if the linked file exists, and update the file link if the linked file changes its output path. |
xref |
type MUST be string . It means the property defines a UID link inside current document model. Application CAN help to validate if the linked UID exists, and resolve the UID link to the corresponding file output link. |
file |
type MUST be string . It means the property defines a file path inside current document model. Application CAN help to validate if the linked file exists, and resolve the path to the corresponding file output path. The difference between file and href is that href is always URL encoded while file is not. |
markdown |
type MUST be string . It means the property is in DocFX flavored Markdown syntax. Application CAN help to transform it into HTML format. |
6.4 tags
The value of this keyword MUST be an array
, elements of the array MUST be strings and MUST be unique. It provides hints for applications to decide how to interpret the property, for example, localizable
tag can help Localization team to interpret the property as localizable.
6.5 mergeType
The value of this keyword MUST be a string. It specifies how to merge two values of the given property. One use scenario is how DocFX uses the overwrite files to overwrite the existing values. In the below table, we use source
and target
to stands for the two values for merging.
The value MUST be one of the following:
Value | Description |
---|---|
key |
If key for source equals to the one for target , these two values are ready to merge. |
merge |
The default behavior. For array , items in the list are merged by key for the item. For string or any value type, target replaces source . For object , merge each property along with its own merge value. |
replace |
target replaces source . |
ignore |
source is not allowed to be merged. |
6.6 xrefProperties
The value of this keyword MUST be an array of string
. Each string
value is the property name of current object that will be exported to be cross referenced by others.
To leverage this feature, a new xref
syntax with template
attribute is support:
<xref uid="{uid}" template="{path_of_partial_template}" />
For the parital template, the input model is the object containing properties xrefProperties
defines.
For example, in the sample schema defined by 7. Samples, "xrefProperties": [ "title", "description" ],
, title
and description
are xrefProperties
for uid webapp
. A partial template to render this xref, for example, named partials/overview.tmpl
, looks like:
{{title}}: {{{description}}}
When someone references this uid using <xref uid="webapp" template="partials/overview.tmpl"
, docfx
expand this xref
into the following html:
Web Apps Documentation: <p>This is description</p>
In this way, users can not only cross reference others to get the target url, but also cross reference other properties as they like.
A common usage of this is the Namespace page in ManagedReference. The Namespace page shows a table of its Classes with the summary
of the Class, with the help of xrefProperties
, the source of truth summary
is always from Class. For the Namespace page, it can, for example:
- Define a
class.tr.tmpl
template:<tr><td>{{name}}</td><td>{{{summary}}}</td></tr>
- The namespace
namespace.tmpl
template, usexref
to render its children classes:{{#children}} <xref uid="{{uid}}" template="class.tr.tmpl" /> {{/children}}
7. Samples
Here's an sample of the schema. Assume we have the following YAML file:
### YamlMime:LandingPage
title: Web Apps Documentation
description: This is description
uid: webapp
metadata:
title: Azure Web Apps Documentation - Tutorials, API Reference
meta.description: Learn how to use App Service Web Apps to build and host websites and web applications.
services: app-service
author: apexprodleads
manager: carolz
ms.service: app-service
ms.tgt_pltfrm: na
ms.devlang: na
ms.topic: landing-page
ms.date: 01/23/2017
ms.author: carolz
sections:
- title: 5-Minute Quickstarts
children:
- text: .NET
href: app-service-web-get-started-dotnet.md
- text: Node.js
href: app-service-web-get-started-nodejs.md
- text: PHP
href: app-service-web-get-started-php.md
- text: Java
href: app-service-web-get-started-java.md
- text: Python
href: app-service-web-get-started-python.md
- text: HTML
href: app-service-web-get-started-html.md
- title: Step-by-Step Tutorials
children:
- content: "Create an application using [.NET with Azure SQL DB](app-service-web-tutorial-dotnet-sqldatabase.md) or [Node.js with MongoDB](app-service-web-tutorial-nodejs-mongodb-app.md)"
- content: "[Map an existing custom domain to your application](app-service-web-tutorial-custom-domain.md)"
- content: "[Bind an existing SSL certificate to your application](app-service-web-tutorial-custom-SSL.md)"
In this sample, we want to use the JSON schema to describe the overall model structure. Further more, the href
is a file link. It need to be resolved from the relative path to the final href. The content
property need to be marked up as a Markdown string. The metadata
need to be tagged for further custom operations. We want to use section
's title
as the key for overwrite section
array.
Here's the schema to describe these operations:
{
"$schema": "https://dotnet.github.io/docfx/schemas/v1.0/schema.json#",
"version": "1.0.0",
"id": "https://github.com/dotnet/docfx/schemas/landingpage.schema.json",
"title": "LandingPage",
"description": "The schema for landing page",
"type": "object",
"xrefProperties": [ "title", "description" ],
"properties": {
"metadata": {
"type": "object",
"tags": [ "metadata" ]
},
"uid": {
"type": "string",
"contentType": "uid"
},
"sections": {
"type": "array",
"items": {
"type": "object",
"properties": {
"children": {
"type": "array",
"items": {
"type": "object",
"properties": {
"href": {
"type": "string",
"contentType": "href"
},
"text": {
"type": "string",
"tags": [ "localizable" ]
},
"content": {
"type": "string",
"contentType": "markdown"
}
}
}
},
"title": {
"type": "string",
"mergeType": "key"
}
}
}
},
"title": {
"type": "string"
}
}
}
8. Q & A
- DocFX fills
_global
metadata into the processed data model, should the schema reflect this behavior?- Decision: NOT include, this schema is for input model, use another schema for output model.
- Is it necessary to prefix
d-
to every field that DocFX introduces in?- If keep
d-
- Pros:
d-
makes it straightforward that these keywords are introduced by DocFX- Keywords DocFX introduces in will never duplicate with the one preserved by JSON schema
- Cons:
d-
prefix provides a hint that these keywords are not first class keywords- Little chance that keywords DocFX defines duplicate with what JSON schema defines, after all, JSON schema defines a finite set of reserved keywords.
- For exampleSwagger spec is also based on JSON schema and the fields it introduces in has no prefix.
- Pros:
- Decision: Remove
d-
prefix.
- If keep