Configure checks for a collection#

You have a directory of markdown files and want Katalyst to enforce checks on them. This guide adds a collection and attaches checks to it.

1. Point a collection at the directory#

Collections are declared inside a storage instance. In a fresh project that is .katalyst/storage/local.yaml (the default filesystem instance). Add the collection under collections:, keyed by its name; path is the directory relative to the instance root:

# .katalyst/storage/local.yaml
type: filesystem
root: .
collections:
  posts:
    path: content/posts

If you omit path, the directory defaults to the collection name. If you omit pattern, it defaults to *.md.

2. Attach checks#

Add a checks list. Each entry names a kind and its required keys, see the check types reference for every check type:

# .katalyst/storage/local.yaml
type: filesystem
root: .
collections:
  posts:
    path: content/posts
    checks:
      - kind: markdown_requires_h1
      - kind: markdown_title_matches_h1
        field: title
      - kind: filesystem_name_case
        style: kebab

A collection must have at least one check, either a schema (see Add a schema) or a non-empty checks list.

3. Run it#

katalyst check posts

Each item prints OK or a path:line: /pointer: message violation. Files in content/posts that do not match the pattern are reported as errors, so nothing is silently skipped. With a conforming hello-world.md beside a mis-named Bad_Title.md whose title and H1 disagree, the run reports each failing check and exits 1:

<project>/content/posts/hello-world.md: OK
<project>/content/posts/Bad_Title.md:4: /title: "Bad title" does not match first H1 "A different heading"
<project>/content/posts/Bad_Title.md: /: filename "Bad_Title" must be kebab-case
exit status 1

Lint the body as text#

The checks above read frontmatter and filenames. To lint the body itself, as raw text, regardless of markdown structure, use the text_* rules. Each takes a regex pattern (or a list of literal values) and an optional target selecting which slice of the body to test (body, line, first-line, matched-lines):

checks:
  # No line may contain "TODO".
  - kind: text_forbids
    target: line
    pattern: '\bTODO\b'
  # The body must mention "Sources" somewhere.
  - kind: text_requires
    pattern: Sources
  # Ban a set of literal markers (regex metacharacters are inert).
  - kind: text_denylist
    values: [FIXME, XXX]

Because text rules read only the body, they also lint plain-text items, a .txt file, or a markdown file with no frontmatter, so a collection with pattern: "*.txt" works the same way.

A text_forbids rule may declare a fix: a replacement template ($1, ${name} capture syntax) applied to the matched text by katalyst fix. This one drops a trailing period from the first body line:

checks:
  - kind: text_forbids
    target: first-line
    pattern: '\.(\s*)$'
    fix: '$1'

Apply different checks per page type#

When one collection holds more than one kind of item, say a Hugo content tree where section landing pages (_index.md, carrying bookCollapseSection) sit beside ordinary content pages, use variants to diverge the checks. Each variant’s when is a metadata predicate (the same grammar as item list --filter); an item runs the base checks plus the first matching variant’s.

pages:
  path: docs/content
  pattern: "**/*.md"
  schema: page                    # base: every page needs a title
  variants:
    # Content pages must declare their sort weight; section landing pages
    # (for which this `when` is false) are exempt and run the base alone.
    - when: "!bookCollapseSection"
      checks:
        - kind: object_required_field
          field: weight

Put a check in a variant, not the base, exactly when some page type must skip it. To require that every item match some variant, add useExhaustiveVariants: true; an unmatched item then fails with matches no variant. Discrimination is by frontmatter only; selecting items by path is not supported yet.