This runbook scans a local directory for markdown files, converts each one to Confluence-compatible HTML using pandoc, and creates or updates the corresponding Confluence page. It extracts page titles from the first # heading in each file and applies tracking labels automatically. Ideal for keeping Git-based docs in sync with Confluence.
It solves the recurring problem of documentation drift: when engineering teams write docs in markdown next to their code but stakeholders read those docs in Confluence, the two copies fall out of step. Run this script as the last step of a docs change — locally before a release, or automatically in CI on every merge to a docs branch — and Confluence always reflects what is in version control. Because it searches by page title and overwrites the body in place rather than creating duplicates, repeated runs are idempotent: a file whose content has not changed produces byte-identical storage-format HTML, so re-publishing is a safe no-op. For the full command set used here, see the command reference and the broader Confluence CLI guide.
atlassian-cli installed (install guide)atlassian-cli auth loginpandoc installed for markdown-to-HTML conversionjq installed for JSON processing# Dry-run: see which pages would be created/updated
DRY_RUN=true ./doc-pipeline.sh DOCS prod
# Sync docs directory to DOCS space
./doc-pipeline.sh DOCS prod
# Sync a custom docs directory
DOCS_DIR="./my-docs" ./doc-pipeline.sh DOCS prod
#!/bin/bash
# Automated Documentation Pipeline: Markdown -> Confluence
#
# This script syncs markdown documentation from a Git repository to Confluence.
# It converts markdown files to Confluence storage format and creates/updates pages.
#
# Usage:
# ./doc-pipeline.sh --space DOCS --profile prod
#
# Requirements:
# - atlassian-cli installed and configured
# - pandoc (for markdown -> HTML conversion)
# - jq (for JSON processing)
set -euo pipefail
# Configuration
SPACE_KEY="${1:-DOCS}"
PROFILE="${2:-default}"
DOCS_DIR="./docs"
DRY_RUN="${DRY_RUN:-false}"
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
log() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $*"
}
error() {
echo -e "${RED}[ERROR]${NC} $*" >&2
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $*"
}
# Check dependencies
check_dependencies() {
local missing=0
for cmd in atlassian-cli pandoc jq; do
if ! command -v "$cmd" &> /dev/null; then
error "Required command not found: $cmd"
missing=$((missing + 1))
fi
done
if [ $missing -gt 0 ]; then
error "$missing required dependencies missing. Please install them first."
exit 1
fi
}
# Convert markdown to Confluence storage format
md_to_confluence() {
local md_file="$1"
# Use pandoc to convert markdown to HTML
pandoc -f markdown -t html "$md_file" | \
# Basic cleanup for Confluence
sed 's/<h1>/<h1 style="margin-top: 20px;">/g' | \
sed 's/<code>/<code class="code-inline">/g'
}
# Get or create page by title
get_or_create_page() {
local space_key="$1"
local title="$2"
local parent_id="${3:-}"
# Search for existing page
local page_id
page_id=$(atlassian-cli confluence search cql \
--output json \
"space = $space_key AND title = \"$title\"" 2>/dev/null | \
jq -r '.results[0].content.id // empty')
if [ -n "$page_id" ]; then
echo "$page_id"
return
fi
# Create new page if not found
if [ "$DRY_RUN" = "true" ]; then
warn "[DRY-RUN] Would create page: $title"
echo "DRY_RUN_PAGE_ID"
return
fi
log "Creating new page: $title"
local create_args=(
"confluence" "page" "create"
"--profile" "$PROFILE"
"--space" "$space_key"
"--title" "$title"
)
if [ -n "$parent_id" ]; then
create_args+=("--parent" "$parent_id")
fi
atlassian-cli "${create_args[@]}" --output json | jq -r '.id'
}
# Update page content
update_page() {
local page_id="$1"
local title="$2"
local content="$3"
if [ "$DRY_RUN" = "true" ]; then
warn "[DRY-RUN] Would update page $page_id: $title"
return
fi
# Save content to temp file
local temp_file
temp_file=$(mktemp)
echo "$content" > "$temp_file"
log "Updating page $page_id: $title"
atlassian-cli confluence page update \
--profile "$PROFILE" \
"$page_id" \
--title "$title" \
--body "$temp_file"
rm -f "$temp_file"
}
# Add labels to page
add_labels() {
local page_id="$1"
shift
local labels=("$@")
if [ "$DRY_RUN" = "true" ]; then
warn "[DRY-RUN] Would add labels to $page_id: ${labels[*]}"
return
fi
for label in "${labels[@]}"; do
log "Adding label '$label' to page $page_id"
atlassian-cli confluence page add-label \
--profile "$PROFILE" \
"$page_id" \
"$label" || warn "Failed to add label: $label"
done
}
# Main pipeline
main() {
log "Starting documentation pipeline"
log "Space: $SPACE_KEY | Profile: $PROFILE | Dry-run: $DRY_RUN"
check_dependencies
# Find all markdown files
if [ ! -d "$DOCS_DIR" ]; then
error "Documentation directory not found: $DOCS_DIR"
exit 1
fi
local processed=0
local failed=0
# Process each markdown file
while IFS= read -r md_file; do
log "Processing: $md_file"
# Extract title from first heading
local title
title=$(grep -m 1 '^# ' "$md_file" | sed 's/^# //' || echo "$(basename "$md_file" .md)")
# Convert to Confluence format
local content
content=$(md_to_confluence "$md_file")
# Get or create page
local page_id
page_id=$(get_or_create_page "$SPACE_KEY" "$title")
if [ -z "$page_id" ]; then
error "Failed to get/create page for: $title"
failed=$((failed + 1))
continue
fi
# Update page content
update_page "$page_id" "$title" "$content"
# Add auto-generated label
add_labels "$page_id" "auto-generated" "documentation"
processed=$((processed + 1))
done < <(find "$DOCS_DIR" -name "*.md" -type f)
log "Pipeline complete: $processed processed, $failed failed"
if [ $failed -gt 0 ]; then
exit 1
fi
}
main "$@"
The pipeline is deliberately stateless — there is no local database tracking which page maps to which file. Instead, the markdown file's first H1 becomes the canonical page title, and a Confluence CQL query (space = KEY AND title = "...") is the lookup key. This means the "diff" the script performs is not a line-level comparison; it is an upsert. If a matching title is found, the converted body replaces the existing page content via confluence page update; if not, confluence page create makes a new page. The conversion itself is done by pandoc -f markdown -t html, after which two lightweight sed passes adjust the output for Confluence's storage format (adding top margin to h1 elements and a class to inline code). Pandoc handles fenced code blocks, links, and most tables cleanly, though complex GitHub-flavored markdown tables and raw HTML can need manual review. Setting DRY_RUN=true makes every create, update, and label step print what it would do without calling the API, which is the safe way to validate a new docs directory. The numbered steps below trace one full pass over the docs tree.
Check dependencies. Validates that atlassian-cli, pandoc, and jq are all available on the system PATH before proceeding.
Discover markdown files. Recursively scans the configured docs directory (default ./docs) for all .md files. Each file becomes one Confluence page.
Convert to Confluence format. Pipes each markdown file through pandoc -f markdown -t html, then applies basic cleanup for Confluence storage format compatibility.
Create or update pages. Searches Confluence by title to find existing pages. If found, updates the content. If not, creates a new page in the target space. Titles are extracted from the first # heading in each file.
Apply tracking labels. Tags each synced page with auto-generated and documentation labels so you can easily identify and filter machine-synced content in Confluence.
atlassian-cli auth login (or pass the correct --profile) and confirm the token has space write access.Required command not found: pandoc. Pandoc is not on the PATH — install it with brew install pandoc (macOS) or apt-get install pandoc (Debian/Ubuntu).SPACE_KEY argument and the parent page ID before re-running.sleep between files; the CLI also retries transient 429s automatically.pandoc -f gfm -t html.Because the sync is idempotent and supports a dry-run, it runs cleanly in CI: gate it on changes under your docs/ path and publish on every merge so Confluence never lags behind the repository. Store the API token as an encrypted secret and pass it through the auth profile. The GitHub Actions workflow below runs the runbook whenever files under docs/ change on the main branch:
# .github/workflows/confluence-sync.yml
name: Sync docs to Confluence
on:
push:
branches: [main]
paths:
- 'docs/**'
jobs:
sync:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install dependencies
run: sudo apt-get update && sudo apt-get install -y pandoc jq
- name: Install atlassian-cli
run: pipx install atlassian-cli
- name: Sync markdown to Confluence
env:
ATLASSIAN_API_TOKEN: ${{ secrets.ATLASSIAN_API_TOKEN }}
run: ./doc-pipeline.sh DOCS prod
GitLab CI achieves the same result with a job that uses an only: changes: ["docs/**"] rule and a masked CI/CD variable for the token instead of the on.push.paths trigger and Actions secret.
Can it delete Confluence pages? No. The script only creates or updates pages — it never calls a delete endpoint, so removing a markdown file does not remove the corresponding Confluence page. Use the bulk cleanup runbook for deletions.
Does it support nested pages or multiple spaces? All discovered files sync into one target space as a flat set of pages; nesting works only when a parent page ID is passed, and multi-space sync means running the script once per space key.
How does it detect changes? It does not diff line by line. It looks up each page by its H1 title via CQL and upserts the converted body, so unchanged markdown re-publishes identical HTML and the run is a safe no-op. Keep an offline copy with the Confluence backup runbook before large syncs.