Skip to content

Ingesting external sources

cv-corpus ingest asks each registered ingester whether it accepts() the input, then appends its output to the corpus.

Built-in ingesters

Name Accepts
markdown *.md / *.markdown with YAML frontmatter
linkedin_export a LinkedIn Basic or Complete export ZIP
github_profile a text file or --username flag
orcid a text file or --orcid-id flag with a 19-char ID

Command

cv-corpus ingest path/to/linkedin_export.zip --corpus my_career --dry-run
cv-corpus ingest path/to/linkedin_export.zip --corpus my_career

--dry-run prints the parsed delta without writing, so you can eyeball for spurious guesses before they enter the corpus.

Writing a custom ingester

See Plugin authoring for the Ingester protocol and a minimal example.