API Reference
This page describes the core models, utility functions, and source producers exposed by the google-okf library.
Core Models
Concept
Represents a single OKF document consisting of YAML frontmatter and a Markdown body.
Example Usage
from datetime import datetime
from google_okf import Concept
concept = Concept(
type="Database Table",
title="Table: users",
description="Stores user accounts information",
resource="mysql://localhost/users",
tags=["database", "sql"],
timestamp=datetime.now(),
body="# Table: users\n\nThis table contains user data."
)
# Serialize to standard OKF string format
file_content = concept.to_file_content()
| Field Name | Type | Required? | Description |
|---|---|---|---|
type |
str |
Yes | The concept type (e.g. Database Table, Document, Metric). |
title |
Optional[str] |
No | Human-readable display name. |
description |
Optional[str] |
No | Brief summary snippet of the concept. |
resource |
Optional[str] |
No | Canonical URI identifying the underlying asset. |
tags |
Optional[List[str]] |
No | List of metadata tags. Defaults to []. |
timestamp |
Optional[datetime] |
No | Date and time when the concept was last modified. |
body |
str |
No | Free-form markdown content. Defaults to "". |
Core I/O Functions
write_concept
Writes a single Concept file to a bundle directory.
from google_okf import write_concept
write_concept(bundle_dir="my_bundle", concept_id="tables/users", concept=concept)
| Argument | Type | Description |
|---|---|---|
bundle_dir |
Union[Path, str] |
The root directory of the OKF bundle. |
concept_id |
str |
Relative path identifying the concept (e.g. tables/users). |
concept |
Concept |
The Concept instance to write. |
- Returns:
Path- The absolute path of the written file.
write_bundle
Writes a dictionary of concepts into a bundle directory structure.
from google_okf import write_bundle
concepts = {"tables/users": concept_a, "documents/guide": concept_b}
write_bundle(bundle_dir="my_bundle", concepts=concepts)
| Argument | Type | Description |
|---|---|---|
bundle_dir |
Union[Path, str] |
The root directory of the OKF bundle. |
concepts |
Dict[str, Concept] |
Dict mapping Concept IDs to Concept instances. |
read_concept
Parses a single OKF Markdown file into a Concept instance.
| Argument | Type | Description |
|---|---|---|
file_path |
Union[Path, str] |
Path to the OKF markdown file. |
- Returns:
Concept- The parsed concept object.
read_bundle
Scans and parses an entire OKF bundle directory, returning all concepts.
| Argument | Type | Description |
|---|---|---|
bundle_dir |
Union[Path, str] |
The root directory of the OKF bundle. |
- Returns:
Dict[str, Concept]- Map of Concept IDs (relative paths without.mdextension) to parsed concepts.
validate_bundle_links
Scans all concepts in a bundle and validates that all internal markdown relative links point to existing files.
| Argument | Type | Description |
|---|---|---|
bundle_dir |
Union[Path, str] |
The root directory of the OKF bundle. |
- Returns:
List[Tuple[str, str, str]]- List of broken link tuples:(source_concept_id, target_link, error_details).
Connectors & Producers
DocumentProducer
Extracts text and metadata from files (PDF, DOCX, MD, TXT) in a directory and maps them to OKF concepts.
Example
| Parameter | Type | Default | Description |
|---|---|---|---|
source_dir |
Union[Path, str] |
Required | Directory containing the source documents. |
output_prefix |
str |
"documents" |
Prefix subfolder for concept IDs (e.g. documents/deploy). |
tags |
List[str] |
[] |
List of global tags to apply to all imported documents. |
- Returns:
Dict[str, Concept]- Extracted document concepts.
MySQLProducer
Connects to SQL databases (MySQL, SQLite, PostgreSQL, etc. via SQLAlchemy), inspects tables, column metadata, primary/foreign keys, and outputs OKF table concepts.
Example
| Parameter | Type | Default | Description |
|---|---|---|---|
connection_uri |
str |
Required | SQLAlchemy connection string. |
output_prefix |
str |
"database/tables" |
Prefix subfolder for concept IDs. |
schema |
Optional[str] |
None |
Inspect a specific database schema name. |
tables |
Optional[List[str]] |
None |
Restrict extraction to these specific table names. |
exclude_tables |
Optional[List[str]] |
None |
Exclude these table names from extraction. |
- Returns:
Dict[str, Concept]- Extracted table concepts.
MongoDBProducer
Connects to MongoDB, samples collections to dynamically infer BSON/Python schemas, and outputs OKF collection concepts.
Example
| Parameter | Type | Default | Description |
|---|---|---|---|
connection_uri |
str |
Required | MongoDB connection URI. |
database_name |
str |
Required | Name of the database to inspect. |
output_prefix |
str |
"database/collections" |
Prefix subfolder for concept IDs. |
sample_size |
int |
10 |
Number of documents to sample per collection for schema inference. |
collections |
Optional[List[str]] |
None |
Include only these specific collection names. |
exclude_collections |
Optional[List[str]] |
None |
Exclude these collection names. |
- Returns:
Dict[str, Concept]- Extracted collection concepts.