Only this pageAll pages
Powered by GitBook
1 of 19

tashmet

Getting Started

Loading...

Loading...

Loading...

Concepts

Loading...

Loading...

Loading...

Loading...

Storage

Loading...

Loading...

Loading...

Loading...

Loading...

Custom operators

Loading...

Loading...

Loading...

Connection

Methods of connecting to a database

Overview

Tashmet is split into two main parts, the storage engine and the client. The client issues commands in the form of plain objects to the storage engine via a proxy interface. The storage engine can live in the same application or be connected to remotely through socket.

Local connection

The basic use case for Tashmet is one where both the storage engine and client (Tashmet) is run in the same app. The storage engine exposes a proxy interface that we can make the connection through.

See the Hello world example for how this is set up.


Remote connection

A more advanced use case is when the storage engine runs in a different process, or even on a remote server. We can for instance have the Tashmet client run in a browser connecting to the server, thus bypassing the need for a more traditional rest-inteface approach.

This is currently an experimental feature that lacks the security necessary for production. Only use for developmental purposes.

Blog example

In the following example we will create a server that listens for incoming connections on port 8080.

import { LogLevel } from '@tashmet/core';
import mingo from '@tashmet/mingo';
import Nabu from '@tashmet/nabu';
import TashmetServer from '@tashmet/server';
import { terminal } from '@tashmet/terminal';

const store = Nabu
  .configure({
    logLevel: LogLevel.Debug,
    logFormat: terminal(),
    persistentState: db => `${db}.yaml`,
  })
  .use(mingo())
  .bootstrap();

new TashmetServer(store).listen(8080);

The blog posts are stored in content/posts and we have set up a Nabu database configuration that defines our collection. In the next step, when we set up the client and access the database called content this file will be read by the storage engine as per the peristentState option we specified above.

collections:
  posts:
    storageEngine:
      directory:
        path: ./content/posts
        extension: .md
        format:
          yaml:
            frontMatter: true
            contentKey: articleBody

In the client we establish a connection using the ServerProxy then we are free to use the database as we normally would. Notice how we run an aggregation where markdown content is transformed to HTML. This operation is carried out in the storage engine on the server and the results are sent to the client. This means that if we run this in a browser the bundle size is quite small since it only contains the Tashmet client, while we still have all the power to do things like complex data transformations.

import Tashmet from '@tashmet/tashmet';
import ServerProxy from '@tashmet/proxy';

Tashmet
  .connect(new ServerProxy({ uri: 'http://localhost:8080' }))
  .then(async tashmet =>  {
    const db = tashmet.db('content');
    const posts = await db.collection('posts');

    const cursor = posts.aggregate([
      { $set: { html: { $markdownToHtml: '$articleBody' } } }
    ]);

    for await (const doc of cursor) {
      console.log(doc);
    }

    tashmet.close();
  });

Storage engines

The heart of Tashmet

Introduction

The storage engine is responsible for storing documents. It is bundled with an aggregation engine that allows for queries such as finding, updating, replacing and deleting those documents.


Nabu

The primary storage engine for Tashmet is called Nabu. It is a persistent storage solution that allows for documents to be written to and read from the file system. Nabu is also bundled with a fallback option for in-memory storage that is used by default.

Features

  • Persistent file storage

  • Built in support for JSON, YAML and Markdown

  • Available only in server (Node.js)

  • Includes fallback in-memory storage option

Configuration

The following example creates a Nabu storage engine with default configuration, using mingo as aggregation engine.

import Nabu from '@tashmet/nabu';
import mingo from '@tashmet/mingo';

const store = Nabu
  .configure({})
  .use(mingo())
  .bootstrap();

See the Hello world example for how to connect to and operate on the store

One important aspect of Nabu is that the state of the databases, ie which collections they have and how they are set up, can be persisted to disk in human readable form. If you don't need to create databases and collections dynamically at runtime it's probably more convenient to just craft a configuration file by hand in yaml.

The following configuration option will tell Nabu to look up a database configuration in a yaml file with the same name as the database.

const store = Nabu
  .configure({
    dbFile: db => `${db}.yaml`
  })
  // ...

Let's create the configuration file for a database called mydb that should have a collection named posts. We're using a directory to store our documents. For more details and other options see storage options below.

collections:
  posts:
    storageEngine:
      directory:
        path: ./content/posts
        extension: .md
        format:
          yaml:
            frontMatter: true
            contentKey: articleBody

To connect to our database and use the collection we simply do the following.

Tashmet
  .connect(store.proxy())
  .then(async tashmet =>  {
    const db = tashmet.db('mydb');
    const posts = await db.collection('posts');

Storage options

Nabu supports a wide range of different storage options that determine how documents are read from and written to disk.

  • Array in file

  • Object in file

  • Directory

  • Glob

  • Memory

These can be configured per collection or be specified for the whole database


Memory

The memory storage engine is a purely volatile storage solution.

Features

  • Volatile In-memory storage

  • Available both on server and in browser

For each supported operation the Tashmet client will build a command that is passed through a proxy, either the proxy provided by the storage engine, or though a network connection to a server that acts on the storage engine.

Hence, once a storage engine is created, we can actually execute these raw commands on the engine directly. Consider the following example:

const ns = new TashmetNamespace('mydb');

const storageEngine = Memory
  .configure({})
  .use(mingo())
  .bootstrap()

// Create a collection named 'test'
await storageEngine.command(ns, {create: 'test'});

// Insert a number of documents into it.
await storageEngine.command(ns, {insert: 'test', documents: [
  { _id: 1, category: "cake", type: "chocolate", qty: 10 },
  { _id: 2, category: "cake", type: "ice cream", qty: 25 },
  { _id: 3, category: "pie", type: "boston cream", qty: 20 },
  { _id: 4, category: "pie", type: "blueberry", qty: 15 }
]});

Introduction

Tashmet documentation

What is this?

Tashmet is a javascript database that provides an interface that as closely as possible tracks the interface of MongoDB. Basically Tashmet leverages the power of the excellent aggregation framework mingo together with concepts like databases and collections to provide a MongoDB-like experience in pure javascript.

Why?

The primary motivation for this framework, and what really makes it powerful, is the ability to work with documents on your filesystem, be they json, yaml or other formats. Since custom operators are supported, an aggregation pipeline can also involve steps for doing things like transforming markdown to html or writing output to the file system. These features alone makes Tashmet an excellent backbone in a project such as a static site generator.

Basic concepts

Just like MongoDB, Tashmet is built on a client/server architecture but with the additional option to short-loop that gap with a connection to a storage engine in the same process.

The connection medium between client and server (or storage engine) is referred to as the proxy.

Collections

Operations on collections

Overview

Collection operations

Tashmet collections support a subset of operations from MongoDB.

insertOne

insertOne(
  document: OptionalId<TSchema>, options?: InsertOneOptions
): Promise<InsertOneResult>

Inserts a single document into the collection. If documents passed in do not contain the _id field, one will be added to each of the documents missing it by the driver, mutating the document. This behavior can be overridden by setting the forceServerObjectId flag.

  • doc - The document to insert

  • options - Optional settings for the command

See: https://www.mongodb.com/docs/drivers/node/current/usage-examples/insertOne/


insertMany

insertMany(
  documents: OptionalId<TSchema>[], options?: BulkWriteOptions
): Promise<InsertManyResult>

Inserts an array of documents into the collection. If documents passed in do not contain the _id field, one will be added to each of the documents missing it by the driver, mutating the document. This behavior can be overridden by setting the forceServerObjectId flag.

  • docs - The documents to insert

  • options - Optional settings for the command

See: https://www.mongodb.com/docs/drivers/node/current/usage-examples/insertMany/


findOne

findOne(): Promise<WithId<TSchema> | null>;
findOne(filter: Filter<TSchema>): Promise<WithId<TSchema> | null>;
findOne(filter: Filter<TSchema>, options: FindOptions): Promise<WithId<TSchema> | null>;

Fetches the first document that matches the filter

  • filter - Query for find Operation

  • options - Optional settings for the command

See: https://www.mongodb.com/docs/drivers/node/current/usage-examples/findOne/


find

find(): FindCursor<WithId<TSchema>>;
find(filter: Filter<TSchema>, options?: FindOptions): FindCursor<WithId<TSchema>>;

Creates a cursor for a filter that can be used to iterate over results from the collection

  • filter - The filter predicate. If unspecified, then all documents in the collection will match the predicate

  • options - Optional settings for the command

    • sort?: SortingMap - Set to sort the documents coming back from the query. Key-value map, ex. {a: 1, b: -1}

    • skip?: number - Skip the first number of documents from the results.

    • limit?: number - Limit the number of items that are fetched.

    • projection?: Projection<TSchema> - The fields to return in the query. Object of fields to either include or exclude (one of, not both), {'a':1, 'b': 1} or {'a': 0, 'b': 0}

See: https://www.mongodb.com/docs/drivers/node/current/usage-examples/find/


aggregate

aggregate<T extends Document = Document>(
  pipeline: Document[] = [], options: AggregateOptions = {}
): AggregationCursor<T>

Execute an aggregation framework pipeline against the collection

  • pipeline - An array of aggregation pipelines to execute

  • options - Optional settings for the command

    • batchSize?: number - The number of documents to return per batch. See aggregation documentation.

    • bypassDocumentValidation?: boolean - Allow driver to bypass schema validation

    • collation?: CollationOptions - Specify collation.

    • out?: string

See: https://www.mongodb.com/docs/drivers/node/current/usage-examples/aggregate/


distinct

distinct<Key extends keyof WithId<TSchema>>(
  key: Key
): Promise<Array<Flatten<WithId<TSchema>[Key]>>>;
distinct<Key extends keyof WithId<TSchema>>(
  key: Key,
  filter: Filter<TSchema>
): Promise<Array<Flatten<WithId<TSchema>[Key]>>>;
distinct<Key extends keyof WithId<TSchema>>(
  key: Key,
  filter: Filter<TSchema>,
  options: DistinctOptions
): Promise<Array<Flatten<WithId<TSchema>[Key]>>>;

// Embedded documents overload
distinct(key: string): Promise<any[]>;
distinct(key: string, filter: Filter<TSchema>): Promise<any[]>;
distinct(key: string, filter: Filter<TSchema>, options: DistinctOptions): Promise<any[]>;

The distinct command returns a list of distinct values for the given key across a collection.


countDocuments

countDocuments(
  filter: Filter<TSchema> = {}, options: CountDocumentsOptions = {}
): Promise<number>

Gets the number of documents matching the filter.

  • filter - The filter for the count

  • options - Optional settings for the command


replaceOne

replaceOne(
  filter: Filter<TSchema>, replacement: TSchema, options?: ReplaceOneOptions
): Promise<UpdateResult>

Replace a document in a collection with another document


updateOne

updateMany

deleteOne

deleteOne(filter: Filter<TSchema>, options?: DeleteOptions): Promise<DeleteResult>

Delete a document from a collection

  • filter - The filter used to select the document to remove

  • options - Optional settings for the command


deleteMany

Installation

How to setup Tashmet

Tashmet works by combining three components; a storage engine, an aggregation engine and a client. Tashmet is the client and Nabu is the default storage engine. Mingo is currently the one and only aggregation engine.

To get started let's install all three.

Aggregation

Running an aggregation pipeline

Overview

In this example we create a storage engine in memory and perform a simple aggregation on a set of documents. It closely mimics the aggregation example from MongoDB docs:

Examples

The following example is available in the .

The above example should yield the following output

$ npm install @tashmet/tashmet @tashmet/nabu @tashmet/mingo
import Tashmet from '@tashmet/tashmet';
import Nabu from '@tashmet/nabu';
import mingo from '@tashmet/mingo';

const store = Nabu
  .configure({})
  .use(mingo())
  .bootstrap();

Tashmet
  .connect(store.proxy())
  .then(async tashmet => {
    const db = tashmet.db('aggregation');
    const coll = await db.createCollection('restaurants');

    // Create sample documents
    const docs = [
      { stars: 3, categories: ["Bakery", "Sandwiches"], name: "Rising Sun Bakery" },
      { stars: 4, categories: ["Bakery", "Cafe", "Bar"], name: "Cafe au Late" },
      { stars: 5, categories: ["Coffee", "Bakery"], name: "Liz's Coffee Bar" },
      { stars: 3, categories: ["Steak", "Seafood"], name: "Oak Steakhouse" },
      { stars: 4, categories: ["Bakery", "Dessert"], name: "Petit Cookie" },
    ];

    // Insert documents into the restaurants collection
    await coll.insertMany(docs);

    // Define an aggregation pipeline with a match stage and a group stage
    const pipeline = [
      { $match: { categories: "Bakery" } },
      { $group: { _id: "$stars", count: { $sum: 1 } } }
    ];

    // Execute the aggregation
    const aggCursor = coll.aggregate(pipeline);

    // Print the aggregated results
    for await (const doc of aggCursor) {
      console.log(doc);
    }
  });
{ _id: 3, count: 1 }
{ _id: 4, count: 2 }
{ _id: 5, count: 1 }
https://www.mongodb.com/docs/drivers/node/current/fundamentals/aggregation/
repo

Hello World

A simple example

Introduction

In this section we are going to create a simple hello world application.

Here's what you are going to learn:

  • Connecting to a database

  • Inserting and retrieving a document

Your first application

We start by setting up our storage engine. Nabu will, with its default configuration, create databases in memory for us. We will later explore how we can store documents on the file system.

Next we create a collection, insert a single document into it and then run a query for all documents on that same collection. Just like in MongoDB, the find method will return a cursor that we can use to extract the results. Here we pick the first entry and print it to the command line.

import Tashmet from '@tashmet/tashmet';
import Nabu from '@tashmet/nabu';
import mingo from '@tashmet/mingo';

// Create the storage engine
const store = Nabu
  .configure({})
  .use(mingo())
  .bootstrap();

Tashmet
  .connect(store.proxy())
  .then(async tashmet => {
    const db = tashmet.db('hello-world');
    const posts = db.collection('posts');

    // Insert a single document
    await posts.insertOne({title: 'Hello World!'});

    // Retrieve a cursor and get the first document from it
    const doc = await posts.find().next();
    
    console.log(doc);
  });

Running the app

To test it out you can install ts-node which allows direct execution of typescript on Node.js.

$ npm install ts-node -g

Lets run the application

$ ts-node index.ts

The output should be something like this

{ title: 'Hello World!', _id: '624de5a6a92e999d3a4b5dbf' }

Notice that an ID has been generated for our document since we did not supply one ourselves.

Array in file

Storing a collection as an array within a file

Introduction

This page describes how to set up persistent file storage for a collection using Nabu, where the documents are stored as an array within a single file.

Array in file storage requires the Nabu storage engine

Usage

Create a single collection

Tashmet.connect(store.proxy()).then(async tashmet => {
  const collection = await tashmet.db('myDb').createCollection('myCollection', {
    storageEngine: {
      arrayInFile: {
        path: 'content/myCollection.yaml',
        format: 'yaml'
      }
    }
  });
});

The collection stored in 'content/myCollection.yaml' where each document will reside in a yaml file that derives its name from the _id of the document.

Reuse across database

By defining a custom I/O rule for the storage engine, we can reuse the same configuration across multiple collections.

const store = Nabu
  .configure({})
  .use(mingo())
  .io('yaml', (ns, options) => ({
    arrayInFile: {
      path: `${ns.db}/${ns.collection}.yaml`,
      format: 'yaml'
    }
  }))
  .bootstrap();

Tashmet.connect(store.proxy()).then(async tashmet => {
  const collection = await tashmet.db('myDb').createCollection('myCollection', {
    storageEngine: 'yaml'
  });
});

Alternatively we can set the default I/O rule and skip the storageEngine directive entirely.

const store = Nabu
  .configure({
    defaultIO: 'yaml'
  })
  // ...

Tashmet.connect(store.proxy()).then(async tashmet => {
  const collection = await tashmet.db('myDb').createCollection('myCollection');
});

Store multiple collections within same file

By specifying a field option we can store a complete database within the same file. In the following example we set up the I/O so that the YAML-file contains an object where keys correspond to names of collections with values being the list of documents.

const store = Nabu
  .configure({})
  .use(mingo())
  .io('yaml', (ns, options) => ({
    arrayInFile: {
      path: `${ns.db}.yaml`,
      format: 'yaml',
      field: ns.collection
    }
  }))
  .bootstrap();

Example file output

collection1:
  - _id: d1
  - _id: d2
collection2:
  - _id: d1

Parameters

Path

path: string

Path to the file where documents are stored

Format

format: string

File format. The current valid file formats include:

  • format: 'json'

  • format: 'yaml'

Field

field?: string

An optional name of the field under which the documents are stored in the file. When omitted the list of documents are stored at the root

Note that field supports nesting, eg: 'foo.bar' is valid

Include array index

includeArrayIndex?: string

Include the index of the document within the stored array as a field.

{
  // ...
  includeArrayIndex: 'order'
}

Note that when using this option as specified above, changing the value of the order field will also affect the index of the document within the stored output.

Input

input?: Document[]

Optional additional pipeline stages to apply after documents have been read from file.

When using this option, make sure that output is also present and does the inverse transformation of `input.

Output

output?: Document[]

Optional additional pipeline stages to apply before documents are written to file.

When using this option, make sure that input is also present and does the inverse transformation of output.

Object in file

Storing a collection as an object within a file

Introduction

This page describes how to set up persistent file storage for a collection using Nabu, where the documents are stored as an object within a single file with the ID as key and the rest of the document as value

Object in file storage requires the Nabu storage engine

Usage

Per collection using storage engine option

To configure a single collection to be stored as an object in file we can specify the storageEngine option when creating the collection.

const store = Nabu
  .configure({})
  .use(mingo())
  .bootstrap();

Tashmet.connect(store.proxy()).then(async tashmet => {
  const collection = await tashmet.db('myDb').createCollection('myCollection', {
    storageEngine: {
      objectInFile: {
        path: 'content/myCollection.yaml',
        format: 'yaml'
      }
    }
  });
});

Insert a couple of documents

await collection.insertMany([
  { _id: 'doc1', title: 'foo' },
  { _id: 'doc2', title: 'bar' },
];

After the insert operation above content/myCollection.yaml will contain the following:

doc1:
  title: foo
doc2:
  title: bar

Reuse across database

By defining a custom I/O rule for the storage engine, we can reuse the same configuration across multiple collections. Here we create a rule called objectInYaml that we can target when creating the collection.

const store = Nabu
  .configure({})
  .use(mingo())
  .io('objectInYaml', (ns, options) => ({
    objectInFile: {
      path: `${ns.db}/${ns.collection}.yaml`,
      format: 'yaml'
    }
  }))
  .bootstrap();

Tashmet.connect(store.proxy()).then(async tashmet => {
  const collection = await tashmet.db('myDb').createCollection('myCollection', {
    storageEngine: 'objectInYaml'
  });
});

Alternatively we can set the default I/O rule and skip the storageEngine option entirely.

const store = Nabu
  .configure({
    defaultIO: 'objectInYaml'
  })
  // ...

Tashmet.connect(store.proxy()).then(async tashmet => {
  const collection = await tashmet.db('myDb').createCollection('myCollection');
});

Store multiple collections within same file

By specifying a field option we can store a complete database within the same file. In the following example we set up the I/O so that the YAML-file contains an object where keys correspond to names of collections.

const store = Nabu
  .configure({
    defaultIO: 'dbInYaml'
  })
  .use(mingo())
  .io('dbInYaml', (ns, options) => ({
    objectInFile: {
      path: `${ns.db}.yaml`,
      format: 'yaml',
      field: ns.collection
    }
  }))
  .bootstrap();

  Tashmet.connect(store.proxy()).then(async tashmet => {
    const collection = await tashmet.db('myDb').createCollection('myCollection');
    await collection.insertMany([
      { _id: 'doc1', title: 'foo' },
      { _id: 'doc2', title: 'bar' },
    ];
  });

Content on disk

myCollection:
  doc1:
    title: foo
  doc2:
    title: bar

Parameters

Path

path: string

Path to the file where documents are stored

Format

format: string

File format. The current valid file formats include:

  • format: 'json'

  • format: 'yaml'

Field

field?: string

An optional name of the field under which the documents are stored in the file. When omitted the list of documents are stored at the root

Note that field supports nesting, eg: 'foo.bar' is valid

Input

input?: Document[]

Optional additional pipeline stages to apply after documents have been read from file.

When using this option, make sure that output is also present and does the inverse transformation of `input.

Output

output?: Document[]

Optional additional pipeline stages to apply before documents are written to file.

When using this option, make sure that input is also present and does the inverse transformation of output.

Memory

Storing a collection in memory

Introduction

This is the default storage when no storageEngine directive is given.

Usage

Create a single memory collection

Tashmet.connect(store.proxy()).then(async tashmet => {
  const collection = await tashmet.db('myDb').createCollection('myCollection');
});

Glob

Storing a collection in multiple files matching a glob pattern

Introduction

This page describes how to set up persistent file storage for a collection using Nabu, where each document is stored within a file under multiple directories on the file system

Glob storage requires the Nabu storage engine

Usage

Create a single glob collection

Tashmet.connect(store.proxy()).then(async tashmet => {
  const collection = await tashmet.db('myDb').createCollection('myCollection', {
    storageEngine: {
      glob: {
        pattern: 'content/myCollection/**/*.yaml',
        format: 'yaml'
      }
    }
  });
});

The collection stored in 'content/myCollection/**/*.yaml' where each document will reside in a yaml file that derives its name from the _id of the document.

Reuse across database

By defining a custom I/O rule for the storage engine, we can reuse the same configuration across multiple collections.

const store = Nabu
  .configure({})
  .use(mingo())
  .io('yaml', (ns, options) => ({
    glob: {
      pattern: `${ns.db}/${ns.collection}/**/*.yaml`,
      format: 'yaml'
    }
  }))
  .bootstrap();

Tashmet.connect(store.proxy()).then(async tashmet => {
  const collection = await tashmet.db('myDb').createCollection('myCollection', {
    storageEngine: 'yaml'
  });
});

Alternatively we can set the default I/O rule and skip the storageEngine directive entirely.

const store = Nabu
  .configure({
    defaultIO: 'yaml'
  })
  // ...

Tashmet.connect(store.proxy()).then(async tashmet => {
  const collection = await tashmet.db('myDb').createCollection('myCollection');
});

Parameters

Pattern

pattern: string

Pattern matching files that should be included in the collection

Format

format: string | Document

File format. The current valid file formats include:

  • format: 'json'

  • format: 'yaml'

If we want to access YAML front matter, format also accepts the following configuration.

// YAML with front matter
{
  // ...
  format: {
    yaml: {
      frontMatter: true,
      contentKey: 'content' 
    }
  }
}

Merge stat

mergeStat?: Document

Include file information for each document when it's loaded from the file system. Expressions in merge are computed against the underlying file information (lstat). Consider the following example for how to include the path of the file.

// Include file path
{
  // ...
  mergeStat: {
    path: '$path'
  }
}

The following fields are available for merging:

path, dev, mode, nlink, uid, gid, rdev, blksize, ino, size, blocks, atimeMs, mtimeMs, ctimeMs, birthtimeMs, atime, mtime, ctime, birthtime

Any fields defined in the merge will be pruned before writing them to disk

Construct

construct?: Document

Add additional fields to the document. This stage is performed after merge is completed. Expressions in construct are computed against the actual, loaded document.

Consider the following example where we read markdown files with YAML front matter. The markdown content will be stored under a field named markdown and we use construct to add an additional field html that contains the converted output.

{
  pattern: '/content/**/*.md',
  format: {
    yaml: {
      frontMatter: true,
      contentKey: 'markdown'
    }
  },
  construct: {
    html: {
      $markdownToHtml: '$markdown'
    }
  }
}

Any fields defined in the construct stage will be pruned before writing them to disk

File system

File system operators

Introduction

Custom operators for working with files

Usage

Nabu

When using the Nabu storage engine, File system operators are included by default.

Memory storage engine

If you are using the memory storage engine the fs plugin must be configured

import Memory from '@tashmet/memory';
import mingo from '@tashmet/mingo';
import fs from '@tashmet/fs';

const store = Memory
  .configure({})
  .use(mingo())
  .use(fs())
  .bootstrap();

Expression operators

$lstat

{ $lstat: <expression> }

$fileExists

{ $fileExists: <expression> }

Expression operator that resolves to true if the file specified in the expression exists on the file system.

$readFile

{ $readFile: <expression> }

Expression operator that resolves to the content of the file specied in the expression

$basename

{ $basename: <path> | [<path>, <suffix>] }

Expression operator that returns the last portion of a path, similar to the Unix basename command.

See: https://nodejs.org/api/path.html#pathbasenamepath-suffix

$extname

{ $basename: <path> }

See: https://nodejs.org/api/path.html#pathextnamepath

$dirname

{ $dirname: <expression> }

$relativePath

{ $dirname: <expression> }

$joinPaths

{ $joinPaths: <expression> }


Aggregation stages

$writeFile

Write each document in the stream to file

{
  $writeFile: {
    content: 'content to write (expression or actual data)',
    path: 'path to file',
    overwrite: 'true if file should be overwritten if it exists',
  }
}

$glob

{ $glob: <expression> }

$globMatch

{ $globMatch: <expression> }

JSON

JSON operators

Introduction

Custom operators for converting JSON

Usage

Nabu

When using the Nabu storage engine, JSON operators are included by default.

Memory storage engine

If you are using the memory storage engine the JSON plugin must be configured

import Memory from '@tashmet/memory';
import mingo from '@tashmet/mingo';
import json from '@tashmet/json';

const store = Memory
  .configure({})
  .use(mingo())
  .use(json())
  .bootstrap();

Operators

$jsonToObject

{ $jsonToObject: <expression> }

Convert a JSON string to an object

const input = [
  { json: '{ "foo": "bar" }' }
];
const pipeline: Document[] = [
  { $documents: input },
  { $set: { object: { $jsonToObject: '$json' } } }
];

const doc = await tashmet.db('test').aggregate(pipeline).next();

$objectToJson

{ $objectToJson: <expression> }

Convert an object to a JSON string

const input = [
  { object: { foo: 'bar' } }
];
const pipeline: Document[] = [
  { $documents: input },
  { $set: { json: { $objectToJson: '$object' } } }
];

const doc = await tashmet.db('test').aggregate(pipeline).next();

YAML

YAML operators

Introduction

Custom operators for converting YAML

Usage

Configuration options

Option
Default
Description

indent?: number

2

Indentation width to use (in spaces) when serializing.

skipInvalid?: boolean

true

Do not throw on invalid types (like function in the safe schema) and skip pairs and single values with such types.

flowLevel?: number

-1

Specifies level of nesting, when to switch from block to flow style for collections. -1 means block style everwhere.

styles?: {[tag: string]: string}

"tag" => "style" map. Each tag may have own set of styles.

sortKeys?: boolean | ((a: any, b: any) => number);

false

If true, sort keys when dumping YAML. If a function, use the function to sort the keys.

lineWidth?: number

80

Set max line width for serialized output.

noRefs?: boolean

false

If true, don't convert duplicate objects into references.

noCompatMode?: boolean

false

If true don't try to be compatible with older yaml versions. Currently: don't quote "yes", "no" and so on, as required for YAML 1.1

condenseFlow?: boolean;

false

If true flow sequences will be condensed, omitting the space between a, b. Eg. '[a,b]', and omitting the space between key: value and quoting the key. Eg. '{"a":b}'. Can be useful when using yaml for pretty URL query params as spaces are %-encoded.

Nabu

When using the Nabu storage engine, the YAML plugin is included by default. If you want to pass configuration options you can do it like this

const store = Nabu
  .configure({
    yaml: { indent: 4 }
  })
  .bootstrap();

Memory storage engine

If you are using the memory storage engine the YAML plugin must be configured

import Memory from '@tashmet/memory';
import mingo from '@tashmet/mingo';
import yaml from '@tashmet/yaml';

const store = Memory
  .configure({})
  .use(mingo())
  .use(yaml({ /* configuration options */}))
  .bootstrap();

Operators

$yamlToObject

{ $yamlToObject: <expression> }

Convert a YAML string to an object

const data = dedent`
  title: foo
  list:
    - item1
    - item2
`;
const pipeline: Document[] = [
  { $documents: [{ data }] },
  { $set: { data: { $yamlToObject: '$data' } } }
];

const doc = await tashmet.db('test').aggregate(pipeline).next();
{ data: { title: 'foo', list: ['item1', 'item2'] } }

To convert YAML as front matter we need to specify some extra parameters

const data = dedent`
  ---
  title: foo
  ---
  Content goes here
`;
const pipeline: Document[] = [
  { $documents: [{ data }] },
  {
    $set: {
      data: {
        $yamlToObject: {
          data: '$data',
          frontMatter: true,
          contentKey: 'body'
        }
      } 
    }
  }
];
const doc = await tashmet.db('test').aggregate(pipeline).next();
{ data: { title: 'foo', body: 'Content goes here' } }

$objectToYaml

{ $objectToYaml: <expression> }

Convert an object to a YAML string

const input = [
  { data: { foo: 'bar' } }
];
const pipeline: Document[] = [
  { $documents: input },
  { $set: { data: { $objectToYaml: '$data' } } }
];

const doc = await tashmet.db('test').aggregate(pipeline).next();

Directory

Storing a collection in multiple files within a directory

Introduction

This page describes how to set up persistent file storage for a collection using Nabu, where each document is stored within a file under a specific directory on the file system.

Directory storage requires the Nabu storage engine

Usage

Statically created

To configure

collections:
  myCollection:
    storageEngine:
      directory:
        path: ./content/myCollection
        extension: .yaml
        format: yaml

Dynamically created

Tashmet.connect(store.proxy()).then(async tashmet => {
  const collection = await tashmet.db('myDb').createCollection('myCollection', {
    storageEngine: {
      directory: {
        path: 'content/myCollection',
        extension: '.yaml',
        format: 'yaml'
      }
    }
  });
});

The collection stored in 'content/myCollection' where each document will reside in a yaml file that derives its name from the _id of the document.


Reuse across database

By defining a custom I/O rule for the storage engine, we can reuse the same configuration across multiple collections.

const store = Nabu
  .configure({})
  .use(mingo())
  .io('yaml', (ns, options) => ({
    directory: {
      path: `${ns.db}/${ns.collection}`,
      extension: '.yaml',
      format: 'yaml'
    }
  }))
  .bootstrap();

Tashmet.connect(store.proxy()).then(async tashmet => {
  const collection = await tashmet.db('myDb').createCollection('myCollection', {
    storageEngine: 'yaml'
  });
});

Alternatively we can set the default I/O rule and skip the storageEngine directive entirely.

const store = Nabu
  .configure({
    defaultIO: 'yaml'
  })
  // ...

Tashmet.connect(store.proxy()).then(async tashmet => {
  const collection = await tashmet.db('myDb').createCollection('myCollection');
});

Parameters

Path

path: string

Path to the directory where files are stored

Extension

extension: string

File extension (including the dot)

{
  // ...
  extension: '.yaml'
}

Format

format: string | Document

File format. The current valid file formats include:

  • format: 'json'

  • format: 'yaml'

If we want to access YAML front matter, format also accepts the following configuration.

// YAML with front matter
{
  // ...
  format: {
    yaml: {
      frontMatter: true,
      contentKey: 'content' 
    }
  }
}

Merge Stat

mergeStat?: Document

Include file information for each document when it's loaded from the file system. Expressions in merge are computed against the underlying file information (lstat). Consider the following example for how to include the path of the file.

// Include file path
{
  // ...
  mergeStat: {
    path: '$path'
  }
}

The following fields are available for merging:

path, dev, mode, nlink, uid, gid, rdev, blksize, ino, size, blocks, atimeMs, mtimeMs, ctimeMs, birthtimeMs, atime, mtime, ctime, birthtime

Any fields defined in the merge will be pruned before writing them to disk

Construct

construct?: Document

Add additional fields to the document. This stage is performed after merge is completed. Expressions in construct are computed against the actual, loaded document.

Consider the following example where we read markdown files with YAML front matter. The markdown content will be stored under a field named markdown and we use construct to add an additional field html that contains the converted output.

{
  path: '/content',
  extension: '.md',
  format: {
    yaml: {
      frontMatter: true,
      contentKey: 'markdown'
    }
  },
  construct: {
    html: {
      $markdownToHtml: '$markdown'
    }
  }
}

Any fields defined in the construct stage will be pruned before writing them to disk

Default

default?: Document

Set default values for fields that are not present in the stored content.

{
  path: '/content/posts',
  extension: '.md',
  format: {
    yaml: {
      frontMatter: true,
      contentKey: 'markdown'
    }
  },
  default: {
    slug: '$_id'
  }
}