cloud to edge.
FØCAL is a web service that allows complex image and video processing pipelines to be built, tested, and deployed on scalable compute resources. Several important performance and usability goals are realized in the FØCAL software architecture. Most importantly, the public API provides a way for third-party developers to build and deploy FØCAL pipelines to suit the needs of their specific applications.
This document describes basic workflows made available to third-party developers through FØCAL’s public API . Most of the API interactions discussed in this document are issued through a command line interface or client code. They can, however, also be performed manually through our Dashboard UI. Discussion of Dashboard features is outside the scope of this document.
FØCAL maintains repositories for code, documentation, and issue tracking on Github.
- f0cal-bug – Project-wide issue tracking. Please submit everything here.
- f0cal-cli – The command line interface. Critical for developers.
- f0cal-spec – Machine-readable API documentation.
- f0cal-client-py – Official Python client. Generated from f0cal-spec using fullmetal.
- f0cal-sdk – Required to contribute service-side functionality. Apply for SDK access here.
- fullmetal – Full-stack code generation for Python. This is what we used to expose the FØCAL API to the web.
Following emerging best-practices – see Swagger, RAML, and others – the FØCAL API is “self-documenting.” The XML instance describing the API can be found as f0cal-spec. Official client bindings are auto-generated from this XML instance using fullmetal.
Comprehensive, up-to-date, human-readable docs are best obtained using f0cal-cli.
Web-based documentation is also provided on a per-repository basis.
First, some terminology:
- Pipeline – A series of processing operations through which data is extracted from images or frames of video.
- Audit – The process of determining how well a given pipeline is doing its intended job.
- Ground truth – The data with which an audit is performed. This data provides both pre-condition and post-condition information for assessing a pipeline’s fitness, and is typically assembled by human experts.
- Bundle – A data structure for tracking the performance of one or more pipelines over time.
- Client credentials – A unique string of characters that allows client code to encrypt and decrypt communication with the API.
The FØCAL command line interface (CLI) is the principal mechanism by which service-bound resources are managed. This includes but is not limited to:
- Creating accounts
- Initializing client credentials
- Listing deployed resources
- Starting and stopping pipelines
- Connecting third-party services
The FØCAL CLI is written in Python, and requires both
for installation. We recommend performing the install within a
Installing the CLI will also install Python client bindings, the utility of which will be discussed in upcoming sections.
The first steps that you’ll take with FØCAL involve establishing the
authentication tokens that are required to communicate with the
API. These will require a valid email address. For the purposes of our
examples, we’ll use
Working with data
Computer vision systems are concerned with data of two important types:
- Live data – These are transient data that come from sources external to FØCAL, such as web cams and submission of images to third-party applications.
- Persisted data – These data exist in persistent storage, and are used for system training, testing, and audits.
The current section is concerned exclusively with persisted data. Working with live data will be discussed in upcoming sections, in the context of pipeline deployment.
With so many popular options available for cloud-based image and video storage, FØCAL does not provide its own persistence mechanism. Instead, we integrate Dropbox, Google Drive, Flickr, and other third-party storage services. The following CLI examples illustrate how to add an external data store to your account:
A data set is a collection of persisted images or videos that a user curates for system training or testings purposes. Data sets can be assembled from content persisted on multiple data stores.
check command deserves a special mention. Since FØCAL relies on
external services to persist your image data, we need to have some
guarantee that your images aren’t being modified without FØCAL’s
knowledge. When FØCAL imports new data, it generates a unique hash
from the bytes of each imported image or video. This provides certain
guarantees about the data to future usage scenarios, namely that the
data are exactly the same, bit-for-bit, as when they were imported.
Working with pipelines
Pipelines are the fundamental units of data processing in the FØCAL architecture. They encapsulate a series of operations that is designed by the author to extract information of interest from ingested imagery. The process of designing a pipeline to extract a specific desired information is beyond the scope of this document and will be discussed in future tutorials.
Pipelines can be created from other, pre-existing pipelines.
Pipelines can also be authored “from scratch.”
One of the core design features of the FØCAL architecture is that even the most complex pipelines can be fully serialized to a single, readable document. All the ingredients necessary to exactly reproduce a particular analytic are able to be captured in a way that makes easy to copy, modify, and reuse. The serialization format that FØCAL relies on is XML.
Pipelines can be authored in a number of ways:
- By hand – This option involves working directly with XML syntax in your favorite editor. The resultant instance document should conform to our pipeline XML schema.
- Dashboard GUI – FØCAL provides a graphical user interface for users who prefer interactive feedback about pipeline correctness.
- Client bindings – Official and third-party bindings allow the pipeline XML instance to be authored more intuitively, and without special knowledge of XML syntax. See next section for examples.
Python client bindings
Pipelines can accept persisted data as input.
Deployment is the process by which a pipeline is readied for live data ingest.
--live-input directive causes the CLI to return a reference to
the live pipeline in the form of a URL. This URL is globally
accessible and accepts input data via standard HTTP or web
sockets. Only authorized FØCAL clients are permitted to communicate
with a deployed pipeline.
FØCAL provides bundles for tracking system performance across structural differences in constituent pipelines, structural changes to individual pipelines, and modifications to training data.
Bundles are important for two reasons:
- Comparison – Several structurally different pipelines can be imagined to solve any give analysis problem. But which one is the best for your unique input data? Bundles were developed to provide an apples-to-apples comparisons across different pipelines.
- Debugging – Computer vision systems rely heavily on numerical algorithms. Bugs in numerical code are pernicious in ways that other software bugs are not. They tend to reveal themselves not with obvious stack traces, but instead in subtle changes to system performance. A best-practice for dealing with numerical bugs is to create a kind of regression suite based on ground truth data. Bundles solve this problem.
Basic operations on bundles include creation, adding data sets, adding pipelines, performing audits, and reviewing results.
Audits are a necessary evil. Machine learning systems – of which computer vision systems are a subset – can’t be trusted to generalize well. It is highly likely that (1) deployed systems will encounter input data that are qualitatively different from those that they were designed to handle, and that (2) new error modes will appear as a result. Though latent, these novel error modes will be detrimental to system performance. The only way to identify them is to compare known inputs to desired outputs in a regimented fashion. FØCAL audits support precisely this process.
The first step in performing an audit is establishing a ground truth data set, or a data set that maps known inputs to desired outputs. There are a number of ways to establish ground truth data, and many features of the FØCAL architecture are designed to expedite the process. Discussion of these is beyond the scope of this document.
FØCAL data sets, pipelines, and bundles are all backed by Git
repositories. State on these data structures can be updated by
modifying the file structure within the corresponding Git repositories
and executing a
As with any revision-controlled data structures, FØCAL data sets, pipelines, and bundles can all be branched. The FØCAL API supports referencing these data structures by branch name and commit hash. The FØCAL git service includes a number of hooks that will prevent malformed data structures from being submitted.
Discussion of direct manipulation of FØCAL data structures contained in the git repositories is beyond the scope of this document. Please see the relevant technical documentation for:
The FØCAL API, and specifically its client CLI, supports a number of important workflows that can be leveraged quickly and easily by third-party developers. These workflows allow the developer to build, test, and deploy FØCAL resources on the cloud in support of their specific applications. Data sources, data sets, pipelines, bundles, and audits and their utility were all discussed in detail. Demonstrations were made of realistic API interactions throughout. Developers familiar with similar tools should be fully empowered to start using FØCAL productively.