This workflow needs to be updated with respect to the v1.2 User Interaction Workflow. -JH. Feb 3, 2010.

Introduction

This document describes the user interaction with the Decapod system, and technical back-end activity that occurs in Decapod. Structuring this information as a start-to-finish workflow is a way of portraying Decapod helps serve as a work plan for both design and development, as well as provide a conceptual view into how Decapod functions as a whole.

If you are new to Decapod, this document may be heavy on details, but gives a good description of how the system will work, and where we are going in our work.

Digitization Process Diagram (Draft 2/August 6, 2009)

The following is a graphic illustrating the high-level workflow. More detailed descriptions follow further in this document.

Download PDF of digitization process diagram, Draft 2 (August 6, 2009)

Download PDF of digitization process diagram, Draft 1

Overview of Workflow for User and System

  1. Start Decapod
  2. Assemble hardware if not a fixed installation (Wireframe: Camera Setup Wizard)
  3. Calibrate (Wireframe: Camera Setup Wizard)
  4. User begins capturing (Wireframe: Detailed and Thumbnail View)
  5. User manages individual pages for Exporting (Wireframe: Furture work)
  6. Indicates they want to Export (Wireframe: Menu Bar)
  7. Output generated to PDF. (Wireframe: Furture work)

Start Decapod

Hardware Assembly

Calibration

The process:

Capturing

Capture Post-Processing

Quality Control / Image Management

Data Storage / File Formats

Close the Session

Remastering for Output

Interaction has not been finalized yet.

There is no real user interaction per se, but likely some notifications so users are aware of what the system is doing.

Wireframe: Furture work

Input: Sequence of page spreads

High-level workflow:

  1. page segmentation
  2. line segmentation
  3. document flow and hierarchy analysis
  4. character segmentation - clustering - tokenization

Output: Appropriately split pages, OCR'ed, segmented, flowed, font generated, and with proper document structure

Page Segmentation

The Tokenization Process:

Layout Detection Process:

OCR

Quality Control

During Capture

Wireframe: Furture work

During Remastering

Layout Correction

Possible functionality

Technically possible, but takes resources. Possibly beyond scope of project.

Output

Scripts run:

Remastering Notes / Discussions

Wireframe: Furture work