36.0.2 (6 Apr 2023)
Splitting a page's text into segments when "0" is a segment's only character – We've fixed an issue that prevented a page's text from being split correctly into text segments when segments contained only the "0" character. This issue caused processing delays and excessive memory usage.
Storing pre-calculations for classifying Structured documents – We've resolved an issue that caused invalid memory alloc request errors when the system attempted to store pre-calculated values for the release's Structured layout variations in the database. The issue affected instances with PostgreSQL databases.
User interface for Classification Supervision tasks – We've made the following fixes to the Classification Supervision user interface:
- We've widened the right-hand panel, enlarging the image of the page being categorized.
- We've fixed an issue that caused the screen to flicker each time a keyer clicked on a thumbnail in the left-hand panel.
- We've resolved an issue that caused the right-hand panel to be hidden when a keyer clicked on a page group in the middle panel.
Counting time spent on Classification Supervision tasks where pages are classified as "Other" – We've fixed an issue that prevented time spent on Classification Supervision tasks from being included in Document Classification Supervision Time Spent (Seconds) when keyers classified all pages as "Other" during the tasks. The data for Document Classification Supervision Time Spent (Seconds) appears in the KeyerPerformance.csv file in the Keyer Projection Report.
Updating com.fasterxml.jackson.core:jackson-databind – To address security vulnerabilities, we've updated com.fasterxml.jackson.core:jackson-databind to 2.14.2.
"API Access" tab in the Users section for deployments without AWS ALB – We've fixed an issue that caused the API Access tab to appear in the Users section of the application in deployments that did not use AWS ALB authentication.
Restricting access to /api/v5/audit_logs – We've revoked access to the /api/v5/audit_logs endpoint from all users except System Admins.
36.0.1 (17 Mar 2023)
Retrieving blank thumbnail images of pages – We've fixed an issue that prevented blank thumbnail images from being retrieved when no thumbnail images existed for a submission. This issue prevented the application from being initialized in some situations.
Classifying Structured documents written in Japanese or Simplified Chinese – We've resolved an issue that caused submissions to halt at the Machine Classification step if they contained Structured documents written in Japanese or Simplified Chinese.
"Perform Tasks" link in Submissions table – We've fixed an issue that prevented the Perform Tasks link from appearing in the Submissions table for submissions with Classification Supervision tasks. The issue affected submissions whose first page was classified by the machine.
Keyer Data Management
Duplicate pages after training – We've fixed an issue that caused pages to be duplicated after their documents were used for training. The issue affected documents that contained at least one empty page.
"Latest version not live" after uploading releases and training data – We've resolved a timestamping issue that caused a "Latest version is not live" warning message to appear after uploading a release and the training data for its models.
Logging of artifact-export events – We've changed the severity of the following events from exceptions to warnings in the logs:
- Missing artifacts list
- Missing storage type
- Missing destination
Logging in without assigned user groups or permissions – Previously, if a user was not assigned to a user group in an identity provider (IdP), or if they were assigned to an IdP user group that did not have any permissions, they could log in to the application, but they could not log out. There was also no messaging to let the user know what they needed to do to resolve the issue. A fix for these issues is included in v36.0.1.
Indexing training-data records – We've fixed an indexing issue that caused duplicates of training-data records to be found during the upgrade process, which prevented instances from being updated to v36.
Authentication and SaaS features when AWS ALB is not used – We've resolved an issue that prevented users from authenticating in some situations when a method other than AWS ALB was used. This issue also caused some SaaS-specific features to be disabled in affected instances.
Application recovery after database failovers – We've fixed an issue that prevented the application from recovering quickly after database failovers. The issue sometimes caused the application to be unresponsive for long periods of time.
36.0.0 (17 Mar 2023)
There is an issue in v36.0.0 that prevents version information from appearing in the UI. For this reason, we recommend using v36.0.1 rather than v36.0.0.
New languages – We've added support for submissions written in the following languages:
We support automation on Structured and Semi-structured submissions in these languages, regardless of whether they contain handwritten or printed data.
To learn more about the languages we support, see Supported Languages.
Improvements to the Korean language model – We've enhanced the system's ability to accurately recognize and transcribe Korean words in fields with Generic Text, Address, Company Name, and Name data types.
Improvements to flow management – In an effort to provide more information about flows and the potential results of certain actions, we've made the following updates to flow-management tasks:
- Option to deploy subflows when deploying flows – When you deploy a flow, a confirmation dialog box appears, which includes an option to deploy all connected subflows.
- Indicators that distinguish subflows from each other – Each subflow shown in Flow Studio has its own highlight color, making it easier to determine whether subflows are identical or different. We've also allowed more characters in a subflow's title to be shown in Flow Studio.
- Applying changes to a specific instance of a subflow – You can now choose to save changes to a specific instance of a subflow without impacting the other instances of that subflow in the main flow or in other flows.
- Visual explanation of options when saving changes to a subflow – We've added diagrams explaining the options available when saving changes to a subflow, which illustrate where the changes will be applied.
More details on these updates can be found in Connecting Flow Blocks to Other Flows.
Defining retry policies for flow blocks – You can now define retry policies for flow blocks at the system, flow, or block level. Defining these policies gives you more control over the execution of flows and may prevent submissions from halting when temporary failures occur.
For each policy you create, you can specify the total number of retry attempts that the applicable blocks should have after their initial failure, along with the amount of time that should pass between attempts. For example, you can have a retry policy in which the system retries the block up to three times, with increased time between each attempt.
You can define system-level policies in /admin/hyperflow/wfeconfig/. Flow- and block-specific policies can be defined in the flow settings and block settings, respectively.
For more information, see Defining Automatic Block-Retry Policies.
Custom Entity Detection Block (beta) – This block provides complementary functionality to the capabilities of the Named Entity Recognition Block. The Custom Entity Detection Block can automatically identify a variety of entities including date, SSN, address, policy number, loan number, credit card number, customer ID, account number, employee ID, employer ID, passport number, driver license number, case number, phone number, application number, routing number, and “other.” In general, the block can be configured to locate and identify:
- single words, and
- word patterns that can be described with a combination of regular expressions and keywords.
You need to use Custom Entity Detection Blocks in conjunction with Full Page Transcription Blocks. For example, you can build a redaction flow that processes documents through full-page transcription, then detects all custom entities that are defined in the Custom Entity Detection Block, and at the end uses a Custom Code Block to place black boxes over the detected entities.
To learn more, see the "Custom Entity Detection Block (Beta)" section of Flow Blocks.
Flow Executions page – To make it easier for users outside of the System Admin permission group to access flow-execution information, we’ve added a Flow Executions page to the Flows section of the application. Using the filters on this page, users can view a list of failed flow executions, which cause halted submissions, and retry the halted submissions that meet the filter’s criteria. Clicking the ID of a flow execution opens its Flow Run page, which contains a diagram of the flow and information about the progress of the flow’s execution.
Users need the View Flow Executions permission to access this page. By default, this permission is given to users in the System Admin and Business Admin permission groups.
For more information, see Flow Executions.
Flow Run enhancements – To provide more troubleshooting information about flows and their blocks, we've made the following improvements to the Flow Run page (formerly known as the Flow Execution page):
- New "Code" tab for Custom Code and Python Code Blocks – When you click on a Custom Code Block or a Python Code Block in a flow-execution diagram, a Code tab appears in the bottom panel, containing the Python code for that block.
- Viewing flow inputs, outputs, and errors – You can now view flow-level inputs, outputs, and errors in the Flow Input, Flow Output, and Flow Runtime Errors tabs, respectively, on the Flow Run page.
To learn more about these updates, see Testing and Debugging Flows.
Configuring Custom Code Blocks to accept specific file types as input – With the addition of the File input type within the Parameter class in v36, you can now configure Custom Code Blocks to accept any file type as input, including CSV and JSON files. Users can import and update files of the type expected by the Custom Code Block via the Flows settings sidebar in Hyperscience.
Downloading submission-activity logs – In v36, we’ve added support for downloading submission-activity logs. The submission-activity logs provide you with information about how your submissions progressed through their flows.
To download submission-activity logs, go to the Submissions table, click the menu ( ), and then click Download Submission Activity Logs.
The downloaded submission-activity file is in CSV format.
For more details, see Navigating the Submissions Table.
Text Classification improvements – In v34, we introduced a "preview" version of the Text Classification feature. In v36, we've made updates to the application that streamline the use of Text Classification and make the model-management experience similar to that of other models:
- Text Classification models are now included on the Models page (Library > Models).
- A Model Details page is available for each Text Classification model, which shows the model's projected automation, training data, and information on each set of samples used for training.
- From the Model Details page, you can run training for a model, monitor the status of the training, and deploy the model after training is complete.
- You can also import and export Text Classification models, as well as the models themselves.
To learn more about Text Classification in v36, see Text Classification.
Layout Variation Alerting – With Layout Variation Alerting, users are notified if pages marked as “No Layout Found” are matched to existing layout variations that are not included in the flow's release. When Layout Variation Alerting is enabled, the system attempts to find layout variations for pages marked as “No Layout Found” on a nightly basis.
Note that Layout Variation Alerting is not available in SaaS deployments of Hyperscience. Also, we do not recommend enabling it in instances that process more than a million pages per day.
To learn more about Layout Variation Alerting and how to enable it, see Layout Variation Alerting.
Extracting data points from unstructured documents – Unstructured extraction allows you to extract data points from long documents with unstructured text. To leverage the automation capabilities of unstructured extraction, you need to select the new field ID model called UNSTRUCTURED_EXTRACTION under Flex Engine Type for Training at /admin/form_extraction/template/ for a given layout before training.
With the introduction of this new ID model, you can achieve automation based on the threshold you specify in the Field Identification Target Accuracy flow setting.
To upload and annotate unstructured documents, you can use the Keyer Data Management functionalities in the Model Details page.
Note that unstructured extraction is available in SaaS deployments only.
To learn more about the new UNSTRUCTURED_EXTRACTION model, see Training a New Field Identification Model.
Field Anomaly Detection – As your keyers identify field values in documents, they may sometimes select different instances of the same value across documents. These inconsistencies lead to decreased performance in Field Identification models over time. The Field Anomaly Detection feature analyzes training data before it is used in model training and flags potential mistakes and inconsistencies in field identification. You can then review these annotations and verify or edit them before they are used in training.
Field Anomaly Detection runs as part of Training Data Analysis, which you can run from the Model Details page.
The system highlights documents that contain potential anomalies, and you can review each document's annotations by clicking the Edit annotations link for that document.
For more information about Field Anomaly Detection, see Detecting and Correcting Anomalies in Field Annotations.
Enhanced signature detection – With the updates made in v36, the system can better detect signature fields, improving automation in the processing of signatures.
Separate thresholds for the Automatic QA Sample Rate flow setting – To give you more flexibility when using automatic QA sampling, you now have separate thresholds for the Automatic QA Sample Rate flow setting. We’ve separated the thresholds in the following way:
- Structured Text Transcription QA Sample Rate
- Structured Checkbox Transcription QA Sample Rate
- Structured Signature Transcription QA Sample Rate
- Semi-structured Transcription QA Sample Rate
Note that the Automatic QA Sample Rate feature is optional, and you can still manually set QA sample rates.
To learn more, see Flow Settings.
Improved, faster transcription of PDFs – Previously, if a submission contained a PDF file, the system would convert each of the file's pages into an image before extracting data from the file. In v36, you can choose to extract data from the PDF directly rather than from images of its pages. This update improves machine transcription in PDFs and increases the speed of transcription.
To enable this feature, select the Faster PDF Transcription option in the Machine Classification Block's settings.
Note that you cannot have both Image Correction and Faster PDF Transcription enabled in the same Machine Classification Block, and PDFs must be oriented correctly in order for the Faster PDF Transcription feature to perform as intended.
For more information on the Faster PDF Transcription option, see the "Machine Classification" section of Flow Blocks.
Optimizations for full-page transcription – We've enhanced full-page transcription to make it faster and more accurate, particularly when processing long lines of text.
Transcription automation for fields with multiple bounding boxes – We've added support for transcription automation for fields identified with multiple bounding boxes in Semi-structured layouts.
Decision dependencies – The addition of decision dependencies for Custom Supervision tasks provides you with more flexibility for configuring available options in decision drop-down menus. In v36, Custom Supervision tasks support both decision dependencies within a single document and decision dependencies across multiple documents.
Configuring decision dependencies within a single document allows you to present different options in decision drop-down menus based on user input. For example, you can configure Custom Decision 1 with two possible answers. Based on the selected answer for Custom Decision 1, you will receive different possible answers for Custom Decision 2.
The newly-added support for decision dependencies across multiple documents also allows you to present different options in decision drop-down menus based on user input. The difference here is that your answers in one of the documents affect the possible answers in other documents. For example, you can configure Custom Decision 1 in Document 1 with two possible answers. Based on the selected answer for Custom Decision 1, you will receive different possible answers for Custom Decision 2 in Document 2.
Note that you can configure decision dependencies only for documents and cases.
Mandatory decisions – We’ve added support for mandatory decisions in Custom Supervision tasks. This allows you to mark important decisions critical for downstream processing and data quality. These decisions will need to have a value assigned before users can move on to the next Supervision task.
To learn more about these updates, see Custom Supervision.
Keyer Data Management
Changing training statuses of multiple documents simultaneously – With the introduction of the Edit training status option in v36, you can now edit the training statuses of multiple documents in bulk. To take advantage of this functionality, select documents from the Training Documents table on a model's Model Details page, and then click the Edit training status option that is located in the Actions drop-down menu.
You can only change the statuses of annotated documents. If any of the selected documents are not annotated, a warning message will appear in the Edit Training Status dialog box.
Annotation suggestions for fields with multiple bounding boxes – To expand the capabilities of the Guided Data Labeling feature, the annotation suggestions now provide you with predictions about where all bounding boxes of a field might be located.
For more information on these updates, see Keyer Data Management.
Hourly breakdown of System Throughput report – We’ve added support for downloading hourly breakdowns of the System Throughput report (Reporting > Overview).
Note that hourly data is available only from the date v36 begins running in your instance. To manage database size, hourly data accumulates for up to 30 days. After 30 days, the first day’s data is deleted, and so on as each day passes.
To learn more about the System Throughput report, see System Throughput.
“Supervision” column in the Usage report – We’ve added a Supervision column to the following Usage report’s (Reporting > Usage) CSV files:
The Supervision column indicates what the Transcription Supervision setting is for each field in the report. The Transcription Supervision settings are defined in the Layout Editor for each field in a layout variation.
For more information, see Usage Report.
Filtering by field type in the All Users Performance Summary report – We’ve added a Field Type filter to the All Users Performance Summary report (Reporting > User Performance). You can now choose to filter the report by one of the following field types:
Note that each download of the report includes only the data that meets the filter criteria.
More details can be found in All Users Performance Summary.
Importing and exporting system settings – To let you move system settings between multiple instances that are on the same major version, we’ve added support for importing and exporting the settings found in Administration > System Settings. You can find the import and export functionality at Administration > Import/Export. The export functionality lets you select which settings you want to export.
After making your selections, you can export your system settings to a JSON file. You can then use this JSON file to import your system settings to other instances.
More details can be found in Importing & Exporting System Settings.
Support for PostgreSQL 10.x in Hyperscience v38 – Beginning in v38, the Hyperscience application will no longer support PostgreSQL 10.x. PostgreSQL ended support for 10.x on November 10, 2022.
The following databases will be supported in v38:
- PostgreSQL 12.x, 13.x, and 14.x
- Amazon RDS for PostgreSQL
- Oracle 19c with DBMS_ALERT privileges
- Amazon RDS for Oracle
- Microsoft SQL Server (MSSQL) 2016, 2017, and 2019 with Service Broker enabled
- Amazon RDS for SQL Server
- Azure SQL Managed Instance
For more information on database requirements, see Infrastructure Requirements (Production).
Support for Red Hat OpenShift – To enhance the deployment experience, we now support the use of Red Hat OpenShift. With this enhancement, you can now deploy the Hyperscience application on your own Red Hat OpenShift instance.
Flows endpoints – With the /api/v5/flows endpoints, you can manage your flows programmatically and create scripts to automate frequently performed tasks.
You can complete the following actions with these endpoints:
- List all flows in your instance
- Retrieve information about a specific flow
- Import a flow from a JSON or ZIP file
- Deploy or disable a flow
- Archive or restore a flow
More information about these endpoints can be found in our API documentation.
Importing and exporting models with Artifacts endpoints – We've added functionality to the /api/v5/artifacts endpoints that allows you to import and export Field Identification, Table Identification, and Classification models without logging in to the application.
Imported models do not replace live models. If the imported model matches a live model that doesn't already have a candidate model, the system saves the imported model as a candidate model. If a candidate model already exists, the import fails.
To learn more about importing and exporting models with the Artifacts endpoints, see our API documentation.