Inverse Text Normalization (ITN)

Note: The modelscope pipeline supports all the models in model zoo to inference. Here we take the model of the Japanese ITN model as example to demonstrate the usage.

Inference

Quick start

Japanese ITN model 

from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

itn_inference_pipline = pipeline(
    task=Tasks.inverse_text_processing,
    model='damo/speech_inverse_text_processing_fun-text-processing-itn-ja',
    model_revision=None)

itn_result = itn_inference_pipline(text_in='百二十三')
print(itn_result)
# 123

read text data directly.

rec_result = inference_pipeline(text_in='一九九九年に誕生した同商品にちなみ、約三十年前、二十四歳の頃の幸四郎の写真を公開。')
# 1999年に誕生した同商品にちなみ、約30年前、24歳の頃の幸四郎の写真を公開。

text stored via url，example：https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_text/ja_itn_example.txt

rec_result = inference_pipeline(text_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_text/ja_itn_example.txt')

Full code of demo, please ref to demo

API-reference

Define pipeline

task: Tasks.inverse_text_processing
model: model name in model zoo, or model path in local disk
output_dir: None (Default), the output path of results if set
model_revision: None (Default), setting the model version

Infer pipeline

text_in: the input to decode, which could be:
- text bytes, e.g.: “一九九九年に誕生した同商品にちなみ、約三十年前、二十四歳の頃の幸四郎の写真を公開。”
- text file, e.g.: https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_text/ja_itn_example.txt In this case of text file input, output_dir must be set to save the output results

Modify Your Own ITN Model

The rule-based ITN code is open-sourced in FunTextProcessing, users can modify by their own grammar rules for different languages. Let’s take Japanese as an example, users can add their own whitelist in FunASR/fun_text_processing/inverse_text_normalization/ja/data/whitelist.tsv. After modified the grammar rules, the users can export and evaluate their own ITN models in local directory.

Export ITN Model

Export ITN model via FunASR/fun_text_processing/inverse_text_normalization/export_models.py. An example to export ITN model to local folder is shown as below.

cd FunASR/fun_text_processing/inverse_text_normalization/
python export_models.py --language ja --export_dir ./itn_models/

Evaluate ITN Model

Users can evaluate their own ITN model in local directory via FunASR/fun_text_processing/inverse_text_normalization/inverse_normalize.py. Here is an example:

cd FunASR/fun_text_processing/inverse_text_normalization/
python inverse_normalize.py --input_file ja_itn_example.txt --cache_dir ./itn_models/ --output_file output.txt --language=ja