01-04 12:22 阅读 233

Unity实现OCR文字识别功能

这篇文章主要介绍了通过Unity接入百度AI接口，实现OCR文字识别功能，文中的实现步骤讲解详细，对我们学习或工作有一定的参考价值，需要的可以了解一下

首先登陆百度开发者中心，搜索文字识别服务：

创建一个应用，获取AppID、APIKey、SecretKey秘钥信息：

下载C# SDK，将AipSdk.dll动态库导入Unity：

本文以通用文字识别为例，查阅官方文档，以下是通用文字识别的返回数据结构：

在Unity中定义相应的数据结构：

using System;

/// <summary>

/// 通用文字识别

/// </summary>

[Serializable]

public class GeneralOcr

{

    /// <summary>

    /// 图像方向 -1未定义 0正弦 1逆时针90度 2逆时针180度 3逆时针270度

    /// </summary>

    public int direction;

    /// <summary>

    /// 唯一的log id，用于问题定位

    /// </summary>

    public int log_id;

    /// <summary>

    /// 识别结果数，表示words_result的元素个数

    /// </summary>

    public int words_result_num;

    /// <summary>

    /// 定位和识别结果数组

    /// </summary>

    public string[] words_result;

    /// <summary>

    /// 行置信度信息

    /// </summary>

    public Probability probability;

}

/// <summary>

/// 行置信度信息

/// </summary>

[Serializable]

public class Probability

{

    /// <summary>

    /// 行置信度平均值

    /// </summary>

    public int average;

    /// <summary>

    /// 行置信度方差

    /// </summary>

    public int variance;

    /// <summary>

    /// 行置信度最小值

    /// </summary>

    public int min;

}

下面是调用时传入的相关参数：

封装调用函数：

using System;

using System.Collections.Generic;

using UnityEngine;

public class OCR 

{

    //以下信息于百度开发者中心创建应用获取

    private const string appID = "";

    private const string apiKey = "";

    private const string secretKey = "";

    /// <summary>

    /// 通用文字识别

    /// </summary>

    /// <param name="bytes">图片字节数据</param>

    /// <param name="language">识别语言类型 默认CHN_ENG中英文混合</param>

    /// <param name="detectDirection">是否检测图像朝向</param>

    /// <param name="detectLanguage">是否检测语言，当前支持中、英、日、韩</param>

    /// <param name="probability">是否返回识别结果中每一行的置信度</param>

    /// <returns></returns>

    public static GeneralOcr General(byte[] bytes, string language = "CHN_ENG", bool detectDirection = false, bool detectLanguage = false, bool probability = false)

{

        var client = new Baidu.Aip.Ocr.Ocr(apiKey, secretKey);

try

{

            var options = new Dictionary<string, object>

{

                { "language_type", language },

                { "detect_direction", detectDirection },

                { "detect_language", detectLanguage },

                { "probability", probability }

};

            var response = client.GeneralBasic(bytes, options);

            GeneralOcr generalOcr = JsonUtility.FromJson<GeneralOcr>(response.ToString());

            return generalOcr;

}

        catch (Exception error)

{

            Debug.LogError(error);

}

        return null;

}

}

以上是传入图片字节数据调用接口的方式，也可以通过URL调用，只需将GeneralBasic换为重载函数GeneralBasicUrl：

测试图片：

1	`OCR.General(File.ReadAllBytes(Application.dataPath +` `"/Picture.jpg"));`

以上就是Unity实现OCR文字识别功能的详细内容

原文链接：https://blog.csdn.net/qq_42139931/article/details/122257969