Microsoft Cognitive Services – Face API、Computer Vision API、Content Moderator の一般提供を開始

執筆者: Microsoft Azure

このポストは、4 月 19 日に投稿された Microsoft Cognitive Services – General availability for Face API, Computer Vision API and Content Moderator の翻訳です。

今回は、Cognitive Services チームによる記事をご紹介します。

Microsoft Cognitive Services は、自然言語によるコミュニケーションを通じて周囲の状況を見聞きし、理解し、解釈できる機能を持つ次世代アプリケーションを作成するための開発者向けサービスです。これを使用することで、インテリジェントな機能をより簡単にプラットフォームに追加できるようになります。

マイクロソフトは、本日開催するオンラインイベント Microsoft Data Amp (英語) で、Microsoft Cognitive Services の Face API、Computer Vision API、Content Moderator API の一般提供を開始することを発表しました。

Face API: 人間の顔を検出したり、似た顔を比較したり、視覚的な類似性に基づいて人をグループ化したり、以前にタグ付けした人を識別したり、人の感情を判断したりすることができます。
Computer Vision API: 画像の内容を理解するために利用できます。たとえば、画像に含まれる有名人やその動作を識別するタグを作成し、その内容を説明するわかりやすい文章を記述できます。また、画像内のランドマークや手書き文字を検出することもできます。手書き文字の検出は引き続きプレビューとなります。
Content Moderator: テキストや画像のモデレートをコンピューターが支援し、手動によるレビューツールを補助します。ビデオモデレートは、Azure Media Services の一部としてプレビュー版を使用できます。

以降では、これらの API でどのようなことができるかを説明します。

Anna が Cognitive Services の最新情報を説明

アプリに視覚機能を追加

Face API ではこれまで、年齢、性別、顔の位置、姿勢を取得することができました。今回のリリースではそれに加え、同じ Face API 呼び出しで感情も取得できるようになりました。これは、年齢と感情が同時に必要となるような場合に便利です。詳細については、Face API のドキュメント (英語) を参照してください。

ランドマークの認識

Computer Vision API にランドマークを認識する機能が追加されました。ドメイン独自モデル (英語) として、ランドマークモデルや有名人認識モデルなどを使用できます。このランドマーク認識モデルでは、世界中の 9,000 種類の自然物や人工物のランドマークを認識できます。Computer Vision API のドメイン独自モデルによって継続的に機能が進化しています。

下の写真は私が旅行中に撮影したものです。この写真をアプリに認識させてみましょう。

この場所がわかる人はいるかもしれませんが、コンピューターの場合、どうやって認識するのでしょう?

C# では、以下のシンプルな REST API 呼び出しでこれらの機能を使用できます。なお、他の言語を使用する場合は、この記事の末尾のリンクからご確認ください。

using System;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;

namespace CSHttpClientSample
{
static class Program
{
static void Main()
{
Console.Write("Enter image file path: ");
string imageFilePath = Console.ReadLine();

MakeAnalysisRequest(imageFilePath);

Console.WriteLine("nnHit ENTER to exit...n");
Console.ReadLine();
}

static byte[] GetImageAsByteArray(string imageFilePath)
{
FileStream fileStream = new FileStream(imageFilePath, FileMode.Open, FileAccess.Read);
BinaryReader binaryReader = new BinaryReader(fileStream);
return binaryReader.ReadBytes((int)fileStream.Length);
}

static async void MakeAnalysisRequest(string imageFilePath)
{
var client = new HttpClient();

// 要求ヘッダー。2 つ目のパラメーターを有効なサブスクリプション キーに変更してください。
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", "putyourkeyhere");

// 要求パラメーター。requestParameters の "landmarks" を "celebrities" に変更し、URI を変更すると、有名人認識モデルを使用できます。
string requestParameters = "model=landmarks";
string uri = "https://westus.api.cognitive.microsoft.com/vision/v1.0/models/landmarks/analyze?" + requestParameters;
Console.WriteLine(uri);

HttpResponseMessage response;

// 要求本文。ローカルに保存されている JPEG 画像でこのサンプルを試してみてください。
byte[] byteData = GetImageAsByteArray(imageFilePath);

using (var content = new ByteArrayContent(byteData))
{
// この例では、ContentType を "application/octet-stream" に設定します。
// 他にも、"application/json" および "multipart/form-data" の ContentType を使用できます。
content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
response = await client.PostAsync(uri, content);
string contentString = await response.Content.ReadAsStringAsync();
Console.WriteLine("Response:n");
Console.WriteLine(contentString);
}
}
}
}

正常な応答が得られた場合、次のような JSON コードが返されます。

```json
{
"requestId": "b15f13a4-77d9-4fab-a701-7ad65bcdcaed",
"metadata": {
"width": 1024,
"height": 680,
"format": "Jpeg"
},
"result": {
"landmarks": [
{
"name": "Colosseum",
"confidence": 0.9448209
}
]
}
}
```

手書き文字の認識

Computer Vision API では、手書き文字 OCR のプレビューも使用できます。この機能では、画像内の手書き文字を検出し、認識した文字を機械が使用できる形式の文字列として抽出します。
ノートや手紙、論文、ホワイトボード、記入用紙などの手書き文字の検出と抽出が可能で、紙や付箋紙、ホワイトボードなどのさまざまな表面や背景に対応します。これがあれば、手書きでメモをした後、コンピューターにテキストを入力し直す必要はありません。画像として撮影して手書き文字 OCR を使用すればノートをデジタル化できるため、時間や手間、紙の無駄を削減できます。ノートを見返したいときにも、すぐに該当箇所を検索できます。

こちら (英語) で、サンプルをアップロードして実際にお試しいただけます。

では、以下のようにホワイトボードに書いた文字を認識させてみましょう。

私の好きな言葉です。

C# の場合、下記のコードを実行します。

using System;
using System.IO;
using System.Collections;
using System.Collections.Generic;
using System.Net.Http;
using System.Net.Http.Headers;

namespace CSHttpClientSample
{
static class Program
{
static void Main()
{
Console.Write("Enter image file path: ");
string imageFilePath = Console.ReadLine();

ReadHandwrittenText(imageFilePath);

Console.WriteLine("nnnHit ENTER to exit...");
Console.ReadLine();
}

static byte[] GetImageAsByteArray(string imageFilePath)
{
FileStream fileStream = new FileStream(imageFilePath, FileMode.Open, FileAccess.Read);
BinaryReader binaryReader = new BinaryReader(fileStream);
return binaryReader.ReadBytes((int)fileStream.Length);
}

static async void ReadHandwrittenText(string imageFilePath)
{
var client = new HttpClient();

// 要求ヘッダー。サンプル キーを有効なサブスクリプション キーに変更してください。
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", "putyourkeyhere");

// 要求パラメーターと URI。"handwriting" に設定すると印刷されたテキストは対象外になります。
string requestParameter = "handwriting=true";
string uri = "https://westus.api.cognitive.microsoft.com/vision/v1.0/recognizeText?" + requestParameter;

HttpResponseMessage response = null;
IEnumerable<string> responseValues = null;
string operationLocation = null;

// 要求本文。このサンプルをローカルに保存されている JPEG 画像で試してみてください。
byte[] byteData = GetImageAsByteArray(imageFilePath);
var content = new ByteArrayContent(byteData);

// この例では、ContentType を "application/octet-stream" に設定します。
// 他にも、"application/json" を使用し、画像の URL を指定することもできます。
content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");

try {
response = await client.PostAsync(uri, content);
responseValues = response.Headers.GetValues("Operation-Location");
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}

foreach (var value in responseValues)
{
// この値は、テキスト認識操作の結果の出力先の URI です。
operationLocation = value;
Console.WriteLine(operationLocation);
break;
}

try
{
// 注: 応答が返されるまで多少時間がかかります。手書き文字認識は
// 非同期操作であり、認識するテキストの長さによって時間がかかる場合があります。
// 出力が得られるまで待つか、または操作を再実行することができます。
response = await client.GetAsync(operationLocation);

// 出力は JSON 形式です。
Console.WriteLine(await response.Content.ReadAsStringAsync());
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
}
}
}

認識が成功すると、OCR の結果としてテキストとその領域を示す境界枠、行、単語が以下のように JSON 形式で返されます。

{
"status": "Succeeded",
"recognitionResult": {
"lines": [
{
"boundingBox": [
542,
724,
1404,
722,
1406,
819,
544,
820
],
"text": "You must be the change",
"words": [
{
"boundingBox": [
535,
725,
678,
721,
698,
841,
555,
845
],
"text": "You"
},
{
"boundingBox": [
713,
720,
886,
715,
906,
835,
734,
840
],
"text": "must"
},
{
"boundingBox": [
891,
715,
982,
713,
1002,
833,
911,
835
],
"text": "be"
},
{
"boundingBox": [
1002,
712,
1129,
708,
1149,
829,
1022,
832
],
"text": "the"
},
{
"boundingBox": [
1159,
708,
1427,
700,
1448,
820,
1179,
828
],
"text": "change"
}
]
},
{
"boundingBox": [
667,
905,
1766,
868,
1771,
976,
672,
1015
],
"text": "you want to see in the world !",
"words": [
{
"boundingBox": [
665,
901,
758,
899,
768,
1015,
675,
1017
],
"text": "you"
},
{
"boundingBox": [
752,
900,
941,
896,
951,
1012,
762,
1015
],
"text": "want"
},
{
"boundingBox": [
960,
896,
1058,
895,
1068,
1010,
970,
1012
],
"text": "to"
},
{
"boundingBox": [
1077,
894,
1227,
892,
1237,
1007,
1087,
1010
],
"text": "see"
},
{
"boundingBox": [
1253,
891,
1338,
890,
1348,
1006,
1263,
1007
],
"text": "in"
},
{
"boundingBox": [
1344,
890,
1488,
887,
1498,
1003,
1354,
1005
],
"text": "the"
},
{
"boundingBox": [
1494,
887,
1755,
883,
1765,
999,
1504,
1003
],
"text": "world"
},
{
"boundingBox": [
1735,
883,
1813,
882,
1823,
998,
1745,
999
],
"text": "!"
}
]
}
]
}
}

C# およびその他の言語については、以下の資料を参照してください。

Face API 関連ページ、C# (英語)、Java (英語)、Python (英語) のクイックスタートガイド
Computer Vision API 関連ページ、C# (英語)、Java (英語)、Python (英語) のクイックスタートガイド
Content Moderator 関連ページ、Content Moderator のテスト用デモ (英語): このデモでは、コンテンツモデレートの構成から実行までのライフサイクル全体を体験できます。

実際のお客様の事例については、こちら (英語) からご覧ください。GrayMeta 社の Vision API の使用事例 (英語) もご覧ください。

皆様の開発にお役立ていただけますと幸いです。

Microsoft Cognitive Services – Face API、Computer Vision API、Content Moderator の一般提供を開始

アプリに視覚機能を追加

ランドマークの認識

手書き文字の認識

Trending Articles

Police confirm man stabbed to death in Selsdon was Andrew David Else of Croydon

Hull man, 27, dies after crashing car into a tree on the A165 near Brandesburton

Police charge man, 23, with assault and criminal damage following incident in...

Angry father ordered to compensate daughter’s male friend

Practice Sheet of Right form of verbs for HSC Students

Best 5 Happy Mothers Day Poems For Step Mother

Hyper-V replication "Enabling Replication Failed"

DMG Audio Limitless v1.01 WiN/OSX Incl Patched and Keygen-R2R

Joseph Bradley – Carlisle

Laura Pausini - Platinum Collection (3Cd) (2009) .mp3 - 320 Kbps

Drug dealing brothers caught with £74k stash in Newtown Linford home

Who’s been sentenced from Corby, Kettering, Ringstead, Rothwell, Rushden,...

Anthony Wahome Biography, Family, Wife and Children

Brunei reaffirms healthcare commitment

Materials Around Us Class 6 Worksheet Science Chapter 6

JESSIE ROGERSON ON JULY 10, 20...

Madonna – Behind Me (feat. Guido Dos Santos) – Single [iTunes Plus M4A]

Stories • Goddess Stepmom

Sri Lankan Actress Nadeesha Hemamali Hot Shoot

A/L Technology Stream – Subject combinations, Syllabuses and Teacher guides