Transcriptions（音声の文字化）

POST

/v1/audio/transcriptions

音声を入力言語として書き起こします。
転写APIは、書き起こしたい音声ファイルを入力として受け取り、希望する音声転写の出力ファイル形式を指定できます。
現在、複数の入力および出力ファイル形式に対応しています。

価格：0.003PTC/分

Request

Header Params

string

required

Example:

application/json

Authorization

string

optional

Example:

Bearer {{YOUR_API_KEY}}

Body Params multipart/form-data

file

required

文字起こしする音声ファイル。以下の形式のいずれかを使用してください：mp3、mp4、mpeg、mpga、m4a、wav、またはwebm。

model

string

required

whisper-large-v3

Example:

whisper-large-v3

prompt

string

optional

オプション。モデルのスタイルを導いたり、前の音声セグメントを継続するために使用されます。プロンプトは音声の言語と一致する必要があります。

response_format

string

optional

トランスクリプト出力の形式。以下のオプションがあります：json、text、verbose_json

Example:

json

temperature

number

optional

サンプリング温度、0から1の間。高い値（例：0.8）はより無作為な出力を、低い値（例：0.2）はより焦点を絞った確定的な出力を生成します。0に設定すると、モデルは特定のしきい値に達するまで、対数確率を使用して自動的に温度を上げます。

Example:

Request samples

Shell

JavaScript

Java

Swift

PHP

Python

HTTP

Objective-C

Ruby

OCaml

Dart

curl --location --request POST 'https://api.302.ai/v1/audio/transcriptions' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer sk-mfYQzy0XTFfz4P16vRE4gFrKK1Nly4TozsMqbbb9PSiJUvFO' \
--form 'file=@""' \
--form 'model="whisper-large-v3"' \
--form 'prompt=""' \
--form 'response_format="json"' \
--form 'temperature="0"'

Responses

🟢200OK

application/json

Body

text

string

required

Example

{
    "text": "Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100, a 1,000 times bigger. This is a place where you can get to do that."
}

Modified at 2024-12-02 09:38:36

Transcript（音声・動画の字幕化）

Alignments（字幕のタイミング合わせ）