Web API
Introduction
The Waygo API allows you to extract text from images, also known as OCR, in a way that is automatic and scalable. The API currently supports text extraction of Chinese, English, Japanese and Korean. The detect endpoint returns the position, color and background color of detected text. If you have any questions about the Waygo API or the documentation, please reach out to sdk@waygoapp.com.
Authentication
To authorize, use this code:
# With shell, you can just pass the correct header with each request
curl "api_endpoint_here"
-H "Authorization: yourapikey"
Make sure to replace
yourapikeywith your API key.
Waygo uses API keys to allow access to the API. You can request a new Waygo API key by contacting us.
Waygo expects for the API key to be included in all API requests to the server in a header that looks like the following:
Authorization: yourapikey
Image
Detect text
curl \
--request POST \
--url http://demo.waygoapp.com/ocr/recognize/ \
--header 'cache-control: no-cache' \
--header 'content-type: multipart/form-data' \
--header 'Authorization: yourapikey' \
--form 'image=@/path/to/image.jpg' \
--form lc_src=zh \
--form detect_type=default
The above command returns JSON structured like this:
[
{
"value": "漢堡",
"translation": "Hamburger",
"shape": [
{"x": 10, "y": 40},
{"x": 100, "y": 40},
{"x": 100, "y": 80},
{"x": 10, "y": 80}
],
"colors": {
"fg": [0, 4, 6],
"bg": [240, 245, 255]
},
}
},
{
"value": "雞香堡",
"translation": "Chicken Burger",
"shape": [
{"x": 10, "y": 140},
{"x": 130, "y": 140},
{"x": 130, "y": 180},
{"x": 10, "y": 180}
],
"colors": {
"fg": [0, 4, 6],
"bg": [240, 245, 255]
},
}
}
]
This endpoint detects the lines of text in an image, and returns the positions, colors and text of the detected text. It will also return an English translation and an English romanization of the text, if the source language is not English. If the source language is English, the translation and romanization fields will currently mirror the value field.
The available form parameters for this request are:
image(required) is the image to be used for detection, attached as part of a multipart-encoded request.lc_src(required) is short for “language code, source”, and represents the language code of the language that should be detected in the image. See the POST Form Parameters section for available language codes.lc_tgt(optional) is short for “language code, target”, and represents the language that should be translated into, if any. This defaults to English. See the POST Form Parameters section for available language codes.detect_type(optional) is a hint for what kind of text should be expected in this image. Different Waygo language and OCR models are optimized for different use cases, and when applicable, this hint can boost performance in certain situations. The available types right now is onlydefault.
A list of detected labels are returned. Every detected label in the list contains the following fields:
valueis the detected text in the language specified bylc_srctranslationis the translation of the detected text into Englishshapecontains a list of points for the shape that fits around the detected text. In most cases, this will be four coordinates representing the four corners of a rectangle. The coordinates are given in clockwise order.colorscontains two keys,fgandbg, short for foreground and background. Foreground is the color of the detect text, while background is the average color behind the text. Both keys contain a list with three integer values between 0 and 255, representing the RGB color value. For example,"fg": [0, 34, 230]means that the text color has a red channel with value 0, green 34, and blue 230, or expressed in CSS,rgba(0, 34, 230, 1.0).is
HTTP Request
POST http://demo.waygoapp.com/ocr/recognize/
POST Form Parameters
| Parameter | Required | Default | Description |
|---|---|---|---|
| image | Yes | - | The image to be used for detection, attached as part of a multipart-encoded request. |
| lc_src | Yes | - | The ISO language code of the source language, or in other words, the language of the text in the image. The accepted language codes are:
If zh is specified, the API will automatically handle both simplified and traditional Chinese text. |
| lc_tgt | No | en | The ISO language code of the source language, or in other words, the language to translate to. Currently, the only valid option is en (English), and the parameter is not required. |
| detect_type | No | default | A hint for what kind of text should be expected in this image. Use default for the majority of use cases. |
Errors
The standard error response is structured like this:
{
"code": 418,
"message": "error message",
"fields": ["field1", "field2"]
}
The fields field will only be set if there is an error related to a specific field in the request, otherwise it may be left out. The message field will give a human-readable description of the error.
The error code will match the HTTP response code, and possible codes are:
| Error Code | Meaning |
|---|---|
| 400 | Bad Request |
| 401 | Unauthorized |
| 403 | Forbidden |
| 404 | Not Found |
| 429 | Too Many Requests |
| 500 | Internal Server Error |
| 503 | Service Unavailable |
