Web API

Introduction

The Waygo API allows you to extract text from images, also known as OCR, in a way that is automatic and scalable. The API currently supports text extraction of Chinese, English, Japanese and Korean. The detect endpoint returns the position, color and background color of detected text. If you have any questions about the Waygo API or the documentation, please reach out to sdk@waygoapp.com.

Authentication

To authorize, use this code:

# With shell, you can just pass the correct header with each request
curl "api_endpoint_here"
  -H "Authorization: yourapikey"

Make sure to replace yourapikey with your API key.

Waygo uses API keys to allow access to the API. You can request a new Waygo API key by contacting us.

Waygo expects for the API key to be included in all API requests to the server in a header that looks like the following:

Authorization: yourapikey

You must replace yourapikey with your personal API key.

Image

Detect text

curl \
  --request POST \
  --url http://demo.waygoapp.com/ocr/recognize/ \
  --header 'cache-control: no-cache' \
  --header 'content-type: multipart/form-data' \
  --header 'Authorization: yourapikey' \
  --form 'image=@/path/to/image.jpg' \
  --form lc_src=zh \
  --form detect_type=default

The above command returns JSON structured like this:

[
  {
    "value": "漢堡",
    "translation": "Hamburger",
    
    "shape": [
      {"x": 10, "y": 40}, 
      {"x": 100, "y": 40}, 
      {"x": 100, "y": 80},
      {"x": 10, "y": 80}
    ],
    "colors": {
      "fg": [0, 4, 6],
      "bg": [240, 245, 255]
    },
    

    }
  },
  {
    "value": "雞香堡",
    "translation": "Chicken Burger",
    
    "shape": [
      {"x": 10, "y": 140}, 
      {"x": 130, "y": 140}, 
      {"x": 130, "y": 180},
      {"x": 10, "y": 180}
    ],
    "colors": {
      "fg": [0, 4, 6],
      "bg": [240, 245, 255]
    },
    

    }
  }
]

This endpoint detects the lines of text in an image, and returns the positions, colors and text of the detected text. It will also return an English translation and an English romanization of the text, if the source language is not English. If the source language is English, the translation and romanization fields will currently mirror the value field.

The available form parameters for this request are:

image (required) is the image to be used for detection, attached as part of a multipart-encoded request.
lc_src (required) is short for “language code, source”, and represents the language code of the language that should be detected in the image. See the POST Form Parameters section for available language codes.
lc_tgt (optional) is short for “language code, target”, and represents the language that should be translated into, if any. This defaults to English. See the POST Form Parameters section for available language codes.
detect_type (optional) is a hint for what kind of text should be expected in this image. Different Waygo language and OCR models are optimized for different use cases, and when applicable, this hint can boost performance in certain situations. The available types right now is only default.

A list of detected labels are returned. Every detected label in the list contains the following fields:

value is the detected text in the language specified by lc_src
translation is the translation of the detected text into English
shape contains a list of points for the shape that fits around the detected text. In most cases, this will be four coordinates representing the four corners of a rectangle. The coordinates are given in clockwise order.
colors contains two keys, fg and bg, short for foreground and background. Foreground is the color of the detect text, while background is the average color behind the text. Both keys contain a list with three integer values between 0 and 255, representing the RGB color value. For example, "fg": [0, 34, 230] means that the text color has a red channel with value 0, green 34, and blue 230, or expressed in CSS, rgba(0, 34, 230, 1.0).is

HTTP Request

POST http://demo.waygoapp.com/ocr/recognize/

POST Form Parameters

Parameter	Required	Default	Description
image	Yes	-	The image to be used for detection, attached as part of a multipart-encoded request.
lc_src	Yes	-	The ISO language code of the source language, or in other words, the language of the text in the image. The accepted language codes are: `en` (English) `zh` (Chinese) `ja` (Japanese) `ko` (Korean) If `zh` is specified, the API will automatically handle both simplified and traditional Chinese text.
lc_tgt	No	en	The ISO language code of the source language, or in other words, the language to translate to. Currently, the only valid option is `en` (English), and the parameter is not required.
detect_type	No	default	A hint for what kind of text should be expected in this image. Use `default` for the majority of use cases.

Errors

The standard error response is structured like this:

{
    "code": 418,
    "message": "error message",
    "fields": ["field1", "field2"]
}

The fields field will only be set if there is an error related to a specific field in the request, otherwise it may be left out. The message field will give a human-readable description of the error.

The error code will match the HTTP response code, and possible codes are:

Error Code	Meaning
400	Bad Request
401	Unauthorized
403	Forbidden
404	Not Found
429	Too Many Requests
500	Internal Server Error
503	Service Unavailable