下面是测试数据集的一些样例:
(注:贴吧会压图,要原版请到github源上下载)

$ ./ocr.py test/test-ipad.png
{"sanitized": [[22654, 169, 804272, 431], [23631, 48, 794332, 472], [23572, -73, 774473, 592], [22594, -47, 699730, 226]], "raw": [["22654pt", "(+169)", "804272", "431"], ["23631 pt", "(+48)", "794332", "472"], ["23572:\u201c", "(- 73)", "774473", "592"], ["22594;\u201d", "(-47)", "699730", "226"]]}
得出的数据是sanitized那一项的
[[22654, 169, 804272, 431],
[23631, 48, 794332, 472],
[23572, -73, 774473, 592],
[22594, -47, 699730, 226]]

$ ./ocr.py test/test-tieba-data.jpg
{"sanitized": [[25001, 46, 1073078, 569], [24480, 41, 965042, 559], [23607, -36, 716435, 173], [23553, -50, 822266, 569]], "raw": [["25001pt", "(+46)", "1073078", "569"], ["24480pt", "(+41)", "965042", "559"], ["23607pt", "(-36)", "716435", "173"], ["23553pt", "(-50)", "822266", "569"]]}
得出的数据是
[[25001, 46, 1073078, 569],
[24480, 41, 965042, 559],
[23607, -36, 716435, 173],
[23553, -50, 822266, 569]]